122 49 9MB
English Pages 672 [674] Year 2024
本书版权归John Wiley & Sons Inc.所有
The Handbook of Clinical Linguistics
Blackwell Handbooks in Linguistics This outstanding multi‐volume series covers all the major subdisciplines within linguistics today and, when complete, will offer a comprehensive survey of linguistics as a whole. The most recent publications in the series can be found below. To see the full list of titles available in the series, please visit www.wiley.com/go/linguistics-handbooks The Handbook of Clinical Linguistics Edited by Martin J. Ball, Michael R. Perkins, Nicole Muller, & Sara Howard The Handbook of Pidgin and Creole Studies Edited by Silvia Kouwenberg & John Victor Singler
The Handbook of Language, Gender, and Sexuality, Second Edition Edited by Susan Ehrlich, Miriam Meyerhoff, & Janet Holmes The Handbook of Language Emergence Edited by Brian MacWhinney & William O’Grady
The Handbook of Language Teaching Edited by Michael H. Long & Catherine J. Doughty
The Handbook of Discourse Analysis, Second Edition Edited by Deborah Tannen, Heidi E. Hamilton, & Deborah Schiffrin
The Handbook of Phonetic Sciences, Second Edition Edited by William J. Hardcastle & John Laver
The Handbook of Korean Linguistics Edited by Lucien Brown & Jaehoon Yeon
The Handbook of Language and Speech Disorders Edited by Jack S. Damico, Nicole Muller, & Martin J. Ball
The Handbook of Speech Production Edited by Melissa A. Redford
The Handbook of Language Contact Edited by Raymond Hickey
The Handbook of Classroom Discourse and Interaction Edited by Numa Markee
The Handbook of Computational Linguistics and Natural Language Processing
The Handbook of Narrative Analysis Edited by Anna De Fina & Alexandra Georgakopoulou
Edited by Alexander Clark, Chris Fox, & Shalom Lappin
The Handbook of English Pronunciation Edited by Marnie Reed & John M. Levis
The Handbook of Language and Globalization Edited by Nikolas Coupland The Handbook of Hispanic Sociolinguistics Edited by Manuel Diaz‐Campos The Handbook of Language Socialization Edited by Alessandro Duranti, Elinor Ochs, & Bambi B. Schieffelin The Handbook of Phonological Theory, Second Edition Edited by John A. Goldsmith, Jason Riggle, & Alan C. L. Yu The Handbook of Intercultural Discourse and Communication Edited by Christina Bratt Paulston, Scott F. Kiesling, & Elizabeth S. Rangel The Handbook of Historical Sociolinguistics Edited by Juan Manuel Hernandez‐Campoy & Juan Camilo Conde‐Silvestre The Handbook of Hispanic Linguistics Edited by Jose Ignacio Hualde, Antxon Olarrea, & Erin O’Rourke The Handbook of Conversation Analysis Edited by Jack Sidnell & Tanya Stivers The Handbook of English for Specific Purposes Edited by Brian Paltridge & Sue Starfield The Handbook of Bilingualism and Multilingualism, Second Edition Edited by Tej K. Bhatia & William C. Ritchie The Handbook of Language Variation and Change, Second Edition Edited by J. K. Chambers & Natalie Schilling The Handbook of Spanish Second Language Acquisition Edited by Kimberly L. Geeslin The Handbook of Chinese Linguistics Edited by C.‐T. James Huang, Y.‐H. Audrey Li, & Andrew Simpson
The Handbook of Bilingual and Multilingual Education Edited by Wayne E. Wright, Sovicheth Boun, & Ofelia Garcia The Handbook of Contemporary Semantic Theory, Second Edition Edited by Shalom Lappin & Chris Fox The Handbook of Portuguese Linguistics Edited by W. Leo Wetzels, Joao Costa, & Sergio Menuzzi The Handbook of Linguistics, Second Edition Edited by Mark Aronoff & Janie Rees‐Miller The Handbook of Translation and Cognition Edited by John W. Schwieter & Aline Ferreira The Handbook of Technology and Second Language Teaching and Learning Edited by Carol A. Chapelle & Shannon Sauro The Handbook of Psycholinguistics Edited by Eva M. Fernandez & Helen Smith Cairns The Handbook of Dialectology Edited by Charles Boberg, John Nerbonne, & Dominic Watt The Handbook of Advanced Proficiency in Second Language Acquisition Edited by Paul A. Malovrh & Alessandro G. Benati The Handbook of the Neuroscience of Multilingualism Edited by John W. Schwieter The Handbook of TESOL in K-12 Edited by Luciana C. de Oliveira The Handbook of World Englishes, Second Edition Edited by Cecil L. Nelson, Zoya G. Proshina, & Daniel R. Davis The Handbook of Clinical Linguistics, Second Edition Edited by Martin J. Ball, Nicole Müller, & Elizabeth Spencer
The Handbook of Clinical Linguistics Edited by
Martin J. Ball, Nicole Müller, and Elizabeth Spencer Second Edition
This edition first published 2024 © 2024 John Wiley & Sons Ltd Edition History 1e, © Blackwell Publishing Ltd (hardback, 2008) 1e, © Blackwell Publishing Ltd (paperback, 2011) All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions. The right of Martin J. Ball, Nicole Müller, and Elizabeth Spencer to be identified as the authors of the editorial material in this work has been asserted in accordance with law. Registered Offices John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com. Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats. Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book. Limit of Liability/Disclaimer of Warranty While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. A catalogue record for this book is available from the Library of Congress Hardback ISBN: 9781119875901; ePDF ISBN: 9781119875925; ePub ISBN: 9781119875932; oBook ISBN: 9781119875949 Cover images: © Supremus No. 55, Kazimir Malevich, 1916, public domain (Wikimedia Commons, photographic reproduction) Cover design by Wiley Set in 9.5/11.5pt Palatino by Integra Software Services Pvt. Ltd, Pondicherry, India
Contents
List of Figures Notes on Contributors Introduction Martin J. Ball, Nicole Müller, and Elizabeth Spencer
Part 1: Pragmatics, Discourse, and Sociolinguistics 1 Discourse Analysis and Communication Impairment Louise C. Keegan, Jacqueline A. Guendouzi, and Nicole Müller
ix xiii xxxi
1 3
2 Conversational Implicature and Communication Disorders Francesca Foppolo and Greta Mazzaggio
15
3 Relevance Theory and Communication Atypicalities Elly Ifantidou and Tim Wharton
29
4 Neuropragmatics Luca Bischetti, Federico Frau, and Valentina Bambini
41
5 Pragmatic Impairment as an Emergent Phenomenon Michael R. Perkins and Jamie H. Azios
55
6 Conversation Analysis and Communication Disorders Ray Wilkinson
69
7 Clinical Sociolinguistics Brent Archer, Eleanor Gulick, Jack S. Damico, and Martin J. Ball
81
8 Systemic Functional Linguistics and Communication Disorders Elizabeth Spencer and Alison Ferguson
99
9 Multimodal Analysis of Interaction Scott Barnes and Francesco Possemato 10 Cross-Linguistic and Multilingual Perspectives on Communicative Competence and Communication Impairment: Pragmatics, Discourse, and Sociolinguistics Zhu Hua and Li Wei 11 Clinical Corpus Linguistics Davida Fromm and Brian MacWhinney
115
129 143
vi Contents
Part 2: Syntax and Semantics
157
12 Generative Syntactic Theory and Language Disorders Martina Penke and Eva Wimmer
159
13 Formulaic Sequences and Language Disorders Alison Wray
177
14 Syntactic Processing in Developmental and Acquired Language Disorders Theodoros Marinis
189
15 Inflectional Morphology and Language Disorders Martina Penke
201
16 Normal and Impaired Semantic Processing of Words Marilyne Joyal, Maximiliano A. Wilson, and Yves Joanette
215
17 Neural Correlates of Neurotypical and Pathological Language Processing Sonja A. Kotz, Stefan Frisch, and Angela D. Friederici
229
18 Developmental Language Disorder in a Bilingual Context Jan de Jong
245
19 Cross-Linguistic Perspectives on Morphosyntax in Child Language Disorders Stanislava Antonijevic and Natalia Meir
259
20 The Complex Relationship between Cognition and Language Illustrations from Acquired Aphasia Lyndsey Nickels, Bruna Tessaro, Solène Hameau, and Christos Salis
273
21 Linguistic and Motoric Disorders in the Sign Modality Martha E. Tyrone
287
Part 3: Phonology
301
22 Phonology and Clinical Phonology Elena Even-Simkin
303
23 Constraints-based Nonlinear Phonological Theories in Clinical Phonology Across Languages Barbara May Bernhardt, Joseph P. Stemberger, Glenda Mason, and Daniel Bérubé
317
24 Articulatory Phonology and Speech Impairment Christina Hagedorn and Aravind Namasivayam
333
25 Government Phonology and Speech Impairment Martin J. Ball and Ben Rutter
351
26 A Usage-based Approach to Clinical Phonology Anna V. Sosa and Joan L. Bybee
365
27 Typical and Nontypical Phonological Development Michelle Pascoe
377
28 Vowel Development and Disorders Karen Pollock and Carol Stoel-Gammon
391
Contents vii 29 Cross-Linguistic Phonological Acquisition David Ingram and Elena Babatsouli
407
30 Cross-linguistic Aspects of System and Structure in Clinical Phonology Mehmet Yavaş and Margaret Kehoe
421
31 Connected Speech Caroline Newton, Sara Howard, Bill Wells, and John Local
437
32 Clinical Phonology and Phonological Assessment Barbara Dodd, Alison Holm and Sharon Crosbie
453
Part 4: Phonetics
469
33 Phonetic Transcription in Clinical Practice Sally Bates, Jocelynne Watson, Barry Heselwood, and Sara Howard
471
34 Instrumental Analysis of Speech Production Lucie Ménard and Mark Tiede
489
35 Instrumental Analysis of Articulation Yunjung Kim, Raymond D. Kent, and Austin Thompson
505
36 Instrumental Analysis of Voice Meike Brockmann-Bauser
523
37 Measures of Speech Perception Jan Wouters, Robin Gransier, and Astrid van Wieringen
539
38 Neurophonetics Wolfram Ziegler, Ingrid Aichert, Theresa Schölderle, and Anja Staiger
561
39 Coarticulation and Speech Impairment Ivana Didirková
573
40 Prosodic Impairments Bill Wells and Traci Walker
589
41 Speech Intelligibility Julie Liss
605
42 S ociophonetics and Clinical Linguistics Gerard Docherty and Ghada Khattab
615
Index
633
List of Figures
1.1. Structure of language
7
3.1. The bi-dimensional continuum (Redrawn from Sperber & Wilson, 2015). Reproduced with permission of the Croatian Journal of Philosophy, Vol XV, number 44, page 123 (from Sperber, D. & Wilson, D “Beyond Speaker’s Meaning”)
35
4.1. Conceptual representation of the application of electroencephalography (EEG) to the study of pragmatic phenomena
47
11.1. TBI Bank homepage
145
11.2. Browsable database
152
11.3. Collaborative commentary example
152
12.1. Typical three-person scenario as tested in a picture-pointing task (Wimmer, 2010)
162
12.2. Syntactic tree of the wh-question “Wen kitzelt der Junge?” (= who is the boy tickling)
164
12.3. Pruned syntactic tree of the wh-question “Wen kitzelt der Junge?” (= who is the boy tickling)
165
12.4. Trace Deletion Hypothesis (a) and Relativized Minimality approach (b, c) in individuals with language disorders and absent morphosyntactic features
168
17.1. Left-hand side: View of the left hemisphere with the cortical gyri (IFG, STG and MTG) that are most relevant for language processing. The respective Brodmann areas (BA) are numbered. BA39 is the Angular Gyrus (AG)
237
Right-hand side: The neurocognitive model is an adapted version of Friederici (2017) the subsequent phases of syntactic and semantic processing (associated with the different language-related ERP components) and the brain regions that support them. Note that early processes of speech segmentation and phonological processing are not depicted here, since they are not discussed in the present chapter. For further explanation see text. 21.1. The ASL signs FATHER, MOTHER, and FINE
288
21.2. The ASL sign ASK
292
23.1. Phonological hierarchy from the prosodic phrase to the segmental tier
319
x List of Figures 23.2. Feature hierarchy.
319
24.1. Coupling graph corresponding to the words “mad,” “bad,” and “pad.” In-phase gestures are connected by solid lines whereas anti-phase gestures are connected by dashed lines
335
24.2. Gestural scores corresponding to the words “mad,” “bad,” and “pad.”
336
26.1. Emergent morphological analysis of unbelievable (Bybee, 1998). Reproduced by permission of the Chicago Linguistics Society
368
27.1. The developmental phase model (adapted from Pascoe et al., 2006; Stackhouse & Wells, 2001 with permission from John Wiley and Sons)
380
27.2. Developmental phase model indicating difficulties that could occur at each phase (adapted from Pascoe et al., 2006; Stackhouse & Wells, 2001 with permission from John Wiley and Sons)
386
34.1. Experimental setup for video tracking using blue markers (left), and example of OpenFace landmark fitting (right)
491
34.2. Carstens AG501 (left) and typical EMA sensor layout (right)
493
34.3. Landmarks for delimiting an apical nasal gesture, showing vertical component of tongue tip movement (TTy) and corresponding absolute velocity (TTvel). Gesture on/offset (GONS/GOFFS), peak velocity (PVEL), and target maximum constriction (MAXC) are determined by inflections on the velocity signal
494
34.4. Midsagittal B-Mode ultrasound image with superimposed tongue surface contour (left); probe stabilization helmet (right)
496
34.5. Example EPG contact pattern showing co-constriction of /k/ and /t/ in “pact.” 499 35.1. Number of articles reporting EMA data on speech published annually for the period of 1987–2022. The search was conducted on August 2022. © Yunjung Kim, Raymond D. Kent, and Austin Thompson, 2023
506
35.2. The classic chart for vowels acoustics and kinematics. The labeled vowels are the corner vowels of American English. © Yunjung Kim, Raymond D. Kent, and Austin Thompson, 2023
507
35.3. The effect size (Hedge’s g, Hedges & Olkin, 2014) and 95% confidence intervals for acoustic measures capturing the size of the articulatory movement across studies of typical speech and speech produced by individuals with Parkinson’s disease. Group comparisons that were not statistically significant are represented in gray. VSA = vowel space area, tVSA = triangular vowel space area, VSAlax = vowel space area calculated using lax vowels, FCR = formant centralization ratio, VSD10, 90 = vowel space density of the innermost 10 and 90% of the formant density distribution, respectively; AAVS = articulatory-acoustic vowel space; AVD = acoustic vowel distance. The inverse effect size for FCR is reported for ease of interpretation. Speakers with Parkinson’s disease demonstrated higher FCR values compared to neurotypical speakers, indicating more centralized formants and reduced movement. © Yunjung Kim, Raymond D. Kent, and Austin Thompson, 2023
510
List of Figures xi 35.4. An example of F2 slope. The waveform and spectrogram of the word “buy” are shown at the top and bottom of the figure, respectively. The transition state of F2, defined as a time interval including a spectral change greater than 20 Hz over 20 ms, is indicated by a boxed area. © Yunjung Kim, Raymond D. Kent, and Austin Thompson, 2023
511
35.5. An example of articulatory contrast for stop consonants in a neurotypical speaker (left) and a speaker with Parkinson’s disease (right). Each dot represents the average position of lingual (TF: tongue front, TB: tongue back) and labial sensors (UL: upper lip, LL: lower lip) during passage reading for /p/, /t/, and /k/. © Yunjung Kim, Raymond D. Kent, and Austin Thompson, 2023
513
35.6. Illustration of voice onset time (VOT) in relation to the categories of simultaneous voicing, prevoicing, short lag, and long lag. The first three of these typically are associated with voiced consonants in English, whereas long lag is associated with voiceless consonants. © Yunjung Kim, Raymond D. Kent, and Austin Thompson, 2023
514
35.7. Kinematic hull areas of the tongue (gray) and jaw (black), respectively, obtained from a passage reading task. The tongue sensor was placed approximately 5 cm from the tongue tip, and the jaw sensor was adhered to the labial surface of the lower central incisors 516 35.8. An example of spatiotemporal index (STI) comparing a neurotypical speaker (top, STI = 14.21) and a speaker with Parkinson’s disease (bottom, STI = 20.93). © Yunjung Kim, Raymond D. Kent, and Austin Thompson, 2023
517
36.1. Voice Range Profile (VRP) example of a male adult with a unilateral polyp before and after treatment
525
36.2. Schema of EGG and dEGG waves
530
37.1. This is an illustration of three elements of the schema of this chapter on speech perception measures: Measures of speech perception of the listener with the three families of speech perception measures, Speech to be perceived by the listener, Noise maskers and listening environment (impact of). The impact of Neural and Cognitive Processing specific to the listener (in person or model, real or virtual) is listed in Table 37.1
540
42.1. Predicted formant trajectories (F1 and F2) for goat and thought by Tyneside females (top) and males (bottom) across three age groups (Warburton, 2020)
617
42.2. Frequency of occurrence of fricative and stop variants of (ð) in Shetewi’s (2018) data from Khan Eshieh Camp in Syria
618
42.3. Frequency of occurrence of pre-aspirated variants of (t) in Docherty et al.’s (2006) data from Newcastle children
622
Notes on Contributors
Ingrid Aichert, PhD, is a speech-language pathologist. Since 2002, she has worked as a Research Associate in the Clinical Neuropsychology Research Group (EKN) at the Institute of Phonetics and Speech Processing, Ludwig-Maximilian University of Munich, Germany. Her main research interests are neurogenic speech-sound disorders, in particular, apraxia of speech, and phonological impairments in aphasia. Stanislava Antonijevic-Elliott is an Associate Professor at the School of Health Sciences, University of Galway, Ireland. She serves as the Head of Discipline of Speech and Language Therapy and is the co-director of MSc in Applied Multilingualism. She is a member of the Multilingual and Multicultural Affairs Committee of the International Association of Logopedics and Phoniatrics (IALP). Her research interests include monolingual and multilingual typical and atypical language development, focusing on language assessment. She is also interested in the use of narratives as a language assessment tool, as well as in its potential for language enrichment in school-age children and adult language learners. Brent Archer hails from Johannesburg, South Africa. After completing a degree in speechlanguage pathology/audiology at the University of the Witwatersrand in 2006, Brent worked as a SLP/audiologist at hospitals and schools in the Free State province of South Africa. In 2012, he entered the Applied Languages and Speech Sciences PhD program at the University of Louisiana, Lafayette. For his dissertation, Brent used various frameworks such as conversation analysis and cognitive ethnography to study conversation groups for people with aphasia. He investigated how groups of people with language deficits collaborated and used resources to remain effective, competent communicators. He is now an Associate Professor in Communication Disorders and Sciences Department at Bowling Green State University. In 2020, Brent was chosen as a Tavistock Aphasia Distinguished Scholar. Jamie H. Azios, Ph.D., CCC-SLP is the Doris B. Hawthorne Endowed Chair in the Department of Communicative Disorders at the University of Louisiana at Lafayette. Her research interests include qualitative research methodologies, understanding perspectives of people living with communication disabilities, co-constructed conversation in aphasia, and the impact of communicative environments on social participation and inclusion. She has published articles related to person-centeredness, communication access, functional outcomes of aphasia therapy, and friendship and aphasia.
xiv Notes on Contributors Elena Babatsouli is an Associate Professor and Blanco/BORSF Endowed Professor in Communicative Disorders at the University of Louisiana at Lafayette. She is founding coeditor of the Journal of Monolingual and Bilingual Speech (Equinox) and founding chair of the biennial International Symposium on Monolingual and Bilingual Speech. Elena publishes on child monolingual/bilingual phonological acquisition and assessment, SSDs, SLA, CLD in speech-language pathology, psycholinguistics, clinical linguistics, and quantitative methods. She has edited/co-edited several books, journal special issues, and conference proceedings. A current project is Multilingual Acquisition and Learning: An Ecosystemic View of Diversity (John Benjamins). She also serves on ASHA’s Multicultural Issues Board and as ERC Consolidator Grant Referee for the European Research Council Executive Agency (ERCEA). Martin J. Ball is honorary professor of linguistics at Bangor University and visiting professor in speech-language pathology at Wrexham Glyndŵr University, both in Wales. He previously held positions in Wales, Ireland, the US, and Sweden. He co-edits two academic journals and two book series. He has published widely in communication disorders, phonetics, sociolinguistics, bilingualism, and Welsh linguistics. He recently edited the Manual of Clinical Phonetics for Routledge publishers (2021). He is an honorary fellow of the Royal College of Speech and Language Therapists, and a fellow of the Learned Society of Wales. He currently lives in the Republic of Ireland. Valentina Bambini is Full Professor of Linguistics at the University School for Advanced Studies IUSS Pavia, Italy. Her research interests revolve around the fields of neurolinguistics, experimental pragmatics, and neuropragmatics, with a focus on figurative language. Among her works, she has developed the Assessment of Pragmatic Abilities and Cognitive Substrates (APACS) test, a tool for evaluating pragmatic language disorder in clinical populations. Currently, she is the PI of an ERC Consolidator Grant entitled “PROcessing MEtaphors: Neurochronometry, Acquisition and DEcay” (PROMENADE), devoted to investigating the neurocognitive correlates of metaphor processing in the life span and in clinical populations. She co-founded the Experimental Pragmatics in Italy (XPRAG.it) research network. Scott Barnes is an Associate Professor in the Department of Linguistics at Macquarie University, Sydney, Australia. He is a speech pathologist and conversation analyst, and his research focuses on communication in the course of everyday life. Scott is especially interested in exploring the interfaces between the organization of interaction, language, cognition, and related impairments, and considering how this may inform speech pathology assessment and intervention strategies. Scott is currently the director of the Master of Speech and Language Pathology course at Macquarie University. Sally Bates is a Senior Lecturer in Speech and Language Therapy at University of St Mark and St John, Plymouth, UK. She is a dual trained linguist and SLT with a clinical specialism in developmental Speech Sound Disorder and an academic interest in the interface between theory, practice, and student education. Sally is co-author of two speech assessment tools: PPSA (Phonetic and Phonological Systems Analysis) and CAV-ES (Clinical Assessment of Vowels – English Systems). She is a member of the UK and Ireland’s Children’s Speech Disorder Research Network and a lead author of their Good Practice Guidelines for the Analysis of Child Speech (2021), endorsed by the Royal College of Speech and Language Therapists. Sally is also author of the award winning Early Soundplay Series supporting early speech and language development.
Notes on Contributors xv Barbara May Bernhardt, now Professor Emerita, was on faculty at the School of Audiology and Speech Sciences at the University of British Columbia, Vancouver, Canada, from 1990– 2017. She was a speech-language pathologist from 1972–2022. Her primary focus is phonological development, assessment, and intervention across languages. In collaboration with co-investigator Joseph Paul Stemberger and colleagues in over 15 countries, she has been conducting an international cross-linguistic project in children’s phonological acquisition (phonodevelopment.sites.olt.ubc.ca) since 2006. Other areas of expertise include the utilization of ultrasound in speech therapy, language development, assessment and intervention, and approaches to service delivery for Indigenous people in Canada. Daniel Bérubé is a speech-language pathologist and associate professor in the speech-language pathology program at the University of Ottawa. Recently, he has collaborated on an international cross-linguistic project in phonological development and a Health Canada project assessing the oral language and literacy skills of multilingual children. Luca Bischetti is a post-doctoral research fellow in the Department of Humanities and Life Sciences, University School for Advanced Studies IUSS Pavia, where he obtained his PhD in Cognitive Neuroscience and Philosophy of Mind with a dissertation on the relationship between verbal humor and Theory of Mind. His current research interests lie in the area of neuropragmatics, focusing on the assessment of pragmatic skills during the lifespan and the processing of figurative language phenomena and, in particular, humor. Meike Brockmann-Bauser is head of research and speech pathology at the Department of Phoniatrics and Speech Pathology, Clinic for Otorhinolaryngology, Head and Neck Surgery, University Hospital Zurich, Switzerland. She is certified speech-language pathologist with a clinical interest in diagnosing and treating complex voice, swallowing and speech disorders, and holds a PhD in Biomedical Sciences from Newcastle University (UK). Her research, in collaboration with US, Brazilian, and Swedish groups, focuses on improving clinical voice diagnostics and treatment, primarily using instrumental acoustic voice assessment techniques. She is coauthor of the German national guideline for the assessment and treatment of voice disorders, of two textbooks on voice and swallowing disorders, and of the first master’s degree program in speech therapy in Switzerland. Joan Bybee is Distinguished Professor Emerita of Linguistics at the University of New Mexico; she also was a Professor at the University at Buffalo from 1973–1989. Her research focuses on theoretical issues in phonology, morphology, language universals, and linguistic change. Her work utilizing large cross-linguistic databases provide diachronic explanations for typological phenomena. Her books and articles presenting a usage-based perspective on synchrony and diachrony include Phonology and Language Use (Cambridge University Press, 2001), Language, Usage and Cognition (Cambridge University Press, 2010). Professor Bybee served as the President of the Linguistic Society of America in 2004 and received an honorary doctorate from the University of Oslo in 2005. She is a Fellow of the LSA and the Cognitive Science Society. Sharon Crosbie is a senior lecturer in speech pathology at the Australian Catholic University in Brisbane. Her research is clinically driven with the aim of contributing to the evidence on how to assess and provide effective intervention for children with communication difficulties. Sharon is one of the authors of the Diagnostic Evaluation of Articulation and Phonology (DEAP), a widely used speech pathology clinical assessment for speech sound disorders. She is a certified practising speech pathologist and works in education to promote effective speech, language, and communication skills for all children.
xvi Notes on Contributors Jack S. Damico is a clinical linguist and a speech-language pathologist with a master’s degree in communicative disorders and a PhD in linguistics. With over 12 years of clinical experience as a speech-language pathologist in the public schools, medical settings and in private practice, his research focuses on the authentic implications for individuals with atypical language and communication skills and on the development of clinical applications to assist in overcoming communicative problems. Working primarily in the areas of aphasia in adults and language and literacy difficulties in children from both monolingual and bilingual backgrounds, he specializes in the utilization of various qualitative research methodologies to investigate language and communication as social action. An ASHA Fellow, he is the editor of the Journal of Interactional Research in Communication Disorders. He joined the University of Colorado Boulder faculty in 2019 after 28 years as the Doris B. Hawthorne Eminent Scholar Chair at the University of Louisiana at Lafayette. Ivana Didirková is Associate Professor of Phonetics and Phonology at Paris 8 University Vincennes – Saint-Denis (France). After her PhD thesis on stuttering-like disfluencies, defended in 2016 at the University Paul-Valéry Montpellier 3 (France), she held postdoctoral positions at the University of Louvain (Belgium) and Sorbonne Nouvelle Paris 3 University. Her main research interests include speech production in persons who stutter, disfluency production and perception in stuttered and typical speech, and auditory and somatosensory feedback disturbances. She has also published on the prosody-discourse interface in French. Gerard Docherty is Professor of Phonetics at Griffith University in Queensland, Australia. His research explores the properties of social-indexical variability within- and across-speakers, and how these are interpreted by listeners and acquired by children. His various funded research projects have had a particular focus on diverse varieties of English. His 1989 PhD from Edinburgh University was a study of the timing of voicing in English obstruents, and for over 20 years prior to relocating to Australia, he contributed to phonetics teaching on the Speech and Language Therapy programmes at Newcastle University (UK). Barbara Dodd is retired, but still engages in research that explores aspects of typical and atypical phonological development. She is honorary professor at the Murdoch Children’s Research Institute and the University of Queensland. Her clinical work with children who have communication difficulties motivated research in audio-visual speech perception, literacy, executive function and the assessment of phonological abilities across languages. As an educator, Barbara collaborated with peers to develop novel teaching approaches in speech language pathology, in Australia and Europe. Elena Even-Simkin, PhD, is a psycholinguist, behavioral analyst in psychiatry, and diagnostician with a background in neuroscience. She holds positions at SCE, Shamoon College of Engineering, and Bar-Ilan University. She is a member of the Brain and Language Lab at Bar-Ilan University, a fellow of the Columbia School Linguistics Society (CSLS) and a fellow of the Autism Research Community in the National Center for Autism and Neurodevelopment Research. She is the author of scientific publications, including academic articles, book chapters, books, and encyclopedic entries in the fields of linguistics, discourse and text analysis, semiotics, language disorders, language acquisition, and development in neurotypical individuals and children, adolescents, and adults with learning disabilities and neurodevelopmental disorders.
Notes on Contributors xvii Alison Ferguson, PhD, was, prior to her retirement, Professor of Speech Pathology at University of Newcastle, Australia. Her research explored the applications of sociolinguistic analyses to the assessment and treatment of communication disorders. Her published research demonstrated the application of Conversation Analysis to working with partners of people with aphasia (an acquired language disorder). Systemic Functional Linguistics proved to be a rich resource to illuminate the role of supervisor feedback in speech pathology education, as well as to explore the use of metaphor by people with aphasia and their partners in their talk about their goals for rehabilitation. Francesca Foppolo is Associate Professor of Linguistics and Psycholinguistics in the Department of Psychology of the University of Milan-Bicocca, Italy. She is a member of the advisory board of the international network Bilingualism Matters based in Edinburgh (UK) and participated in several European funded research networks. She is currently Associate Editors of Applied Psycholinguistics, Cambridge University Press. Her research focuses on language processing in typically and atypically developing children, bilinguals, and adults by means of off-line and on-line techniques, particularly eye-tracking in visual context and reading. Federico Frau is a doctoral candidate in Cognitive Neuroscience and Philosophy of Mind at the University School for Advanced Studies IUSS Pavia, Italy. He earned a master’s degree in Linguistic Sciences with a thesis on linguistic and communicative deficits in Parkinson’s Disease. His current training and research activities concern the field of neuropragmatics, focusing in particular on pragmatic impairment in psychiatric and neurological conditions and the role of multimodal processes (e.g., sensory-motor and bodily experience as well as mental imagery) in language processing. Angela D. Friederici is director at the Max Planck Institute for Human Cognitive and Brain Sciences in Leipzig, honorary professor at the University of Leipzig, at the University of Potsdam, and the Charité University Medicine Berlin. She published more than 500 papers on the neuroscientific bases of language in the adult and developing brain, as well as a monograph Language in our Brain (MIT Press, 2017). She is a member of the BerlinBrandenburg Academy of Sciences, the National Academy of Sciences Leopoldina, the Academia Europaea. She holds a doctor honoris causa of the University of Mons, Belgium, and received a number of international scientific prizes, including the Huttenlocher Award and William James Award, USA. Stefan Frisch is a psychotherapist and clinical neuropsychologist and head of the psychology group at the Department of Geriatric Psychiatry, Psychosomatic Medicine, and Psychotherapy (Pfalzklinikum Klingenmünster, Germany). He has clinical experience in neurology/neuropsychology, psychotraumatology, psychosomatic medicine and psychiatry. He has done research on neural language processing, on the neuropsychology of different neurological and psychiatric diseases and on conceptual and historical perspectives of neuropsychology and psychiatry. Davida Fromm is a research professor in psychology at Carnegie Mellon University. Previously, she taught in the Communication Sciences and Disorders department at the University of Pittsburgh and the Speech Pathology department at Duquesne University. She has conducted research on aphasia, dementia, apraxia of speech, and other neurologically based communication disorders in adults. Currently, her focus is on the adult clinical language banks in the
xviii Notes on Contributors TalkBank system – AphasiaBank DementiaBank, TBIBank, and RHDBank –with emphasis on managing and expanding the shared discourse databases, creating educational resources, and developing automated language analysis programs. Robin Gransier is a postdoctoral researcher at Experimental ORL, department of Neurosciences, KU Leuven, Belgium. His research focusses on unraveling how the acoustically and electrically stimulated auditory pathway processes sound based on electrophysiological measures, and how these insights can be used to develop novel neuro-inspired electrical stimulation strategies to restore hearing with cochlear implants. He is also involved in several studies that investigate the limitations of the human brain to process speech in challenging listening conditions, the effects of electrical stimulation on the nervous system, and how anthropogenic noise affects the hearing and behavior of marine mammals. He has published over fifty scientific journal articles in the field of audiology and hearing research, and is currently board member of the Belgian Scientific Society of Audiology (B Audio). Jacqueline Guendouzi is department head of Health and Human Sciences, and professor of communication sciences and disorders at Southeastern Louisiana University. She previously held positions at the University of South Alabama and Birmingham City University in the UK. She holds a PhD in Linguistics from Cardiff University. Her areas of research and publication focus on communication in the context of dementia and social aspects of women’s conversations. Eleanor Gulick is a doctoral student pursuing a degree in communication sciences and disorders at Bowling Green State University. In 2020 she obtained her Master’s degree in speech-language pathology from Bowling Green State University. She is currently practicing clinically in an urban hospital and has work in a variety of medical facility in rural and urban communities. Eleanor works in the Interactional Aphasiology Lab under Dr Brent Archer. Her research interests include aphasia and other cognitive communication disorders with a focus on enhancing conversation and life participation. Christina Hagedorn is Associate Professor of Linguistics and Speech and Language Pathology at the City University of New York (CUNY) where she directs the College of Staten Island Motor Speech Laboratory and also holds appointments in the doctoral programs in Linguistics and Speech-Language-Hearing Sciences at The Graduate Center, CUNY. She received her PhD in Linguistics from the University of Southern California and her M.S. in Communicative Sciences and Disorders from New York University. Her research interests include using modalities such as real-time magnetic resonance imaging and electromagnetic articulography to shed light on the precise nature of articulatory breakdowns in impaired speech in order to inform theories of speech production and refine therapeutic techniques used to address these speech impairments. Solène Hameau is a lecturer in Speech Pathology at the Catholic University of LouvainLa-Neuve (BE). She began her career as a clinical speech pathologist in France before moving to research. She holds a Masters in Linguistics from the University of Toulouse (FR) and a PhD in Cognitive Science from Macquarie University, where she currently has an additional research position. She is interested in factors affecting word production in different populations (monolinguals and bilinguals with and without aphasia), with a focus on how the similarity of words in the lexicon(s) of a speaker influence language processing.
Notes on Contributors xix Barry Heselwood is Honorary Senior Lecturer in the School of Languages, Cultures, and Societies at the University of Leeds, UK. Since retiring he has continued to do research and has published on aspects of the phonetics of atypical speech, on phonetic transcription, and on the phonetics and phonology of Arabic and the Modern South Arabian languages Mehri and Shehret. Alison Holm is an associate professor of speech pathology at the University of Tasmania in Australia. She has previously worked in Queensland (Australia) and the United Kingdom. Alison is an associate editor of the International Journal of Speech Language Pathology. She has published widely in speech sound disorders, and language and literacy acquisition and disorders, particularly for multilingual children. Alison is one of the authors of the Diagnostic Evaluation of Articulation and Phonology (DEAP) a widely used speech pathology clinical assessment. Sara Howard is Emeritus Professor of Clinical Phonetics in the Division of Human Communication Sciences at the University of Sheffield, UK. Sara has published and presented widely in the area of clinical phonetics and phonology, focusing particularly on children with developmental speech difficulties and individuals whose speech has been affected by a cleft palate. She has specific interests in the analysis of connected speech and the production of speech in conversation, using both perceptual and instrumental phonetic techniques. Sara is an ex-president of the International Association of Clinical Phonetics and Linguistics. Zhu Hua is Professor of Language Learning and Intercultural Communication and Director of International Centre for Intercultural Studies at the Institute of Education, University College London (UCL). She is an elected Fellow of Academy of Social Sciences, UK, and elected Fellow and Board member of the International Academy for Intercultural Research. She is Chair of British Association for Applied Linguistics (BAAL), 2021–2024. Elly Ifantidou is Professor in Language and Linguistics at the Department of English Language and Literature of the National and Kapodistrian University of Athens (NKUA). She is the author of Evidentials and Relevance (2001) (John Benjamins), Pragmatic Competence and Relevance (2014) (John Benjamins), co-editor of Developmental and Clinical Pragmatics (2020) (De Gruyter Mouton), and Beyond Meaning (2021) (John Benjamins). Her publications include articles in journals such as Journal of Pragmatics, Pragmatics & Cognition, Pragmatics, International Review of Pragmatics, Intercultural Pragmatics, Functions of Language, Lingua, Language and chapters in edited volumes by Cambridge University Press, John Benjamins, De Gruyter Mouton, Routledge. Her research interests are in semantics, pragmatics, pragmatic development in first and second language acquisition. Currently, she is co-editor-in-chief of Pragmatics & Cognition. David Ingram is an Emeritus Professor in the Department of Speech and Hearing Sciences at Arizona State University. He received his B.S. from Georgetown University and his PhD in Linguistics from Stanford University. His research interests are in language acquisition in typically developing children and children with language disorders, with a cross-linguistic focus. The language areas of interest are phonological, morphological, and syntactic acquisition. He is the author of Phonological Disability in Children (1976), Procedures for the Phonological Analysis of Children’s Language (1981), and First Language Acquisition (1989). His most recent work has focused on whole word measures of phonological acquisition.
xx Notes on Contributors Yves Joanette is Professor in cognitive neurosciences of aging and communication at the Speech-Language Department of the Faculty of Medicine of the Université de Montréal. He is also Laboratory director at the CRIUGM Research Center on aging and is currently Deputy Vice-Principal Research at Université de Montréal. He previously contributed as CEO of the Quebec Health Research Fund (FRQS) and Director of the CIHR Institute of Aging. He was a founding Member and Chair of the World Dementia Council and contributed extensively to the WHO dementia as well as healthy ageing strategies. Yves Joanette has been a pioneer on the topic of right-hemisphere communication disorders as well as on the heterogeneity of normal aging and dementia. He is Fellow of the Canadian Academy of Health Science and received two Honoris Causa, from Lyon-Lumière and Ottawa universities. Jan de Jong is an Associate Professor in the logopedics master program at the University of Bergen (Norway). Before that, he worked in linguistics departments at the universities of Groningen, Utrecht and Amsterdam (The Netherlands). He has wide experience in teaching typical and atypical language acquisition, at the graduate and undergraduate level. His research primarily concerns the grammatical aspects of developmental language disorder, also in a bilingual context. He was the vice chair of the European COST Action “Language Impairment in a Multilingual Society; Linguistic Patterns and the Road to Assessment.” Marilyne Joyal holds a Master’s degree in speech language pathology and a PhD in experimental medicine from Université Laval. Her thesis focused on brain and behavioral correlates of semantic processing in healthy and pathological aging, including Alzheimer’s disease and the semantic variant of primary progressive aphasia. She currently works as a research associate in the Speech and Hearing Neuroscience Laboratory at the CERVO Brain Research Centre in Quebec, Canada. She also teaches cognitive neuroscience of language at Université Laval as a lecturer. Louise Keegan, PhD, CCC-SLP, BC-ANCDS, is an Associate Professor of Speech-Language Pathology and the Associate Dean of the School of Rehabilitation Sciences at Moravian University, Pennsylvania. Her primary research focuses on social communication after Traumatic Brain Injury, specifically in the contextualized group context. She investigates optimal interventions and employs linguistic analysis methods to examine the communication strengths and skills of this population. In addition to clinical research, Dr Keegan also conducts research in the scholarship of teaching and learning as related to the areas of experiential learning and problem-based learning. She has numerous peer-reviewed publications and has presented her work at many national and international conferences. Margaret Kehoe is a senior lecturer in the Psycholinguistics and Speech Therapy department at the University of Geneva. She teaches classes in Speech Sound Disorders and Bilingualism. She has conducted research on phonetic and phonological acquisition in English-, German-, Spanish-, and French-speaking children as well as in bilingual children. She is also interested in the relationship between lexical and phonological development. She has been associate editor for two academic journals. Apart from her university position, she is a speech-language therapist who works with bilingual children in an international setting. Raymond D. Kent, PhD, is Professor Emeritus of the Department of Communication Sciences and Disorders at the University of Wisconsin-Madison. He has authored or edited 20 books and has published more than 200 journal articles and book chapters. His awards include the Honors of the American Speech-Language-Hearing Association (ASHA); Alfred
Notes on Contributors xxi Kawana Council of Editors Award from ASHA; Claude Pepper Award from NIDCD; Docteur Honoris Causa from the Universite de Montreal, Canada; Garcia Prize from Folia Phoniatrica et Logopaedica; and Honorary Doctorate, University of Oulu, Finland. He has served as editor of the Journal of Speech and Hearing Research, Clinical Linguistics and Phonetics, and Folia Phonetica et Logopaedica, and currently is a Handling Editor for Speech, Journal of Speech, Language, and Hearing Research. Ghada Khattab is Professor in Phonetics and Phonology in the School of Education, Communication and Language Sciences at Newcastle University, UK. A native speaker of Lebanese Arabic, she grew up in Lebanon and started her career as an English teacher before moving to the UK to complete her masters and PhD in Linguistics at the University of Leeds. Since then, she has spearheaded research in the areas of Arabic phonetics and phonology, monolingual and bilingual phonological acquisition, and sociophonetics. She has supervised several doctoral and post-doctoral researchers in these areas and published widely on all aspects of her research. She has also collaborated with researchers on clinical assessment and intervention with multilingual children, exploring cultural and sociolinguistic considerations for working with diverse communities in the UK and across the Arab world. Yunjung Kim is Associate Professor in the School of Communication Science and Disorders at Florida State University. Her research and publications primarily focus on the acoustic and kinematic analysis of speech, speech intelligibility, and neurogenic speech disorders in adults including those with diverse language backgrounds. Sonja A. Kotz is a translational cognitive neuroscientist, investigating temporal, rhythmic, and formal predictions and control mechanisms in audition, speech/language and music. In her basic, translational, and comparative research she utilizes a wide range of behavioral and neuroimaging methods (M/EEG, s/f/rsMRI, TMS, DBS). She heads the Neuropsychology section at the Faculty of Psychology and Neuroscience at Maastricht University, The Netherlands, and holds several honorary professorships (Manchester, Leipzig, and Lisbon). She is a senior/ associate editor for several impact journals in the field (e.g., Imaging Neuroscience, Cortex, Neurobiology of Language). Find more about her lab @ www.band-lab.com. Julie M. Liss is associate dean of the College of Health Solutions and professor of speech and hearing science at Arizona State University. She has served as editor-in-chief for the Journal of Speech, Language and Hearing Research (speech section), and currently serves as the Senior Editor for Registered Reports for Speech and Language for the American SpeechLanguage-Hearing Association (ASHA). She has published widely in communication disorders, acoustic and perceptual characterization of disordered speech, and speech intelligibility. She is a fellow of ASHA and currently lives in Arizona. John Local is Emeritus Professor of Phonetics and Linguistics at the University of York. His research over the past 40 years has engaged with non-segmental phonology and phonetic interpretation, speech synthesis, and, latterly, phonetics and talk-in-interaction. This work developed out of a close study of Firthian Prosodic Analysis (FPA) and an attempt to elaborate an impressionistic parametric phonetics supported by instrumental findings. Some of the results of this research, conducted with his colleague John Kelly, were presented in their book, Doing Phonology (Manchester University Press, 1989). In recent years he has worked on phonetic variability in the “liquid” system of English and the phonetic and interactional features of attitude in everyday conversation (supported by grants from the Economic and
xxii Notes on Contributors Social Research Council). He was awarded a two-year British Academy Readership, on an interactionally grounded analysis of the phonetics of everyday talk. Brian MacWhinney is Teresa Heinz Professor of Psychology, Computational Linguistics, and Modern Languages at Carnegie Mellon University. His Unified Competition Model analyzes first and second language learning as aspects of a single basic system. He has developed a series of 14 TalkBank open access online databases for the study of language learning, multilingualism, and language disorders. The databases for language disorders include AphasiaBank, ASDBank, DementiaBank, FluencyBank, RHDBank, and TBIBank. These databases provide transcriptions of spoken language linked to audio and video media, along with programs for analysis and linguistic profiling. His other research topics include methods for online learning of second language vocabulary and grammar, neural network modeling of lexical development, fMRI studies of children with focal brain lesions, ERP studies of betweenlanguage competition, and the role of embodied perspectival imagery in sentence processing. Theodoros Marinis is professor of linguistics and the director of the Centre for Multilingualism at the University of Konstanz in Germany. His research focuses on language acquisition and processing across monolingual and multilingual populations of typically and atypically developing children and adults and aims to uncover the nature of language processing in typical and atypical language development. He has conducted studies on a large range of languages, including Arabic, Dutch, English, Farsi, German, Greek, Italian, and Turkish. He has published widely in language processing, multilingualism, developmental language disorders, and autism. Glenda Mason, PhD is a Canadian registered and certified Speech-Language Pathologist, and a Lecturer and Research Associate in the School of Audiology and Speech Sciences at the University of British Columbia, Canada. Dr Mason conducts research regarding typical and disordered speech sound development of school-aged children, using in-person and online data collection methodologies. Her particular interest is in phonological development in multisyllabic words. She has developed a whole word nonlinear phonological metric and analysis which have been automated in Phon (MNLA; Hedlund & Rose, 2020). Greta Mazzaggio is currently an assistant professor of linguistics at the University of Florence, Italy. She previously won the Swiss Confederation Excellence Scholarship, working for one year at the University of Neuchâtel (Switzerland) and then she moved to the University of Nova Gorica, Slovenia. She also spent research periods in France (CNRS, Lyon) and the USA (University of Notre Dame, Indiana). Her main research interests focus on three interconnected topics: experimental pragmatics, clinical linguistics, and bilingualism. She is the author of the first Italian introductory book on experimental pragmatics. Natalia Meir is professor in the Department of English Literature and Linguistics, University of Bar-Ilan, Israel, where she also serves as the Coordinator for the Linguistics in Clinical Research Program. She is a member of the Bilingualism Matters network, a community of international experts engaging the public with the latest research about multilingualism. Furthermore, she is a member of the Multilingual and Multicultural Affairs Committee of the International Association of Logopedics and Phoniatrics (IALP). Meir`s research interests cover monolingual and multilingual (a)typical language development (including Autism Spectrum Disorder and Developmental Language Disorder). In multilingual populations, she focuses on Heritage Language development and maintenance across the lifespan.
Notes on Contributors xxiii Lucie Ménard is the Founder and Director of the “Phonetics Laboratory at Université du Québec à Montréal” and Adjunct-director of the “Center for Research on Brain, Language, and Music—CRBLM.” Full professor at the University of Québec à Montreal, Dr Menard has been engaged in a sustained program of studies of the development of speaker strategies for reaching intelligible multisensory speech goals. Her research has used a combination of instrumental measures (ultrasound imaging, optical and electromagnetic tracking of orofacial articulators), acoustic measures, and modeling in sensory-deprived young children and adults. To support this work, her lab has devised methods of obtaining and analyzing accurate ultrasound data on the tongue movements of children as young as two years old. Most recently, Prof Ménard has been involved in the development of clinical assessment tools (ultrasound imaging and virtual reality) for children with neuromuscular disease at the Living Lab Ste-Justine Pediatric Hospital, as a researcher. Nicole Müller is Professor of Speech and Hearing Sciences at University College, Ireland. She also holds a visiting professorship at Linköping University, Sweden. In the past, she has held academic appointments in England, Wales, the USA, and Sweden. Her research interests are in clinical linguistics and phonetics, multilingualism, and adult-acquired impairments of communication and cognition, for example in consequence to neurodegeneration, brain injury, or stroke. She is a legacy editor of the journal Clinical Linguistics and Phonetics. Aravind Namasivayam is an internationally recognized expert in speech motor control and speech disorders in children, with a clinical degree in Speech-Language Pathology, and a specialization in Neuroscience at the Doctoral/Post-Doctoral level. He has led several governmental and industry-funded randomized clinical trials, resulting in policy changes at the governmental level and the development of evidence-based care pathways for children. Dr Namasivayam has received numerous awards for his contributions to the field, including but not limited to the National Award for Excellence in Applied Research from Speech-Language and Audiology Canada, the Distinguished Service Award from the University of Toronto. Caroline Newton is an Associate Professor in Clinical Linguistics at University College London. Her research focuses particularly on the impact of speech and language difficulties on everyday communication, including language beyond the level of the single word and in challenging contexts such as listening in noise or to an unfamiliar speaker. Recent work has also explored the interaction between language and other areas of cognition, such as the role of foundational number language in numeracy ability. Her publications include work with both children and adults with a variety of communication difficulties. Lyndsey Nickels is professor in the School of Psychological Sciences, Macquarie University, Sydney. She graduated as a Speech and Language Therapist from Reading University, UK and completed a PhD at Birkbeck College, University of London. Taking a psycholinguistic and cognitive neuropsychological approach, her research has focused on informing theories of language processing, its breakdown in aphasia, treatment of language impairment in aphasia and methodological rigor in single case experimental design. She has published widely in all these areas, including a monograph, five edited volumes and over 200 journal articles. She is a fellow of the Academy of Social Sciences of Australia. Michelle Pascoe is an honorary Associate Professor in the Division of Communication Sciences and Disorders at the University of Cape Town, South Africa. She is a speech-language therapist whose research focuses on typical and atypical speech, language, and literacy
xxiv Notes on Contributors acquisition. Her particular interest is in speech and language development in the languages of Southern Africa, multilingualism and ways to support clinicians when working with families from a range of language and cultural backgrounds. Michelle is an associate editor of Child Language Teaching and Therapy and Logopedics Phoniatrics Vocology and has published more than 50 journal articles and book chapters to date. Martina Penke is professor of psycholinguistics at the University of Cologne (Germany). Her research focusses on typical child language acquisition, on developmental language disorders in children with hearing loss and children with genetic disorders (Down syndrome, Williams syndrome) as well as on acquired language disorders (Broca’s and Wernicke’s aphasia, Parkinson’s disease). The aim of her research in language disorders is to provide linguistically sound profiles of retained and impaired language abilities in the area of syntax and morphology and to identify syndrome-specific performance patterns within these areas. Michael (Mick) Perkins is Emeritus Professor of Clinical Linguistics in the Division of Human Communication Sciences at the University of Sheffield, UK. He has published and presented widely in clinical linguistics, pragmatics, semantics, and language development, with a more recent focus on the interactions between semantics, pragmatics and cognition in a wide range of developmental and acquired communication disorders. He was a founder member of the International Clinical Phonetics and Linguistics Association (ICPLA), was its Vice-President from 2000 to 2006 and is an honorary fellow of the Royal College of Speech and Language Therapists. Karen E. Pollock is a Professor in the Department of Communication Sciences and Disorders at the University of Alberta in Edmonton, Canada, where she teaches speech sound disorders and supervises graduate student research. She is a certified/registered speech-language pathologist and previously held academic and clinical positions in the US. She has co-edited four books and published and presented research on the development of vowels in children with and without speech sound disorders and speech-language development in children adopted internationally. Her current research explores speech development in young Canadian children learning Mandarin Chinese as a heritage language or minority second language in bilingual schools. Francesco Possemato completed his PhD at the University of Sydney (2018). By using the methods of Conversation Analysis and Interactional Linguistics, his research addresses language and social interaction in a variety of contexts. Francesco has worked on the Conversational Interaction in Aboriginal and Remote Australia (CIARA) project (Macquarie University-University of Melbourne-University of Queensland), and is currently a Postdoctoral Associate in the Communication and Assistive Device Laboratory (CADL) – Department of Communicative Disorders and Science (State University of New York (SUNY) at Buffalo). Francesco is Honorary Research Fellow at Macquarie University, where he is also the co-investigator for the Aphasia, correction, and micro-collaboration project (Macquarie University) addressing interactions involving people with aphasia. Francesco is also the external investigator for the Students’ flourishing through Italian classroom interaction project (La Trobe University). While in Sydney he was the coordinator for the Conversation Analysis in Sydney (CAIS) group – Australasian Institute of Ethnomethodology and Conversation Analysis (AIEMCA). Francesco has published on Italian L2 teaching, atypical interaction, and pragmatic typology.
Notes on Contributors xxv Ben Rutter is a Lecturer in Clinical Linguistics in the Division of Human Communication Sciences, Health Sciences School at the University of Sheffield. He holds a BA (Hons.) in Linguistics from the University of York and PhD in Clinical Linguistics from the University of Louisiana at Lafayette. His research interests in in the application of phonetics and linguistics to the study of communication disorders with a particular focus on motor speech. Christos Salis is senior lecturer in Speech and Language Science at Newcastle University (UK). He is a qualified speech and language therapist supervising emerging clinicians at the Tavistock Aphasia Centre (North East) and is also Editor in Chief of “Aphasiology.” His current research focuses on the confluence of language and cognition in typical older people and people affected by post-stroke aphasia, especially on the contributions of working memory and attention for sentence and discourse production. Theresa Schölderle, PhD, is an academic speech-language-therapist. Currently, she has a postdoctoral position at the Clinical Neuropsychology Research Group (EKN) at the Institute of Phonetics and Speech Processing, Ludwig Maximilian University, Munich, Germany. Her research focus is on dysarthrias acquired early in life (e.g., in children and adults with cerebral palsy). Anna Sosa is Professor and Chair of the Department of Communication Sciences and Disorders at Northern Arizona University (NAU). She completed a master’s degree in Linguistics at the University of New Mexico in 2000 and a PhD in Speech and Hearing Sciences at the University of Washington in 2008. Prior to joining the NAU faculty in 2009, she worked as a school-based Speech Language Pathologist in Washington State. Her research focuses on phonological development in young children with and without speech sound disorder and on the relationship between lexical and phonological development. She teaches undergraduate and graduate courses in phonetics, phonological development disorders, bilingual language development, and early intervention and supervises graduate students in the NAU SpeechLanguage-Hearing Clinic in the areas of evaluation and assessment of infants and toddlers with delayed communication, intervention for speech sound disorder, and dyslexia. Elizabeth Spencer, PhD, is a senior lecturer in speech pathology at the University of Newcastle, Australia. She is a qualified speech pathologist with particular interest in the application of clinical linguistics in speech pathology. Her current research interests are in developing clinical applications and clinically viable methods for the analysis of language and discourse across the lifespan. She has a particular interest in clinical applications of Systemic Functional Linguistics. Anja Staiger, PhD, is a postdoctoral researcher at the Clinical Neuropsychology Research Group (EKN) at the Institute of Phonetics and Speech Processing, Ludwig Maximilian University, Munich, Germany. She is trained as a speech-language pathologist and clinical linguist. Her main areas of research are motor speech disorders in progressive and nonprogressive neurological disease (apraxia of speech and dysarthria). Joseph Stemberger, PhD, is Professor Emeritus of Linguistics at the University of British Columbia. He is primarily interested in phonology, morphology, and their interaction. Most of his research is in adult psycholinguistics and in first language acquisition and investigates the nature of cognitive representations. He is also involved in a study of first language acquisition by monolingual Zapotec-learning children in Mexico, and a similar study on Slovene. His theoretical orientations are toward Optimality Theory, and toward connectionist models.
xxvi Notes on Contributors His current research tends to focus on the degree to which phonological and morphological information is concentrated in particular lexical items vs. the degree to which lexical processing is supplemented with system-general information. Other languages he has worked on include Choctaw, Cambodian, and Ojibwe, and Seri and German (collaborating with other scholars). Carol Stoel-Gammon, PhD, is professor emerita in the Department of Speech and Hearing Sciences at the University of Washington, Seattle. Her research and publications focus on prelinguistic and early linguistic development, on cross-linguistic studies of phonological acquisition, on investigations of early speech and language acquisition in children with typical and atypical development, and on the relationship between babble and speech. She has served on the editorial boards of several journals of the American Speech-Language-Hearing Association (ASHA), is an ASHA fellow, and a recipient of ASHA Honors. Bruna Tessaro is a PhD student of the International Doctorate for Experimental Approaches to Language and Brain at the School of Psychological Sciences at Macquarie University (Australia) and at the Department of Speech and Languages Sciences at Newcastle University (United Kingdom). Her work focuses on the assessment of cognition beyond language in people with aphasia, in research and clinical practice. She is also interested in the relationship between semantic processing and cognitive aspects in aphasia. Austin Thompson is an assistant professor at the University of Houston. Prior to his current appointment, Austin earned his PhD at Florida State University. His research focuses on motor control of speech production in speakers with and without dysarthria, instrumental analysis of speech using acoustic and kinematic methods, and linguistic influences on speech production in bilingual or multilingual speakers. Additionally, he is a licensed and certified speech-language pathologist with a clinical interest in diagnosing and treating speech deficits in individuals with dysarthria secondary to neurodegenerative diseases like Parkinson’s disease and amyotrophic lateral sclerosis. Mark Tiede, PhD, is a phonetician specializing in speech production and instrumental methods for studying it. He earned a doctorate in Linguistics from Yale, and has been part of the ATR Human Information Processing Laboratory (Kyoto) and MIT-RLE Speech Communication groups. Currently he is associated with Haskins Laboratories and the Brain Function Laboratory in the Yale Dept. of Psychiatry. His current research interests are focused on identifying characteristic patterns of speech articulatory movements as they evolve under changes in rate and prosodic influence. Martha Tyrone is an Associate Professor in the Speech-Language Pathology program at Gallaudet University. For 15 years, she was a Senior Research Scientist at Haskins Laboratories, where she was awarded multiple research grants from the National Institutes of Health. Her research deals with the relationship between motor control, language, and communication in clinical and non-clinical populations. She uses instrumented techniques, such as motion capture and electromagnetic articulography, to examine the structure of speech, gesture, and sign language in deaf and hearing adults. Dr Tyrone teaches courses on Speech Science, Research Methods, Neuroanatomy, and Motor Speech Disorders. Traci Walker is a Senior Lecturer in the Department of Human Communication Sciences, University of Sheffield. Her research uses the methodology of Conversation Analysis (CA) to investigate the function and use of linguistic structures (both syntactic and phonetic) in both typical and atypical communication. She has published on topics including repetition
Notes on Contributors xxvii and repair in conversations involving both adults and children; claims and displays of understanding by people with aphasia; and the automatic detection of dementia and epilepsy based on the analysis of conversations. Jocelynne Watson is visiting professor at the University of St Mark and St John, Plymouth having previously held the positions of Senior Lecturer and Clinical Director for the Speech and Hearing Sciences Department at Queen Margaret University, Edinburgh. Jocelynne has a particular interest in developmental Speech Sound Disorder and its implications for literacy development. Jocelynne’s publications focus on etiology, vowel disorder, assessment, and treatment. She is co-author of two speech assessment tools: PPSA (Phonetic and Phonological Systems Analysis) and CAV-ES (Clinical Assessment of Vowels – English Systems). Astrid van Wieringen is full professor at Experimental ORL, Department of Neurosciences, University of Leuven (Belgium) where she combines research and teaching (5 courses). Her interdisciplinary research focuses on understanding the neural consequences of deprived auditory input, and optimizing hearing of adults and children with hearing aids and/or cochlear implants. She is the program director of the five-year program Speech-language Pathology and Audiological Sciences (faculty of Medicine), and also professor II at the University of Oslo, Department of Special Education. She is president-elect of the International Society of Audiology, secretary-treasurer of the International Collegium of Rehabilitative Audiology (ICRA), (founding) board member of the Belgian Scientific society of Audiology (B Audio), and member of the World Rehabilitation Alliance (WHO). Li Wei is Director and Dean of the UCL Institute of Education, at University College London, UK, where he also holds the Chair in Applied Linguistics. He is the Editor of International Journal of Bilingual Education and Bilingualism and Applied Linguistics Review. He is Fellow of the British Academy, Academia Europaea, Academy of Social Sciences (UK), and Royal Society of Arts (UK). Bill Wells is Emeritus Professor in Human Communication Sciences at the University of Sheffield, where he worked from 2000–2015. He was previously Professor of Clinical Linguistics at University College London and Principal of the National Hospital’s College of Speech Sciences. He has published extensively in the fields of interactional linguistics and clinical linguistics, including a book series with Joy Stackhouse on Children’s Speech and Literacy Difficulties. He is an Honorary Fellow of the Royal College of Speech and Language Therapists. Tim Wharton is a principal lecturer in linguistics at the University of Brighton, UK. His primary research interest is pragmatics, the study of utterance interpretation. In particular, his research explores how “natural,” non-linguistic behaviors – tone of voice, facial expressions, gesture – interact with the linguistic properties of utterances (broadly speaking, the words we say). His main theses are outlined in his 2009 book, Pragmatics and Non-Verbal Communication, which charts a point of contact between pragmatics, linguistics, philosophy, cognitive science, ethology, and psychology, and his latest book – Pragmatics and Emotion – (with Louis de Saussure) published by CUP in 2023. He edited the recent John Benjamins volume “Beyond meaning” and has published papers on pragmatics and related subjects (and sometimes unrelated subjects) in numerous international journals. Ray Wilkinson is Professor of Human Communication at the University of Sheffield. He uses conversation analysis to study social interaction, in particular the use of talk within social interaction. His research includes the study of communication disorders such as
xxviii Notes on Contributors aphasia, dementia and stammering and their impact on conversation, and he has recently co-edited (with John Rae and Gitte Rasmussen) Atypical Interaction: The Impact of Communicative Impairments within Everyday Talk (2020, Palgrave Macmillan). He has also published on typical conversation, conversation involving second-language speakers, conversation involving children, and on the social interaction of non-human primates. In addition he has developed, implemented and evaluated intervention programs to improve conversation. Maximiliano A. Wilson is full professor at the speech-language pathology program of Université Laval. He is also researcher at the Cirris Research Centre on interdisciplinary rehabilitation and social integration in Quebec City, Canada. His research focuses on the changes in the brain that occur in normal and pathological ageing (neurodegenerative diseases) to sustain language processing, specifically reading and semantics. He combines the study of behavior with brain scan imagery techniques such as magnetic resonance and evoked responses by electroencephalography. Eva Wimmer is a senior researcher and lecturer in the research unit Language and Communication of the Department of Rehabilitation Sciences at TU Dortmund University, Germany. After receiving her PhD in linguistics in 2009, she previously worked at the German universities of Düsseldorf, Bremen, and Cologne in the areas of psycholinguistics and special needs education. From 2018–2019, she was an interim professor for Special Education and Rehabilitation of Speech and Language Disabilities at the University of Cologne. Her research focus is on the study of individuals with acquired or developmental language disorders with a specific interest in the acquisition of syntax and morphology in children with developmental language disorder (DLD), Down syndrome and hearing impairment. Jan Wouters is full professor at the Department of Neurosciences of the University of Leuven, Belgium. The core of his research focuses on audiology, auditory neural processing, hearing aids and hearing implants, building bridges between technology, neuroscience, perception, and medicine. He is (co)author of about 370 articles in international peer-reviewed journals and teaches four major physics and audiology courses at KU Leuven. He served in the executive board of ICRA, EFAS, and is president of the Belgian Audiology Society B-Audio. He is an honorary fellow of the Deutsche Gesellschaft für Audiologie (DGA) and received the Lifetime Achievement Award of the European Federation of Audiology Societies (EFAS). Alison Wray is a Research Professor of Language and Communication at Cardiff University, UK. Her research has focused on modelling the forms and functions of formulaic language in normal and disordered language and in second language acquisition, with further applications to the evolution of language. She has published articles, books, and animated films on the topic of communication in the dementia context, including her award-winning 2020 monograph The Dynamics of Dementia Communication and her 2021 book aimed at people with dementia, carers and bystanders, Why Dementia Makes Communication Difficult: A Guide to Better Outcomes. Mehmet Yavaş is Professor of Linguistics at Florida International University, USA. He has published numerous articles and books on applied phonology. Among those are his widely read Applied English Phonology (4th Edition, 2020), Romance-Germanic Bilingual Phonology (2017), Unusual Productions in Phonology: Universals and Language-Specific Considerations (2015), Phonology: Development and Disorders (1998), First and Second Language Phonology (1994), Phonological Disorders in Children: Theory, Research and Practice (1981), and Avaliacao fonologica da crianca (1990), a phonological assessment procedure for Brazilian Portuguese.
Notes on Contributors xxix Wolfram Ziegler is Professor of Neurophonetics and head of the Clinical Neuropsychology Research Group (EKN) at the Institute of Phonetics and Speech Processing, Ludwig Maximilian University Munich. He has a diploma and PhD in Mathematics from the Technical University of Munich and spent 10 years as a research assistant at the Max-PlanckInstitute for Psychiatry. From 1995 to 2015 he headed the EKN at the Clinic for Neuropsychology, City Hospital Bogenhausen in Munich, and since 2015 at the LMU Munich. His research focus is on speech disorders in neurologic populations.
Introduction MARTIN J. BALL, NICOLE MÜLLER, AND ELIZABETH SPENCER
The first edition of this Handbook was published in 2011, and the fields of clinical linguistics and clinical phonetics have developed considerably since that volume was prepared. Partly, this can be seen in the fact that this new edition has four more chapters and an extra Part than the first edition. But, more importantly, the chapters themselves record important advances in their respective areas. We were fortunate that many of the scholars who contributed to the first edition were available to update their chapters. In some cases, however, both for new chapters and where the original author was unavailable, we have recruited new contributors. The editorial team, too, has changed. Two of the original editors were not free to work on the new edition (though they are still co-authors of chapters in this collection) but the new team is balanced both in terms of areas of expertise and geographical areas. Part 1 (Pragmatics, Discourse, and Sociolinguistics) has two more chapters in this edition compared to the first. Multimodal analysis of interaction – an area of recent interest in our field – has been added to this Part; also, the important topic of corpus linguistics as applied to clinical corpora is covered. Both these new chapters are authored by leading scholars in their subjects. Of the remaining chapters in Part 1, three have brand new author teams and most of the remainder have strengthened their teams with additional scholars. Part 2 (Syntax and Semantics) has ten chapters, that is, one more than the first edition. The new chapter deals with disruptions to language in the sign modality, a welcome addition to the topics covered in this part of the book. Six of the other chapters have been updated by the authors of the equivalent chapters in the first addition (some with added co-contributors), while three have new authorial teams. The original Part 3 (Phonetics and Phonology) has undergone the greatest transformation. In the first edition, chapters dealing mostly with phonology were intermingled with those better classed as phonetics. For this new edition the editors decided to separate the two subject fields, at least partly due to a recognition that phonetics (unlike phonology) falls out with the discipline of linguistics. Part 3 of this new edition, therefore, contains chapters dealing with a variety of phonological topics. Of the eleven chapters in this part, five are new and cover clinical phonology and the assessment of phonology, phonological development, and cross-linguistic aspects. One chapter from the first edition has been rewritten by a new team, and the remaining five have been updated by the original authors (some with new co-authors). Part 4 (Phonetics) contains ten chapters. A reorganization of topics since the first edition now sees three chapters dealing with instrumental analysis of speech, with the remaining topics similar to those included previously. Two of the three instrumental chapters have new
本书版权归John Wiley & Sons Inc.所有 cintro.indd 31
14-12-2023 07:25:43
xxxii
Introduction
author teams, with the third adding a new co-author. Three of the other chapters also have new authors, with the remaining four being written by the original authors often with new co-authors added. The Editors are proud of the Handbook’s diversity, in particular, different clinical contexts, a variety of theoretical frameworks, and a range of scholarly traditions. Further, the contributors are from a number of different countries and work with numerous languages; they are from both clinical and non-clinical backgrounds and include linguists, phoneticians, speech-language pathologists and psychologists. This diversity reflects the multifarious nature of clinical linguistics and communication rehabilitation. We are confident that this updated and expanded Handbook will be a valuable resource for clinicians, phoneticians, and linguists interested in looking at research in all parts of subject, reviewed and presented by the leading scholars of their areas.
本书版权归John Wiley & Sons Inc.所有 cintro.indd 32
14-12-2023 07:25:43
Part 1: Pragmatics, Discourse, and Sociolinguistics
1 Discourse Analysis and Communication Impairment LOUISE C. KEEGAN, JACQUELINE A. GUENDOUZI, AND NICOLE MÜLLER 1.1 Introduction 1.1.1 Discourse The terms discourse and discourse analysis are used in many ways by different people both within and across different disciplines. The Latin word discursus, which became “discourse” in English (Onions, 1966, p. 272), means “running to and fro,” from which derived the medieval Latin meaning “argument’” Thus, within disciplines that deal with human language, speech, and communication, “discourse” can be understood, in the widest sense, as both the process of running to and fro, an exchange between a human being and their environment, and the products arising from such exchanges. Many of the differences in defining discourse stem from the distinction between discourse as process and discourse as product. There is a thorough discussion of this distinction in the prior edition of this chapter (Müller et al., 2008). The analysis of discourse typically occurs on the product itself, however, more recently there has been a focus on how such analyses may be concerned with the mechanisms that underlie the processes involved. For example, Cherney et al. (1998, p. 2.) define discourse as “continuous stretches of language or a series of connected sentences or related linguistic units that convey a message.” These linguistic units are the product and the primary focus of analysis. Yet in the more recent edition of their text (Coelho et al., 2023) they update this definition to highlight how discourse involves both comprehension and production, as well as the complex interplay between the linguistic, cognitive, and social elements that impact the discourse process. This focus on the “behind the scenes” aspects of discourse, including the sociolinguistic and sociocultural aspects, indicates a growing recognition of the discourse process as an integral component of the product (see Keegan, Hoepner et al., 2023; Chapters 5, 6, 7, and 8 in this volume). The focus of this chapter is analysis of discourse in communication disorders and so the majority of the literature discussed is from the field of Speech-Language Pathology/Therapy. However, it is important to note that there are many other applications of discourse analysis. There have been many studies that use discourse analyses to investigate the communication of various healthcare professionals, including physicians (e.g., Park et al., 2021) nurses (e.g., Lenzen et al., 2018) and pharmacists (e.g., Kellar et al., 2020). Such analyses are not
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
4 Louise C. Keegan, Jacqueline A. Guendouzi, and Nicole Müller restricted to healthcare. Critical Discourse Analysis (e.g., Fairclough, 2012) has been used to examine the discourses of politics for several decades. As a method it has also been applied to healthcare settings and clinical guidelines (e.g., Jørgensen et al., 2022; Tomkow et al., 2023) and examines power relations and how language and communication is used within social structures. Similarly, systemic functional linguistics (discussed in depth in Chapter 8) has also been applied to discourses of power structures and systems (e.g., Chen, 2018; Sharififar & Rahimi, 2015).
1.1.2 Analysis and Structure The many and varied uses of discourse analysis methods contribute to discrepancies in the way such analyses are applied. There have been many efforts to simplify the approaches to analysis of discourse and to provide a structure for conceptualization. We describe these here with the caveat that discourse analysis is inherently complex and “messy” and the categorizations that follow are simplifications of language that allow us to assign structures and analytical functions that, although somewhat arbitrary in nature because of the overlap, provide us with a means of conceptualizing and breaking down the process for analysis. One way to categorize discourse is based on the type or genre of discourse being examined. This is often organized by the speaker’s main purpose, whether it be conversational, expository, informative, persuasive, or narrative in nature. This widely used taxonomy is defined and explained in Müller et al. (2008; Table 1.1) and Coelho et al. (2023; Table 3.1). However, these types or genres are typically oversimplifications. For example, a business negotiation may have a conversational structure overall, but is likely to contain elements of expository and persuasive discourse, and possibly even narrative material. Similarly, the categories micro- and macrostructure, and by extension micro- and macrostructural deficits, are frequently employed in the clinical literature. Kintsch and Van Dijk (1978) distinguish between microstructure and macrostructure in terms of local information and global information respectively. The local information is related to individual words and relationships while the macrostructure represents the gist, topic, or main ideas (Müller et al., 2008). Finally, many authors also discuss a superstructure that is essentially a means of describing the genre or discourse type (Cherney et al., 1998), as it relates to the relationship between language and society. While these categories are well defined and there has been literature that differentiates them (e.g., Altman et al., 2012 found individuals with aphasia typically have strengths in macrostructure but increased difficulty with microstructure), it should be noted that these categories are inherently linked (Bitetti et al., 2020). For example, if one tells a story about a past event it usually highlights a main topic and structure ( macrostructure), however a morphological verb-ending “-ed” in English at the word level (microstructure) also contributes to the macrostructure. It is also likely that such a story is presented in a narrative format (superstructure) and the cultural context in which this story occurs and is being presented also contributes to the superstructure. Another categorization system often referenced in analyses includes cohesion and coherence. Yule (2020) identified cohesion as the connections that exit within a discourse sample. Coherence then, is considered the overarching logical and unified whole (Yule, 2020). Much of the literature describes these as separate entities (Armstrong, 2000; Carrell, 1982), to facilitate analysis; however, these concepts are not completely separable. Coherence can be impacted by difficulties in cohesion. There is an extensive literature on cohesion in the Systemic Functional Linguistic tradition (see Chapter 8). Halliday and Hasan (2014) describe multiple types of cohesion: reference (linguistic markers that refer to an earlier element),
Discourse Analysis and Communication Impairment 5 substitution (replacing a previously mentioned word with a substitute like a pronoun), ellipsis (omitting information that can be contextually inferred), conjunction (linking devices), and lexical (related words and phrases). In terms of clinical application, the simplification inherent in the categorizations outlined is deliberate. This limits the variables of analysis that must be considered, and makes comparisons and generalizations easier. This is also the reason why in clinical assessment or research contexts, “naturalness” tends to be sacrificed for the sake of standardization in terms of the tasks and stimuli used. For example, one of the frequently used picture stimuli to elicit descriptive discourse is the well-known “Cookie Theft Picture” from the Boston Diagnostic Aphasia Examination (Goodglass & Kaplan, 1983). Similarly, the Cinderella story retell is frequently used to collect narrative samples (Fromm et al., 2020). Thus, in these situations a balance is attempted between achieving comparability and predictability and, if not necessarily a discourse context that is entirely personally relevant and natural to the participant, one that is engaging enough to produce data that reflect the best of the participant’s ability. Despite this, there is a recent effort to acknowledge the sociolinguistic complexity in discourse analysis and there is a growing body of literature (e.g., Armstrong et al., 2023; Keegan, Hoepner et al., 2023) that considers contextualized conversations and focuses on description of discourse while acknowledging the difficulty with attempting to standardize interactional clinical assessment.
1.2 Theoretical Approaches to Discourse Analysis The conceptualization of the discourse analysis process that is adopted, whether explicitly defined or left implicit to emerge from the data gathered and analyzed, depends on the clinical or research question being asked. The question is in turn constrained by the theoretical or analytical framework within which a researcher or a clinician works. This section provides a brief overview of some of the more common theoretical approaches to discourse that have been developed over the years. The first chapter in the prior (first) edition of this collection reviews these in more depth (Müller et al., 2008), and subsequent chapters in this text are dedicated to specific theoretical approaches such as Speech Act Theory and Relevance Theory. Speech Act Theory (SAT; Austin, 1962; see also Chapter 2) differentiates three major categories of speech acts, namely, locutionary acts, illocutionary acts, and perlocution. A locutionary act, or locution, is a speaker’s use of words with determinate sense or, in other words, unambiguous meaning, and reference. The illocutionary act or illocution is the act carried out by the speaker uttering the locution; in other words, the illocution embodies the speaker’s intention in making an utterance. The illocutionary effect is the addressee’s (or listener’s) recognition of the speaker’s intention (Searle, 1969). The listener’s acting upon the speaker’s expressed intention is the perlocution, or perlocutionary act. Classifications from SAT have been used in the construction of communication assessments. The early version of Prutting and Kirchner’s (1983) Pragmatic Protocol includes a taxonomy of behaviors based on SAT, distinguishing between utterance acts (how a speaker presents a message), propositional acts (linguistic meaning), illocutionary acts, and perlocutionary acts. In the later (1987) version of the Pragmatic Protocol, this classification was abandoned; the notion of speech acts (speech act pair analysis and variety of speech acts are to be rated as either “appropriate” or “inappropriate,” or “no opportunity to observe”) was, however, maintained. The Profile of Communicative Appropriateness (Penn, 1988) includes the management of indirect speech acts (without further subclassification) as one aspect under the meta-category of sociolinguistic sensitivity. (See Adams, 2002, for a review of other
6 Louise C. Keegan, Jacqueline A. Guendouzi, and Nicole Müller assessment methods and analytic procedures that employ concepts from SAT, among other aspects of interaction and pragmatics.) Grice’s theory of conversational implicature (1975) proposes that interlocutors all follow cooperative principles that determine maximum efficiency of language use for communication. A conversational maxim is any of four rules: a speaker’s contribution is adequately informative (quantity), truthful (quality), relevant (relation), and orderly (manner). These maxims therefore draw on SAT to indicate how a listener arrives at a speaker’s intended interpretation of an illocutionary act (Grice, 1975; Searle, 1975). Maxims represent the “idealized” framework for how speakers (by adhering to or flouting the maxims) create implicatures and how listeners might draw particular inferences, and, it is important to note, are based on traditional norms of western conversational patterns. Damico’s Systematic Observation of Communicative Interaction or Clinical Discourse Analysis approach (Damico, 1991; Hoepner et al., 2018) is a tool for “real-time” observation of communicative interaction which uses Bach and Harnisch’s (1979) modification of Searle’s original classification of illocutionary acts as a framework, and uses Grice’s maxims of conversational cooperation to categorize the inappropriate execution of illocutionary acts. (See also Chapter 2 of this collection.) Relevance theory (Sperber & Wilson, 1997; see also Chapter 3) is a framework that explores discourse as a cognitive process. It examines meaning in terms of storage (memory) and processing and thus explains relationships between semantics and pragmatics in terms of inferences based on the evidence the listener has at their disposal. Relevance theory has been applied to examining communication in children and adults who have developmental or acquired communication difficulties with cognitive communication (e.g., Guendouzi, 2013; Happé, 1995; McDonald et al., 2000) and has been effective at explaining pragmatic breakdowns (Guendouzi, 2013; Ryder & Leinonen, 2011). Conversation analysis (CA) is a sociolinguistic approach to interaction, in that it examines communication in contextualized conversation (Wilkinson, 2011). Among the major methodological tenets of CA and clinical approaches based on CA is the principle that one’s data must be approached with as few preconceptions as possible as to how mutual understanding (the joint negotiation of meaning within the interactional context) is or is not achieved. Further, the analyst’s role is to discover, by way of detailed description of the “local” (i.e., turn-by-turn, in conversational data) organization of a text (e.g., a transcript of a conversation), the mechanisms that interactants use to jointly negotiate meaning. Thus, there is no a priori “ill-formed” or “well-formed” structure; rather, what is or is not successful emerges out of the unfolding interaction (see also Atkinson & Heritage, 1984). Thus, the focus of the analysis is typically the turn-taking in conversation that facilitates the co-construction of meaning (Azios & Simmons Mackie, 2023). CA has been used extensively in the aphasia literature to examine repair, multimodal communication, topic and turn management (see summaries in Azios & Simmons Mackie, 2023; Keegan, Behn et al., 2023). Chapter 6 in this volume discusses conversation analysis and its clinical application in further detail. Systemic Functional Linguistics (SFL) is a theory of language use (Halliday & Matthiessen, 2014), and its analytical tools have in recent years become popular in the field of clinical discourse analysis. It allows one to examine language from the perspective of three overarching metafunctions (textual, ideational and interpersonal). The textual metafunction is the facilitating metafunction, referring to the speakers’ ability to organize and construct the text in a cohesive manner (Halliday & Matthiessen, 2014). The ideational metafunction of language is to understand and represent the world and the speakers’ experience of the world, and it can be experiential (meanings at and below clause level) and logical (meanings created at the level of the clause complex). Finally, the interpersonal metafunction involves the representation of the speakers’ experiences to each other, that is, the roles and
Discourse Analysis and Communication Impairment 7 relationships they form with one another and the world. Chapter 8 of this volume (Spencer & Ferguson) provides more detail related to systemic functional linguistics and its clinical application. These theories and approaches, among others, as this is not an exhaustive list, provide frameworks and philosophical groundings for clinical discourse analysis. However, the application of the analysis, even though it references theory, often occurs from a unit perspective. The following section discusses the units of analysis that are typically utilized during clinical discourse analysis.
1.3 Units of Analysis for Clinical Application Since discourse analysis methods typically involve examining units of language, it is not surprising that different approaches will frame analysis in different ways and utilize different units. Units may be selected for analysis on the basis of the structure of language (see Figure 1.1) and analyst preference. Using this approach, the sounds and symbols serve as the smallest unit for examination. Analyses of phonetics and phonology, sounds and sound patterns, are discussed in parts III and IV of this volume. The next unit of analysis involves an examination of morphology. Morphemes are often described as the minimal unit of meaning or grammatical function (Yule, 2020). Syntax or the structure or form of language provides information on an individual’s knowledge of grammatical rules. Semantics addresses the meaning of the words or utterances. Morphology, syntax and semantic analyses are addressed in part II of this collection. Finally, pragmatics applies to a higher level of discourse that describes the social use of these linguistic units in context. However, within these categories, units of analysis may be addressed in terms of the amount of text that is analyzed rather than just those linguistic characteristics that are examined. One can examine an allophone, a phoneme or grapheme, a morpheme, a word, a clause, a T-unit (terminable unit), a C-unit (communication unit), a phrase, a sentence, or
Pragmatics
Semantics Syntax
Morphology
Phonology
Figure 1.1 Structure of language. © L. Keegan, J. Guendouzi & N. Müller, 2023.
8 Louise C. Keegan, Jacqueline A. Guendouzi, and Nicole Müller larger components of language in context. Different analytical methods may have varied approaches to defining these concepts, including the terms clause, C-unit, T-unit and phrase. Nevertheless, it is typically agreed that a clause is the unit of language below the sentence and, in English, includes a subject and a verb or predicate. T-units can be described as the “shortest possible grammatically allowable sentence” (Hunt, 1965, p. 305) and usually serve as analytical units for analysis of written language but this term “T-unit” has been applied to spoken language, too (Foster et al., 2000). It involves a main clause and its modifiers (subordinate clauses attached to it) and cannot be subdivided without loss of essential meaning. Similarly, a C-unit is most typically described as similar to a T-unit with the added distinction that it may include segmented or partial sentences, that are more typical of conversation (Hughes et al., 1997; Loban, 1976). Phrases are typically distinguished on the basis of speaker pauses and natural breaks. Finally, a sentence is typically defined as a linguistic expression, unit of grammar, or group of words that contains at least a subject and a finite verb phrase, and functions as a statement, question, exclamation, or command (Halliday & Matthiessen, 2014). Halliday & Matthiessen also specify that within SFL, the term sentence usually references written language, and they favor the term “clause complex” in the context of spoken language (Eggins & Slade, 2004; Halliday & Matthiessen, 2014). Beyond these units of analysis other approaches examine larger segments of text or interaction (e.g. paragraphs in written texts, conversations in interactional contexts) in various contexts and environments where analysis methods take a more holistic approach to examining aspects of language such as cohesion (e.g., Zhang et al., 2021), coherence (e.g., Henry et al., 2020), topics (e.g. Mentis & Prutting, 1991), repair (e.g., Clark, 2020; Leaman & Archer, 2022), turn taking (e.g. Young et al., 2016) and more. Examples of these are provided throughout this text but also briefly in the section that follows.
1.4 Application of Clinical Analysis Consider the following exchange, a transcribed casual conversation between three family members, while unpacking groceries. W: J: W: W: J: M: M: J: J: M: M: M: M: M: J: M: J: M: M:
you didn’t check the mail today did you dear? no. no. I’ll be right back. is this our popcorn then? yeah. I think a marketer must have brought that to me from work. okay. the bread got smashed. yeah it did? oh shoot. oh man that’s gonna be fun trying to make a sandwich out of. what they pack in that sack? that (interrupted) oh that’s how it happened. that cereal. huh {laugh} well hmm. so much for nice shaped bread.
Discourse Analysis and Communication Impairment 9 For analysis purposes each line in the excerpt contains only one clause complex or C-unit, despite the fact that there were numerous multipart utterances (e.g., that’s gonna be fun trying to make a sandwich out of, what they pack in that sack?). At the level of the allophone or phoneme, there were no notable errors or issues, at least per this transcription. A phonetic transcription of the conversation might provide additional information about phonology or speech patterns (e.g., accent) (see parts III and IV of this collection). Similarly, syntax, including morphology seemed to be unremarkable. It was notable that the transcriber used “gonna” to represent a contraction of “going to” which is often encountered in casual speech. The utterance “What they pack in that sack?” includes an ellipsis of the verb “did”, which is implied. Similarly, the word “sack” to refer to a (typically paper) bag for groceries p rovides a clue about the cultural context of the conversation. Pragmatically, all three individuals appear to be communicating effectively. Insight into their relationship may be inferred from word choices such as “dear”. Analyses of this excerpt would likely not identify the fact that one of these communication partners experiences communication difficulties post brain injury. Thus, it is very relevant to note that clinical discourse analysis is not only about identifying difficulties, but also highlighting language strengths of individuals with communication difficulties. Clinical analysis of discourse has been applied to data from a variety of populations with communication difficulties and using different theories and approaches, many of which are addressed in this collection. Regardless of the approach, context is a key consideration. In the example above the participant with a brain injury (J is “Joe Johnson” from Keegan et al., 2022) communicated effectively in the casual exchange of unpacking shopping. However, in situations where there are memory demands and multiple social conventions to follow, his communication was not so effective (see Keegan et al., 2022). Thus, there are some key considerations for clinical application as related to context. Keegan, Hoepner and colleagues (2023) categorize these contextual considerations as (1) abilities (2) environment (3) demands and (4) identity. While many approaches and theories do account for context (e.g., genre in SFL), many of our clinical tools (e.g., pragmatic scales, clause level analyses methods) do not explicitly address these. Therefore, it is not unusual for researchers and clinicians to focus on the unit (e.g., a clause) without considering the context either within or beyond the text. The communication abilities of the client involved in the interaction is usually the key consideration for clinical discourse analysis, however it is relevant to note that the abilities of the communication partner(s) will also play a role. Analyses of discourse in recent years (Simmons-Mackie & Kagan, 1999; Togher, 2000) have led to the development of communication partner training programs (e.g., Kagan, 1998; Togher et al., 2016; Kagan et al, 2018) that allow interlocutors to contribute more supports in conversations with individuals with communication disorders. The environment, which includes the physical location but also the communication partners, can greatly influence an interaction (Keegan & Müller, 2022). Approaches such as Clinical Discourse Analysis (Damico, 1991; Hoepner et al., 2018) emphasized the relevance of collecting language samples of individuals in different situational environments. The demands placed on an individual’s cognitive and linguistic skills in any one interaction vary greatly. The linguistic and cognitive demands are shaped by the purpose of the interaction, or the interactional (and transactional) goals of the participants. The norms, or what is considered appropriate, and indeed what is effective in terms of speakers achieving their goals, in any one interaction, are shaped by the sociocultural, sociopolitical environment in which an interaction takes place, including the negotiation of power dynamics and identities. Thus, a workplace negotiation with a manager with the goal of achieving a pay increase follows different patterns and norms than a casual gossip with a close friend, where no specific goal needs to be achieved. There is much literature that discusses the varying pragmatic norms of different cultural contexts (e.g., Ishihara & Cohen, 2021) and these
10 Louise C. Keegan, Jacqueline A. Guendouzi, and Nicole Müller should be considered as analyses occur. Finally, identity or sense of self is a factor of culture, experiences, values etc. Social construction theory (Coulter, 1979; Harré, 1983, 1991; Sabat & Harré, 1992) and Positioning Theory (Harré & van Langenhove, 1999) provide foundations for analyzing projections of self within daily interactions that have since been applied in clinical discourse analysis (e.g., Keegan, Behn et al., 2023; Müller & Guendouzi, 2009).
1.5 Applying Discourse Analysis in Clinical Contexts and Future Directions Discourse analysis has been well established as a means for examining naturally occurring communication. As aforementioned, analysis of discourse has been applied to a great many fields. However, communication professionals and researchers, who work with individuals who experience communication difficulty, are moving toward methods that emphasize the ecological validity of the interaction (Keegan, Hoepner et al., 2023) and moving away from discourse analysis of contrived elicited utterances in decontextualized settings. Discourse analysis can tell us a lot about the strengths and skills of individuals with communication difficulties and directly informs intervention goals and strategies that allow individuals to capitalize on strengths and skills while addressing any communication difficulties. This analysis is relevant for all populations regardless of diagnosis, and it should ideally be done in the context or communication environment of interest. There are many available approaches to discourse analysis, that are outlined in detail in this volume. Technology developments in recent years provide new opportunities in this regard. Automated transcription developments (MacWhinney & Fromm, 2022) increase the speed whereby contextual language samples can be collected. Automated analysis programs such as Computerized Language ANalysis (CLAN; MacWhinney, 2000) and Systematic Analysis of Language Transcripts (SALT; Miller & Iglesias, 2015) have been available for several decades. With technological advancements that serve to increase the speed of t ranscriptions that incorporate the conventions for these programs, such analyses will be much more efficient in terms of the time spent on them. Alternative automated analysis programs that require minimal transcription (e.g., SUGAR; Pavelko & Owens, 2019) or are “transcription-less” are in development (e.g., Howell et al., 2022; Yiew et al., 2023). Thus, in summary, technology is poised to facilitate the ease of discourse analysis over the next few years, regardless of the unit, approach or aspects of discourse being examined. Finally, c orpus data banks of individuals with communication disorders are growing (see Chapter 11), and as analysis methods are applied to these corpora, we will obtain additional comparative data about communication strengths and difficulties of individuals with and without communication disorders. Technological advances are also shaping how humans interact with machines. These range from very scripted interactions (such as the use of automated telephone response systems which sort callers into different categories), to artificial intelligence (AI) tools such as virtual reality (Brassel et al., 2021), or ChatGPT, which is capable of dialogic interaction that includes reacting to follow-up questions such as requests for clarifications, admitting mistakes, challenging incorrect assumptions and rejecting inappropriate requests (https:// openai.com/blog/chatgpt). The implications of AI tools for q uestions of authorship for example, in the academic and education contexts are being v igorously debated at the time of chapter preparation (Cotton et al., 2023). Similarly, the exponential growth of telehealth services also has implications for the context of interactions as well as the automation of transcription and analysis (Yiew et al., 2023). Implications of such tools for overcoming communication limitations and disabilities remain to be formulated and are a fruitful field for future investigations.
Discourse Analysis and Communication Impairment 11
REFERENCES Adams, C. (2002). Practitioner review: The assessment of language pragmatics. Journal of Child Psychology and Child Psychiatry, 48(8), 973–987. Altman, C., Goral, M., & Levy, E. (2012). Integrated narrative analysis in multilingual aphasia: The relationship among narrative structure, grammaticality and fluency. Aphasiology, 26(8), 1029–1052. Armstrong, E. (2000). Aphasic discourse analysis: The story so far. Aphasiology, 14(9), 875–892. Armstrong, E., Lewis, T., Robins, A., Malcolm, I., & Ciccone, N. (2023). Cross–cultural perspectives on conversational assessment and treatment in aphasia: Learnings from a first nations context. In C. Coelho, L. R. Cherney, & B. B. Shadden (Eds.), Discourse analysis in adults with and without communication disorders: A resource for clinicians and researchers (pp. 131–148). Plural Publishing. Atkinson, J. M., & Heritage, J. (1984). Structures of social action. Cambridge University Press. Austin, J. L. (1962). How to do things with words. Oxford University Press. Azios, J., & Simmons Mackie, N. (2023). Clinical application of conversation analysis in aphasia. In C. Coelho, L. R. Cherney, & B. B. Shadden (Eds.), Discourse analysis in adults with and without communication disorders: A resource for clinicians and researchers (pp. 109–130). Plural Publishing. Bach, K., & Harnisch, R. M. (1979). Linguistic communication and speech acts. MIT Press. Bitetti, D., Hammer, C. S., & López, L. M. (2020). The narrative macrostructure production of Spanish–English bilingual preschoolers: Within-and cross-language relations. Applied Psycholinguistics, 41(1), 79–106. Brassel, S., Power, E., Campbell, A., Brunner, M., & Togher, L. (2021). Recommendations for the design and implementation of virtual reality for acquired brain injury rehabilitation: Systematic review. Journal of Medical Internet Research, 23(7), e26344. Carrell, P. L. (1982). Cohesion is not coherence. TESOL Quarterly, 16(4), 479–488. Chen, W. (2018). A critical discourse analysis of Donald Trump’s inaugural speech from the perspective of systemic functional grammar. Theory and Practice in Language Studies, 8(8), 966–972. Cherney, L. R., Shadden, B. B., & Coelho, C. A. (1998). Analyzing discourse in communicatively impaired adults. Aspen Publishers.
Clark, E. V. (2020). Conversational repair and the acquisition of language. Discourse Processes, 57(5–6), 441–459. Coelho, C. A., Cherney, L. R., & Shadden, B. B. (2023). Discourse analysis in adults with and without communication disorders. Plural Publishing. Cotton, D. R., Cotton, P. A., & Shipway, J. R. (2023). Chatting and cheating: Ensuring academic integrity in the era of ChatGPT. Innovations in Education and Teaching International, 1–12. https://doi.org/10.1080/14 703297.2023.2190148 Coulter, J. (1979). The social construction of mind: Studies in ethnomethodology and linguistic philosophy. Macmillan. Damico, J. (1991). Clinical discourse analysis: A functional approach to language assessment. In C. Simon (Ed.), Communication skills and classroom success: Assessment and therapy methodologies for language and learning disabled students (pp. 125–148). Thinking Publications. Eggins, S., & Slade, D. (2004). Analysing casual conversation. Equinox Publishing. Fairclough, N. (2012). Critical discourse analysis. In J.P. Gee & M. Handford (Eds.), The Routledge handbook of discourse analysis (pp. 9–20). Routledge. Foster, P., Tonkyn, A., & Wigglesworth, G. (2000). Measuring spoken language: A unit for all reasons. Applied Linguistics, 21(3), 354–375. Fromm, D., Forbes, M., Holland, A., & MacWhinney, B. (2020). Using AphasiaBank for discourse assessment. Seminars in Speech & Language, 41(1), 10–19. Goodglass, H., & Kaplan, E. (1983). The Boston diagnostic aphasia examination (2nd ed.). Lea and Febiger. Grice, H. P. (1975). Logic and conversation. In P. Cole & J. L. Morgan (Eds.), Syntax and semantics. Volume 3: Speech acts (pp. 41–58). Academic Press. Guendouzi, J. (2013). ‘So what’s your name?’ Relevance in dementia. In B. Davis & J. Guendouzi (Eds.), Pragmatics in dementia discourse: (pp. 29–54). Cambridge Scholars Publishing. Halliday, M. A. K., & Hasan, R. (2014). Cohesion in English (No. 9). Routledge. Halliday, M. A. K., & Matthiessen, C. M. (2014). Halliday’s introduction to functional grammar. Routledge. Happé, F. G. (1995). The role of age and verbal ability in the theory of mind task performance
12 Louise C. Keegan, Jacqueline A. Guendouzi, and Nicole Müller of subjects with autism. Child Development, 66(3), 843–855. Harré, R. (1983). Personal being. Blackwell. Harré, R. (1991). The discursive production of selves. Theory and Psychology, 1(1), 51–63. Harré, R., & van Langenhove, L. (Eds.). (1999). Positioning theory: Moral contexts of intentional action. Blackwell. Henry, L. A., Crane, L., Fesser, E., Harvey, A., Palmer, L., & Wilcock, R. (2020). The narrative coherence of witness transcripts in children on the autism spectrum. Research in Developmental Disabilities, 96, 103518. Hoepner, J. K., Buhr, H., Johnson, M., Sather, T., & Clark, M. (2018). Interactions between the environment, physical demands, and social engagement at an aphasia camp. Journal of Interactional Research in Communication Disorders, 9(1), 44–75. https://doi.org/10.1558/jircd.34497 Howell, S., Beeke, S., Sinnott, E. L., Varley, R., & Pring, T. (2022). An examination of sample length and reliability of the Interactional Network Tool, a new measure of group interactions in acquired brain injury. Aphasiology, 1–15. https://doi.org/10.1080/02 687038.2022.2118517 Hughes, D. L., McGillivray, L., & Schmidek, M. (1997). Guide to narrative language: Procedures for assessment. Thinking Publications. Hunt, K. W. (1965). A synopsis of clause-tosentence length factors. The English Journal, 54(4), 300–309. Ishihara, N., & Cohen, A. D. (2021). Teaching and learning pragmatics: Where language and culture meet. Routledge. Jørgensen, K., Andreasson, K., Rasmussen, T., Hansen, M., & Karlsson, B. (2022). Recoveryoriented cross-sectoral network meetings between mental health hospital professionals and community mental health professionals: A critical discourse analysis. International Journal of Environmental Research and Public Health, 19(6). https://doi.org/10.3390/ijerph19063217 Kagan, A. (1998). Supported conversation for adults with aphasia: Methods and resources for training conversation partners. Aphasiology, 12(9), 816–830. Kagan, A., Simmons-Mackie, N., & Victor, J. C. (2018). The impact of exposure with no training: Implications for future partner training research. Journal of Speech, Language, and Hearing Research, 61(9), 2347–2352. Keegan, L., Hoepner, J. K., Togher, L., & Kennedy, M. (2023). Clinically applicable sociolinguistic assessment for cognitive-communication disorders. American Journal of Speech Language
Pathology, 32(2 S), 966–976. https://doi. org/10.1044/2022_AJSLP-22-00102 Keegan, L. C., Behn, N., Power, E., Howell, S., & Rietdijk, R. (2023). Assessing conversation after traumatic brain injury. In C. Coelho, L. R. Cherney, & B. B. Shadden (Eds.), Discourse analysis in adults with and without communication disorders: A resource for clinicians and researchers (pp. 173–192). Plural Publishing. Keegan, L. C., & Müller, N. (2022). The influence of context on identity construction after traumatic brain injury. Journal of Interactional Research in Communication Disorders, 13(2), 171–195. Keegan, L. C., Müller, N., Ball, M. J., & Togher, L. (2022). Anger & aspirations: Linguistic analysis of identity after traumatic brain injury. Neuropsychological Rehabilitation, 32(8), 2029–2053. https://doi.org/10.1080/09602011. 2022.2071949 Kellar, J., Paradis, E., van der Vleuten, C. P., Oude Egbrink, M. G., & Austin, Z. (2020). A historical discourse analysis of pharmacist identity in pharmacy education. American Journal of Pharmaceutical Education, 84(9), 1251–1258. Kintsch, W., & Van Dijk, T. (1978). Towards a model of text comprehension and production. Psychological Review, 85(5), 363–394. Leaman, M. C., & Archer, B. (2022). “If you just stay with me and wait … You’ll get an idea of what I’m saying”: The communicative benefits of time for conversational self-repair for people with aphasia. American Journal of Speech-Language Pathology, 31(3), 1264–1283. Lenzen, S. A., Stommel, W., Daniëls, R., van Bokhoven, M. A., van der Weijden, T., & Beurskens, A. (2018). Ascribing patients a passive role: Conversation analysis of practice nurses’ and patients’ goal setting and action planning talk. Research in Nursing & Health, 41(4), 389–397. Loban, W. (1976). Language development: Kindergarten through grade twelve. NCTE Committee on Research Report No. 18. MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk: Transcription format and programs (3rd ed.). Lawrence Erlbaum Associates Publishers. MacWhinney, B., & Fromm, D. (2022). Language sample analysis with TalkBank: An update and review. Frontiers in Communication, 7, 91. McDonald, S., Code, C., & Togher, L. (Eds.). (2000). Communication disorders following traumatic brain injury. Psychology Press. Mentis, M., & Prutting, C. A. (1991). Analysis of topic as illustrated in a head-injured and a normal adult. Journal of Speech and Hearing
Discourse Analysis and Communication Impairment 13 Research, 34(3), 583–595. https://doi.org/ 10.1044/jshr.3403.583 Miller, J. F., & Iglesias, A. (2015). Systematic Analysis of Language Transcripts (SALT), Version 16.1.5 [Computer software]. SALT Software, LLC Müller, N., & Guendouzi, J. A. (2009). Discourses of dementia: A call for an ethnographic, action research approach to care in linguistically and culturally diverse environments. Seminars in Speech and Language, 30(03), 198–206. Müller, N., Guendouzi, J. A., & Wilson, B. (2008). Discourse analysis and communication impairment. In M. J. Ball, M. R. Perkins, N. Müller, & S. Howard (Eds.), Handbook of clinical linguistics (pp. 1–31). Blackwell Publishing. Onions, C. T. (1966). The Oxford dictionary of English etymology. Oxford University Press. Park, J., Saha, S., Chee, B., Taylor, J., & Beach, M. C. (2021). Physician use of stigmatizing language in patient medical records. JAMA Network Open, 4(7), e2117052–e2117052. Pavelko, S. L., & Owens, R. E., Jr. (2019). Diagnostic accuracy of the Sampling Utterances and Grammatical Analysis Revised (SUGAR) measures for identifying children with language impairment. Language, Speech, & Hearing Services in Schools, 50(2), 211–223. Penn, C. (1988). The profiling of syntax and pragmatics in aphasia. Clinical Linguistics & Phonetics, 2(3), 179–207. Prutting, C., & Kirchner, D. (1983). Applied pragmatics. In T. Gallagher & C. Prutting (Eds.), Pragmatic assessment and intervention issues in language (pp. 29–64). College Hill. Prutting, C., & Kirchner, D. (1987). A clinical appraisal of the pragmatic aspects of language. Journal of Speech and Hearing Disorders, 52(2), 105–119. Ryder, N., & Leinonen, E. (2011). Relevance theory and language interpretation. In J. Guendouzi, F. Loncke & M. Williams (Eds.), The handbook of psycholinguistics and cognitive processing: Perspectives in communication disorders (pp. 747–760). Psychology Press. Sabat, S. R., & Harré, R. (1992). The construction and deconstruction of self in Alzheimer’s disease. Ageing & Society, 12(4), 443–461. Searle, J. (1969). Speech acts. Cambridge University Press. Searle, J. (1975). A taxonomy of illocutionary acts. In K. Gunderson (Ed.), Language, mind and knowledge. Minnesota studies in the philosophy of science (pp. 344–369). University of Minnesota Press.
Sharififar, M., & Rahimi, E. (2015). Critical discourse analysis of political speeches: A case study of Obama’s and Rouhani’s speeches at UN. Theory and Practice in Language Studies, 5(2), 343. Simmons-Mackie, N., & Kagan, A. (1999). Communication strategies used by ‘good’ versus ‘poor’ speaking partners of individuals with aphasia. Aphasiology, 13(9–11), 807–820. Sperber, D., & Wilson, D. (1997). Remarks on relevance theory and the social sciences. Multilingua, 16(2/3), 145–151. Togher, L. (2000). Giving information: The importance of context on communicative opportunity for people with traumatic brain injury. Aphasiology, 14(4), 365–390. Togher, L., McDonald, S., Tate, R., Rietdijk, R., & Power, E. (2016). The effectiveness of social communication partner training for adults with severe chronic TBI and their families using a measure of perceived communication ability. NeuroRehabilitation, 38(3), 243–255. Tomkow, L., Pascall-Jones, P., & Carter, D. (2023). Frailty goes viral: A critical discourse analysis of COVID-19 national clinical guidelines in the United Kingdom. Critical Public Health, 33(1), 116–123. https://doi.org/10.1080/09581596.20 22.2090316 Wilkinson, R. (2011). Changing interactional behaviour: Using conversation analysis in intervention programmes for aphasic conversation. In C. Antaki (Ed.), Applied conversation analysis: Intervention and change in institutional talk (pp. 32–53). Palgrave-Macmillan. Yiew, K., Togher, L., Power, E., Brunner, M., & Rietdijk, R. (2023). Differentiating use of facial expression between individuals with and without traumatic brain injury using affectiva software: A pilot study. International Journal of Environmental Research and Public Health, 20(2), 1169. Young, J. A., Lind, C., & Van Steenbrugge, W. (2016). A conversation analytic study of patterns of overlapping talk in conversations between individuals with dementia and their frequent communication partners. International Journal of Language & Communication Disorders, 51(6), 745–756. Yule, G. (2020). The study of language (7th ed.). Cambridge University Press. Zhang, M., Geng, L., Yang, Y., & Ding, H. (2021). Cohesion in the discourse of people with post-stroke aphasia. Clinical Linguistics & Phonetics, 35(1), 2–18.
2 Conversational Implicature and Communication Disorders FRANCESCA FOPPOLO AND GRETA MAZZAGGIO 2.1 Conversational Implicatures When we speak, we often convey more than what we literally say, enriching our message with implicit content. We also provide cues to our listeners or readers to derive inferences from what we say. To exemplify, consider the conversation in (1), in which a teacher asks their students about their homework, which comprised exercises on pages 41 and 43: (1) Teacher: Did you do your homework? a. Student A replies: I did some of the exercises. b. Student B replies: I did the exercises on page 41. c. Student C replies: I had a terrible headache yesterday. From the students’ responses, the teacher will infer that none of the students completed their homework, and, in particular, the Conversational Implicatures in (2) will be derived: (2) a. Student A completed some but not all of the assignments. b. Student B completed only page 41, not page 43. c. Student C did none of the exercises. How? Conversational Implicatures arise in discourse by virtue of the mutual agreement between speakers and hearers to be cooperative and obey some maxims (Grice, 1975). These maxims urge us to make our contributions maximally informative (Quantity); to say something which is true and reliable (Quality); to be relevant (Relevance) and clear, brief, orderly, and unambiguous (Manner). Assuming that participants in a conversation are cooperative and eager to provide a contribution that is qualitatively and quantitatively appropriate to the purposes of the exchange, the student’s answers in (1a-c) can be judged informative and cooperative answers only if they are evaluated with respect to what is relevant in the exchange, or in contrast with what the students could have said instead but didn’t. For example, students A and B could have uttered the alternative (and more informative) statements in (3) (1a-b): (3) a. I did all the exercises. b. I completed page 41 and 43.
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
16 Francesca Foppolo and Greta Mazzaggio The fact that they did not constitute a cue for their teacher to derive the implicatures in (2). Such implicatures, albeit conveying the same message, are distinct on a theoretical basis. The inference in (2a) is called a scalar implicature, given that it is based on an ordered scale in which the elements (in this case, the quantifiers some and all) are ordered with respect to the information they convey (Horn, 1972). All, which is the most informative quantifier in the scale, implies some: if someone made all the exercises, they also made some of the exercises, by virtue of logical entailment between all and some. Mentioning the weaker scalar alternative (in this case, some, as in (1a)) triggers the scalar inference in (2a) that the most informative alternative (in this case, all, as in (3a)) is not true. The inference in (2b) is called ad hoc implicature, and it is based on alternatives that are contextually, not linguistically, triggered. In this case, mentioning only one of the available alternatives in the discourse (as in (1b)) will trigger the ad hoc inference that the other alternatives (as in (3b)) are not true. In both cases, the use of a weaker utterance in a situation in which the more informative alternative holds constitutes a violation of the maxim of Quantity, resulting in an utterance that is said to be “underinformative.” Typically, these inferences are derived with ease in adult conversations: they are a means of conveying our message in a fast and efficient way. Nonetheless, they do require the steps in (4) to be derived, and the abilities to carry out these steps might be impaired or delayed in certain populations. (4) i. an alignment of speakers and hearers with respect to the conversational maxims and the principle of cooperation; ii. the recognition, both by the speaker and the hearer, of the alternatives that are relevant in the discourse, and an alignment between them on this issue. Note that, in the case of scalar implicatures, the alternatives are lexically triggered by a scale which is based on lexical knowledge (i.e., the knowledge of the meaning of the quantifiers involved in the scale, and their mutual position on that scale). In the case of ad hoc implicatures, instead, no prior linguistic knowledge is required, but the recognition of the relevant alternatives in the specific context. iii. the recognition, on the hearer side, of the epistemic state of the speaker, that is, their knowledge with respect to the context and the matter of the utterance. Typically developing children have been found to strive with conversational implicatures, especially scalar implicatures, at least up to age 6. Assessed by means of different tasks and methodologies, they are found to fail to recognize when a sentence does not constitute an informative utterance in the discourse. For example, children at ages 4 and 5 accept underinformative sentences like “Some dwarfs went on a boat” (Foppolo et al., 2012), “Every space-guy took a strawberry or an onion ring” (Chierchia et al., 2004) or “The dog painted a heart” (Katsos & Bishop, 2011) in contexts in which all the dwarfs went on a boat (all is the most informative alternative in the scale), every space-guy took a strawberry and an onion ring (and is the most informative alternative in the scale) and the dog painted both a heart and a star. In the remaining of this chapter, we will take a closer look at the main experimental works on conversational implicatures in clinical linguistics during the last decades, trying to find commonalities between results in populations with different types of communication disorders in the domain of language, socio-pragmatics, and communication.
2.2 Conversational Implicatures in Clinical Linguistics The ability to cope with conversational inferences in clinical subjects has been a matter of a stream of research both in the linguistic and psychological fields in the past decade, although studies are still scarce, and results are mixed. Nonetheless, it is important to investigate
Conversational Implicature and Communication Disorders 17 impairments in conversational implicatures’ comprehension since this is linked with a failure in everyday social relationships – resulting in a sense of frustration for p eople suffering the impairment – but it is also a mirror of more general deficits in linguistic competence which can offer some cues for early intervention. Moreover, given the steps required to derive conversational implicatures, and the traits that identify populations with communication disorders, understanding how these inferences are dealt with in a population with a communicative impairment will offer a privileged perspective to know more about the phenomenon of pragmatic inference on the one side, and about communication disorders on the other, also with respect to diagnosis and intervention. As we have seen in (4), the derivation of an implicature requires different steps and t ackles different cognitive abilities. First, it requires the recognition of the speakers’ c ommunicative intentions, their mental state, and the shared background of information. These, in turn, require what in the literature is referred to as the Theory of Mind (ToM), that is, the ability to attribute mental states (such as beliefs, desires, intentions, emotions, knowledge) to other people. Second, one needs to retrieve and/or access the alternatives that are relevant in the discourse, a process that might be resource demanding. We define “cognitive load” the amount of resources used to perform a task in terms of executive functions, which are the higher cortical functions involved in controlling and coordinating other cognitive skills and behaviors, such as planning our actions, switching tasks, or inhibiting irrelevant information. Finally, linguistic abilities may also play a role, as suggested by research on typically developing children (Foppolo et al., 2021; Wilson & Katsos, 2022), particularly with respect to scalar implicatures that rely on the knowledge of a lexicalized scale. Following the criteria for the diagnosis of Communication Disorder in the 2013 Diagnostic and Statistical Manual of Mental Disorders Fifth Edition (DSM-5) – particularly, the recognized impairment in the ability to adapt communication to the context or the listener, as well as the difficulties in understanding non-literal, implicit or context-dependent meanings – it is reasonable to expect difficulties in the derivation of conversational implicatures in populations with impairments in the linguistic and/or in the socio-pragmatic domains. In the next sessions, we will discuss up-to-date experimental findings in different types of neurodevelopmental disorders (Section 2.2.1) and communicative disorders in adult clinical populations (Section 2.3).
2.2.1 Neurodevelopmental Disorders 2.2.1.1 Implicatures in Autism Spectrum Disorders Studies on Autism Spectrum Disorders (ASD) are probably the most prosperous in the field of conversational implicatures in clinical linguistics, and this is undoubtedly due to the fact that pragmatic skills constitute a core area of weakness in ASD. However, this flourishing experimental field has led to apparently mixed results that we will attempt to reconnect. In their pioneering work, Surian et al. (1996) addressed the question “Are children with autism deaf to Gricean maxims?”. Their answer was “yes” since their results showed how autistic children find the violation of conversational maxims difficult to detect; also, ToM seemed to predict their behavior. Subsequent work focused on older children, adolescents, and adults, and no difference was found between autistic individuals and age-matched peers, showing that some pragmatic abilities are preserved in ASD. Pijnacker et al. (2009) assessed adults in the spectrum with underinformative sentences like “Some sparrows are birds” (modeled after Noveck, 2001) and found that the autistic population was as sensitive as the neurotypical adult controls to pragmatic violations.
18 Francesca Foppolo and Greta Mazzaggio Chevallier et al. (2010) assessed autistic adolescents with underinformative sentences like “There is a sun or a train” in a context in which both a sun and a train were present, and their results replicated those of Pijnacker et al.: participants in the spectrum produced similar rates of pragmatic interpretation of the disjunction as their neurotypical peers. In both studies, verbal intelligence correlated with a higher rate of pragmatic interpretations. Similarly, Su and Su (2015) found no difference in the interpretation of scalar implicatures between autistic children, whose ages ranged from 4 to 15 years, and typically developing peers. If verbal intelligence seems to play an important role in conversational implicatures, it is important to note that this set of studies only assessed linguistically driven implicatures (i.e., scalar implicatures), and used an explicit binary judgment (true/false, agree/ disagree). Nonetheless, conclusions cannot be taken as conclusive because ToM has not been assessed in these latest studies and the assessed groups were heterogeneous in age and diagnosis. Particularly relevant to this last point: at the time, the distinction between “High-Functioning Autism” and “Asperger syndrome” was still in place; part of the autistic individuals in those studies were diagnosed as Asperger, following DSM-IV (American Psychiatric Association, 2000). Further works considered the mentalizing skill component and demonstrated how autistic people might succeed in computing scalar implicatures by relying on compensatory strategies that do not require reasoning about another’s mental state (Hochstein et al., 2018). Other studies demonstrate that the kind of task used matters: when non-binary judgments are required, autistic children seem to show more difficulties in rejecting underinformative utterances (Pastor-Cerezuela et al., 2018; Schaeken et al., 2018), and this might also explain the variability in the results across different studies. In a recent contribution, Mazzaggio et al. (2021) assessed autistic children with a Picture Selection task similar to Hochstein et al. (2018) on both scalar and ad hoc implicatures. As we said above, the former is lexically driven while the latter is contextually driven, so it only requires the “pragmatic” component of the derivation of the implicature. Results showed that autistic children performed worse compared to their peers in both types of implicatures, even if their performance increased with age. These results are compatible with a pragmatic impairment of autistic children, which is not limited to linguistically driven scales (cf. also Whyte & Nelson, 2015). Furthermore, general cognitive abilities were found to affect the performance autistic children on both kinds of implicatures, and ToM reasoning skills were found to be linked to their performance on scalar, but not ad hoc implicatures. Despite the variability observed, we can conclude that autistic individuals show a delay in the emergence of their ability to compute scalar implicatures. Also, Mazzaggio and Surian (2018) report an association between autistic traits (particularly, ToM) in the broader phenotype and the ability to compute scalar implicatures. As some researchers have argued, when no differences are found between autistic individuals and neurotypical controls, this might be the result of a strategy in which ToM skills are bypassed to solve the task (Kissine, 2021; van Tiel & Kissine, 2018). For this reason, it is advisable to include ToM measures when assessing people in the spectrum.
2.2.1.2 Social Pragmatic Communication Disorder (SPCD) Under the macro-category of Communication Disorders in DSM-5, the Social Pragmatic Communication Disorder (SPCD) has been introduced and it is described as a neurodevelopmental disorder characterized by persistent difficulties in the social use of verbal and nonverbal communication; these deficits lead to functional limitations in everyday social relationships and school or job performance. Despite the official recognition of the disorder,
Conversational Implicature and Communication Disorders 19 little is known about it and research remains scarce, probably due to the difficulty of c linicians in differentiating this diagnosis from ASD (without intellectual disability). Apparently, the only major difference between the two diagnoses is the stereotyped and rigid behavior of the latter group. In a recent contribution, Svindt and Surányi (2021) compared the performance of a utistic and SPCD children in evaluating sentences that trigger systematic implicit content via linguistic cues (such as the presuppositional triggers only and too, and cleft-constructions). The authors label these inferences grammaticalized because these constructions are linguistically driven and their derivation do not rely on context or on the speakers’ perspectives: the sentence “only the princess fainted” conveys the implicit meaning that no one else fainted, besides the princess, independently from contextual information; analogously when hearing “it is the king who sat down,” this implicitly conveys that only the king, and no one else, did so. Their aim was to investigate whether the distinction between the two disorders is truly qualitative or, instead, quantitative in nature and, secondly, to explore the role of receptive grammatical competence and ToM abilities in performance. Overall, results show that both children with SPCD and autistic children differed when compared to typically developing peers in the comprehension of linguistically triggered non-literal meanings, presenting lower pragmatic scores. However, different cognitive skills predicted children’s performance, depending on their diagnosis: accurate responses were better predicted by ToM in autistic children and by grammatical skills in children with SPCS. As the authors themselves point out, “uncovering differences in the underlying cognitive sources are critical both for clarifying the status of SPCD as a distinct disorder, and also for the improvement of the accuracy of SPCD children’s early diagnosis and timely therapeutic intervention in clinical work.” (Svindt & Surányi, 2021, p. 1153). This result is also interesting considering the fact that children with pragmatic disorders have been categorized in the past as a subgroup of the so-called Specific Language Impairment (SLI), now relabeled as Developmental Language Disorders (DLD). Indeed, it has been debated for a long time whether DLD and SPCD are two different disorders (Bishop, 2000), considering the difficulty, in both populations, in language comprehension and use. In the next section, we will turn to experimental works on conversational implicature in DLD, in which language is the predominantly impacted area.
2.2.1.3 Developmental Language Disorders (DLD) Children with Developmental Language Disorders (DLD) have persistent difficulties in the acquisition and use of language, due to deficits in comprehension and/or production, with difficulties also in discourse and ineffective communication (APA, 2013). To get a diagnosis of DLD, these difficulties should not be attributable to other medical or neurological conditions, like hearing or other sensory impairment, motor dysfunction, intellectual disability, or developmental delay. Typically, children with DLD predominantly show difficulties with phonology, vocabulary and/or morphosyntax, although a subgroup might also show pragmatic difficulties (Ryder et al., 2008). In the same study cited above, Surian et al. (1996) also tested children with DLD on their ability to detect violations of conversational maxims (the Conversational Violation test, CVT) and argued for no pragmatic delay in the DLD group with respect to TD children. By closer inspection of the children’s performance, however, DLD scored above chance only on statements that violated expectations of truthfulness, relation, or politeness, but they behaved at chance when statements violated the first and second maxims of Quantity, which are the ones involved in scalar implicatures. Indeed, when tested with a classic Truth Value Judgment task for scalar implicature in which children had to judge underinformative sentences involving “some,” Katsos et al. (2011) report that Spanish
20 Francesca Foppolo and Greta Mazzaggio children with DLD performed lower compared to their age-matched peers, and comparably to younger TD children matched for linguistic abilities by standardized measures for receptive grammar and expressive vocabulary. This led the authors to conclude that the difficulty with conversational implicatures in children with DLD might be proportionate to their overall language difficulties. As we have seen, the nature and impact of pragmatic impairment in DLD are still debated, and experimental results are mixed. As anticipated, some studies tried to identify, within the DLD group, the group divide between those that have or do not have pragmatic impairment. For example, Ryder et al. (2008) tested a variety of verbal and non-verbal tasks (including a task for scalar implicature), comparing the performance of children with DLD and agematched or language-matched TD children (aged 7–11 and 5–6 years respectively). Results showed that children with DLD performed similarly to the younger TD group, and significantly less well than their peers. A correlation between pragmatic tasks and scores of receptive grammar was found, replicating previous findings. Additionally, children diagnosed as having pragmatic difficulties by their speech and language therapists showed more difficulties in all pragmatic tasks compared to the children with DLD without specific pragmatic impairment.
2.2.1.4 Specific Learning Disorders According to the DSM-5, Specific Learning Disorders (SLD) consist of general difficulties in learning and using academic skills, and, more specifically, difficulties in reading aloud and spelling, understanding the meaning of what they read, and processing complex grammar, as well as difficulties with mastering numbers and mathematical reasoning. Based on the area of difficulty, we can broadly distinguish between dyslexia (impairment in reading and/or writing skills) and dyscalculia (impairment in mathematical skills). To receive a diagnosis, learning difficulties should not be better accounted for by intellectual disabilities, uncorrected visual or auditory acuity, other mental or neurological disorders, psychosocial adversity, lack of proficiency in the language of academic instruction, or inadequate educational instruction. If impairments in language and mathematics abilities have been extensively studied in children with SLD, pragmatic abilities have been scarcely investigated. Recently, Cardillo et al. (2018) found deficits in pragmatic skills and ToM in children with dyslexia, particularly in their understanding of metaphors. Moreover, since SLD is believed by some authors to be associated with impaired verbal working memory, this led researchers to wonder whether this aspect might cause difficulties specifically with scalar implicatures (Vender, 2017). However, the results are mixed. On one side, Arosio et al. (2016) tested 24 children with dyslexia with a Truth Value Judgment task for scalar implicatures and compared their performance with two groups of TD children matched for chronological or linguistic age. Results showed an at-ceiling performance in all three groups, leading to the conclusion that children with dyslexia do not seem to display specific difficulties with the derivation of scalar implicatures. On the other side, Vender (2009), Stoicescu et al. (2011), and Hu et al. (2019) found contrasting results. Vender (2009) tested children with dyslexia in different types of scalar implicatures related to different scales (quantifiers, connectives, and adverbs). Children with dyslexia performed poorly compared to age-matched children and adults, failing to reject underinformative sentences; instead, they performed comparably to younger children. In one of the experiments, they tested the hypothesis that the difficulties shown by children with dyslexia when deriving scalar implicatures are due to processing limitations. To this purpose, the task was simplified to reduce the processing load by explicitly and verbally presenting the two alternative interpretations of or. In this case, children with dyslexia performed like the control groups. All these experiments led Vender to conclude that children
Conversational Implicature and Communication Disorders 21 with dyslexia are impaired in their ability to compute scalar implicatures and this is due to a lack in their processing resources (working memory) and the high cost required to access/ retrieve alternatives, a necessary step in the derivation of implicatures. Stoicescu et al. (2011) and Hu et al. (2019) confirmed difficulties of children with dyslexia on scalar implicatures with some, assessing respectively Romanian children and Chinese children. All authors agreed on identifying the cause of this impairment in processing difficulties that might be related to the reduced cognitive resources available to children with dyslexia and/or lexical difficulties with accessing scale members.
2.3 Communicative Disorders in Adult Populations 2.3.1 Schizophrenia Spectrum and Psychotic Disorders (SSPD) In DSM-5, schizophrenia spectrum (SS) and psychotic disorders (PD) are defined as presenting abnormalities in one or more of the following five domains: delusions, hallucinations, disorganized thinking (speech), grossly disorganized or abnormal motor behavior (including catatonia), and negative symptoms. Moreover, some people with this diagnosis show verbal communication impairments that can be defined as “positive” (disorganized discourse) or “negative” (poverty of speech). Pragmatics difficulties in these patients seem, however, unrelated to thought disorder, that is, a disorganized way of thinking that results in abnormal language: for example, “pragmatic rules, such as be relevant and be sufficient (Grice, 1975), may be inappropriately violated by tangential or incoherent speech and poverty of speech, respectively” (Linscott, 2005, p. 226). Abu-Akel (1999) analyzed the speech of two patients with disorganized schizophrenia to check for violations of the maxims of Relation and Quantity in recorded spontaneous interviews: one or both maxims were violated 66% of the time by one patient and 57% of the time by the other patient. Tényi et al. (2002) analyzed the ability of 26 people with paranoid schizophrenia in decoding violations of conversational implicatures (related to the Gricean maxim of Relevance). They found more difficulties in decoding the violated maxim as compared to neurotypical adults (see also Kuperberg, 2010). One study on comprehension demonstrates difficulties with figurative language in people with SSPD. For example, with proverbs, they tend to prefer concrete interpretations, with strong correlations between their comprehension ability and ToM, executive functioning, and intelligence (Brüne & Bodenstein, 2005). Abilities with conversational implicatures do not seem to be spared for individuals with SSPD, as demonstrated by some research in the field. Other experimental works on scalar implicatures show pragmatic impairments in this population. Wampers et al. (2018) assessed adults diagnosed with schizophrenia (Experiment 1) and young psychotic patients between 16 and 31 years old (Experiment 2) with tasks for scalar implicatures. In the first experiment, participants completed a classical binary Statement Evaluation task (modelled after Noveck, 2001) in which half of the items were underinformative sentences with some (e.g., “Some oaks are trees”) and the participants had to say whether the sentence was “true” or “false”. Patients rejected fewer underinformative items than the control group. To check whether these results were due to a higher tolerance for underinformativeness, in the second experiment participants were given a ternary scale of response (i.e., in addition to “true” and “false,” a third intermediate judgment was added: “both true and false”); additionally, they were also assessed for ToM skills. Again, participants with psychosis gave much more logical answers than the control group and a significant negative correlation between the number of logical answers and ToM scores was observed in the clinical group. The authors concluded that the observed differences between patients and controls might be explained by differences in ToM skills.
22 Francesca Foppolo and Greta Mazzaggio Schaeken et al. (2021) tried to replicate the results of Wampers et al. (2018) with participants diagnosed with SSPD according to the DSM-5 and by testing different scalar terms: quantifiers, modals, connectives, and adjectives. They created a questionnaire based on van Tiel et al. (2016) and Zevakhina (2012) in which a character utters a statement with a weak scalar item like “Some theater performances are interesting”. Participants were then asked to evaluate (on a five-point Likert scale) whether it could be deduced from the proposition that, according to the speaker, the statement implied that a stronger scalar term was not involved (in the example, “Not all theater performances are interesting”). Surprisingly, they failed to replicate Wampers et al. (2018) results: although their clinical group performed less pragmatically than the control group, the difference was not significant. The authors suggest that the difference between their results and previous studies might be explained by procedural differences, concluding that more research is needed to advance our understanding of pragmatic difficulties, specifically of conversational implicatures, in people with SSPD.
2.3.2 Brain Damages and Aphasia Aphasia is an acquired disorder characterized by the inability to articulate and/or understand words; it is caused by damage in specific areas of the brain mainly due to strokes, head traumas, brain tumors or degenerative diseases. Patients with aphasia can present heterogeneous linguistic behaviors depending on the severity of the damage and on the language-dominant side of the affected area. Several studies demonstrated pragmatic impairments in both left-hemisphere-damaged (LHD) patients, which usually have aphasia, and in right-hemisphere-damaged (RHD) patients (Parola et al., 2016; Tompkins et al., 2002). When considering the few studies that targeted conversational implicatures in braindamaged individuals, there are mixed results, probably due to the variability among the tested populations. With a qualitative methodology, Ahlsén (1993) described the difficulties of people with aphasia in adhering to conversational principles and maxims, particularly the Quantity and Manner maxims. Kasher et al. (1999) experimentally assessed participants with either LHD or RHD on the understanding of conversational implicatures based on all maxims using both a verbal and a non-verbal test. Interestingly, results showed that damages in both cerebral hemispheres led to impairments in the computation of implicatures. Despite similar behavioral results, the authors suggest that this does not imply that they process implicatures in the same way, since performance with implicatures correlated with different linguistic and non-linguistic measures in the two groups. Furthermore, “performance correlated most highly with tests other than Spontaneous Speech or Auditory Verbal Comprehension, confirming that the pragmatic deficit is not due to simple loss of basic language functions” (1999, p. 587). In more recent work, Kennedy et al. (2015) did not confirm these results; they assessed nine individuals with aphasia on scalar implicatures and did not find any difference compared with typical adults, even if the same patients showed difficulties with presuppositions. Results can be related to the small number of participants or to differences in the clinical population. Spotorno et al. (2015) ran a neuroimaging study to investigate the cognitive and neural basis of the computation of scalar implicatures in 17 patients with a neurodegenerative disease associated with progressive frontal and anterior temporal atrophy (bvFTD), but no aphasia. This population is known to present ToM and executive-function impairments (Pardini et al., 2013) and this led the authors to hypothesize difficulties with scalar implicatures. People with bvFTD participated in two experiments. Experiment 1 was a Truth Value Judgment task in which participants had to evaluate sentences like “Some of the
Conversational Implicature and Communication Disorders 23 cats are in the box” with respect to a visual scene in which either only some of the cats or all the cats were in the box. Experiment 2 used similar materials in a Picture Selection task in which participants had to select the best matching picture, choosing between the one that satisfied the logical interpretation of some (some and possibly all cats) and one that satisfied its pragmatic interpretation (some but not all cats). Results reported difficulties in the first experiment, with a tendency in these patients to provide responses compatible with a logical interpretation of some. The same participants performed well in the second experiment, selecting the picture that was compatible with the informative use of some. Moreover, neuroimaging data show a correlation between patients’ performance in the first experiment and atrophy in the ventromedial prefrontal cortex; due to the at-ceiling performance, it was not possible to perform similar analyses in the second experiment. The authors conclude that the different performance in the two experiments might be related to the fewer resource demands in the second one: patients might have difficulty in generating (linguistic) alternative interpretations in the first experiment; when these alternatives are provided visually, like in the second experiment, the difficulty disappears. These results seem to diverge from the data described above on typically developing children (Foppolo et al., 2021) and autistic children (Mazzaggio et al., 2021), in which having the alternative visually presented did not result in at-ceiling p erformances (furthermore, a high correlation between the two tasks was found in Foppolo et al., experiment 1 with preschoolers). Due to the few numbers of experiments on individuals with brain damage, their methodological heterogeneity, and their discordant conclusions, more studies are needed to obtain conclusive results.
2.4 Conclusions and Future Directions A recent study on Nicaraguan Sign Language (NSL) suggested that quantifiers are l inguistic fundamentals and universals (Kocab et al., 2022). NSL is a sign language that was – almost spontaneously – created in Nicaragua when deaf children were brought together for schooling for the first time in 1979. Prior to the opening of this school, these children mainly communicated with their families with home sign systems. NSL is of particular interest to linguists because it is the only known example of the emergence of a language that has been able to be studied and described since its origin and through its early evolution. In later years, NSL has been passed on to the new children who entered the community. This led Kocab et al. (2022) to test first, second and third generations of NSL signers to ask whether quantificational concepts are learned through access to a language or if they are basic properties of the human mind. If the first hypothesis is correct, authors expected that in a group of children without previous access to another language, like the first NSL speakers, it would take several generations for these terms to emerge, otherwise they expected them to emerge quite rapidly. Kocab et al.’s work investigated, with an elicitation study, whether quantifiers like some, all, and none are used by signers of NSL to describe pictures of events in which some, all or none of the characters performed an action. For example, given a set of bears, they had to describe a picture in which some of the bears, but not all, were swimming. They tested 17 NSL signers divided into three cohorts: signers who entered the signing c ommunity before 1983 (first cohort); signers who entered the signing community between 1986 and 1990 (second cohort); signers who entered the signing community between 1993 and 1999 (third cohort). They found dedicated lexical forms that functioned like classical quantifiers in all stages of NSL. Nonetheless, they also found that existential quantifiers like some and many tended to be produced in contexts that were best described by universal quantifier (all), that is, in contexts
24 Francesca Foppolo and Greta Mazzaggio in which all the bears are swimming. Interestingly, this happened more in cohort 1 than 2 and 3, suggesting a different lexicalization and use of these forms across stages of language evolution. According to the authors, these results show that the basic semantics of quantification is readily available in the human mind, although the ability to use quantifiers pragmatically might depend on the availability of processing resources or on the lexicalization of the scale, providing an explanation that aligns with those advocated to explain typically developing children’s difficulties with scalar implicatures. If this is indeed the case, what is the purpose of studying conversational implicatures in clinical populations? Mostly for theoretical and practical reasons: (1) the theoretical understanding of the cognitive, neurological, and linguistic underpinnings of implicatures’ processing and, consequently, (2) the implementation of communication interventions and educational support to improve pragmatic skills in people with impairments in this area of communication. Research is increasingly demonstrating how training can truly help the improvement of communicative skills. For example, Bambini et al. (2020) developed a training intervention grounded in the Gricean model of communication: the PragmaCom. Through a meta-pragmatic strategy, this intervention “prompts reasoning about the maxims by presenting exercises based on story contexts where communicative mismatches happen (misunderstanding of figurative meanings or inappropriate discourse production) and encouraging the discussion on the pragmatic mechanisms that were violated” (Bambini et al., 2020, p. 4). When applied to older adults (Bambini et al., 2020) and individuals with s chizophrenia (Bambini et al., 2022), it demonstrates the possibility to improve pragmatic skills. Treatments aimed at improving pragmatic skills are particularly diverse in nature. Despite the wide variation in methods, techniques, and goals, four main approaches to pragmatic language intervention can be distinguished (Cummings, 2016): (1) intervention on incorrect conversational exchanges (often based on Conversation Analysis techniques; e.g., Conversation Analysis Profile for People with Aphasia (CAPPA), Whitworth et al., 1997); (2) social-communicative skills training (e.g., role-playing or metapragmatic approach to the remediation of pragmatics on conversational conventions, topic management, speech acts, turn-taking, etc.); (3) pragmatic skills training; (4) ToM training. Pragmatic training targeting autistic children demonstrated that group interventions appear to be more effective than those provided individually, and the inclusion of typically developing peers may have the potential to increase the effectiveness of group interventions. Targeting skills like emotion recognition, prosody, joint attention, initiating and maintaining a conversation, attending to people, and following social cues, proved useful to enhance pragmatic skills (for a review, see Parsons et al., 2017). Research still needs to progress to understand how much these interventions have a long-term effect. Furthermore, future studies should consider that the clinical population constitutes a large and heterogeneous target group. Even when language seems not to be impaired, pragmatics difficulties can still be a significant and permanent barrier to effective communication and interpersonal relations. When assessing the clinical population in their ability to communicate cooperatively, there is an increasing necessity to carefully consider commonalities and differences among the participants that are selected. For a better comparison across groups, it is fundamental to assess them with the same tests in different domains, which should include ToM and language tests. Tests on conversational implicatures can also provide a sensitive measure to evaluate the ability of the single individual to adapt their linguistic exchanges to specific contexts and conversational partners, which are the key to success in social relationships.
Conversational Implicature and Communication Disorders 25
REFERENCES Abu-Akel, A. (1999). Impaired theory of mind in schizophrenia. Pragmatics & Cognition, 7(2), 247–282. Ahlsén, E. (1993). Conversational principles and aphasic communication. Journal of Pragmatics, 19(1), 57–70. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.). Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (DSM-5®). American Psychiatric Pub. Arosio, F., Pagliarini, E., Perugini, M., Barbieri, L., & Guasti, M. T. (2016). Morphosyntax and logical abilities in Italian poor readers: The problem of SLI under-identification. First Language, 36(3), 295–315. Bambini, V., Agostoni, G., Buonocore, M., Tonini, E., Bechi, M., Ferri, I., Sapienza, J., Martini, F., Cuoco, F., Cocchi, F., Bischetti, L., Cava llaro, R., & Bosia, M. (2022). It is time to address language disorders in schizophrenia: A RCT on the efficacy of a novel training targeting the pragmatics of communication (PragmaCom). Journal of Communication Disorders, 97, 106196. Bambini, V., Tonini, E., Ceccato, I., Lecce, S., Marocchini, E., & Cavallini, E. (2020). How to improve social communication in aging: Pragmatic and cognitive interventions. Brain and Language, 211, 104864. Bishop, D. V. M. (2000). Pragmatic Language Impairment: A Correlate of SLI, a Distinct Subgroup, or Part of the Autistic Continuum. In L. B. Leonard & D. V. M. Bishop (Eds.), Speech Language Impairments in Children: Causes, Characteristics, Intervention and Outcome (pp. 113–128). Psychology Press. Brüne, M., & Bodenstein, L. (2005). Proverb comprehension reconsidered—‘theory of mind’and the pragmatic use of language in schizophrenia. Schizophrenia Research, 75(2-3), 233-239. Cardillo, R., Garcia, R. B., Mammarella, I. C., & Cornoldi, C. (2018). Pragmatics of language and theory of mind in children with dyslexia with associated language difficulties or nonverbal learning disabilities. Applied Neuropsychology: Child, 7(3), 245–256. Chevallier, C., Wilson, D., Happé, F., & Noveck, I. (2010). Scalar inferences in autism spectrum
disorders. Journal of Autism and Developmental Disorders, 40(9), 1104–1117. Chierchia, G., Guasti, M. T., Gualmini, A., Meroni, L., Crain, S., Foppolo, F. (2004). Semantic and Pragmatic Competence in Children’s and Adults’ Comprehension of Or. In I. A. Noveck & D. Sperber (Eds.), Experimental Pragmatics (pp. 283–300). Palgrave Studies in Pragmatics, Language and Cognition. Palgrave Macmillan. Cummings, L. (2016). Clinical pragmatics. In Y. Huang (Ed.), Oxford handbook of pragmatics (pp. 346–361). Oxford University Press. Foppolo, F., Guasti, M. T., & Chierchia, G. (2012). Scalar implicatures in child language: Give children a chance. Language Learning and Development, 8(4), 365–394. Foppolo, F., Mazzaggio, G., Panzeri, F., & Surian, L. (2021). Scalar and ad-hoc pragmatic inferences in children: Guess which one is easier. Journal of Child Language, 48(2), 350–372. Grice, H. P. (1975). Logic and conversation. In P. Cole & J. L. Morgan (Eds.), Syntax and semantics. Volume 3: Speech acts (pp. 41–58). Academic Press. Hochstein, L., Bale, A., & Barner, D. (2018). Scalar implicature in absence of epistemic reasoning? The case of autism spectrum disorder. Language Learning and Development, 14(3), 224–240. Horn, L. R. (1972). On the semantic properties of the logical operators in English. Ph.D. dissertation. UCLA. Hu, S., Zhou, P., Foppolo, F., Vender, M., & Delfitto, D. (2019). Scalar implicatures in Chinese children with reading difficulties. First Language, 39(5), 479–507. Kasher, A., Batori, G., Soroker, N., Graves, D., & Zaidel, E. (1999). Effects of right-and lefthemisphere damage on understanding conversational implicatures. Brain and Language, 68(3), 566–590. Katsos, N., & Bishop, D. V. (2011). Pragmatic tolerance: Implications for the acquisition of informativeness and implicature. Cognition, 120(1), 67–81. Katsos, N., Roqueta, C. A., Estevan, R. A. C., & Cummins, C. (2011). Are children with Specific Language Impairment competent with the pragmatics and logic of quantification? Cognition, 119(1), 43–57.
26 Francesca Foppolo and Greta Mazzaggio Kennedy, L., Bill, C., Schwarz, F., Folli, R., & Romoli, J. (2015). Scalar implicatures vs. Presuppositions: the view from Broca’s aphasia. In T. Bui, & D. Özyıldız (Eds.), NELS 45: Proceedings of the forty-fith annual meeting of the North-East Linguistics Society, 2, 97–110. Springer. Kissine, M. (2021). Autism, constructionism, and nativism. Language, 97(3), e139–e160. Kocab, A., Davidson, K., & Snedeker, J. (2022). The emergence of natural language quantification. Cognitive Science, 46(2), e13097. Kuperberg, G. R. (2010). Language in schizophrenia part 1: An introduction. Language and Linguistics Compass, 4(8), 576–589. Linscott, R. J. (2005). Thought disorder, pragmatic language impairment, and generalized cognitive decline in schizophrenia. Schizophrenia Research, 75(2–3), 225–232. Mazzaggio, G., Foppolo, F., Job, R., & Surian, L. (2021). Ad-hoc and scalar implicatures in children with autism spectrum disorder. Journal of Communication Disorders, 90, 106089. Mazzaggio, G., & Surian, L. (2018). A diminished propensity to compute scalar implicatures is linked to autistic traits. Acta Linguistica Academica, 65(4), 651–668. Noveck, I. A. (2001). When children are more logical than adults: Experimental investigations of scalar implicature. Cognition, 78(2), 165–188. Pardini, M., Gialloreti, L. E., Mascolo, M., Benassi, F., Abate, L., Guida, S., Viani, E., Dal Monte, O., Schintu, S., Krueger, F., & Cocito, L. (2013). Isolated theory of mind deficits and risk for frontotemporal dementia: A longitudinal pilot study. Journal of Neurology, Neurosurgery & Psychiatry, 84(7), 818–821. Parola, A., Gabbatore, I., Bosco, F. M., Bara, B. G., Cossa, F. M., Gindri, P., & Sacco, K. (2016). Assessment of pragmatic impairment in right hemisphere damage. Journal of Neurolinguistics, 39, 10–25. Parsons, L., Cordier, R., Munro, N., Joosten, A., & Speyer, R. (2017). A systematic review of pragmatic language interventions for children with autism spectrum disorder. PloS One, 12(4), e0172242. Pastor-Cerezuela, G., Tordera Yllescas, J. C., González-Sala, F., Montagut-Asunción, M., & Fernández-Andrés, M. I. (2018). Comprehension of generalized conversational implicatures by children with and without autism spectrum disorder. Frontiers in Psychology, 9, 272.
Pijnacker, J., Hagoort, P., Buitelaar, J., Teunisse, J. P., & Geurts, B. (2009). Pragmatic inferences in high-functioning adults with autism and Asperger syndrome. Journal of Autism and Developmental Disorders, 39(4), 607–618. Ryder, N., Leinonen, E., & Schulz, J. (2008). Cognitive approach to assessing pragmatic language comprehension in children with specific language impairment. International Journal of Language & Communication Disorders, 43(4), 427–447. Schaeken, W., Van de Weyer, L., De Hert, M., & Wampers, M. (2021). The role of working memory in the processing of scalar implicatures of patients with schizophrenia spectrum and other psychotic disorders. Frontiers in Psychology, 12, 635724. Schaeken, W., Van Haeren, M., & Bambini, V. (2018). The understanding of scalar implicatures in Children with Autism Spectrum Disorder: Dichotomized responses to violations of informativeness. Frontiers in Psychology, 9, 1266. Spotorno, N., McMillan, C. T., Rascovsky, K., Irwin, D. J., Clark, R., & Grossman, M. (2015). Beyond words: Pragmatic inference in behavioral variant of frontotemporal degeneration. Neuropsychologia, 75, 556–564. Stoicescu, I., Sevcenco, A., & Avram, L. (2011). The acquisition of scalar implicatures: A clinical marker of developmental dyslexia in Romanian? In Topics in Language Acquisition and Language Learning in a Romanian Context. Selected Papers from Bucharest Colloquium of Language Acquisition (BUCLA) (pp. 15–16). Su, Y. E., & Su, L. Y. (2015). Interpretation of logical words in Mandarin-speaking children with autism spectrum disorders: Uncovering knowledge of semantics and pragmatics. Journal of Autism and Developmental Disorders, 45(7), 1938–1950. Surian, L., Baron-Cohen, S., & Van der Lely, H. (1996). Are children with autism deaf to Gricean maxims? Cognitive Neuropsychiatry, 1(1), 55–72. Svindt, V., & Surányi, B. (2021). The comprehension of grammaticalized implicit meanings in SPCD and ASD children: A comparative study. International Journal of Language & Communication Disorders, 56(6), 1147–1164. Tényi, T., Herold, R., Szili, I. M., & Trixler, M. (2002). Schizophrenics show a failure in the decoding of violations of conversational implicatures. Psychopathology, 35(1), 25–27.
Conversational Implicature and Communication Disorders 27 Tompkins, C. A., Fassbinder, W., Lehman-Blake, M. T., & Baumgaertner, A. (2002). The nature and implications of right hemisphere language disorders: Issues in search of answers. In A. Hillis (Ed.), Handbook of adult language disorders: Integrating cognitive neuropsychology, neurology, and rehabilitation (pp. 429–448). Psychology Press. van Tiel, B., & Kissine, M. (2018). Quantity-based reasoning in the broader autism phenotype: A web-based study. Applied Psycholinguistics, 39(6), 1373–1403. van Tiel, B., van Miltenburg, E., Zevakhina, N., & Geurts, B. (2016). Scalar diversity. Journal of Semantics, 33(1), 137–175. Vender, M. (2009). Scalar implicatures and developmental dyslexia. Proceedings of the 2nd International Clinical Linguistics Conference, Universidad Autónoma de Madrid, Universidad Nacional de Educación a Distancia, and Euphonia Ediciones. Vender, M. (2017). Disentangling dyslexia: Phonological and processing impairment in developmental dyslexia. Peter Lang.
Wampers, M., Schrauwen, S., De Hert, M., Gielen, L., & Schaeken, W. (2018). Patients with psychosis struggle with scalar implicatures. Schizophrenia Research, 195, 97–102. Whitworth, A., Perkins, L., & Lesser, R. (1997). Conversation Analysis Profile for People with Aphasia (CAPPA). Whurr Publishers. Whyte, E. M., & Nelson, K. E. (2015). Trajectories of pragmatic and nonliteral language development in children with autism spectrum disorders. Journal of Communication Disorders, 54, 2–14. Wilson, E., & Katsos, N. (2022). Pragmatic, linguistic, and cognitive factors in young children’s development of quantity, relevance and word learning inferences. Journal of Child Language, 49(6), 1065–1092. Zevakhina, N. (2012). Strength and similarity of scalar alternatives. Proceedings of Sinn und Bedeutung, 16(2), 647–658.
3 Relevance Theory and Communication Atypicalities ELLY IFANTIDOU AND TIM WHARTON 3.1 Introduction The early 1980s witnessed increasing interest by clinicians in assessing and treating people who, while possessing relatively intact structural language, nonetheless exhibited atypical communicative skills. The decade saw the first serious attempts at the characterization of pragmatic disorders, paving the way for the ultimate development of theories which addressed pragmatic difficulties in adults and children. (For a thorough review see Cummings, 2017). In those first attempts to identify symptoms of pragmatic impairment, clinicians were eager and willing to use key concepts introduced by the new, so-called ordinary language philosophies of Austin, Searle and Grice. Their enormous influence was best realized in one of the early studies by McTear (1985), where a 10-year-old boy’s poor conversational skills were analyzed in terms of his failure to understand the interlocutor’s indirect speech acts, presupposition, and his violation of the conversational maxim of quality (by contributing inconsistent and misleading information to the exchange). As practical eclecticism became more the norm among clinical practitioners, the seeds of a more sophisticated integration of practice with theoretical developments were sown. Since the mid-1990s, these frameworks have been Sperber and Wilson’s (1995) Relevance Theory, and approaches based around Theory of Mind (ToM) theories (see Premack & Woodruff, 1978; also Langdon & Coltheart, 1999; Tager-Flusberg, 2000). These two dominant cognitive frameworks have been increasingly associated with developmental pragmatics in Autism Spectrum Disorders (ASDs) children and adults. Pragmatic language impairments are commonly reported features of clinical p opulations with acquired disorders, too (for a review of earlier relevance-theoretic studies on right hemisphere damage and traumatic brain injury, see Leinonen & Ryder, 2008; for a recent account, see Jagoe & Wharton, 2021). Over the last three decades, relatively fewer studies, compared to developmental work, used relevance theory’s model of language comprehension to investigate meaning communicated verbally and non-verbally in aphasic and schizophrenic patients. Early studies were inspired by developmental work, as in the case of Mitchley et al. (1998) who followed Happé’s (1993) pioneering work on irony comprehension in young people with autism, in concluding that schizophrenic patients, too, are impaired in their appreciation of the mental states of others.
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
30 Elly Ifantidou and Tim Wharton In this chapter, we first offer a brief introduction to relevance theory (Section 3.2), with emphasis on recent developments attempting to integrate non-propositional, yet elemental aspects of verbal and non-verbal communication (Section 3.4). We then review early and latest studies on developmental and acquired disorders which used the relevance theoretic framework, in Sections 3.3 and 3.4. One advantage of using a more rigorous theoretical structure to describe behavioral problems in an explicit way is to explain why certain strengths and competencies have grown in some populations and not in others, and why certain abilities have expanded while others have stagnated or declined. The integration of material from clinical pragmatics can also importantly push the boundaries of the theory and broaden the domain to which its insights can be applied.
3.2 Introducing Relevance Theory Relevance theory builds on the inferential view of communication first presented in Grice (1957) and developed in his later work (1968, 1969). However, Sperber and Wilson were heavily influenced by the cognitive revolution of the late nineteen-sixties and, rather than postulating more Maxims, they turned to a range of questions concerning how his insights can be dealt with in psychological terms. How, they ask, can Grice’s insight that the very act of communication creates expectations in an audience be developed? If speaker meaning really can be dealt with in terms of the beliefs, desires, and intentions of communicators, then understanding words must, at some stage, involve attributing mental states to communicators. What does this mean in psychologically realistic terms? Turning first to the question of the audience’s expectations, relevance theory proposes that the human cognitive system is geared to look out for relevant information, which will interact with information that is already mentally-represented and lead to positive cognitive effects (in the form of true implications, warranted strengthenings or contradictions of existing assumptions): this is the The Cognitive Principle of Relevance. That the search for relevance plays a central role in human cognition is now widely accepted in cognitive science. Relevant information is information that improves an individual’s representation of the world. This disposition to search for relevance is routinely exploited in human communication. It is a disposition we all share, and speakers know that listeners will pay attention only to stimuli that are relevant enough. Therefore, in order to attract and hold an audience’s attention, speakers should make their communicative stimuli appear at least relevant enough to be worth processing. More precisely, according to The Communicative Principle of Relevance anyone overtly displaying an intention to inform, by producing an utterance or other intentional (in relevance theory, ostensive) stimulus creates a presumption that the stimulus is at least relevant enough to be worth processing, and moreover, the most relevant one compatible with her own abilities and preferences. Relevance theory is an attempt to flesh out the notion of what makes communicated information worthwhile. Turning next to the attribution of mental states, Sperber and Wilson show how these two principles interact with the mechanisms that mediate ToM. Speakers and hearers use the ability to read minds to build a shared cognitive environment and, using this as the base from which to draw mutually shared contextual assumptions, express and infer meaning. In this, relevance theory differs from the traditional code model in which a linguistic utterance is a signal encoding the message a communicator wishes to communicate. In order for a hearer to retrieve the speaker’s meaning, all they need do is decode the signal the speaker has provided into an identical thought or message. Viewed in this way, linguistic communication works according to broadly the same principles as semaphore, or Morse code. Human communication, Sperber and Wilson contend, does not work like this.
Relevance Theory and Communication Atypicalities 31 Consider Mary’s response to Peter’s invitation below: (1) Peter: Would you like to come to the cinema this evening? Mary: I have to finish my paper. Mary’s response is certainly indirect and, on the face of it, rather ambivalent. An analysis of this exchange using a broadly Gricean framework would go something along these lines. Mary’s response appears not to be relevant. However, Peter assumes that despite this apparent irrelevance she is still co-operating. As a result, he will process her utterance in such a way that this assumption is preserved – that is, in a manner which assumes it is relevant to his question. Nowadays, most people working on meaning believe that the gap between what is linguistically encoded and what is communicated is bridged not by more coding, but by reasoning, or inference. So, yes, hearers do decode words, but they only use the decoded meaning as a point of departure from which to work out what the speakers of those words mean by them. Relevance theory provides an answer to the question of not only why speakers and hearers aim at understanding each other, but also how. From the wide variety of inference-making processes, relevance theorists have singled out two major kinds: those which operate on imagistic and affective elements. They are taken together with decoded, propositional meaning, because they are intertwined and closely tied to human natural language. In addition, they are, too, basic ingredients of verbal and non-verbal communication. Consider the mundane metaphor in (2): (2) Bob’s a bit of a bulldozer. The statement in (2) is openly false, and on the face of it, uninformative, or ambiguous. On Grice’s simile account of metaphor, the speaker of (2) conversationally implicates a proposition to the effect that Bob resembles a bulldozer. However, wider repercussions on the interpretation of the propositional content of utterances are nowadays explored. Ifantidou (2021) addressed the complexity of interpretations retrieved from metaphors, illustrated by the images in (3): (3) a. Bob is taking a part in a meeting, doesn’t care to hear what other people think … b. My previous supervisor never listened to any opinion other than his own. c. A man hitting his hand on a table. At the same time, a range of emotions were perceived by readers of (2) above, such as surprise, annoyance, alarm, disappointment, to name a few. In focusing on inference as the route to work out what speakers of certain words mean, relevance theory aims at integrating a greater diversity of affective, imagistic and conceptual components into a coherent account of how speakers and hearers understand each other.
3.3 Developmental Disorders Since the early 1990s, relevance theory has offered the theoretical grounding to explain the development in typically developing children, and children with deficits in language abilities with and without pragmatic difficulties. Since the early 1980s, the most common label used to describe children whose pace and course of language development are not the characteristic ones was “specific language impairment” (SLI). Within SLI, a subgroup with “pragmatic language difficulties” (PLD) emerged, those difficulties attributed specifically to the retrieval
32 Elly Ifantidou and Tim Wharton of implicatures and an impaired ability to integrate relevant information in context (see Ryder & Leinonen, 2014). Before presenting the contribution of relevance theory to these areas of scholarly work, we first introduce and answer two fundamental questions: (a) how do SLI children differ from typically developing children? And (b) why is it important for a theory of language development to take into consideration their atypicalities? As it turns out, SLI symptoms are experienced primarily in terms of altered developmental processes (e.g. very late emergence, asynchronies across language domains), rather than in terms of violations of natural language properties. In effect, children with SLI will obey characteristics of their input language regardless of the difficulties they face (e.g. when repeating nonwords which are closely related to the length of real words in their language, see Leonard, 2014). Moreover, children with SLI will face the same difficulties with TD children in more demanding types of task, for example, with the continuous imperfective in Modern Greek compared to the habitual imperfective (see Dosi, 2019). As suggested by a series of studies, their poorer linguistic abilities pertaining to grammar and morphosyntax seem to be the outcome of cognitive deficits, such as their limited verbal working memory capacity (see Dosi et al., 2018; Dosi & Koutsipetsidou, 2019; Tsimpli et al., 2016). As a result, after a period of late emergence, the pace of subsequent lexical and grammatical development does not appear to be appreciably slower than that seen in typical development. At the same time, non-linguistically driven deficits remain, for example, their inability to handle the processing demands that the language test items place on them (for an example of a sentence comprehension experiment involving four pictures of similar and potentially interfering scenarios to be held in memory, see Leonard et al., 2013). Worth mentioning is that the debate as to whether the genetic factors which contribute to SLI are different from those that are responsible for differences in language ability in the normal range is still inconclusive. Those studies seem to have identified genetic factors that distinguish better from weaker language skills, but it is not yet clear if this line of research has specifically revealed factors that distinguish impairment from non-impairment (Leonard, 2014). For example, Dollaghan (2004) has shown that SLI children’s ability differs from their peers in their language symptoms primarily in degree rather than in kind. This supports the idea that the genetic basis may not be the (only) differentiating factor between children with SLI and children exhibiting typical language development. For the above reasons, the language attainments of SLI users can help us determine how essentially normal processes can and must build an adaptive capacity to cope with significant language difficulties, in conformity with known biological principles and the typology of the input language. Thus, with a wider range of abilities represented in the participant pool, new insights of theoretical and practical import might emerge. Next we trace back to the early 1990s the first works within relevance theory which attempted to describe and explain normal development and impairment of pragmatic comprehension, in adults and children. At the same time, we review several recent studies which pose new challenges within relevance theory. In doing so, we are assuming that a pragmatic theory can help us understand learner behavior at any stage of language acquisition (first or second), or ability level (adult or child). The benefit of relying on this initial hypothesis is that the hypothesis itself can be explored further by assuming that learners are operating with the same system as competent adults and that when something goes wrong or is differently interpreted, its causes can be traced in the pace (i.e. maturity) of developing cognitive processes rather than in the nature of the mechanisms involved. To illustrate, consider the case of reference resolution in pronouns. Foster-Cohen (1994) argued that the difficulty children experienced rejecting those sentences which do not match a picture is due to the fact that the experiment was designed without taking into consideration the presumption of optimal relevance: i.e. by being presented with redundant or contradictory information which results in increased processing demands when faced
Relevance Theory and Communication Atypicalities 33 with a situation where relevance is not immediately obvious. Among these early applications to language acquisition are those by Smith (1989), Smith and Tsimpli (1995), Watson (1995), and by Carroll (1995), Ying (1996), Foster-Cohen (1997), Bezuidenhout and Sroda (1998), and Wolf (1999), among others. Those dealing with impaired cases, too, attributed developmental difficulties to high computational costs and/or interpretive, or “metarepresentational” second-order, abilities (e.g. in ironies and rhetorical questions) (see Smith & Tsimpli, 1995, for a single case study of a savant, Christopher, with exceptional language abilities). Setting aside the role that theory of mind plays in pragmatic language interpretation, more recently, relevance theory has been applied by Ryder and Leinonen (2014) in designing tests on recovering implicatures, to assess children’s use of the given context (rather than other world knowledge). Their results suggested a greater developmental delay of children with SLI or PLD in utilizing relevant context, highlighting the importance of discouraging stopping at the first interpretation (or relying on a key word as is often reported in responses of children with SLI, that is, relying on semantic meaning, or on knowledge from memory). Of the few studies that have applied a relevance theoretic lens to autistic communication (Happé, 1993; Leinonen & Kerbel, 1999; Leinonen & Ryder, 2008; Loukusa et al., 2007; Papp, 2006; Wearing, 2010), all have been driven by theories of cognitive architecture interacting with human perception and communication, and autistic people’s impaired ToM abilities, in particular. In Happé’s first study, the very able autistic group (2nd order ToM) was compared to the normal adults’ performance in justifying a character’s utterance by attributing mental states (in a joke, lie, sarcasm, pretence, emotions). It was found that although the autistic participants did not use significantly fewer mental state justifications overall than the control groups, their failure to use the appropriate mental state terms revealed their true range of capacities: to recognize that these puzzling stories require answers in the realm of mental state language, and that the literal meaning of a speaker’s utterance does not make sense. In that early work, Happé (1993) acknowledged that what seemed to out win a cognitive deficit was their inability to integrate the story context given, process global information and apply what social knowledge they may have in everyday life situations (rather than an inability to attribute mental states alone). Recently, relevance theorists have pointed out that different facts and assumptions of what is mutually manifest among interlocutors cause breakdowns in communication especially when autistic and non-autistic individuals interact. Williams et al. (2021) provided evidence that matched-dispositional conversations, even if strangers are involved, reveal a significant increase in flow, rapport, and intersubjective attunement as well as increased social motivation. These indications challenge the belief that impaired theory of mind is a defining trait of autism, on the grounds that autistic children follow a complete, but atypical sequence of ToM stage progression. Those atypicalities in the sequence, speed, in processing and merging information from various sources are claimed to be the central cause of the social interaction style and repetitive behaviors associated with symptoms of autism (see Williams et al., 2021).
3.4 Acquired Disorders Many communicative atypicalities, of course, are not developmental at all. Right hemisphere damage (RHD) to the brain, suffered later in life (perhaps as the result of a stroke) can severely affect an individual’s ability to perform a range of pragmatic tasks. It might, for example, affect a person’s ability to perform the kind of inferences that are necessary for communication as described in relevance theory. This may have a range of consequences, for example, causing problems in the understanding of non-literal language. In a series of studies, McDonald (1999, 2000)
34 Elly Ifantidou and Tim Wharton explores the problems individuals with RHD have in successfully interpreting utterances as being ironic. This she explains by noting that their acquired atypicality renders them unable to infer that a particular utterance is being used echoically. This is a central notion in the relevance theory account of irony and again, she claims, demonstrates how relevance theory can shed light on how cognitive breakdown relates to communicative atypicalities. Other key references here are Bihrle et al., 1986; Brownell et al., 1983; Roman et al., 1987. Dipper et al. (1997) also applied relevance theory in an investigation into the pragmatic difficulties experienced by RHD individuals. They found that these individuals performed less well than neurotypicals in a range of tasks involving so-called bridging inferences (inferences that are required for a particular stretch of discourse to be rendered coherent). They suggest this might be explained by hypothesizing that the relevance-based “logical deductive device” might be affected by the brain damage these individuals have suffered. But it should be noted that since the late nineties, there have been many developments in relevance theory. In an earlier version of this chapter, Leinonen and Ryder 2008 write that “Sperber and Wilson (1995) argue that inferential comprehension involves central cognitive processes rather than specialized mechanisms.” (This is the all-purpose logical deductive device to which Dipper et al. (1997) refer.) But this is no longer the case. Sperber and Wilson (2002) present arguments to suggest that there is much more to the interpretive processes that underlie verbal comprehension than general mind-reading abilities of the type evoked by Grice (and others). Their proposal is that the processes underlying comprehension might be performed by a domain-specific “comprehension” mechanism or module (Sperber, 1994, 2000). The function of such a mechanism would be to interpret ostensive stimuli using a relevance-driven heuristic. This proposed dissociation between the broader capacity known as ToM and a domainspecific comprehension module has allowed researchers to develop a more nuanced view of a range of communicative atypicalities. If, for example, ToM actually comprises a whole range of different sub-modules, it might be possible to chart more clearly in cognitive terms which specific structures relate to which comprehension tasks. This, in turn, will shed light on how damage of such structures relates to specific atypicalities. To use an example from the developmental literature, the fact that autistic children have more difficulty interpreting cognitive emotions (where an emotion is caused by, for example, a belief – such as surprise) than simple emotion (an emotion caused by a situation – such as happiness), may indicate that while meta-cognitive abilities are atypical, the coding-decoding mechanisms responsible for the interpretation of natural signals (such as smiles and other non-verbal behaviors) are intact (Wharton, 2014). Indeed, the right hemisphere is relatively dominant in the interpretation of emotional prosody, and the left hemisphere in the interpretation of linguistic prosody (Baum & Pell, 1999; Pell, 2002; Ross et al., 1988). Thus, RHD in Parkinson’s patients affects the interpretation of emotional and sentential prosody but not lexical prosody. While RHD may cause difficulties with emotional prosody, it does not necessarily lead to problems with the identification of emotional facial expressions. The relevance theoretic framework suggests two possible explanations that might be worth investigating: selective impairment of the ability to interpret natural signals; or selective impairment of the ability to interpret natural prosodic indicators of mental rather than physical state. Since its inception, relevance theory has taken the domain of pragmatics to be somewhat broader than other post-Gricean theories. While most of these equate the domain of pragmatics with Gricean non-natural meaning (meaningNN), relevance theory has consistently endorsed an approach within which there is a continuum of cases between (indirect) cases of Gricean meaningNN and cases of “showing.” The latter category can be distinguished by the fact that the evidence provided for what is being pointed out is relatively direct. If you ask one of us the time, we might utter something in a linguistic code (in which case the evidence
Relevance Theory and Communication Atypicalities 35 is indirect – you have to know the code). But, alternatively, we might point at a clock on the wall. This idea that there is a continuum of cases has implications for the domain of pragmatic principles or maxims, for it suggests that they are best seen as applying to the domain of intentional communication as a whole, rather than to the domain of meaningNN, as is generally assumed in Gricean accounts. Among other things, the idea that there is a continuum of cases allows relevance theory to better accommodate natural communicative behaviors within a pragmatic theory: the “showing-meaningNN” continuum provides a snapshot of the types of evidence used in intentional communicative acts. Such acts are typically a composite of inter-related behaviors which fall at various points along the continuum. At one extreme of the continuum lie clear cases of spontaneous, natural display; at the other extreme lie clear cases of linguistic coding, where all the evidence provided for the first, basic layer of information is indirect. In between lie a range of cases in which more or less direct “natural” evidence and more or less indirect coded evidence mix to various degrees (natural signals, for example). Gussenhoven (2002) proposes that the meaning inherent in intonation may be either arbitrary or based on universal “biological codes.” In the framework proposed here, intonation (indeed, prosodic elements generally) would occupy various positions along the continuum (see Wharton, 2009; Wilson & Wharton, 2006). Sperber and Wilson (2015) rework the showing-meaningNN continuum by incorporating another axis, along which communicated information might by determinate or indeterminate, see Figure 3.1. The horizontal axis reflects the nature of the information that is being pointed out ostensively, whether that information is being shown or meantNN. So when someone points to a particularly salient object in the environment, or utters the name of that object, what is being shown or meantNN is highly determinate. Think of the example above where someone responds to being asked the time by pointing at a clock. Whether it has been shown or meantNN, the intended import in cases such as this are easily paraphrased in propositional terms. In cases of descriptively ineffable utterances, say, a metaphor, or cases where rather than clearly pointing at something someone simply waves their hand in a general direction, what is being shown or meantNN is not easily paraphrased at all.
Determinate
Meaning
Showing
Indeterminate
1
2
3
4
5
6
7
8
9
Figure 3.1 The bi-dimensional continuum (Redrawn from Sperber & Wilson, 2015). Reproduced with permission of the Croatian Journal of Philosophy, Vol XV, number 44, page 123 (from Sperber, D. & Wilson, D “Beyond Speaker’s Meaning”).
36 Elly Ifantidou and Tim Wharton This new theoretical development has not only allowed relevance theorists to explore the vaguer aspects of communication with more rigor and exactitude, but also has clinical applications. There is, for example, a rich literature on the interpretation of gesture by people with aphasia, typically as a result of stroke (Kistner et al., 2019; Rose et al., 2013; Sekine et al., 2013). Most of this work, however, use a model of gestural “meaning” in which form and function are coded according to strict principles of categorization (Kong et al., 2015). Jagoe and Wharton (2021) explore a data set of gestural use by people with aphasia (AphasiaBank – MacWhinney et al., 2011) from a relevance theory perspective with two main aims: firstly, to use the theory and the bi-dimensional continuum above in order to explore the potential to present a unified account of ostensive communicative gesture in people with aphasia; secondly, to explore ways in which the mere existence of this continuum allows for a first exploration of what they call its “neglected corners”: instances of communication in which propositional content is difficult to specify and what is conveyed is either vague or highly indeterminate. They demonstrate that indeterminacy and impressions, which are perfectly common among typical speakers but not even visible in the code model of communication, may be intentionally and effectively communicated by a person with aphasia. The analysis above supports the bi-dimensional continuum as a framework which brings together dimensions of meaning/showing as well as graded considerations of determinacy. And as well as this, relevance theory offers a conceptually unified approach with which to explore and explain ostensive-inferential communication, including gesture, in people with aphasia. The analysis of the exemplars presented in Jagoe and Wharton (2021) demonstrates how people with aphasia may use gesture to communicate an impression. Crucially, vagueness or indeterminacy is not pathological, a feature of those who are communicatively “deprived” in some way, but rather a feature of human communication. Moving beyond the code model may also provide a framework in which to address the intentional use of “vague” gesture, recognizing that human communication is not consistently determinate. A focus on precise, or determinate “meaning” in the existing literature concerning use of gesture by people with aphasia belies the productive use of vague communication, which forms an important part of human communication according to relevance theory.
3.5 Conclusions A wide range of studies have established the plausibility of the analytic tools offered by relevance theory to account for language disorders. They have shown, for instance, that the pace of development and degree of language difficulties rather than type or order, creates atypicalities in human communication, in both developmental and acquired disorders. In the same vein, appealing to mutually manifest facts and assumptions to explain and predict behaviors in autistic individuals challenged the belief that impaired theory of mind is a defining trait of autism. Generally, the theory has shifted responsibility for communicative success and failure to the experiential dimension of human communication. As for example, in interlocutors’ disposition – preferably “matched” in autistic-autistic pairs, and in language charged with affective meanings or mental images which are crucial for successful participation in i nteraction. How the so-called contextual information is tightly linked to people’s affective dispositions, how the meanings retrieved are often less than a collection of fully determinate propositions, and how these testable predictions help to understand pragmatic l anguage difficulties, have only begun to be explored. As it develops, relevance theory makes a more promising theory in this daunting endeavor, allowing us to move beyond mere surface description and to get to grips with precisely what causes communicative atypicalities.
Relevance Theory and Communication Atypicalities 37
REFERENCES Baum, S., & Pell, M. (1999). The neural basis of prosody: Insights from lesion studies and neuroimaging. Aphasiology, 13(8), 581–608. Bezuidenhout, A., & Sroda, M. S. (1998). Children’s use of contextual cues to resolve ambiguity: An application of relevance theory. Pragmatics & Cognition, 6(1/2), 255–290. Bihrle, A. M., Brownell, H. H., Powelson, J. A., & Gardner, H. (1986). Comprehension of humorous and non-humorous materials by left and right brain-damaged patients. Brain and Cognition, 5(4), 399–411. Brownell, H. H., Michel, D., Powelson, J., & Gardner, H. (1983, Jan). Surprise but not coherence: Sensitivity to verbal humor in right-hemisphere patients. Brain and Language, 18(1), 20–7. https://doi.org/10.1016/0093934x(83)90002-0. PMID: 6839130. Carroll, S. (1995). The irrelevance of verbal feedback to language learning. In L. Eubank, L. Selinker, & M. Sharwood Smith (Eds.), The current state of interlanguage. Studies in honor of William E. Rutherford (pp. 73–88). John Benjamins. Cummings, L. (2017). Clinical pragmatics. In G. Yueguo, A. Barron, & G. Steen (Eds.), Routledge handbook of pragmatics (pp. 419–432). Routledge. Dipper, L. T., Bryan, K. L., & Tyson, J. (1997). Bridging inference and relevance theory: An account of right hemisphere inference. Clinical Linguistics and Phonetics, 11(3), 213–228. Dollaghan, C. (2004). Taxometric analyses of specific language impairment in 3- and 4-year-old children. Journal of Speech, Language, and Hearing Research, 47(2), 464–475. Dosi, I. (2019). Aspectual and cognitive asymmetries in Greek-speaking children with Specific Language Impairment (SLI). Selected Papers of ISTAL, 23, 122–140, ISSN 2529-1114. https://ejournals.lib.auth.gr/thal/article/ viewFile/7325/7074 Dosi, I., Andreou, M., & Peristeri, E. (2018, June 2–3). Task effects on the production of grammatical aspect in Greek-speaking children with Specific Language Impairment. Presentation at Language Disorders in Greek 7 Conference, Athens, Greece. Dosi, I., & Koutsipetsidou, E. C. (2019). Measuring linguistic and cognitive abilities by means of a sentence repetition task in children with developmental dyslexia and
developmental language disorder. European Journal of Research in Social Sciences, 7(4), 10–19. Foster-Cohen, S. (1994). Exploring the boundary between syntax and pragmatics: Relevance and the binding of pronouns. Journal of Child Language, 21(1), 237–255. Foster-Cohen, S. (1997). “If you’d like to burn your mouth, feel free!”: A relevance-theoretic account of conditionals used to children. In M. Groefsema (Ed.), Proceedings of the University of Hertfordshire Relevance Theory Workshop (pp. 140–148). Peter Thomas and Associates. Grice, H. P. (1957). Meaning. Philosophical Review, 66(3), 377–388. Grice, H. P. (1968). Utterer’s meaning, sentence meaning and word-meaning. Foundations of Language, 4(3), 225–242. Grice, H. P. (1969). Utterer’s meaning and intentions. Philosophical Review, 78(2), 147–177. Gussenhoven, C. (2002). Intonation and interpretation: Phonetics and phonology. In B. Bel & I. Marlien (Eds.), Proceedings of the Speech Prosody 2002 Conference (pp. 47–57). Happé, F. (1993). Communicative competence and theory of mind in autism: A test of relevance theory. Cognition, 48(2), 101–119. Ifantidou, E. (2021). Non-propositional effects in verbal communication: The case of metaphor. In T. Wharton & C. Jagoe (Eds.), Journal of Pragmatics, 181(2), 6–16. Jagoe, C., & Wharton, T. (2021). Meaning non-verbally: The neglected corners of the bi-dimensional continuum communication in people with aphasia. In T. Wharton & C. Jagoe (Eds.), Journal of Pragmatics, 178(5), 21–30. Kistner, J., Dipper, L. T., & Marshall, J. (2019). The use and function of gestures in word-finding difficulties in aphasia. Aphasiology, 33(11), 1372–1392. Kong, A. P. H., Law, S. P., Kwan, C. C. Y., Lai, C., & Lam, V. (2015). A coding system with independent annotations of gesture forms and functions during verbal communication: Development of a database of speech and gesture (DoSaGE). Journal of Nonverbal Behavior, 39(1), 93–111. Langdon, R., & Coltheart, M. (1999). Mentalising, schizotypy, and schizophrenia. Cognition, 71(1), 43–71. Leinonen, E., & Kerbel, D. (1999). Relevance theory and pragmatic impairment.
38 Elly Ifantidou and Tim Wharton International Journal of Language & Communication Disorders, 34(4), 367–390. Leinonen, E., & Ryder, N. (2008). Relevance theory and language disorders. In M. Ball, M. Perkins, N. Müller, & S. Howard (Eds.), Handbook of clinical linguistics (pp. 49–60). Blackwell. Leonard, L. B. (2014). Children with specific language impairment and their contribution to the study of language development. Journal of Child Language, 41(1), 38–47. Leonard, L., Deevy, P., Fey, M., & Bredin-Oje, S. (2013). Sentence comprehension in specific language impairment: A task designed to distinguish between cognitive capacity and syntactic complexity. Journal of Speech, Language, and Hearing Research, 56(2), 577–589. Loukusa, S., Leinonen, E., Kuusikko, S., Jussila, K., Mattila, M. L., Ryder, N., Ebeling, H., & Moilanen, I. (2007). Use of context in pragmatic language comprehension by children with Asperger syndrome or high-functioning autism. Journal of Autism and Developmental Disorders, 37(6), 1049–1059. MacWhinney, B., Fromm, D., Forbes, M., & Holland, A. (2011). AphasiaBank: Methods for studying discourse. Aphasiology, 25(11), 1286–1307. McDonald, S. (1999). Exploring the process of inference generation in sarcasm: A review of normal and clinical studies. Brain and Language, 68(3), 486–506. McDonald, S. (2000). Putting communication disorders in context after traumatic brain injury. Aphasiology, 14(4), 339–347. McTear, M. F. (1985). Pragmatic disorders: A case study of conversational disability. British Journal of Disorders of Communication, 20(2), 129–142. Mitchley, N. J., Barber, J., Gray, J. M., Brooks, D. N., & Livingston, M. G. (1998). Comprehension of irony in schizophrenia. Cognitive Neuropsychiatry, 3(2), 127–138. Papp, S. (2006). A relevance-theoretic account of the development and deficits of theory of mind in normally developing children and individuals with autism. Theory & Psychology, 16(2), 141–161. Pell, M. (2002). Evaluation of nonverbal emotion in face and voice: Some preliminary findings on a new battery of tests. Brain and Cognition, 48(2–3), 499–514. Premack, D., & Woodruff, G. (1978). Does the chimpanzee have a “theory of mind”? Behavioural and Brain Sciences, 4(1), 515–526. Roman, M., Brownell, H. H., Potter, H. H., Seibold, M. S., & Gardner, H. (1987). Script
knowledge in right hemisphere-damaged and in normal elderly adults. Brain and Language, 31(1), 151–170. Rose, M. L., Raymer, A. M., Lanyon, L. E., & Attard, M. C. (2013). A systematic review of gesture treatments for post-stroke aphasia. Aphasiology, 27(9), 1090–1127. Ross, E., Edmonson, J., Seibert, G., & Homan, R. (1988). Acoustic analysis of affective prosody during right-sided Wada test: A within-subjects verification of the right hemisphere’s role in language. Brain & Language, 33(1), 128–145. Ryder, N., & Leinonen, E. (2014). Pragmatic language development in language impaired and typically developing children: Incorrect answers in context. Journal of Psycholinguistic Research, 43(1), 45–58. Sekine, K., Rose, M. L., Foster, A. M., Attard, M. C., & Lanyon, L. E. (2013). Gesture production patterns in aphasic discourse: In-depth description and preliminary predictions. Aphasiology, 27(9), 1031–1049. Smith, N. (1989). The twitter machine. Blackwell. Smith, N., & Tsimpli, I. (1995). The mind of a savant. Blackwell. Sperber, D. (1994). The modularity of thought and the epidemiology of representations. In L. Hirschfeld & S. Gelman (Eds.), Mapping the mind: Domain specificity in cognition and culture (pp. 39–67). Cambridge University Press. Sperber, D. (2000). Metarepresentations in an evolutionary perspective. In D. Sperber (Ed.), Metarepresentations: A multidisciplinary perspective (pp. 117–137). Oxford University Press. Sperber, D., & Wilson, D. (1995). Relevance: Communication and cognition. Blackwell. Sperber, D., & Wilson, D. (2002). Pragmatics, modularity and mindreading. Mind & Language, 17(1–2), 3–23. Sperber, D., & Wilson, D. (2015). Beyond speaker’s meaning. Croatian Journal of Philosophy, XV(44), 117–149. Tager-Flusberg, H. (2000). Language and understanding minds: Connections in autism. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from developmental cognitive neuroscience (pp. 124–149). Oxford University Press. Tsimpli, I. M., Peristeri, E., & Andreou, M. (2016). Narrative production in monolingual and bilingual children with specific language impairment. Applied Psycholinguistics, 37(1), 195–216.
Relevance Theory and Communication Atypicalities 39 Watson, R. (1995). Relevance and definition. Journal of Child Language, 22(1), 211–222. Wearing, C. (2010). Autism, metaphor and relevance theory. Mind & Language, 25(2), 196–216. Wharton, T. (2009). Pragmatics and non-verbal communication. Cambridge University Press. Wharton, T. (2014, September/December). What words mean is a matter of what people mean by them. Linguagem em (Dis)curso – LemD, Tubarão, SC, 14(3), 473–488. Williams, G. L., Wharton, T., & Jagoe, C. (2021). Mutual (mis)understanding: Reframing autistic pragmatic “impairments” using
relevance theory. Frontiers in Psychology, 12, 210–229. Wilson, D., & Wharton, T. (2006). Relevance and prosody. Journal of Pragmatics, 38(10), 1559–1579. Wolf, A. (1999). Context and relevance theory in language teaching: An exploratory approach. International Review of Applied Linguistics and Language Teaching, 37(2), 95–109. Ying, I. (1996). Multiple constraints on processing ambiguous sentences: Evidence from adult L2 learners. Language Learning, 46(4), 681–711.
4 Neuropragmatics LUCA BISCHETTI, FEDERICO FRAU, AND VALENTINA BAMBINI 4.1 Defining Neuropragmatics Neuropragmatics can be defined as the study of the neurocognitive basis of pragmatic processes as they are accounted for in the Gricean tradition, that is, with a focus on the inferential activities supporting the comprehension of intended meanings and the production of discourse and conversation. Because of its c onnection with both language and socio-cognitive functioning, neuropragmatics stands out as a domain of interdisciplinary research that straddles the border between neurolinguistics, theoretical and experimental pragmatics, as well as other areas of cognitive neuroscience. The field was shaped in the early 2000s, grounding on two decades of studies on pragmatic difficulties in clinical populations, and rapidly incorporated methods from cognitive neuroscience, including the use of neuroimaging and electrophysiological recordings (Bambini & Bara, 2012; Bara & Tirassa, 2000; Hagoort & Levinson, 2014; Stemmer, 2008). Nowadays, neuropragmatics encompasses different research strands, from those focused on the study of pragmatics in clinical groups (clinical pragmatics; Cummings, 2017, 2021) to those based on neuroscience approaches (Canal & Bambini, 2023; Reyes-Aguilar et al., 2018). Currently, key topics in neuropragmatics revolve around the relationship between pragmatic abilities and other cognitive skills, as this issue might contribute to solving important theoretical questions related to the status of pragmatics in the architecture of the mind, such as the hypothesis that pragmatics is a dedicated mind-reading module (Bosco, Tirassa et al., 2018; Hauptman et al., 2023; Spotorno et al., 2012). In addition, neuropragmatics carries relevant societal implications related to how to classify, diagnose, and promote pragmatic abilities in individuals who have difficulties in pragmatic tasks (Bambini et al., 2022; Turkstra et al., 2017). In particular, given the centrality of communication in human life, an informed neuropragmatic approach can offer a better grounding for assessing and treating pragmatic language disorder, contributing to promoting well-being.
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
42 Luca Bischetti, Federico Frau, and Valentina Bambini
4.2 The Contribution of Clinical Pragmatics and the Pervasiveness of Pragmatic Language Disorders The investigation of the neural correlates of pragmatic ability was initiated by the neurological observations of anatomo-clinical correlations between brain lesions in the right hemisphere and difficulties in high-order language skills. By the early 1960s, clinicians noted that individuals with right-hemisphere damage, while not exhibiting common aphasic symptoms, had difficulties in understanding “superordinate” language aspects, such as concrete and abstract formulations (Critchley, 1962; Eisenson, 1959). More refined studies over the 1970s and 1980s confirmed that these individuals are often impaired in a range of phenomena that later fell under the label of pragmatics, including the comprehension of figurative expressions such as idioms, metaphors, and humor (Brownell et al., 1983; Kempler et al., 1999; Winner & Gardner, 1977), producing appropriate discourse as well as building a coherent representation of a story (Joanette & Goulet, 1990), and emotional prosody (Weintraub et al., 1981). This literature contributed to shaping the hypothesis of a specialization of the right hemisphere for pragmatics (the so-called right-hemisphere hypothesis), which has represented the dominant paradigm in neuropragmatics up to the early 2000s (Dronkers et al., 2000), based on the idea of a strict division of functions in the brain between structural language skills (in the left hemisphere) and more interpretative and creative ones (in the right hemisphere). Another adult clinical condition where pragmatic disorders have been extensively described since at least the 1990s is traumatic brain injury, with affected individuals displaying a typical pattern of cognitive and communicative deficits (Togher et al., 2013). In particular, individuals suffering from traumatic brain injury were shown to exhibit difficulties in discourse production and comprehension (Büttner-Kunert et al., 2022), conversational appropriateness (Body et al., 1999), and grasping the intended meaning of figurative expressions (Arcara et al., 2020). Studies in this area contributed to fostering the idea of a strict relationship between pragmatic functioning and non-linguistic domains such as theory of mind (ToM) and executive functions (Martin & McDonald, 2003). A further key domain in clinical pragmatics deals with neurodevelopmental disorders, among which autism spectrum disorder stands out as one of the most affected (and studied) conditions, characterized by difficulties in managing turns in conversation, adapting register and referential information (e.g., through pronouns and deixis) to the interlocutors’ needs, and using the context to derive the meaning of figurative expressions (Reindal et al., 2021; Volden, 2017). Furthermore, the introduction of the category of “social (pragmatic) communication disorder” in the DSM-5 (American Psychiatric Association, 2013) as a pure pragmatic disorder without autism-related manifestations and language or intellectual disabilities, strengthened the attention to the pragmatic dimension in atypical learning, giving rise to a live debate that is still ongoing (Saul et al., 2022). The last two decades have been characterized by the marked expansion of the clinical groups assessed for pragmatic skills, revealing a diffuse deficit that might manifest not only in people with right-hemisphere damage or traumatic brain injury, but also in neurological conditions traditionally not assessed for language disorders, such as Alzheimer’s disease (Amanzio et al., 2008), frontotemporal dementia (Luzzi et al., 2020), amyotrophic lateral sclerosis (Bambini, Bischetti et al., 2020), multiple sclerosis (Carotenuto, Arcara et al., 2018), and Parkinson’s disease (Montemurro et al., 2019). Furthermore, well-known difficulties with language in psychiatric populations started to be described by using pragmatic categories. This is what happened in the case of schizophrenia (see Bambini, Arcara, Bechi et al., 2016; Colle et al., 2013). Well-known clinical features such as tangentiality and verbosity in speech are now described in terms of poor coherence (Marini et al., 2008), while the so-called concretism (clinically defined as lack of abstract thinking manifesting in concrete interpretations of proverbs, such as “Gold goes in at any gate
Neuropragmatics 43 except heaven’s” interpreted as “There’s gold in churches,” Harrow & Quinlan, 1985) is now spelled out in its pragmatic features as a difficulty with figurative language (Bambini, Arcara et al., 2020). Finally, recent literature pointed out several still underserved populations that might suffer from pragmatic disorders in their life span, including maltreated children and adults in the prison population (for an overview, see Cummings, 2021). The development of tools assessing pragmatic skills is expanding in parallel with the growing literature on different clinical conditions. In addition to classic instruments such as the Profile of Communicative Appropriateness (Penn, 1985), the Right Hemisphere Communication Battery (Gardner & Brownell, 1986), the Montreal Protocol for the Evaluation of Communication (Joanette et al., 2004), and the Children’s Communication Checklist (Bishop, 1998), new tools with stronger theoretical basis have been proposed and applied in a wide spectrum of clinical conditions. Among these, we mention the Assessment Battery for Communication (Bosco et al., 2012), built upon the foundation of Cognitive Pragmatics (Bara, 2010), and the Assessment of Pragmatic Abilities and Cognitive Substrates (Arcara & Bambini, 2016), developed following Gricean and post-Gricean pragmatics to assess discourse and figurative language, which also inspired a short version for remote administration (Bischetti et al., 2023). Thanks to the broad use of these refined tools, we have gained a better idea of how diffuse pragmatic disorders can be across populations (see Table 4.1 for a non-exhaustive summary). A promising research line in neuropragmatics concerns the development of theoretically sound treatment programs addressing pragmatic disorders in different profiles. These include the Cognitive Pragmatic Treatment, inspired by Cognitive Pragmatics, and the Gricean-inspired Pragmatics of Communication (PragmaCom) treatment, successfully applied to promote pragmatic skills in individuals with schizophrenia (Bambini et al., 2022; Bosco et al., 2016) and in traumatic brain injury (Bosco, Parola et al., 2018). The effectiveness of these programs offers in turn important insights into the foundation of theoretical pragmatic models and their psychological reality. Furthermore, the importance of developing effective pragmatic intervention is motivated by the evidence of the impact of pragmatic impairment on several facets of daily functioning and well-being: from childhood to adult life, communication difficulties impact children’s social integration and school success rate and are considered a significant predictor of occupational achievement and relational satisfaction in adults with neurological or psychiatric disorders, often exposed to social isolation and, in turn, to anxiety and depression (Agostoni et al., 2021; Snow & Douglas, 2017).
4.3 Neuroimaging and the Bilateral Networks for Pragmatic Processing The introduction of brain imaging techniques to examine the metabolic activity of populations of neurons while engaged in cognitive tasks allowed for great advances in the description of the neuroanatomy of language. Since the 1990s, methodologies such as PET (Positron Emission Tomography) and fMRI (functional Magnetic Resonance Imaging) have been used to describe the brain regions implicated in phonological, grammatical, and semantic tasks (Price, 2012), and more recently also in pragmatic functioning. Pioneering studies run before the 2000s supported the right-hemisphere hypothesis for pragmatics as put forward in the literature on clinical groups, with investigations, for instance, on metaphor (Bottini et al., 1994) and discourse comprehension (St George et al., 1999). Nowadays, however, there is a consensus about the bilateral organization of pragmatics in the brain. Dozens of studies from the last two decades and several meta-analyses emphasized the role of both hemispheres in processing pragmatic phenomena, especially fronto-temporal regions (Bohrn et al., 2012; Rapp et al., 2012; Reyes-Aguilar et al., 2018). The contribution of other areas, especially in the parietal lobes, including the temporo-parietal
42.00 (10.80) 63.30 (9.64) 67.63 (5.99) 39.74 (10.54) 36.30 (9.90) 21.00 (1.76) 12.30 (3.30) 15;11 (44 months) 14;10 (75 months)
39 19 47
42 33 30 47 17 19 177
24 79%
50%
55% 36% 53% 77% 80% 36% 52–76%*
59% 42% 53%
78%
APACS APACS APACS APACS ABaCo APACS CCC-2 (pragmatic subscales) CCC (pragmatic composite) CCC (pragmatic composite)
APACS ABaCo APACS
MEC
Assessment tool
Laws and Bishop (2004)
Laws and Bishop (2004)
Carotenuto, Arcara et al. (2018) Bambini, Arcara, Martinelli et al. (2016) Bambini, Bischetti et al. (2020) Bambini, Arcara, Bechi et al. (2016) Colle et al. (2013) Cappelli et al. (2018) Reindal et al. (2021)
Arcara et al. (2020) Bosco, Parola et al. (2018) Montemurro et al. (2019)
Ferré et al. (2012)
Reference
Notes: ABaCo = Assessment Battery for Communication; APACS = Assessment of Pragmatic Abilities and Cognitive Substrates; CCC = Children’s Communication Checklist (first or second edition); MEC = Protocole Montréal d’Évaluation de la Communication [Montreal Protocol for the Evaluation of Communication]. Mean age is reported in years or years; months for children. * Numbers refer to the range reported for the pragmatic subscales in CCC-2 (i.e., from E to H).
Williams syndrome
19
40.56 (17.79) 38.50 (10.80) 72.00 (7.36)
112
Right-hemisphere damage Traumatic brain injury Parkinson’s disease Multiple sclerosis Amyotrophic lateral sclerosis Schizophrenia
Adult dyslexia Autism spectrum disorder Down syndrome
62.28 (12.86)
Sample size
Clinical population
% below cut-off in pragmatic Mean age (SD) assessment
Table 4.1 Frequency of pragmatic impairment in different populations as evaluated with different standardized tools focusing on (or including) pragmatic ability.
Neuropragmatics 45 junction – as part of the ToM network for understanding others’ mental states, intentions, and communicative goals – has also been highlighted in relation to specific pragmatic phenomena, such as indirect replies, irony, and humor (e.g., Bašnáková et al., 2015; Farkas et al., 2021). To illustrate the neuroimaging literature on pragmatics, we will consider the case of metaphor comprehension. Studies typically employed paradigms where nominal metaphors or word pairs were presented along with literal equivalents (e.g., “Do you know what that dancer is? A dragonfly” vs. “Do you know what that insect is? A dragonfly”, Bambini et al., 2011; “pearl tears” vs. “water drops”, Mashal et al., 2007) while subjects laid in the scanner. The literature consistently reports greater activations for metaphorical vs. literal items in both hemispheres, especially in the typical language areas such as the inferior frontal and temporal gyri. Studies often reported also the involvement of regions related to executive functions, such as the anterior cingulate cortex bilaterally (e.g., Bambini et al., 2011; Mashal et al., 2007), possibly linked to inhibitory control in filtering irrelevant information at the level of the literal meaning. Furthermore, greater activations in the ToM network, including the left inferior parietal lobule, the angular gyrus, and the right superior temporal sulcus (e.g., Bambini et al., 2011; Lee & Dapretto, 2006), often emerged, indicating the effort in inferring the intended meaning conveyed via metaphors. Another interesting phenomenon at the interface of pragmatics and other cognitive systems is verbal humor. By studying the elaboration of short stories ending with punchlines vs. plausible completions (e.g., “Why did the golfer wear two sets of pants? … He got a hole in one” vs. “… It was a very cold day”, Goel & Dolan, 2001), the literature highlighted the involvement of language-related fronto-temporal areas, including the inferior/middle frontal gyri and the middle temporal gyrus, as well as the bilateral temporo-parietal junction, linked with the mechanisms of incongruity detection-resolution and the identification of the others’ (comic) intentions (e.g., Goel & Dolan, 2001; Vrticka et al., 2013). Moreover, verbal humor understanding correlates with increased blood flow in areas included in the mesocorticolimbic system, such as prefrontal areas and the amygdala, functionally associated with the appreciation of comic aspects (Farkas et al., 2021). The understanding of intentions is pivotal also in the processing of irony: besides the activation of language-related areas in the left inferior frontal gyrus, processing ironic utterances vs. literal statements (e.g., imagine a lazy vs. a fruitful day being described as “really productive,” Spotorno et al., 2012) recruited regions implicated in reasoning about others’ mental states such as the precuneus, the temporo-parietal junction bilaterally, and the medial prefrontal cortex (e.g., Shibata et al., 2010). The fMRI literature on pragmatics has largely explored also text and discourse processing. Overall, there is strong evidence of the contribution of extended networks in both hemispheres in processing language “beyond the sentence given” (Ferstl et al., 2008; Hagoort & van Berkum, 2007). Bilateral effects in fronto-temporal or fronto-parietal regions are reported, for instance, for establishing coherence in text comprehension (Ferstl & von Cramon, 2001) and inference-making processes related to logical connectives (Prado et al., 2015).
4.4 Neurophysiology and the Time Course of Discourse and Figurative Language Processing Since the 1920s, the recording of brain electrical activity at the level of the scalp via electroencephalography (EEG) has been one of the fundamental tools of neurology, and, from the 1980s, it became also one of the preferred approaches to explore the time-course of language processing. A key property of the EEG is its excellent time resolution, which allows researchers to appraise changes in the electrical activity of the brain induced by experimental manipulations – known as event-related potentials (ERPs) – with millisecond precision. By testing the elaboration of meaningful vs. anomalous sentences (e.g., “He spread the warm bread with
46 Luca Bischetti, Federico Frau, and Valentina Bambini butter” vs. “socks”), Kutas & Hillyard (1980) showed a difference in amplitude – a peak for meaningless words – about 400ms after the stimulus presentation over centro-parietal electrodes (the so-called N400 component), which is taken to reflect the recognition of semantic incongruities based on contextual expectancies. From the discovery of the N400, psycho- and neurolinguistics have largely expanded the exploration of the ERP components guiding language processing. The N400 turned out to be a key signature of pragmatic processing, involved in the elaboration of figurative language and discourse, often paired with other later components (for a comprehensive review see Canal & Bambini, 2023). For instance, the literature consistently reported N400 effects in the understanding of metaphorical vs. literal word pairs (e.g., “conscience storm” vs. “problem resolution,” Arzouan et al., 2007), metaphorical utterances vs. literal statements (e.g., “Unemployment is a plague” vs. “Cholera is a plague,” De Grauwe et al., 2010), literary metaphors vs. literal expressions (e.g., “grass of velvet” vs. “throne of velvet,” Bambini et al., 2019), and between metaphor types (e.g., physical, “Boxers are pandas,” vs. mental, “Teachers are books,” Canal et al., 2022). Especially when stimuli are embedded in sentential contexts, a number of studies revealed the involvement of positive ERP effects at longer latencies (about 600ms after word onset), associated with the P600 or with the late positive complex (LPC), for metaphorical vs. literal conditions (e.g., Bambini, Bertini et al., 2016; De Grauwe et al., 2010; Weiland et al., 2014). Notably, this bi-phasic pattern is modulated by the linguistic context: Bambini, Bertini et al. (2016) showed that metaphors with supportive contexts elicit no N400 effects, signaling effortless lexical access to metaphor terms, whilst the inferential machinery deriving the intended figurative meaning still operates as reflected by significant P600 effects compared to literal items. Context is indeed the greatest factor influencing the N400. Studies showed that the N400 is greater when words are less expected in the pragmatic context, be it intended as previous discourse (Nieuwland & Van Berkum, 2006), world knowledge (Hagoort et al., 2004), or expectations about the speaker’s identity and attitudes (Van Berkum et al., 2008). Similarly, accommodating new information in the discourse model engages the N400, along with later components, as reported for the e laboration of presupposition triggers (e.g., Domaneschi et al., 2018; Schumacher & Hung, 2012). In the last decade, the literature started exploring the time-frequency domain of the EEG associated with language processing (Meyer, 2018). By decomposing the psychophysiological signal into different frequency bands (i.e., theta 4–8 Hz, alpha 8–12 Hz, beta 12–35 Hz, gamma 35–80 Hz), this approach enables the study of induced changes in EEG oscillations, that is, event-related rhythmic changes in electrical activity, consisting of either synchronizations or desynchronizations of different populations of neurons. For instance, a well-known effect is the increase in power of the alpha band for storing syntactic structures in verbal working memory. Nowadays, it is not rare to report, along with effects in the ERP domain, the time-frequency correlates also of pragmatic processing (see Figure 4.1 for a conceptual representation of both the ERP and the frequency domain of the EEG). For instance, the processing of humor is associated with a prolonged time course as reflected in the ERP, which includes early negative effects such as the N400 (or sometimes the Left Anterior Negativity) and later positive effects such as the P600/LPC (Canal et al., 2019; Coulson & Kutas, 2001; Feng et al., 2014), but also with the desynchronization in the beta band. While the N400-LPC pattern can be linked to the incongruity detection-resolution mechanisms typical of humor, the beta drop may reflect the abandonment of the status quo and the wrapping of a novel (comic) interpretation of the context (Canal et al., 2019). As for irony, studies showed that it capitalizes especially on late positive effects such as the P600 (Regel et al., 2014; Spotorno et al., 2013), with a power decrease in the alpha band and an increase in the theta and gamma ranges (Regel et al., 2014; Spotorno et al., 2013). While the P600 stresses the role of pragmatic inferencing in irony processing, the effects in the time- frequency domain point to the cognitive efforts spent in the integration of linguistic material and contextual information.
Neuropragmatics 47
Figure 4.1 Conceptual representation of the application of electroencephalography (EEG) to the study of pragmatic phenomena. From left to right, the EEG signal is acquired while a participant reads metaphors (e.g., “That lawyer is a shark”) vs. literal statements (e.g., “That fish is a shark”), presented word-by-word on the screen. EEG recordings are time-locked to the onset of the target word (i.e., “shark”). Event-related potentials (ERPs) and changes in power in the time-frequency domain can be analyzed as a function of scalp location and time: the bottomright part of the figure shows a typical N400/P600 pattern associated with metaphors vs. literal statements; the upper-right part of the figure illustrates the output of the time-frequency analysis, with a fictitious representation of oscillatory changes in different frequency bands. Lighter regions in the theta frequency band (4–8 Hz) indicate neural synchronization, while darker regions in the beta (12–35 Hz) and gamma (35–80 Hz) frequency bands indicate neural desynchronization. © Luca Bischetti, Federico Frau & Valentina Bambini, 2023.
4.4 Conclusions, Outstanding Issues, and New Frontiers After more than 20 years of investigations, neuropragmatics has gathered consolidated knowledge about how the brain processes pragmatic meanings and is currently facing a set of outstanding questions for the next era of studies. First, the use of neuroimaging has offered an impressive contribution in describing the cortical nodes associated with pragmatic processing, highlighting bilaterally spread pragmatic networks, and bringing our knowledge much beyond the right-hemisphere hypothesis derived from early clinical studies. While this was a fundamental step, we still struggle to understand the overall brain architecture supporting pragmatics. Since pragmatics very often involve also areas implicated in other functions (such as ToM and executive functions), what is its status? Can pragmatics still be considered as a separate cognitive domain? Recent neuroimaging studies are tackling this sort of questions, trying to disentangle the relative contribution of different systems to pragmatic tasks and to advance theoretical models of pragmatics (Hauptman et al., 2023; Paunov et al., 2022). Another way of approaching this issue consists in looking more in detail at brain networks, considering not only cortical regions but also white matter tracts. A pragmatic-specific pathway connecting brain regions involved in linguistic and ToM processing (the arcuate fasciculus and especially its posterior segment) has been proposed in the literature (Catani & Bambini, 2014), but these are only the first steps in unraveling the complete neural architecture of pragmatics. Along the lines of considering pragmatics
48 Luca Bischetti, Federico Frau, and Valentina Bambini together with other domains of experience, a very promising research straddle is the relationship between pragmatic inferencing and sensorimotor systems (Cuccio, 2022). Embodiment and multimodality have become dominant paradigms in psycho- and neurolinguistic studies (Pulvermüller, 2012), and we know that motor areas are involved in processing communicative functions conveyed by speech prosody (Tomasello et al., 2022) and in understanding at least some instances of figurative language (see, for instance, the case of action verbs used metaphorically as in the sentence “Matilde throws her sadness far away,” Romero Lauro et al., 2013). Yet studies are still limited (Yang & Shu, 2016) and it is up to future research to highlight the actual contribution of bodily experience and simulation to pragmatic processing. Second, thanks to neurophysiological methods, we have tracked down the processing of discourse and figurative language in its phases with great precision, highlighting that pragmatic processes often unfold in multiple steps. What is still to discover, however, is the deep physiological roots of pragmatic elaboration, in terms of cooperation of different populations of neurons and the intrinsic rhythm of different brain areas supporting pragmatic computation. The investigation of the time-frequency domain could offer great insights to clarify this matter and it is important that future studies expand the range of the phenomena described with this approach to look for an “oscillatory signature” of pragmatic processing. Moreover, a great promise is represented by the use of more naturalistic paradigms compared to the usual settings and lab-based experiments, such as the use of continuous stimulation (Alday, 2019) or dyadic interaction (Kuhlen et al., 2012). Although the evidence is still very limited, we have learned that – when communication is embedded into more ecological contexts – other processes may come to support pragmatics, such as the P300 component for pronoun anaphors processing (Brilmayer & Schumacher, 2021) or the association between alpha desynchronization and the N400 in humorous exchanges recorded from speaker and listener simultaneously (the so-called hyperscanning, Bridwell et al., 2018). In pursuing the goal of deepening neuropragmatic knowledge, it is also fundamental to advance research on clinical groups and to increase the body of studies that try to link the description of pragmatic disorders at the behavioral level to neurofunctional patterns (Carotenuto, Cocozza et al., 2018; Cavelti et al., 2018; Luzzi et al., 2020). Pragmatic communication deficits are not only remarkably diffuse but also impactful on individuals’ and caregivers’ daily life (Turkstra et al., 2017). A more precise description of the pragmatic profiles across populations, along with their neurofunctional basis, is the key to identifying difficulties, tailoring treatment programs, and ameliorating the individual’s quality of life.
REFERENCES Agostoni, G., Bambini, V., Bechi, M., Buonocore, M., Spangaro, M., Repaci, F., Cocchi, F., Bianchi, L., Guglielmino, C., Sapienza, J., Cavallaro, R., & Bosia, M. (2021). Communicative-pragmatic abilities mediate the relationship between cognition and daily functioning in schizophrenia. Neuropsychology, 35(1), 42–56. https://doi.org/10.1037/neu0000664 Alday, P. M. (2019). M/EEG analysis of naturalistic stories: A review from speech to language processing. Language, Cognition and Neuroscience, 34(4), 457–473. https://doi.org/ 10.1080/23273798.2018.1546882 Amanzio, M., Geminiani, G., Leotta, D., & Cappa, S. (2008). Metaphor comprehension in
Alzheimer’s disease: Novelty matters. Brain and Language, 107(1), 1–10. https://doi. org/10.1016/j.bandl.2007.08.003 American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). https://doi.org/10.1176/ appi.books.9780890425596 Arcara, G., & Bambini, V. (2016). A test for the assessment of pragmatic abilities and cognitive substrates (APACS): Normative data and psychometric properties. Frontiers in Psychology, 7, 70. https://doi.org/10.3389/fpsyg.2016.00070 Arcara, G., Tonini, E., Muriago, G., Mondin, E., Sgarabottolo, E., Bertagnoni, G., Semenza, C., & Bambini, V. (2020). Pragmatics and figurative
Neuropragmatics 49 language in individuals with traumatic brain injury: Fine-grained assessment and relevancetheoretic considerations. Aphasiology, 34(8), 1070–1100. https://doi.org/10.1080/02687038.2 019.1615033 Arzouan, Y., Goldstein, A., & Faust, M. (2007). Brainwaves are stethoscopes: ERP correlates of novel metaphor comprehension. Brain Research, 1160(1), 69–81. https://doi.org/10.1016/j. brainres.2007.05.034 Bambini, V., Agostoni, G., Buonocore, M., Tonini, E., Bechi, M., Ferri, I., Sapienza, J., Martini, F., Cuoco, F., Cocchi, F., Bischetti, L., Cava llaro, R., & Bosia, M. (2022). It is time to address language disorders in schizophrenia: A RCT on the efficacy of a novel training targeting the pragmatics of communication (PragmaCom). Journal of Communication Disorders, 97, 106196. https://doi.org/10.1016/j.jcomdis.2022.106196 Bambini, V., Arcara, G., Bechi, M., Buonocore, M., Cavallaro, R., & Bosia, M. (2016). The communicative impairment as a core feature of schizophrenia: Frequency of pragmatic deficit, cognitive substrates, and relation with quality of life. Comprehensive Psychiatry, 71, 106–120. https://doi.org/10.1016/j. comppsych.2016.08.012 Bambini, V., Arcara, G., Bosinelli, F., Buonocore, M., Bechi, M., Cavallaro, R., & Bosia, M. (2020). A leopard cannot change its spots: A novel pragmatic account of concretism in schizophrenia. Neuropsychologia, 139, 107332. https://doi.org/10.1016/j. neuropsychologia.2020.107332 Bambini, V., Arcara, G., Martinelli, I., Bernini, S., Alvisi, E., Moro, A., Cappa, S. F., & Ceroni, M. (2016). Communication and pragmatic breakdowns in amyotrophic lateral sclerosis patients. Brain and Language, 153–154, 1–12. https://doi.org/10.1016/j.bandl.2015.12.002 Bambini, V., & Bara, B. G. (2012). Neuropragmatics. In J.-O. Östman & J. Verschueren (Eds.), Handbook of pragmatics (pp. 1–21). John Benjamins Publishing Company. https://doi. org/10.1075/hop.16.neu2 Bambini, V., Bertini, C., Schaeken, W., Stella, A., & Di Russo, F. (2016). Disentangling metaphor from context: An ERP study. Frontiers in Psychology, 7, 559. https://doi.org/10.3389/ fpsyg.2016.00559 Bambini, V., Bischetti, L., Bonomi, C. G., Arcara, G., Lecce, S., & Ceroni, M. (2020). Beyond the motor account of amyotrophic lateral sclerosis: Verbal humour and its relationship with the cognitive and pragmatic profile. International Journal of Language & Communication Disorders,
55(5), 751–764. https://doi. org/10.1111/1460-6984.12561 Bambini, V., Canal, P., Resta, D., & Grimaldi, M. (2019). Time course and neurophysiological underpinnings of metaphor in literary context. Discourse Processes, 56(1), 77–97. https://doi. org/10.1080/0163853X.2017.1401876 Bambini, V., Gentili, C., Ricciardi, E., Bertinetto, P. M., & Pietrini, P. (2011). Decomposing metaphor processing at the cognitive and neural level through functional magnetic resonance imaging. Brain Research Bulletin, 86(3–4), 203–216. https://doi.org/10.1016/j. brainresbull.2011.07.015 Bara, B. G. (2010). Cognitive pragmatics. The mental processes of communication. MIT Press. https://doi.org/10.7551/mitpress/ 9780262014113.001.0001 Bara, B. G., & Tirassa, M. (2000). Neuropragmatics: Brain and communication. Brain and Language, 71(1), 10–14. https://doi.org/10.1006/ brln.1999.2198 Bašnáková, J., van Berkum, J., Weber, K., & Hagoort, P. (2015). A job interview in the MRI scanner: How does indirectness affect addressees and overhearers? Neuropsychologia, 76, 79–91. https://doi.org/10.1016/j. neuropsychologia.2015.03.030 Bischetti, L., Pompei, C., Scalingi, B., Frau, F., Bosia, M., Arcara, G., & Bambini, V. (2023). Assessment of pragmatic abilities and cognitive substrates (APACS) brief remote: A novel tool for the rapid and tele-evaluation of pragmatic skills in Italian. Language Resources and Evaluation. https://doi.org/10.1007/ s10579-023-09667-y Bishop, D. V. M. (1998). Development of the children’s communication checklist (CCC): A method for assessing qualitative aspects of communicative impairment in children. Journal of Child Psychology and Psychiatry, 39(6), 879–891. Body, R., Perkins, M. R., & McDonald, S. (1999). Pragmatics, cognition and communication in traumatic brain injury. In S. McDonald, L. Togher, & C. Code (Eds.), Communication disorders following traumatic brain injury (pp. 81–112). Psychology Press. Bohrn, I. C., Altmann, U., & Jacobs, A. M. (2012). Looking at the brains behind figurative language – A quantitative meta-analysis of neuroimaging studies on metaphor, idiom, and irony processing. Neuropsychologia, 50(11), 2669–2683. https://doi.org/10.1016/j. neuropsychologia.2012.07.021 Bosco, F. M., Angeleri, R., Zuffranieri, M., Bara, B. G., & Sacco, K. (2012). Assessment Battery
50 Luca Bischetti, Federico Frau, and Valentina Bambini for Communication: Development of two equivalent forms. Journal of Communication Disorders, 45(4), 290–303. https://doi. org/10.1016/j.jcomdis.2012.03.002 Bosco, F. M., Gabbatore, I., Gastaldo, L., & Sacco, K. (2016). Communicative-Pragmatic Treatment in schizophrenia: A pilot study. Frontiers in Psychology, 7, 166. https://doi. org/10.3389/fpsyg.2016.00166 Bosco, F. M., Parola, A., Angeleri, R., Galetto, V., Zettin, M., & Gabbatore, I. (2018). Improvement of communication skills after traumatic brain injury: The efficacy of the cognitive pragmatic treatment program using the communicative activities of daily living. Archives of Clinical Neuropsychology, 33(7), 875–888. https://doi. org/10.1093/arclin/acy041 Bosco, F. M., Tirassa, M., & Gabbatore, I. (2018). Why pragmatics and theory of mind do not (completely) overlap. Frontiers in Psychology, 9, 1453. https://doi.org/10.3389/ fpsyg.2018.01453 Bottini, G., Corcoran, R., Sterzi, R., Paulesu, E., Schenone, P., Scarpa, P., Frackowiak, R. S. J., & Frith, D. (1994). The role of the right hemisphere in the interpretation of figurative aspects of language A positron emission tomography activation study. Brain, 117(6), 1241–1253. https://doi.org/10.1093/brain/117.6.1241 Bridwell, D. A., Henderson, S., Sorge, M., Plis, S., & Calhoun, V. D. (2018). Relationships between alpha oscillations during speech preparation and the listener N400 ERP to the produced speech. Scientific Reports, 8(1), 12838. https://doi.org/10.1038/s41598-018-31038-9 Brilmayer, I., & Schumacher, P. B. (2021). Referential chains reveal predictive processes and form-tofunction mapping: An electroencephalographic study using naturalistic story stimuli. Frontiers in Psychology, 12, 623648. https://doi.org/10.3389/ fpsyg.2021.623648 Brownell, H., Michel, D., Powelson, J. A., & Gardner, H. (1983). Surprise but not coherence: Sensitivity to verbal humor in right-hemisphere patients. Brain and Language, 18(1), 20–27. https://doi.org/10.1016/0093-934X(83)90002-0 Büttner-Kunert, J., Blöchinger, S., Falkowska, Z., Rieger, T., & Oslmeier, C. (2022). Interaction of discourse processing impairments, communicative participation, and verbal executive functions in people with chronic traumatic brain injury. Frontiers in Psychology, 13, 892216. https://doi.org/10.3389/ fpsyg.2022.892216
Canal, P., & Bambini, V. (2023). Pragmatics electrified. In M. Grimaldi, Y. Shtyrov, & E. Brattico (Eds.), Language electrified. Techniques, methods, applications, and future perspectives in the neurophysiological investigation of language. 202, 583–612. Humana. https://doi.org/10.1007/ 978-1-0716-3263-5_18 Canal, P., Bischetti, L., Bertini, C., Ricci, I., Lecce, S., & Bambini, V. (2022). N400 differences between physical and mental metaphors: The role of theories of mind. Brain and Cognition, 161, 105879. https://doi.org/10.1016/j. bandc.2022.105879 Canal, P., Bischetti, L., Di Paola, S., Bertini, C., Ricci, I., & Bambini, V. (2019). “Honey, shall I change the baby? – Well done, choose another one”: ERP and time-frequency correlates of humor processing. Brain and Cognition, 132, 41–55. https://doi.org/10.1016/j.bandc.2019.02.001 Cappelli, G., Noccetti, S., Arcara, G., & Bambini, V. (2018). Pragmatic competence and its relationship with the linguistic and cognitive profile of young adults with dyslexia. Dyslexia, 24(3), 294–306. https://doi.org/10.1002/dys.1588 Carotenuto, A., Arcara, G., Orefice, G., Cerillo, I., Giannino, V., Rasulo, M., Iodice, R., & Bambini, V. (2018). Communication in multiple sclerosis: Pragmatic deficit and its relation with cognition and social cognition. Archives of Clinical Neuropsychology, 33(2), 194–205. https://doi.org/10.1093/arclin/acx061 Carotenuto, A., Cocozza, S., Quarantelli, M., Arcara, G., Lanzillo, R., Brescia Morra, V., Cerillo, I., Tedeschi, E., Orefice, G., Bam bini, V., Brunetti, A., & Iodice, R. (2018). Pragmatic abilities in multiple sclerosis: The contribution of the temporo-parietal junction. Brain and Language, 185, 47–53. https://doi. org/10.1016/j.bandl.2018.08.003 Catani, M., & Bambini, V. (2014). A model for social communication and language evolution and development (SCALED). Current Opinion in Neurobiology, 28, 165–171. https://doi. org/10.1016/j.conb.2014.07.018 Cavelti, M., Kircher, T., Nagels, A., Strik, W., & Homan, P. (2018). Is formal thought disorder in schizophrenia related to structural and functional aberrations in the language network? A systematic review of neuroimaging findings. Schizophrenia Research, 199, 2–16. https://doi. org/10.1016/j.schres.2018.02.051 Colle, L., Angeleri, R., Vallana, M., Sacco, K., Bara, B. G., & Bosco, F. M. (2013). Understanding the communicative impairments in schizophrenia: A
Neuropragmatics 51 preliminary study. Journal of Communication Disorders, 46(3), 294–308. https://doi. org/10.1016/j.jcomdis.2013.01.003 Coulson, S., & Kutas, M. (2001). Getting it: Human event-related brain response to jokes in good and poor comprehenders. Neuroscience Letters, 316(2), 71–74. https://doi.org/10.1016/ S0304-3940(01)02387-4 Critchley, M. (1962). Speech and speech loss in relation to duality of the brain. In V. B. Mountcastle (Ed.), Interhemispheric relations and cerebral dominance (pp. 208–213). John Hopkins Press. Cuccio, V. (2022). The figurative brain. In A. M. García & A. Ibáñez (Eds.), The Routledge handbook of semiosis and the brain 130–144. Routledge. https://doi.org/10.4324/9781003051817-11 Cummings, L. (Ed.). (2017). Research in clinical pragmatics. Springer. https://doi. org/10.1007/978-3-319-47489-2 Cummings, L. (Ed.). (2021). Handbook of pragmatic language disorders. Springer. https://doi. org/10.1007/978-3-030-74985-9 De Grauwe, S., Swain, A., Holcomb, P. J., Ditman, T., & Kuperberg, G. R. (2010). Electrophysiological insights into the processing of nominal metaphors. Neuropsychologia, 48(7), 1965–1984. https://doi.org/10.1016/j.neuropsychologia. 2010.03.017 Domaneschi, F., Canal, P., Masia, V., Lombardi Vallauri, E., & Bambini, V. (2018). N400 and P600 modulation in presupposition accommodation: The effect of different trigger types. Journal of Neurolinguistics, 45, 13–35. https://doi.org/10.1016/j.jneuroling. 2017.08.002 Dronkers, N. F., Pinker, S., & Damasio, A. (2000). Language and the aphasias. In E. R. Kandel, J. H. Schwartz, & T. Jessell (Eds.), Principle of neural science (4th ed., pp. 1169–1187). McGraw-Hill. Eisenson, J. (1959). Language dysfunctions associated with right brain damage. American Speech and Hearing Association, 1, 107. Farkas, A. H., Trotti, R. L., Edge, E. A., Huang, L.-Y., Kasowski, A., Thomas, O. F., Chlan, E., Granros, M. P., Patel, K. K., & Sabatinelli, D. (2021). Humor and emotion: Quantitative meta analyses of functional neuroimaging studies. Cortex, 139, 60–72. https://doi.org/10.1016/j. cortex.2021.02.023 Feng, Y.-J., Chan, Y.-C., & Chen, H.-C. (2014). Specialization of neural mechanisms underlying the three-stage model in humor processing: An ERP study. Journal of
Neurolinguistics, 32, 59–70. https://doi. org/10.1016/j.jneuroling.2014.08.007 Ferré, P., Fonseca, R. P., Ska, B., & Joanette, Y. (2012). Communicative clusters after a righthemisphere stroke: Are there universal clinical profiles? Folia Phoniatrica et Logopaedica, 64(4), 199–207. https://doi.org/10.1159/000340017 Ferstl, E. C., Neumann, J., Bogler, C., & von Cramon, D. Y. (2008). The extended language network: A meta-analysis of neuroimaging studies on text comprehension. Human Brain Mapping, 29(5), 581–593. https://doi.org/ 10.1002/hbm.20422 Ferstl, E. C., & von Cramon, D. Y. (2001). The role of coherence and cohesion in text comprehension: An event-related fMRI study. Cognitive Brain Research, 11(3), 325–340. https:// doi.org/10.1016/S0926-6410(01)00007-6 Gardner, H., & Brownell, H. (1986). Right hemisphere communication battery. Psychology Service, VAMC. Goel, V., & Dolan, R. J. (2001). The functional anatomy of humor: Segregating cognitive and affective components. Nature Neuroscience, 4(3), 237–238. https://doi.org/10.1038/85076 Hagoort, P., Hald, L., Bastiaansen, M., & Petersson, K. M. (2004). Integration of word meaning and world knowledge in language comprehension. Science, 304(5669), 438–441. https://doi.org/10.1126/science.1095455 Hagoort, P., & Levinson, S. C. (2014). Neuropragmatics. In M. S. Gazzaniga & G. R. Mangun (Eds.), The cognitive neurosciences (5th ed., pp. 667–674). MIT Press. Hagoort, P., & van Berkum, J. (2007). Beyond the sentence given. Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1481), 801–811. https://doi.org/10.1098/ rstb.2007.2089 Harrow, M., & Quinlan, D. (1985). Disordered thinking and schizophrenic psychopathology. Gardner Press. Hauptman, M., Blank, I., & Fedorenko, E. (2023). Non-literal language processing is jointly supported by the language and Theory of Mind networks: Evidence from a novel meta-analytic fMRI approach. Cortex, 162, 96–114. https://doi. org/10.1016/j.cortex.2023.01.013 Joanette, Y., & Goulet, P. (1990). Narrative discourse in right-brain-damaged righthanders. In Y. Joanette & H. H. Brownell (Eds.), Discourse ability and brain damage (pp. 131–153). Springer. https://doi. org/10.1007/978-1-4612-3262-9_6
52 Luca Bischetti, Federico Frau, and Valentina Bambini Joanette, Y., Ska, B., & Côté, H. (2004). Protocole MEC, Protocole Montréal d’Évaluation de la Communication [Montreal protocol for the evaluation of communication]. Ortho edition. Kempler, D., VanLancker, D., Marchman, V., & Bates, E. (1999). Idiom comprehension in children and adults with unilateral brain damage. Developmental Neuropsychology, 15(3), 327–349. https://doi.org/10.1080/87565649909540753 Kuhlen, A. K., Allefeld, C., & Haynes, J.-D. (2012). Content-specific coordination of listeners’ to speakers’ EEG during communication. Frontiers in Human Neuroscience, 6, 266. https://doi.org/10.3389/ fnhum.2012.00266 Kutas, M., & Hillyard, S. A. (1980). Reading senseless sentences: Brain potentials reflect semantic incongruity. Science, 207(4427), 203–205. https://doi.org/10.1126/science.7350657 Laws, G., & Bishop, D. V. M. (2004). Pragmatic language impairment and social deficits in Williams syndrome: A comparison with Down’s syndrome and specific language impairment. International Journal of Language & Communication Disorders, 39(1), 45–64. https:// doi.org/10.1080/13682820310001615797 Lee, S. S., & Dapretto, M. (2006). Metaphorical vs. literal word meanings: fMRI evidence against a selective role of the right hemisphere. NeuroImage, 29(2), 536–544. https://doi. org/10.1016/j.neuroimage.2005.08.003 Luzzi, S., Baldinelli, S., Ranaldi, V., Fiori, C., Plutino, A., Fringuelli, F. M., Silvestrini, M., Baggio, G., & Reverberi, C. (2020). The neural bases of discourse semantic and pragmatic deficits in patients with frontotemporal dementia and Alzheimer’s disease. Cortex, 128, 174–191. https://doi.org/10.1016/j. cortex.2020.03.012 Marini, A., Spoletini, I., Rubino, I. A., Ciuffa, M., Bria, P., Martinotti, G., Banfi, G., Boccascino, R., Strom, P., Siracusano, A., Caltagirone, C., & Spalletta, G. (2008). The language of schizophrenia: An analysis of micro and macrolinguistic abilities and their neuropsychological correlates. Schizophrenia Research, 105(1–3), 144–155. https://doi. org/10.1016/j.schres.2008.07.011 Martin, I., & McDonald, S. (2003). Weak coherence, no theory of mind, or executive dysfunction? Solving the puzzle of pragmatic language disorders. Brain and Language, 85(3), 451–466. https://doi.org/10.1016/ S0093-934X(03)00070-1
Mashal, N., Faust, M., Hendler, T., & JungBeeman, M. (2007). An fMRI investigation of the neural correlates underlying the processing of novel metaphoric expressions. Brain and Language, 100(2), 115–126. https://doi. org/10.1016/j.bandl.2005.10.005 Meyer, L. (2018). The neural oscillations of speech processing and language comprehension: State of the art and emerging mechanisms. European Journal of Neuroscience, 48(7), 2609–2621. https://doi.org/10.1111/ejn.13748 Montemurro, S., Mondini, S., Signorini, M., Marchetto, A., Bambini, V., & Arcara, G. (2019). Pragmatic language disorder in Parkinson’s disease and the potential effect of cognitive reserve. Frontiers in Psychology, 10, 1220. https://doi.org/10.3389/fpsyg.2019.01220 Nieuwland, M. S., & Van Berkum, J. J. A. (2006). When peanuts fall in love: N400 evidence for the power of discourse. Journal of Cognitive Neuroscience, 18(7), 1098–1111. https://doi. org/10.1162/jocn.2006.18.7.1098 Paunov, A. M., Blank, I. A., Jouravlev, O., Mineroff, Z., Gallée, J., & Fedorenko, E. (2022). Differential tracking of linguistic vs. mental state content in naturalistic stimuli by language and Theory of Mind (ToM) brain networks. Neurobiology of Language, 3(3), 413–440. https://doi.org/10.1162/nol_a_00071 Penn, C. (1985). The profile of communicative appropriateness: A clinical tool for the assessment of pragmatics. South African Journal of Communication Disorders, 32(1), 18–23. https://doi.org/10.4102/sajcd.v32i1.329 Prado, J., Spotorno, N., Koun, E., Hewitt, E., Van der Henst, J.-B., Sperber, D., & Noveck, I. A. (2015). Neural interaction between logical reasoning and pragmatic processing in narrative discourse. Journal of Cognitive Neuroscience, 27(4), 692–704. https://doi. org/10.1162/jocn_a_00744 Price, C. J. (2012). A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading. NeuroImage, 62(2), 816–847. https://doi. org/10.1016/j.neuroimage.2012.04.062 Pulvermüller, F. (2012). Meaning and the brain: The neurosemantics of referential, interactive, and combinatorial knowledge. Journal of Neurolinguistics, 25(5), 423–459. https://doi. org/10.1016/j.jneuroling.2011.03.004 Rapp, A. M., Mutschler, D. E., & Erb, M. (2012). Where in the brain is nonliteral language? A coordinate-based meta-analysis of functional
Neuropragmatics 53 magnetic resonance imaging studies. NeuroImage, 63(1), 600–610. https://doi. org/10.1016/j.neuroimage.2012.06.022 Regel, S., Meyer, L., & Gunter, T. C. (2014). Distinguishing neurocognitive processes reflected by P600 effects: Evidence from ERPs and neural oscillations. PLoS One, 9(5), e96840. https://doi.org/10.1371/journal.pone.0096840 Reindal, L., Nærland, T., Weidle, B., Lydersen, S., Andreassen, O. A., & Sund, A. M. (2021). Structural and pragmatic language impairments in children evaluated for autism spectrum disorder (ASD). Journal of Autism and Developmental Disorders, 53, 701–719. https:// doi.org/10.1007/s10803-020-04853-1 Reyes-Aguilar, A., Valles-Capetillo, E., & Giordano, M. (2018). A quantitative metaanalysis of neuroimaging studies of pragmatic language comprehension: In search of a universal neural substrate. Neuroscience, 395, 60–88. https://doi.org/10.1016/j. neuroscience.2018.10.043 Romero Lauro, L. J., Mattavelli, G., Papagno, C., & Tettamanti, M. (2013). She runs, the road runs, my mind runs, bad blood runs between us: Literal and figurative motion verbs: An fMRI study. NeuroImage, 83, 361–371. https:// doi.org/10.1016/j.neuroimage.2013.06.050 Saul, J., Griffiths, S., & Norbury, C. F. (2022). Prevalence and functional impact of social (pragmatic) communication disorders. Journal of Child Psychology and Psychiatry, 64, 376–387. https://doi.org/10.1111/jcpp.13705 Schumacher, P. B., & Hung, Y.-C. (2012). Positional influences on information packaging: Insights from topological fields in German. Journal of Memory and Language, 67(2), 295–310. https:// doi.org/10.1016/j.jml.2012.05.006 Shibata, M., Toyomura, A., Itoh, H., & Abe, J. (2010). Neural substrates of irony comprehension: A functional MRI study. Brain Research, 1308, 114–123. https://doi.org/10.1016/j. brainres.2009.10.030 Snow, P., & Douglas, J. (2017). Psychosocial aspects of pragmatic disorders. In L. Cummings (Ed.), Research in clinical pragmatics (pp. 617–649). Springer. https://doi. org/10.1007/978-3-319-47489-2_23 Spotorno, N., Cheylus, A., Van Der Henst, J.-B., & Noveck, I. A. (2013). What’s behind a P600? Integration operations during irony processing. PLoS One, 8(6), e66839. https:// doi.org/10.1371/journal.pone.0066839
Spotorno, N., Koun, E., Prado, J., Van Der Henst, J.-B., & Noveck, I. A. (2012). Neural evidence that utterance-processing entails mentalizing: The case of irony. NeuroImage, 63(1), 25–39. https:// doi.org/10.1016/j.neuroimage.2012.06.046 St George, M., Kutas, M., Martinez, A., & Sereno, M. I. (1999). Semantic integration in reading: Engagement of the right hemisphere during discourse processing. Brain, 122(7), 1317–1325. https://doi.org/10.1093/brain/122.7.1317 Stemmer, B. (2008). Neuropragmatics. In M. J. Ball, M. R. Perkins, N. Müller, & S. Howard (Eds.), The handbook of clinical linguistics (pp. 61–78). Blackwell Publishing. https://doi. org/10.1002/9781444301007.ch4 Togher, L., McDonald, S., Coelho, C. A., & Byom, L. (2013). Cognitive communication disability following TBI. Examining discourse, pragmatics, behaviour and executive function. In S. Mcdonald, L. Togher, & C. Code (Eds.), Social and communication disorders following traumatic brain injury (2nd ed., pp. 89–118). Psychology Press. https://doi. org/10.4324/9780203557198-4 Tomasello, R., Grisoni, L., Boux, I., Sammler, D., & Pulvermüller, F. (2022). Instantaneous neural processing of communicative functions conveyed by speech prosody. Cerebral Cortex, 32(21), 4885–4901. https://doi.org/10.1093/ cercor/bhab522 Turkstra, L. S., Clark, A., Burgess, S., Hengst, J. A., Wertheimer, J. C., & Paul, D. (2017). Pragmatic communication abilities in children and adults: Implications for rehabilitation professionals. Disability and Rehabilitation, 39(18), 1872–1885. https://doi.org/10.1080/09 638288.2016.1212113 Van Berkum, J. J. A., van den Brink, D., Tesink, C. M. J. Y., Kos, M., & Hagoort, P. (2008). The neural integration of speaker and message. Journal of Cognitive Neuroscience, 20(4), 580–591. https://doi.org/10.1162/jocn.2008.20054 Volden, J. (2017). Autism spectrum disorder. In L. Cummings (Ed.), Research in clinical pragmatics (pp. 59–83). Springer. https://doi. org/10.1007/978-3-319-47489-2_3 Vrticka, P., Black, J. M., & Reiss, A. L. (2013). The neural basis of humour processing. Nature Reviews Neuroscience, 14(12), 860–868. https:// doi.org/10.1038/nrn3566 Weiland, H., Bambini, V., & Schumacher, P. B. (2014). The role of literal meaning in figurative language comprehension: Evidence from
54 Luca Bischetti, Federico Frau, and Valentina Bambini masked priming ERP. Frontiers in Human Neuroscience, 8, 583. https://doi.org/10.3389/ fnhum.2014.00583 Weintraub, S., Mesulam, -M.-M., & Kramer, L. (1981). Disturbances in prosody. A righthemisphere contribution to language. Archives of Neurology, 38(12), 742. https:// doi.org/10.1001/archneur.1981. 00510120042004
Winner, E., & Gardner, H. (1977). The comprehension of metaphor in brain-damaged patients. Brain, 100(4), 717–729. https://doi. org/10.1093/brain/100.4.717 Yang, J., & Shu, H. (2016). Involvement of the motor system in comprehension of non-literal action language: A meta-analysis study. Brain Topography, 29(1), 94–107. https://doi. org/10.1007/s10548-015-0427-5
5 Pragmatic Impairment as an Emergent Phenomenon MICHAEL R. PERKINS AND JAMIE H. AZIOS 5.1 Introduction Transcript 1 is an extract from a conversation between John, a child with autistic spectrum disorder (ASD), aged 4;11, and Kate, a speech and language therapist. They are looking at pictures of different kinds of fruit. Transcript 1 1 2 3 4 5 6 7 8 9 10 11 12
Kate: John: Kate: John: Kate: John: Kate: John: Kate: John: Kate: John:
Could you eat that? [indicating picture of oranges] No. Why’s that? Because the orange is hurting me. How does it hurt you? He won’t eat it. You don’t eat oranges? No. Why not, John? Because silly. Why are they silly? An orange.
Most of John’s contributions to the conversation don’t seem to connect well with what Kate says. One might describe them as inappropriate, irrelevant or just plain odd. Assuming that such exchanges are typical of John, would we be justified in describing his conversation as showing evidence of pragmatic impairment? If we analyze his utterances using certain categories derived from pragmatic theory the answer would appear to be “yes.” For example, John’s contributions are not particularly “cooperative” in the sense of Grice’s cooperative principle (see Ahlsén, 2011). More specifically, according to Grice’s theory of conversational implicature, John’s responses in lines 6 and 12 – from an outside observer’s viewpoint, though not necessarily from John’s – could be seen as breaking the maxim of relevance (i.e. they appear to have little to do with Kate’s preceding questions) and if his responses in lines 2 and 4 are indeed untrue, they break the maxim of quality. It is not clear, though, whether these “floutings” of the maxims are intended to trigger implicatures, and if so, what they might be. Other features of John’s conversation may be described using Speech Act
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
56 Michael R. Perkins and Jamie H. Azios Theory (see Keegan et al, Chapter 1 in this volume). For example, John’s responses in lines 6 and 12 could be taken as evidence of a lack of “illocutionary uptake”; that is, as Blank et al. (1979) put it when describing a similar child, he seems to find it hard to “interpret the … intent of others” (p. 351). Kate likewise appears to find some of John’s utterances hard to interpret – for example, in line 7 she tries to get John to verify whether a re-explicated version of his preceding utterance is in fact what he meant. In terms of Relevance Theory (see Ifantidou & Wharton, Chapter 3 in this volume), this could be construed as both Kate and John having to make a significant commitment in terms of processing effort with relatively little to show for it by way of “contextual effects” including enhanced mutual understanding. The concepts and categories provided by pragmatic theory thus provide us with a ready means of describing atypical communicative behavior. John would also be labeled as pragmatically impaired according to various formal assessment procedures. For example, to take just two items from Bishop’s Children’s Communicative Checklist (2003), John “uses terms like ‘he’ or ‘it’ without making it clear what he is talking about” (cf. line 6) and “it is sometimes hard to make sense of what he is saying because it seems illogical or disconnected”. Likewise, according to Penn’s Profile of Communicative Appropriateness (1985), John’s conversation might be described as manifesting inappropriate “reference”, “idea sequencing” and “topic adherence”. These ways of characterizing pragmatic impairment are common in clinical practice and research and have given rise to a wide range of clinical pragmatic tests, assessments and checklists. However, while providing a useful means of describing anomalous communicative behavior, most tests are less successful at explaining such behavior in a way that provides clinicians with clear targets for intervention. For example, a lack of illocutionary uptake could be an indirect consequence of a range of factors including difficulties with inferential reasoning, a syntactic parsing problem, an attention deficit, problems with short-term verbal memory or impaired auditory processing. Thus, labeling the behavioral symptom is only a first step; the likely underlying cause also needs to be identified. Furthermore, it’s unlikely that pragmatic impairment is a unitary condition with a single underlying cause (Perkins, 2014). Separate tests that purport to capture the same pragmatic impairments do not routinely or accurately identify the same children as pragmatically impaired (Volden & Phillips, 2010). Additionally, discrete behavioral symptoms linked to pragmatic impairment have been noted across various communication disabilities such as aphasia, TBI, and autism but the underlying cause of the outward symptomology varies across condition. For example, difficulties in topic initiation and maintenance have been attributed to linguistic processing in aphasia (e.g. Leaman et al., 2022), impaired social perception or executive dysfunction in TBI (e.g. Byom & Turkstra, 2012), and deficits in social-emotional reciprocity in autism (Friedman et al., 2019). As pointed out by Perkins (2014), such a wide range of conditions makes a single etiology for pragmatic impairment impossible. In this chapter we outline a perspective antithetical to the reductionist approach alluded to above which remains the predominant view of pragmatic impairment in medical and clinical models. We suggest that the analysis of pragmatic impairment be regarded as an “emergent” phenomenon. That is to say, rather than seeing pragmatics as a discrete component of communicative processing like syntax, phonology or lexis, it views it as an indirect, or “epiphenomenal,” consequence of the way such components are used and interact. Furthermore, rather than viewing pragmatic competence as being solely to do with language use, the “emergentist” approach regards it as resulting from the interaction of multiple factors including language, cognition, and more besides. The emergentist account of pragmatic ability and disability has its roots in the “interactionist” approach pioneered by
Pragmatic Impairment as an Emergent Phenomenon 57 Elizabeth Bates, Carol Prutting, and Claire Penn, among others (see, for example, Gallagher, 1991; Penn, 1999). The version presented here, which has been developed over the last decade or so (e.g. Perkins, 1998, 2005, 2007), draws in addition on insights from cognitive science (e.g. A. Clark, 1997), social psychology (H. H. Clark, 1996) and conversation analysis (Wilkinson, Chapter 6 in this volume).
5.2 An Emergentist Model of Pragmatic Ability and Disability John’s pragmatic problems as illustrated in the transcript above stem at least partly from being unable to work out others’ states of mind including their intentions, feelings and knowledge. For meaning that is linguistically encoded, this may not pose much of a problem. However, any meaning which is left unsaid, on the assumption that the hearer will be able to infer it, is bound to be problematic in cases where there is inadequate access to others’ mental states. An inability to “read” others’ minds in this way is commonly described as having an impaired “theory of mind” (ToM) (see Bambini et al, Chapter 4 in this volume) – that is, a cognitive deficit – and the link between ToM competence and pragmatic impairment is now generally accepted in research on ASD, right-hemisphere brain damage (RHD) and traumatic brain injury (TBI) (e.g., Camia et al., 2022). However, this view has been viewed as theoretically inadequate by several authors (e.g., Andrés-Roqueta & Katsos, 2017; Kissine, 2016). ToM, though, is not the only aspect of cognition to contribute to pragmatic ability. From a speaker’s perspective, pragmatics may be seen as getting the balance right between what is said and what may reasonably be left to be inferred, and the hearer’s role is to work this out. This interpersonal balancing act is dependent not only on ToM but in addition on the capacity to encode and decode what is expressed linguistically. If a speaker has a language-encoding problem, the hearer may be left with a difficult or even impossible inferential task. For example, people with aphasia are often unable to encode enough detail in their conversational turns because of problems with lexical retrieval and/or grammatical formulation, leaving interlocutors very little to go on when trying to infer intended meaning (Perkins, 2014). Transcript 2 is adapted from Beeke et al. (2020) and includes a conversation between Doris, a 75-year-old woman with Wernicke’s aphasia who has problems with lexical retrieval. Specifically, Doris has trouble with person reference including gender mismatched pronouns and kinship association terms. She is unable to supply enough linguistic content so that a referent can be established. As a result, she is unable to encode sufficient and accurate information linguistically to express what she means. The two speakers do have shared knowledge of the persons mentioned in the conversation, but the lexical retrieval problems still create an imbalance between explicit and implicit meanings. Underlined content is emphasized and spoken louder than the words around it. Transcript 2 (Beeke et al., 2020, p. 945) Doris: Pam: Doris: Pam: Doris Pam Doris:
no we’re talking about him you’re talking about Harry (long pause) your son no not my son…the father…Mariu- maaa maaa…what’s her name Margaret hm ((nods)) the mother hm ((nods))
58 Michael R. Perkins and Jamie H. Azios In this particular case the underlying problem happens to be one of lexical access, but difficulties with phonology, syntax or prosody have similar consequences for the explicitimplicit balance. If pragmatic competence is seen as effective language use, ability to make the right encoding choices clearly draws not only on cognitive factors but also on linguistic ability. Linguistic encoding ability can in turn be indirectly affected by motor speech problems as found in conditions as different as dysarthria, cleft palate and cerebral palsy where access to phonological, syntactic and semantic form is obscured by poor articulation, but the end result in terms of additional inferential processing for the hearer is the same. Linguistic decoding ability also plays a significant role in pragmatic processing. If one is unable to parse incoming utterances in order to arrive at an accurate representation of their propositional content, any additional implicit meaning will be more difficult to access. Language is thus one type of input system which the inferential reasoning system draws on, though it is not the only one relevant to pragmatics. Visual impairment, for example, can affect the detection of irony via facial expression, and young blind children have been shown to perform as poorly as children with autism on ToM tasks (Hobson & Bishop, 2003). Hearing impairment, too, has been shown to have adverse effects on conversational turn-taking and initiation (Pajo & Laakso, 2020). Inferential reasoning also draws on a range of cognitive capacities. ToM plays a particularly important role here, as noted above, but so do other areas of cognition. The conversational extract in Transcript 3, spoken by a man with traumatic brain injury, exhibits sudden topic shifts which leave the hearer unable to work out the links and see the overall coherence of what is being said. Transcript 3 (Perkins et al., 1995, p. 305) I have got faults and . my biggest fault is . I do enjoy sport . it’s something that I’ve always done . I’ve done it all my life . I’ve nothing but respect for my mother and father and . my sister . and basically sir . I’ve only come to this conclusion this last two months . and . as far as I’m concerned . my sister doesn’t exist
This appears to be linked to problems with short-term memory – that is, the speaker forgets what he has just been talking about – and “executive function” – that is, he has problems with planning and monitoring what he is saying. So far, it has been tacitly assumed that the sole way of making meaning explicit is via language, and indeed such an assumption is widespread in both theoretical and clinical linguistics. Semiotic systems such as prosody, gesture, gaze, facial expression and posture are often seen as secondary to spoken language, and even “pragmatic” insofar as they enable the hearer to infer meaning not expressed linguistically. In recent years, however, a number of research studies have suggested that all of these systems have a certain equivalence in that they provide alternative ways of making meaning explicit. Furthermore, they appear to function together as a single, mutually dependent and integrated signaling system across which meaning is orchestrated (McNeill, 2000). In the example below, a man with severe aphasia and apraxia (Rudy) who has very little intelligible speech is trying to initiate a topic in conversation with his wife, Lila. She misinterprets Rudy’s intent as “hummingbird” and has just written down that word on a piece of paper in front of them. Rudy indicates that Lila’s guess of hummingbird is wrong and is able to repair meaning using a combination of single words, artifacts, gesture, and prosody. In the transcript below, XXX indicates unintelligible speech and actions are italicized between double parentheses.
Pragmatic Impairment as an Emergent Phenomenon 59 Transcript 4 (adapted from Azios et al., 2022, p. 230) Rudy: XXX XXX day XXX duh day XXX XXX everyday ((points to “hummingbird”)) different. Rudy: wayyyyy up ((arm up, moves hand forward)) XXX see it XXX see see Lila: are you talking about your martin house? Rudy: yeah Similarly, Simmons-Mackie and Damico (1996) have shown how individuals with aphasia are able to make use of posture, gesture, repetition and neologisms to signal discourse functions such as turn initiation and termination which would normally be done linguistically. Such a multimodal approach to communication muddies the waters of the explicit/ implicit distinction made in traditional pragmatics. In the emergentist approach, on the other hand, it simply leads to the recognition of a wider range of choices which are implicated in decisions about what meanings are to be made explicit. The ability for a speaker to maintain, and for a hearer to work out, the precise relationship between what is explicitly conveyed and what is meant can thus be seen to be dependent on a range of underlying factors, some of which are shown in Table 5.1. The semiotic elements provide alternative ways of representing meaning which may be encoded motorically and decoded sensorily. The various cognitive elements are responsible for what is and is not encoded and decoded, and how, why, when, where and whether these processes take place. Seen in this way, pragmatics is an inherent property of the communicative spectrum as a whole, rather than being exclusively subserved by a single cognitive system, that is, ToM, in conjunction with a single semiotic system, that is, language, as is more commonly assumed to be the case. From a clinical perspective, such an approach has the advantage of allowing a focus on the disparate range of factors which can lead to pragmatic impairment, and thus provides the opportunity to focus on, and treat, underlying causes in addition to behavioral Table 5.1 Some semiotic, cognitive and sensorimotor elements of pragmatics. Semiotic
Cognitive
Motor
Sensory
Language: phonology prosody morphology syntax semantics discourse Gesture Gaze Facial expression Posture
Inference Theory of mind Executive function Memory Emotion Attitude
Vocal tract Hands Arms Face Eyes Body
Hearing Vision
Source: Perkins, 2011.
60 Michael R. Perkins and Jamie H. Azios Table 5.2 A simple taxonomy of pragmatic impairments. Area of underlying deficit
Type of pragmatic impairment
inference theory of mind executive function memory emotion and attitude phonology morphology syntax lexis prosody discourse gesture gaze facial expression posture auditory perception visual perception motor/articulatory ability
Cognitive
Linguistic
Non-verbal
Sensorimotor
symptoms. This permits a detailed typology of different pragmatic impairments, rather than forcing a reliance on a single generic, but uninformative, label such as pragmatic impairment/disability/difficulties (Perkins, 2000). Table 5.2 represents a starting point for such a taxonomy. Even this, though, is still something of a simplification, as it leaves out a crucial dimension of pragmatic impairment that we have so far not touched upon.
5.3 Compensatory Adaptation Most approaches to communication impairment assume a direct link between an underlying linguistic or cognitive deficit and the set of behaviors or symptoms to which it gives rise. So, for example, aphasic agrammatism and specific language impairment (SLI) are often seen as a direct consequence of damage to a grammar “module”. An alternative view is that behavioral symptoms are often only indirectly linked to an underlying deficit, and may in fact result from compensatory adaptation. So, for example, some now see agrammatism as “message simplification on the part of the aphasic speaker in an attempt to prevent computational overload” (Kolk, 1995, p. 294), and SLI as a compensatory adaptation to a procedural memory deficit whereby linguistic rules are learned explicitly via declarative memory, as is typically the case in adult second language learners (Ullman & Pierpont, 2005). In similar vein, the emergentist account of pragmatic impairment sees pragmatic behavior as resulting from complex interactions and trade-offs between the kinds of elements shown in Table 5.1. An individual is seen as an intrapersonal domain, comprising the sum total of all his or her interacting semiotic, cognitive and sensorimotor capacities. Any malfunctioning capacity will have consequences for the entire intrapersonal domain, and any subsequent
Pragmatic Impairment as an Emergent Phenomenon 61 adaptation will result in a redistribution of resources across the domain as a whole. Problems with phonological encoding, for example, may be offset by more extensive use of gesture, and syntactic comprehension difficulties may lead the hearer to rely more on contextually inferred meaning. Such adaptations and trade-offs are deemed pragmatic if they are motivated by the need to communicate with others. A group of two or more individuals is seen as an interpersonal domain in which the individuals’ capacities interact with those of the other individual(s). The interacting elements are still of the same type – that is, semiotic, cognitive and sensorimotor – but become a shared resource. A deficit within an individual may have interpersonal consequences, and any resulting adaptations will have an impact on the explicit-implicit meaning balance at an interpersonal level. This could lead, for example, to attitudinal and emotional meaning being encoded via facial expression rather than linguistically, and being decoded visually rather than auditorily. Some examples are provided in Table 5.3. To summarize: when we describe pragmatic ability and disability as emergent, we mean that pragmatics is not a discrete entity but the complex outcome of many interacting variables. When we communicate with others, we draw on a range of capacities including (1) signaling systems such as language, gesture and facial expression, (2) cognitive systems such as theory of mind, inference and memory, (3) motor output systems such as the vocal tract and hand movement and (4) sensory input systems such as hearing and vision. All of these “elements” exist within the individual, that is, they constitute an intrapersonal domain, but during communication they combine with those of other individuals to form an interpersonal domain. Interpersonal communication involves many choices: for example, which meanings are explicitly encoded, and which left implicit, which signaling systems are used, and which meanings are most salient and relevant. The exercise of such choices requires multiple interactions between the various underlying semiotic, cognitive and sensorimotor capacities both within and between individuals. Intrapersonal and interpersonal domains are dynamic systems whose integrity and equilibrium are maintained via a continuous
Table 5.3 Examples of interpersonal compensation for expressive and receptive communication impairments. Impairment of expressive resources
Compensation by interlocutor
Semiotic, e.g. syntactic formulation problems Cognitive, e.g. attention deficit
Greater reliance on inference based on contextual clues and shared knowledge Greater reliance on gesture, eye contact, linguistic repetition Repetition of what hearer thinks has been said for verification by speaker
Sensorimotor, e.g. dysarthria, dyspraxia Impairment of receptive resources
Compensation by interlocutor
Semiotic, e.g. poor parsing, word recognition Cognitive, e.g. poor short-term memory
Simplified syntax, use of gesture and visual clues Frequent linguistic recapitulation and use of visual reminders Greater reliance on gesture, exaggerated articulation and other visual clues
Sensorimotor, e.g. hearing impairment Source: Perkins, 2011.
62 Michael R. Perkins and Jamie H. Azios process of compensatory adaptation. The effect of this is most plainly seen when one or more individual elements malfunction and create an imbalance within the system as a whole.
5.4 Clinical and Theoretical Implications of an Emergentist Model of Pragmatics Because of its holistic perspective, the emergentist account of pragmatics is much broader in scope than other approaches which focus on a single component of pragmatic processing such as intention, inference or ToM, and is effectively co-extensive with the entire spectrum of interpersonal communication. This does not mean, though, that specificity and rigor are sacrificed for comprehensiveness. Admittedly, labels such as “pragmatic impairment” and “pragmatic disability” are too vague to have much diagnostic value. For example, Prutting and Kirchner’s Pragmatic Protocol (1987) includes items as disparate as variety of speech acts, topic maintenance, repair, pause time and feedback to speakers. Likewise, pragmatic impairment has been seen as an inherent property of a similarly disparate range of unrelated communication disorders including aphasia, Asperger’s syndrome, autism, dementia, developmental language disorder, hearing impairment, visual impairment and schizophrenia (Perkins, 2003). However, by focusing on the entire range of underlying factors that determine the balance between explicit and implicit meaning, the emergentist approach is able to identify the different pragmatic consequences of all these conditions in terms of both their underlying causes and their communicative effects. Furthermore, in so doing it provides an explanation of the condition rather than just describing it, and makes it possible to direct intervention at causes rather than just symptoms. It is rarely the case, though, that anomalous behavior maps directly onto a single underlying cognitive, linguistic or sensorimotor deficit. It is quite common to find behavioral symptoms resulting from attempts to compensate for a deficit elsewhere in the intrapersonal domain. So, for example, Tarling et al. (2006) found that a child with Williams syndrome was able to partially mask syntactic formulation and lexical retrieval difficulties by effecting smooth and well-timed turn transitions and topic changes to give the overall impression of being an attentive and effective conversational partner. By viewing individuals and groups of individuals as dynamic organisms comprising complex interactions of cognitive, linguistic and sensorimotor processes, the emergentist approach moves away from the single deficit model of pathology and sees all communication disorders as potentially complex. An emergentist approach to pragmatic impairment may also reveal that behaviors commonly referred to as “disabling” and evidence of impairment might actually represent “an interactional solution to competing demands on compromised communicative resources” (Perkins, 2014, p. 141). That is, the same behaviors seen as weaknesses in an individual with a pragmatic impairment may be viewed as strengths. This strength-based perspective is only revealed through the careful analysis of the interactional domain where all communicative resources, whether semiotic, cognitive, or sensorimotor, are examined in relation to the interpersonal balancing act between the speaker and hearer. For example, through systemic functional linguistics, Keegan and McAdam (2016) demonstrate swearing, a behavior overwhelmingly attributed to lack of control and increased automaticity, as a linguistic strength and integral part of identity in a man with TBI. Conversation analysis-based studies have revealed similar findings for atypical, pragmatically unusual behaviors observed in persons with aphasia (Azios & Simmons-Mackie, 2022; Beeke et al., 2003; Simmons-Mackie & Damico, 1996), TBI (Azios & Archer, 2018; Denman & Wilkinson, 2011), and RHD (Barnes et al., 2019; Barnes & Armstrong, 2010). For example, Azios and Archer (2018) demonstrate how a young
Pragmatic Impairment as an Emergent Phenomenon 63 man with a TBI uses singing during therapy sessions to negotiate topics of interest and to disalign from or disagree with his clinician during potentially face-threatening conversations about his disability. While on a surface level the singing appears unusual and perseverative, and therefore pragmatically inappropriate, the authors’ analysis demonstrates that the behavior is skillfully used to move the interaction from a focus on the negative facets of his identity to a focus on more positive characteristics (i.e. his knowledge of music and relative skill as a singer). As pointed out by Damico and Nelson (2005), even when an individual does not have culturally appropriate or effective semiotic mediational capacities, he or she does have perceptual, social, and cognitive needs to create meaning relative to their world of social experience. Therefore, a person with a deficient semiotic system does the best he or she can to express themselves, even if these outward signs may be perceived by others as problematic or inappropriate. Thus, it is important for professionals involved in clinical aspects of care to conceptualize these external behaviors as a systematic but epiphenomenal response to the competing demands experienced by the individual with pragmatic impairment. Only then can interventions be directed at the underlying cause of the seemingly inappropriate behavior rather than attempts at extinguishing the outward symptom. In a study of touching another person in everyday social interaction, Denman and Wilkinson (2011), demonstrated the strategic use of touching behavior by a man with TBI while interacting with his female carer. Touching of the opposite sex is viewed as a common type of inappropriate behavior and one that is often present following TBI (Simpson et al., 2013). However, the analysis here revealed that touching was not a random, impulsive act as typically described in the TBI literature. Instead, touching was used when the man with the TBI was expected to provide a specific answer or response (especially if the question asked was face-threatening) or as a way of declining an offer after earlier attempts were ignored. Thus, in all of the touching examples, the female carer’s behavior precipitated the touching behavior. These data have important implications for clinical management as carers may require training to recognize the underlying cause of this seemingly inappropriate behavior and adjust their own actions accordingly. Models of typical and atypical pragmatic functioning tend to focus either on the capacities of the individual (e.g. ToM) with minimal reference to properties of the interaction in which the individual is a participant, or else on the interaction itself with little account being taken of the participants’ underlying cognitive and linguistic capacities. The emergentist model, on the other hand, sees the intrapersonal and interpersonal domains as working in synergy, as is found in dynamic models of shared cognition (e.g. Clark, 1997) and joint action (e.g. Clark, 1996). These models have also been used to explain the nature of communication in aphasia groups and the alterations of the physical environment that both facilitators and participants with aphasia manipulate to ensure understanding between group members (Archer et al., 2018, 2019, 2021). In a case study of Peter, a child with an original diagnosis of SLI, Perkins (2007) showed that a range of anomalous communicative behaviors could only be properly understood when seen simultaneously from the perspective of the individual and that of the communicating dyad. Some of Peter’s conversational problems were easily describable in traditional pragmatic terms – for example, referential inadequacy, lack of coherence, poor topic introduction and maintenance, being unclear (Grice’s maxim of manner), saying too little or too much (Grice’s maxim of quantity), and not always making clear the illocutionary force of his utterances. In other areas, though, Peter was clearly pragmatically skilled – for example, his use of conversational repair, gaze, prosody and gesture to manage turn-taking effectively and to coordinate his own behavior with that of the interlocutor. A single diagnostic
64 Michael R. Perkins and Jamie H. Azios term such as “pragmatic impairment” is therefore clearly neither adequate nor sufficiently specific. Some of these behaviors were linked to problems with lexical retrieval and syntactic formulation, that is, a language encoding problem, which meant that his meaning was often insufficiently explicit. However, his language performance was also very variable. For example, lexical access improved when he was able to keep his syntax simple, and syntactically complex sentences were possible provided he used pro-forms such as “it” and “there” instead of more semantically specified forms. (These compensatory adaptations were in fact only part of a more complex picture, being linked to underlying difficulties with auditory verbal memory (i.e. remembering what he had already said, and what others had said to him) and auditory selective attention (i.e. being able to process language against background noise).) We can describe such trade-offs in intrapersonal terms (i.e. as interactions within and between Peter’s linguistic, cognitive, and sensorimotor systems) but they are also clearly interpersonally motivated. In addition, some compensatory adaptations were exclusively interpersonal. For example, Peter would sometimes formulate a proposition gradually and incrementally across several conversational turns, and require evidence of understanding from his interlocutor after each increment before continuing. A simple example is shown in Transcript 5. Transcript 5 1 Peter: you know the tickets? 2 Sara: yeah 3 Peter: they tell you where to go Instead of producing the single sentence “the tickets tell you where to go” in one turn, the subject noun phrase is specified first, and then subsequently substituted by a pronoun which reduces the processing load. Syntactic formulation across turns in this way is only possible with appropriate input from the interlocutor, making it effectively a joint activity. A further example of interpersonal adaptation is the use of eye gaze by Peter to indicate when he requires assistance from his conversational partner to find a word. Peter’s word searches can sometimes take many seconds, and he pauses frequently. Although conversational pauses are often treated by interlocutors as a place where they may take a turn, this only happens in Peter’s case when in addition he re-establishes eye contact. In Transcript 6, there is a gap between “on” and “a ship” of about 2 seconds containing both filled and unfilled pauses (underlined in the transcript). During this time, Peter’s gaze is averted, and his interlocutor does nothing to help him. Transcript 6 (°hh = in-breath; (0.1) and (1.0) = length of pause seconds.) Peter: know when it was a wa °hh we went on erm (0.1) [tuts] (1.0) a ship On occasions when eye contact is re-established before Peter retrieves the word, on the other hand, the interlocutor either facilitates retrieval, for example by suggesting possible targets, or else produces the word herself. Lexical retrieval in conversations with Peter is therefore also a joint activity. It is only when the complex interplay between individual elements such as syntax, lexis, memory, attention and auditory processing are seen within intrapersonal and interpersonal domains simultaneously that we are able to grasp the systematicity in Peter’s variable conversational performance. His communicative strengths and weaknesses, which are superficially
Pragmatic Impairment as an Emergent Phenomenon 65 captured by pragmatic labels such as “self-repair” and “semantic underspecification,” turn out to be the emergent tip of a complex iceberg. It is only through understanding this mesh of underlying variables that effectively targeted treatment becomes possible. In addition to its clinical relevance, the emergentist model also has implications for mainstream pragmatics and pragmatic theory. The underlying complexity of pragmatic impairment as illustrated above suggests that the study of normal pragmatic functioning might benefit from extending its scope and allocating a more central role to non-linguistic semiotic systems such as gesture, eye gaze and facial expression, to cognitive systems in addition to ToM, and to motor output and sensory input systems. Most work in pragmatics focuses exclusively on the use of language and it is often assumed that linguistic pragmatics is all that there is. Likewise, the contribution of cognition to pragmatics is rarely seen as extending beyond inferential reasoning, and ToM in particular. As noted above, however, a typology of pragmatic abilities based on a comprehensive range of contributory factors offers a principled means of capturing both the breadth and the detail of pragmatics without being open to the charge of being nothing more than “a range of loosely related research programmes” (Sperber & Wilson, 2005, p. 468) that is sometimes leveled at the discipline as a whole. The way in which language and other semiotic devices appear to work together as a single composite signaling system suggests that the notion of explicitness, normally seen as an exclusive property of language, could be usefully re-examined. Interestingly, this takes us back to Morris’s original conception of pragmatics as “the study of the relation of signs [i.e. not exclusively linguistic signs] to interpreters” (Morris, 1938, p. 6). Finally, by seeing pragmatics as a fusion of intrapersonal and interpersonal domains, the emergentist program provides a framework for reconciling purely cognitively based approaches to pragmatics such as relevance theory (Ifantidou & Wharton, Chapter 3 in this volume) with purely ethnographic approaches such as conversation analysis (Wilkinson, Chapter 6 in this volume), which excludes any reference to cognitive states except insofar as they are indirectly reflected in empirically observable behaviors.
REFERENCES Ahlsén, E. (2011). Conversational implicature and communication impairment. In M. J. Ball, M. Perkins, N. Müller, & S. Howard (Eds.), The handbook of clinical linguistics (1st ed., pp. 32–48). Wiley. Andrés-Roqueta, C., & Katsos, N. (2017). The contribution of grammar, vocabulary and theory of mind in pragmatic language competence in children with autistic spectrum disorders. Frontiers in Psychology, 8, 996. Archer, B., Azios, J. H., Gulick, N., & Tetnowski, J. (2021). Facilitating participation in conversation groups for aphasia. Aphasiology, 35(6), 764–782. Archer, B., Azios, J. H., Tetnowski, J., Damico, J., Freer, J., Schmadeke, S., & Christou-Franklin, E. (2019). Key wording practices in three aphasia conversation groups: A preliminary study. Aphasiology, 33(10), 1248–1269.
Archer, B., Tetnowski, J., Freer, J. C., Schmadeke, S., & Christou-Franklin, E. (2018). Topic selection sequences in aphasia conversation groups. Aphasiology, 32(4), 394–416. Azios, J. H., & Archer, B. (2018). Singing behaviour in a client with traumatic brain injury: A conversation analysis investigation. Aphasiology, 32(8), 944–966. Azios, J. H., Archer, B., & Lee, J. B. (2022). Understanding mechanisms of change after conversation-focused therapy in aphasia: A conversation analysis investigation. Journal of Interactional Research in Communication Disorders, 13(2), 220–243. Azios, J. H., & Simmons-Mackie, N. (2022). Clinical application of conversation analysis in aphasia. In C. Coelho, L. Cherney, & B. Shadden (Eds.), Discourse analysis in adults with and without communication disorders: A resource for
66 Michael R. Perkins and Jamie H. Azios clinicians and researchers (pp. 109–130). Plural Publishing, Inc. Barnes, S., & Armstrong, E. (2010). Conversation after right hemisphere brain damage: Motivations for applying conversation analysis. Clinical Linguistics & Phonetics, 24(1), 55–69. Barnes, S., Toocaram, S., Nickels, L., Beeke, S., Best, W., & Bloch, S. (2019). Everyday conversation after right hemisphere damage: A methodological demonstration and some preliminary findings. Journal of Neurolinguistics, 52, 100850. Beeke, S., Wilkinson, R., & Maxim, J. (2003). Exploring aphasic grammar 1: a single case analysis of conversation. Clinical Linguistics & Phonetics, 17(2), 81–107. Beeke, S., Capindale, S., & Cockayne, L. (2020). Correction and turn completion as collaborative repair strategies in conversations following Wernicke’s aphasia. Clinical Linguistics & Phonetics, 34(10–11), 933–953. Bishop, D. V. M. (2003). The children’s communication checklist, version 2 (CCC-2). Psychological Corporation. Blank, M., Gessner, M., & Esposito, A. (1979). Language without communication: A case study. Journal of Child Language, 6(2), 329–352. Byom, L. J., & Turkstra, L. (2012). Effects of social cognitive demand on theory of mind in conversations of adults with traumatic brain injury. International Journal of Language & Communication Disorders, 47(3), 310–321. Camia, M., Benassi, E., Giovagnoli, S., & Scorza, M. (2022). Specific learning disorders in young adults: Investigating pragmatic abilities and their relationship with theory of mind, executive functions and quality of life. Research in Developmental Disabilities, 126, 104253. Clark, A. (1997). Being there: Putting brain, body, and world together again. MIT Press. Clark, H. H. (1996). Using language. Cambridge University Press. Damico, J. S., & Nelson, R. L. (2005). Interpreting problematic behavior: Systematic compensatory adaptations as emergent phenomena in autism. Clinical Linguistics & Phonetics, 19(5), 405–417. Denman, A., & Wilkinson, R. (2011). Applying conversation analysis to traumatic brain injury: Investigating touching another person in everyday social interaction. Disability and Rehabilitation, 33(3), 243–252. Friedman, L., Sterling, A., DaWalt, L. S., & Mailick, M. R. (2019). Conversational language
is a predictor of vocational independence and friendships in adults with ASD. Journal of Autism and Developmental Disorders, 49(10), 4294–4305. Gallagher, T. M. (Ed.). (1991). Pragmatics of language: Clinical practice issues. Chapman Hall. Hobson, R. P., & Bishop, M. (2003). The pathogenesis of autism: Insights from congenital blindness. Philosophical Transactions of the Royal Society, Series B, 358(1430), 335–344. Keegan, L. C., & McAdam, H. (2016). Swearing after traumatic brain injury: A linguistic analysis. Journal of Interactional Research in Communication Disorders, 7(1), 101. Kissine, M. (2016). Pragmatics as metacognitive control. Frontiers in Psychology, 6, 2057. Kolk, H. (1995). A time-based approach to agrammatic production. Brain and Language, 50(3), 282–303. Leaman, M. C., Archer, B., & Edmonds, L. A. (2022). Toward empowering conversational agency in aphasia: Understanding mechanisms of topic initiation in people with and without aphasia. American Journal of Speech-Language Pathology, 31(1), 322–341. McNeill, D. (Ed.). (2000). Language and gesture. Cambridge University Press. Morris, C. W. (1938). Foundations of the theory of signs. In O. Neurath, R. Carnap, & C. Morris (Eds.), International encyclopedia of unified science (pp. 77–138). University of Chicago Press. Pajo, K., & Laakso, M. (2020). Other-initiation of repair by speakers with mild to severe hearing impairment. Clinical Linguistics & Phonetics, 34(10–11), 998–1017. Penn, C. (1985). The profile of communicative appropriateness. South African Journal of Communication Disorders, 32(1), 18–23. Penn, C. (1999). Pragmatic assessment and therapy for persons with brain damage: What have clinicians gleaned in two decades? Brain and Language, 68(3), 535–552. Perkins, M. R. (1998). Is pragmatics epiphenomenal? Evidence from communication disorders. Journal of Pragmatics, 29(3), 291–311. Perkins, M. R. (2000). The scope of pragmatic disability: A cognitive approach. In N. Müller (Ed.), Pragmatics and clinical applications (pp. 7–28). John Benjamins. Perkins, M. R. (2003). Clinical pragmatics. In J. Verschueren, J.-O. Östman, J. Blommaert, & C. Bulcaen (Eds.), Handbook of pragmatics: 2001 installment (pp. 1–29). John Benjamins.
Pragmatic Impairment as an Emergent Phenomenon 67 Perkins, M. R. (2005). Pragmatic ability and disability as emergent phenomena. Clinical Linguistics and Phonetics, 19(5), 367–377. Perkins, M. R. (2007). Pragmatic impairment. Cambridge University Press. Perkins, M. R. (2011). Pragmatic impairment as an emergent phenomenon. In M. J. Ball, M. Perkins, N. Müller, & S. Howard (Eds.), The handbook of clinical linguistics (1st ed., pp. 79–91). Wiley. Perkins, M. R. (2014). Pragmatics as interaction. In M. Ball, N. Müller, & R. Nelson (Eds.), Handbook of qualitative research in communication disorders (pp. 131–148). Psychology Press. Perkins, M. R., Body, R., & Parker, M. (1995). Closed head injury: Assessment and remediation of topic bias and repetitiveness. In M. R. Perkins & S. J. Howard (Eds.), Case studies in clinical linguistics (pp. 293–320). Whurr. Prutting, C. A., & Kirchner, D. M. (1987). A clinical appraisal of the pragmatic aspects of language. Journal of Speech and Hearing Disorders, 52(2), 105–119. Simmons-Mackie, N., & Damico, J. (1996). The contribution of discourse markers to communicative competence in aphasia.
American Journal of Speech-Language Pathology, 5(1), 37–43. Simpson, G. K., Sabaz, M., & Daher, M. (2013). Prevalence, clinical features, and correlates of inappropriate sexual behavior after traumatic brain injury: A multicenter study. The Journal of Head Trauma Tehabilitation, 28(3), 202–210. Sperber, D., & Wilson, D. (2005). Pragmatics. In F. Jackson & M. Smith (Eds.), Oxford handbook of contemporary analytic philosophy (pp. 468–501). Oxford University Press. Tarling, K., Perkins, M. R., & Stojanovik, V. (2006). Conversational success in Williams syndrome: Communication in the face of cognitive and linguistic limitations. Clinical Linguistics and Phonetics, 20(7–8), 583–590. Ullman, M. T., & Pierpont, E. I. (2005). Specific Language Impairment is not specific to language: The procedural deficit hypothesis. Cortex, 41(3), 399–433. Volden, J., & Phillips, L. (2010). Measuring pragmatic language in speakers with autism spectrum disorders: Comparing the Children’s Communication Checklist–2 and the Test of Pragmatic Language. American Journal of Speech-Language Pathology, 19(3), 204–212.
6 Conversation Analysis and Communication Disorders RAY WILKINSON 6.1 T he Importance of Analysing Conversations Involving People with Communication Disorders Over the last 30 years or so, there has been a significant amount of research carried out applying Conversation Analysis (CA) within the field of communication disorders. A major attraction of CA for researchers and clinicians working with communication disorders is that it provides a rigorous method for the analysis of naturally occurring interactive talk and non-verbal behavior. This includes various forms of talk-in-interaction including institutional interaction, such as that between health professionals and clients, and conversation, that is,, spontaneous informal dialogic talk, often between consociates, such as friends, colleagues, and family members. In particular, the analysis of conversation involving the person with the communication disorder and one or more significant others, such as a family member or friend, has been a focus of attention in the field, and has led to new insights into the impact of communication disorders on the everyday interactions and social worlds of the individual affected and those they regularly interact with (Wilkinson, 2019). Analysing data of the person with a communication disorder in conversation rather than in institutional interaction is seen as preferable for the purposes of understanding the impact of the communication disorder within everyday social life (Heeschen & Schegloff, 1999). Reasons for this include the fact that if the person with a communication disorder is analyzed in institutional interaction, for example, with a clinician or a researcher, the interactional role that the person with a communication disorder is made to adopt (e.g., as a patient or subject, primarily responding to questions and other initiations from the professional) means that the interaction may not clearly reflect their abilities and difficulties in everyday conversations with family members and friends. In addition, without data of conversations between the person with a communication disorder and their significant others, it is not possible to see how all the parties engage in conversation in the domestic sphere, and how, for example, they may adapt their interactional contributions in response to the person with a communication disorder’s impairments (e.g., Bauer & Kulke, 2004). Why is it so important to be able to analyze conversational data of people with c ommunication disorders? One reason is that conversation constitutes the most common use of spoken language and non-verbal communication in daily life, so it is important that the researcher or clinician has a means of gaining information about it. Second, analysing conversation provides information
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
70 Ray Wilkinson not just about the person with the communication disorder’s contributions to the interaction, but also about those of their interlocutors, providing insights into how those interlocutors may facilitate (or not) the conversational participation of the person with the communication disorder. Third, engaging in conversation involves certain abilities of the person with a communication disorder which have generally not been assessed directly by communication disorder researchers and practitioners, but which are central to effective communication and participation in everyday social interaction. These include being able to: • make the conversational contribution understandable, including in relation to the sequential context of prior talk within which it is produced (Robinson, 2016) • design one’s contribution for the particular recipient (or recipients) who is being addressed (i.e., recipient design: Sacks et al., 1974) • launch one’s turn-at-talk (or turn) at the right time (e.g., at a transition relevance place of another participant’s talk, such that starting to talk here will not be heard as interruptive: Sacks et al., 1974) • progress one’s turn to its possible end without significant disruptions or delay (i.e., progressivity: Schegloff, 2007) As such, contributions to conversation are an inherently social and temporal phenomenon, and an assessment of conversation should be able to reflect this. Typically, however, tools for the assessment of communication disorders have been developed for purposes other than the collection or analysis of conversational data. For example, assessments of speech or language often involve the elicitation and analysis of monologic talk through tasks where the client is asked to produce individual phonemes, name objects, describe pictures, narrate a well-known story or provide a description of how to do a daily task. Other seemingly more communicative and dialogic assessment activities, such as role play, lack the creativity inherent in natural conversation as well as conversation’s extended series of back-and-forth exchanges where each contribution builds on, and responds to, the prior. Finally, interviews or questionnaires about communication can elicit quite general formulations about the communicative behavior of a person with a communication disorder or that person’s significant other but will not capture the specific details of the interaction as it unfolds moment-by-moment over time. In contrast to these methodological approaches, CA involves the collection of naturally occurring talk-in-interaction, either in the form of audio-recordings or, in the case of interactions where the participants have visual access to each other, preferably video-recordings. All of the recording, or the relevant parts of it, are transcribed using a system which captures features of the talk (including elements such as repetitions, restarts and silences, which are often left out of more broad transcriptions) as well as, where appropriate, non-verbal conduct, such as gesture, eye gaze, and the use of objects (Hepburn & Bolden, 2017). Analysis is based on a combination of the recording and the transcription.
6.2 Conversation Analysis The origins of CA lie in the work carried out in the 1960s and 1970s by the American sociologist Harvey Sacks and his colleagues Emanuel A. Schegloff and Gail Jefferson. Influenced by the studies of Erving Goffman on face-to-face interaction and, in particular, the sociological movement of Ethnomethodology inspired by the work of Harold Garfinkel, CA emerged as a procedure for analysing everyday talk-in-interaction and the institutionalized structural organizations, or social conventions, which underlie and inform the participants’ behavior in any particular interaction (Robinson, 2016).
Conversation Analysis and Communication Disorders 71 These conventions have a normative character, and it is by their (largely unconscious) rienting to these conventions that participants in an interaction produce talk and non-verbal o behavior which can be seen by recipients to be orderly, coherent, and meaningful. As such, CA starts from the assumption (an assumption borne out by a large body of empirical findings) that naturally occurring talk-in-interaction is an orderly activity which can be analyzed rigorously and in its own right in order to uncover linguistic and other practices. In this regard, it differs markedly from other approaches which have influenced clinical linguistics including linguistic work in the tradition of Chomsky and cognitive (neuro)psychological approaches to normal and impaired language. In these latter traditions, naturally occurring conversational data is seen as too disorderly to form the starting point for an investigation of language and language use. CA investigations have focused on the procedures by which speakers use their turnsat-talk to produce social actions (questioning, requesting, news-telling etc.) and recipients display an understanding and response to these actions in subsequent turns. Within this analytic perspective grammar, lexis, and other aspects of language production are investigated in terms of their use as resources for turn/action construction, including how their deployment at a certain point within the turn or series of turns contributes to how that turn will be heard and responded to by its recipient(s). A turn is constructed and interpreted in relation to the point in the interaction at which it is being produced. To understand an utterance, recipients interpret it within the context of its immediately prior utterances, as it is assumed an utterance is constructed in relation to what has immediately preceded it unless the speaker displays otherwise. This use of sequential context as a resource for understanding an utterance can be particularly important for recipients in attempting to understand the utterances of people with communication disorders since these speakers’ linguistic, phonetic or other limitations can regularly make understanding their utterances problematic. When this resource is less available for recipients to draw on, such as when the person with the communication disorder attempts to initiate a new topic, or even a new sequence, recipients may have more difficulty in understanding the utterance. For similar reasons the sequential context provided by the preceding turns can also be a useful resource for people with communication disorders in constructing their turns since they may, to a greater or lesser extent, be able to compensate for their lack of linguistic resources by designing their utterances to exploit the contextual resources available, in particular the sequential context provided by preceding talk (see Goodwin, 1995). CA research has also highlighted how the temporality and projectability of talk are of central importance to participants in producing and interpreting utterances. For example, there is a preference for progressivity in talk (see Lerner, 1996) such that what has been projected by the talk at this juncture to occur next is expected to be produced at that point. The delay or absence of the projected item(s) at the point due is noticeable and accountable and can open the speaker up to (often negative) inferences and can result in the production of certain actions by other participants. For example, an initiating action, such as a question, launches a sequence of actions (Schegloff, 2007), and this action sets up an expectation that a corresponding responsive action, such as an answer, should be produced next by another participant. A delay or absence of the responsive action can trigger inferences about the speaker such as that s/he is unable or unwilling for some reason to answer the question, has not heard the question etc. Similarly, when a speaker is producing a turn, each part of the emerging turn projects, and is heard by the recipient as projecting, how that turn is progressing toward possible completion and thus toward the point where another participant might non-interruptively take over the floor (Sacks et al., 1974). Any delay in producing the next item due, particularly if there is a silence of over one second (Jefferson, 1989), is noticeable as such to recipients. One result can be that recipients may draw inferences as to the reason for the delay (inferring, for example, that the speaker is having difficulty in
72 Ray Wilkinson accessing or producing the required item). Another result can be that a co-participant takes the opportunity afforded by the delay to enter the turn at that point and take over the floor (Lerner, 1996). The linguistic limitations of people with communication disorders mean that it is often difficult for them to act in accordance with the constraints and expectations involved in talk-in-interaction, in particular the time constraints inherent in certain conventions of conversation such as the preference for progressivity. It can be as a consequence of the inability to produce the required item in the required way at the required time that a speaker’s communication impairment may become particularly highlighted and exposed in interaction and take over as the focus of the conversational activity. Indeed, it can be in this way that a speaker may be exposed in everyday interaction as communication impaired (e.g., the block of a person with a stammer which delays or makes him/her unable to produce an item in response to a question). On a more positive note, it has been argued that certain features of the linguistic behavior of people with communication disorders and their interlocutors can be understood as attempts to adapt to the demands of talk-in-interaction in the light of the communication disorder (Heeschen & Schegloff, 1999). The interactional and collaborative nature of naturally occurring talk is shown within CA investigations to be an integral aspect of how it is produced and understood. Recipients of talk, by their next turn response to a speaker’s turn, are crucial, for example, in how the content of that turn is registered and taken up and how an understanding of it (or not) is displayed within the interaction. Similarly, the establishment of a new topic within the interaction regularly relies on how a recipient responds to a speaker’s attempt to generate the new topic (Schegloff, 2007). Unlike many other approaches to speech and language production which explicitly or implicitly treat spoken language as the product of a single speaker putting his/her thought or intention into verbal form, work within CA has argued that even the output of a single speaker can be shown to be an interactional and collaboratively co-constructed achievement due to the fact that how a speaker constructs his/her emerging turn can be seen to be affected by various aspects of recipients’ behavior (Schegloff, 1982). This co-constructional feature of talk can be particularly important for interactions involving people with communication disorders since these speakers may often rely on their co-participants to assist, for example, in searching for a word, or in clarifying what the speaker with the communication disorder was trying to say. The organization of repair in talk has been an area of CA which has been particularly drawn upon by those investigating interactions involving people with communication disorders such as aphasia, dysarthria, or hearing impairment. Repair refers to the mechanisms used by participants in dealing with troubles in talk-in-interaction and can be broken down into three parts; the repair initiation, the repair completion and the trouble source in the talk which is being treated by the participants as engendering the repair and which may or not be an error (Schegloff et al., 1977). Both the initiation and completion of repair can be carried out by “self” (the participant whose trouble source is being dealt with) or “other” (a participant other than the one whose trouble is being dealt with), thus giving various repair types such as “self-initiated self-repair” or “other-initiated self-repair”. Repair can raise various issues concerning incompetence to the surface of the interaction. This is part of the reason why if repair is done at all, self-initiated self-repair is the most common form in “typical” talk-in-interaction since the speaker both initiates and completes the repair him/herself, usually within the same turn. This lessens the disruption caused by the repair to the topic talk which was in progress and avoids the need for others to be involved in solving the speaker’s trouble. Repair is usually quick and successful in typical talk (Schegloff et al., 1977) with one or two repair tries usually proving sufficient to deal with the trouble. The analysis of repair is important for clinical linguistics in that it opens up for investigation the analysis of how interactions involving people with communication disorders may
Conversation Analysis and Communication Disorders 73 be disrupted by repair, what distinctive forms the repair can take and how it may be different in, for example, quantity or length to that seen in typical talk. It also allows investigation of how participants work together in attempting to achieve repair, what methods they use and whether they are successful or not on any particular occasion. Finally, a focus on trouble sources (i.e., what the participants in the interaction treat as being in need of repair) rather than on errors can be useful in relation to the analysis of communication disorders in talk. While the talk of people with communication disorders may contain a significant number of errors, many of these may be “let pass” by the participants as being unproblematic in terms of the business at hand. An analysis of trouble sources, on the other hand, allows insights into what errors or other features of the talk the participants themselves treat as problematic and worthy of remedial action in the interaction. As such, an analysis of trouble sources and repair in general can provide the clinician with particularly useful information when attempting to target therapy at the particular problems the participants are experiencing in everyday life.
6.3 Conversation Analysis of Communication Disorders A typical CA study in the field of communication disorders focuses on one disorder (e.g., aphasia) and often one subtype of that disorder (e.g., Broca-type aphasia). Commonly, several people with that disorder/sub-type are analyzed in conversation with one or more significant others to uncover recurring features of how that disorder/sub-type impacts on conversation (see, e.g., Bauer & Kulke, 2004; Pajo, 2013). Drawing on more than 50 years of CA studies of “typical” conversations (i.e., those involving persons without communication disorders), a form of comparative analysis (Drew & Heritage, 1992) is used to uncover how these “atypical interactions” (Wilkinson et al., 2020) differ in systematic ways. Differences can be evident in relation to one or more areas of conversational organization, including turns, sequences, the use of actions, repair, and topic. Analysis can not only display how a particular type of disorder can result in conversations which differ in systematic ways from those involving non-communication disordered speakers, it can also highlight some broad similarities and differences between different types of communication disorder and how they impact on conversation (Wilkinson, 2019). In this section, I will discuss CA findings about communication disorders in terms of four groupings of disorders: speech and hearing impairments, fluency impairments, language impairments, and cognitive impairments. These groupings are not meant to be seen as rigid, but rather as loose categories which allow some broad similarities and differences across communication disorders to be noted. It should also be acknowledged that while the aim here is to highlight some general patterns of interaction within and between disorders, there can be significant heterogeneity across people with the same communication disorder due to the presence of sub-types of that disorder as well as different levels of severity of the impairments, and, in some types of communication disorder, significant changes over time.
6.3.1 Speech or Hearing Impairments While dysarthria and hearing impairment are in many ways quite different types of communication disorder (or communication difficulty), CA studies have highlighted some similar recurring features across both forms of atypical interaction. In each case, one way in which the disorder can be seen to impact regularly on conversation is in the form of what Schegloff et al. (1977, p. 367) term other-initiations of repair (Bloch & Wilkinson, 2009; Pajo, 2013). In this
74 Ray Wilkinson type of sequence, one participant treats a prior turn (most commonly the immediately prior turn) produced by another participant as a source of trouble which that other participant should remedy before an adequate hearing or understanding of the turn can be achieved. The other-initiation of repair turn can highlight some element of the prior turn as the trouble to be repaired (e.g., “you saw who?”) or can be in an open-class form (e.g., “pardon?” or “huh?”: Drew, 1997). In the case of dysarthria, it is recurrently the recipient of the person with dysarthria’s talk who initiates repair, indexing that they are having some problem with the intelligibility or understandability of that talk. In the case of hearing impairment, it is the person with the impairment who may regularly initiate repair on another speaker’s talk, displaying their difficulty in hearing what was said. While among participants without communication disorders, one other-initiated repair sequence (i.e., an other-initiation of repair by one participant followed by a self-repair of the trouble source turn by the other) is usually sufficient to resolve the trouble and allow for the onward progress of the current topic, in conversations involving a participant with dysarthria or a hearing impairment, it is not uncommon that two or more other-initiated repair sequences are needed to resolve the trouble (Griffiths et al., 2015; Pajo & Laakso, 2020). The relative frequency and possible prolonged nature of these repair sequences is thus one way in which the communicative impairments associated with dysarthria or hearing impairment affect conversation, both in terms of delaying its onward progress while the troubles are resolved, as well as raising potentially delicate issues concerning who is primarily at fault for necessitating the repair (Ekberg et al., 2020). In addition, while a rule of conversation is that in general only one participant speaks at a time (Sacks et al., 1974), there is evidence that in conversations involving people with dysarthria or hearing impairment there may be more frequent overlapping talk and incursions by one participant into another participant’s turn (Griffiths et al., 2012). This includes the production of other-initiations of repair, which, unlike in conversations involving participants without communication disorders (Schegloff, 2000), can be produced while the particular part of the turn (i.e., the turn-constructional unit (TCU): Sacks et al., 1974), such as a sentence, they are targeting is still in progress (Griffiths, 2014; Pajo & Laakso, 2020). There is also some evidence that certain types of other-initiation of repair may be more common than others in at least some sub-groups of people with these disorders; open-class forms of other-initiation of repair have been reported to be common in both people with hearing impairment Pajo (2013) and people with dysarthria (here, dysarthria secondary to Parkinson’s Disease: Griffiths, 2014). In addition, the use of embodied forms of other- initiation of repair (e.g., leaning toward the speaker) may be particularly commonly used by people with severe hearing impairment (Pajo & Laakso, 2020). Finally, CA studies have explored the methods people with dysarthria or hearing impairment may use to adapt their interactional contributions, and thus bypass or mitigate the effect of their impairments within conversation. At one end of the continuum are conventionalized methods such as the use of sign language (Girard-Groeber, 2020) or alternative and augmentative (AAC) devices (Bloch & Wilkinson, 2004; Engelke & Higginbotham, 2013). In addition, more ad hoc methods have been analyzed, such as the co-construction of turns by a person with severe dysarthria and their interlocutor (Bloch, 2005).
6.3.2 Fluency Impairments A major way in which stammering (or stuttering) impacts on conversation is that it delays or disrupts the progressivity of TCUs, such as sentences, in ways which are not typically seen in the talk of speakers who do not stammer. For example, the person who stammers may have silences of significantly more than one second within their turn (cf. Jefferson, 1989) or may repeat (or block on) the same phoneme several times (Wilkinson & Morris, 2020). This marked
Conversation Analysis and Communication Disorders 75 (Robinson, 2016) pattern of turn production makes it noticeable to others, drawing attention to the identity of the speaker as someone with a stammer (or at least with a difficulty in “getting the words out”). In addition, and as is the case in conversation involving speakers without communication disorders (Jefferson, 1983), disruptions to progressivity can be one place in the turn where other participants regularly enter the turn space of the dysfluent speaker and start to talk. Wilkinson and Morris (2020) discuss three such types of turn incursion within their data set of interactions involving a person who stammers phoning a service provider, such as a restaurant. The first is completion of the person who stammer’s ongoing TCU by another participant, which can be done as a query to be confirmed by the person who stammers. The second is an other-initiation of repair, which, as well as being incursive into the trouble source turn, may (as is the case with other communication disorders, such as the speech and hearing impairments noted above, but unlike “typical” conversation: Schegloff, 2000) be incursive in the trouble source TCU. The third is a check by the other participant that the person who stammers is still on the call and has not, for example, been cut off. This can be linked, for instance, to the presence of a long silence in the turn of the person who stammers. In summary, then, there is evidence that the turns-at-talk of people who stammer may be particularly vulnerable and “permeable” (Lerner, 1996) to the talk of other participants. At the same time, CA research has identified some methods that people who stammer and those they interact with may use to mitigate some of the interactional issues linked to stammering. For example, Tetnowski and Damico (2001) describe the interactional behavior of a man with a moderate stammer who displayed a pattern of regularly shifting his gaze from his co-participant at the point where he was dysfluent. The authors suggest this behavior may be an interactional method which assists him in keeping the turn. Interactional achievements such as maintaining the turn despite dysfluency may also involve the conversation partner. Tetnowski and Damico (2001) also describe a pattern in the interaction of another dyad where, when the person with a stammer was dysfluent, the conversation partner regularly responded with an acknowledgement in the form of a vocalization such as “mmhm” and/or a head nod. Tetnowski and Damico (2001) note that this behavior by the co-participant is hearable as an encouragement to the dysfluent speaker to continue with the turn, as well as implicitly displaying to the dysfluent speaker that the co-participant is not going to challenge to take over the turn at that point. Wilkinson and Morris (2020) analyze another method used by people who stammer when in interaction with someone who is not familiar with them, that is, announcing, usually near the start of the conversation, that they are someone who stammers (a practice sometimes referred to as “advertising”). This self-presentation may be accompanied by an explicit stating of a consequence of this social identity, that is, that the interlocutor will have to be “patient” or allow more time for the person who stammers to talk. This method is interesting in that it is one place where people with a communication disorder can be seen to be attempting to alter the conventions of conversation, which are typically implicit and followed by people without much conscious thought or discussion. By announcing that they stammer, the speaker can be heard as implicitly requesting that the interlocutor alter their usual expectations of how the timeline of the interaction will proceed (since the person who stammers may need more time than a typical speaker) and possibly alter their conduct accordingly.
6.3.3 Language Impairments Impairments which impact on areas of language such as lexis and grammar can be part of acquired disorders such as aphasia or developmental disorders such as developmental language disorder. CA studies have so far focused more on the former than the latter, and the impact of children’s language impairments on interaction is thus an area for further investigation (though see, for example Tykkyläinen (2010)).
76 Ray Wilkinson Conversations involving a speaker with a language impairment regularly display patterns of repair which differ from those involving participants without a communication disorder. For example, self-initiations of repair (Schegloff et al., 1977) can be more frequent, due to word searches (Helasvuo et al., 2004; or attempts to deal with lexical errors. An inability to produce a soughtfor word can result in repair attempts which are prolonged (Wilkinson, 2007). Regularly, the preferred outcome of self-initiated self-repair (Schegloff et al., 1977) will not be achieved and another participant may enter the self-initiated repair attempt and complete it (Laakso & Godt, 2016). Other-initiations of repair can also be frequent, particularly in the case of certain sub-types of aphasia, such as people with Broca-type aphasia, where a recipient may regularly use an otherinitiation of repair to display a difficulty in understanding the person with aphasia (Wilkinson, 2019; see also Goodwin, 1995). The result is that conversations involving people with aphasia may contain noticeably more repair activity than is present in conversations involving speakers without a communication disorder. One consequence of this is that these conversations can display delayed progressivity both within turns (due to self-initiations of repair) and at the level of sequences (with other-initiations of repair delaying the turn which should have been produced next). Displays of emotion, such as frustration, by the person with aphasia are not uncommon as methods for dealing with these manifestations of linguistic incompetence and the resulting threats to face experienced by the person with aphasia (Laakso, 2014). CA studies have also highlighted ways in which participants adapt typical methods of forming contributions, such as methods of turn construction, in order to lessen the impact of linguistic impairments (often in the form of repair) on the conversation. For example, speakers with aphasia may develop idiosyncratic forms of constructing turns such as relying on direct reported speech and embodied communication in order to depict, or act out, an event rather than attempting to verbally describe it in conventional ways (Wilkinson et al., 2010). Heeschen and Schegloff (1999) argue that telegraphic speech, a feature of some speakers with Broca-type aphasia, may best be seen as a form of adapted turn construction rather than as (is traditionally the case) a direct result of a linguistic impairment. They show how this form of turn construction can have the consequence of mobilizing the participation of the interlocutor, such that the contribution that the person with aphasia may have struggled to produce alone becomes co-constructed by both participants. Other ways in which interlocutors may adapt the way they talk in conversation include adopting a pattern of asking yes-no questions to people with aphasia whose verbal output is very limited (Goodwin, 1995). They may also deploy types of verbal social actions, such as test questions (i.e., where a speaker asks a question to which they already know the answer: Bauer & Kulke, 2004), which are not normally used among familiars in conversation but are instead more common in the talk of those in a position of power or authority, such as teachers or parents of young children.
6.3.4 Cognitive Impairments There is a wide, and heterogeneous, range of disorders where there is typically evidence of cognitive impairments (such as deficits in memory, attention or executive functions) which can impact on communication and social interaction. These disorders include dementia, autism, traumatic brain injury, and learning disability, among others. While the talk of people with these disorders may display the impact of one or more of the impairments discussed above, such as speech or language impairments, one common feature of their talk or other conduct is the atypical use of a verbal social action, such as a question (Wilkinson, 2019). That is, the action may be linguistically well-formed in terms of, for example, lexis and grammar, but the use of that action may be in some way unusual and not in line with the usual norms governing the use of social actions within interaction (Robinson, 2016).
Conversation Analysis and Communication Disorders 77 For example, in conversations involving speakers with traumatic brain injury (TBI), the speaker may engage in perseveration in the form of telling the recipient something that that speaker has in fact already told that recipient (Frankel & Penn, 2007). This violates a norm of social interaction whereby a speaker should not inform a recipient about some state of affairs where they have grounds to believe that recipient already has this information (Stivers et al., 2011). Similarly, in the case of conversations involving a speaker with dementia it has been shown how the speaker with dementia may ask a question to which they have already been given the answer (Kindell et al., in press). In cases such as these, the recipient may display through their talk or conduct that they are treating these actions by the person with the communication disorder as in some way inappropriate (for example, in the latter case, by reminding the person with dementia that they were just recently given the information they have just asked for: Kindell et al., in press). This atypical use of actions may extend to other levels of conversation, such as recurrently returning to the same topic (as seen in people with TBI: Body & Parker, 2005). Non-verbal behavior too may take forms which can be seen to be inappropriate. For example, Denman and Wilkinson (2011) describe a man with TBI who engaged in regular touching of his female carer, a behavior which the carer treated as inappropriate through, for example, removing his hand or verbally reminding him of appropriate behavior concerning touching. In some cases where the impairments are more severe, the usual sense of a shared social world, and the notion that the participants are each drawing on a shared set of social norms, may be threatened. In these cases, there may be a dilemma for the recipient regarding how to react. This has been discussed, for example, in relation to people with dementia who engage in confabulations, that is,, producing statements while being unaware of their falsity (Lindholm, 2015). Somewhat similarly, Wootton (1999) discusses an 11-year-old boy with severe autism who engaged in delayed echoing, that is,, producing echoes that appear to have a basis not in some recent talk or event but in some previous occasion. These echoes appeared uncommunicative in that they did not seem fitted to what had just recently been said or done in the interaction. In this situation, where it was not clear what, if any, interactional function, the echo was hearable as having at this point in the interaction, one recurrent reaction from the co-participant was to not to respond to it, thus in effect treating it interactionally as if it had never happened.
6.4 Conclusion Over the last 30 years or so, CA has been applied to a wide range of communication disorders. A recent edited collection (Wilkinson et al., 2020) provides an introductory overview and a series of empirical studies which cover a range of communication disorders involving people with cognitive, linguistic, fluency, speech, or hearing impairments. In terms of traditional clinical linguistic concerns, it can be argued that a limitation of a CA approach is that it does not provide an account of the underlying causes of communication disorders and their “symptoms”. A CA approach could therefore be viewed as providing complementary information to that provided by, for example, psycholinguistic or neurolinguistic approaches. However, another perspective can be to view CA as providing an alternative, interactional, approach within the field of communication disorders, a field which has traditionally been dominated by individualistic approaches which take the individual’s brain, mind, body or “competence” as the conceptual starting point. From this vantage point, instead of starting with the individual and then attempting to understand how the individual functions in a social context, a contribution of a CA approach is that
78 Ray Wilkinson it starts from the social world. As such, it aims to understand how the person with a communication disorder may function as regards the social and interactional organizations, such as the organization of turn-taking or repair, which channel and shape all individuals’ communicative contributions. Ultimately, it can be argued that it is naturally occurring talk (and other conduct) within social interaction, rather than the production of, for example, single words or sentences under experimental conditions, which explanatory models or theories of communication disorders and their symptoms should be aiming to provide accounts for. For this, an approach such as that of CA, with its methods of working with naturally occurring interaction and its substantive findings about the actions and practices of “typical” interaction, can be seen to be of central importance. In addition, while it has not been a focus of this overview, it can be noted that this type of interactional perspective has also generated new approaches to intervention which aim to directly target and improve the conversations of people with communication disorders and their everyday interlocutors (see Wilkinson, 2014).
REFERENCES Bauer, A., & Kulke, F. (2004). Language exercises for dinner: Aspects of aphasia management in family settings. Aphasiology, 18(12), 1135–1160. https://doi.org/10.1080/02687030444000570 Bloch, S. (2005). Co-constructing meaning in acquired speech disorders: Word and letter repetition in the construction of turns. In K. Richards & P. Seedhouse (Eds.), Applying conversation analysis (pp. 38–55). Palgrave Macmillan UK. Bloch, S., & Wilkinson, R. (2004). The understandability of AAC: A conversation analysis study of acquired dysarthria. Augmentative and Alternative Communication, 20(4), 272–282. https://doi. org/10.1080/07434610400005614 Bloch, S., & Wilkinson, R. (2009). Acquired dysarthria in conversation: Identifying sources of understandability problems. International Journal of Language & Communication Disorders, 44(5), 769–783. Body, R., & Parker, M. (2005). Topic repetitiveness after traumatic brain injury: An emergent, jointly managed behaviour. Clinical Linguistics and Phonetics, 19(5), 379–392. https://doi. org/10.1080/02699200400027189 Denman, A., & Wilkinson, R. (2011). Applying conversation analysis to traumatic brain injury: Investigating touching another person in everyday social interaction. Disability and Rehabilitation, 33(3), 243–252. https://doi.org/ 10.3109/09638288.2010.511686 Drew, P. (1997). “Open” class repair initiators in response to sequential sources of troubles in
conversation. Journal of Pragmatics, 28(1), 69–101. https://doi.org/10.1016/ S0378-2166(97)89759-7 Drew, P., & Heritage, J. (1992). Analyzing talk at work: An introduction. In P. Drew & J. Heritage (Eds.), Talk at work: Interaction in institutional settings (pp. 3–65). Cambridge University Press. Ekberg, K., Hickson, L., & Lind, C. (2020). Practices of negotiating responsibility for troubles in interaction involving people with hearing impairment. In R. Wilkinson, J. P. Rae, & G. Rasmussen (Eds.), Atypical interaction: The impact of communicative impairments within everyday talk (pp. 409–433). Palgrave Macmillan. https://doi. org/10.1007/978-3-030-28799-3_14 Engelke, C. R., & Higginbotham, D. J. (2013). Looking to speak: On the temporality of misalignment in interaction involving an augmented communicator using eye-gaze technology. Journal of Interactional Research in Communication Disorders, 4(1), 95. https:// doi.org/10.1558/jircd.v4i1.95 Frankel, T., & Penn, C. (2007). Perseveration and conversation in TBI: Response to pharmacological intervention. Aphasiology, 21(10–11), 1039–1078. https://doi. org/10.1080/02687030701198395 Girard-Groeber, S. (2020). Swiss German and Swiss German Sign Language resources in repair initiations: An examination of two types of classroom. In R. Wilkinson, J. P. Rae, & G. Rasmussen (Eds.), Atypical interaction: The
Conversation Analysis and Communication Disorders 79 impact of communicative impairments within everyday talk (pp. 435–464). Palgrave Macmillan. https://doi. org/10.1007/978-3-030-28799-3_15 Goodwin, C. (1995). Co-constructing meaning in conversations with an aphasic man. Research on Language and Social Interaction, 28(3), 233–260. https://doi.org/10.1207/s15327973rlsi2803_4 Griffiths, S. (2014). Managing everyday participation in Parkinson’s disease: A conversation analytic study. Unpublished PhD Thesis. Universities of Exeter and Plymouth. Griffiths, S., Barnes, R., Britten, N., & Wilkinson, R. (2012, February). Potential causes and consequences of overlap in talk between speakers with Parkinson’s disease and their familiar conversation partners. Seminars in Speech and Language, 33(01), 27–43. https:// doi.org/10.1055/s-0031-1301161 Griffiths, S., Barnes, R., Britten, N., & Wilkinson, R. (2015). Multiple repair sequences in everyday conversations involving people with Parkinson’s disease. International Journal of Language & Communication Disorders, 50(6), 814–829. https://doi. org/10.1111/1460-6984.12178 Heeschen, C., & Schegloff, E. A. (1999). Agrammatism, adaptation theory, conversation analysis: On the role of so-called telegraphic style in talk-in-interaction. Aphasiology, 13(4/5), 365–406. https://doi. org/10.1080/026870399402145 Helasvuo, M. L., Laakso, M., & Sorjonen, M. L. (2004). Searching for words: Syntactic and sequential construction of word search in conversations of Finnish speakers with aphasia. Research on Language and Social Interaction, 37(1), 1–37. https://doi.org/10.1207/s15327973r lsi3701_1 Hepburn, A., & Bolden, G. B. (2017). Transcribing for social research. Sage. Jefferson, G. (1983). Notes on some orderliness of overlap onset. Tilburg Papers in Language and Literature, 28(1), 1–28. Jefferson, G. (1989). Preliminary notes on a possible metric which provides for a “standard maximum” silence of approximately one second in conversation. In D. Roger & P. Bull (Eds.), Conversation: An interdisciplinary perspective (pp. 166–196). Multilingual Matters. Kindell, J., Keady, J., & Wilkinson, R. (in press). On the use of tag questions by co-participants of people with dementia in talk-in-interaction: Asymmetries of knowledge, power and interactional competence. In P. Muntigl, C. Plejert, & D. Jones (Eds.), Interaction and
dementia: From diagnosis to daily discourse. Cambridge University Press. Laakso, M. (2014). Aphasia sufferers’ displays of affect in conversation. Research on Language and Social Interaction, 47(4), 404–425. https://doi. org/10.1080/08351813.2014.958280 Laakso, M., & Godt, S. (2016). Recipient participation in conversations involving participants with fluent or non-fluent aphasia. Clinical Linguistics & Phonetics, 30(10), 770–789. https://doi.org/10.1080/02699206.2016.1221997 Lerner, G. (1996). On the “semi-permeable” character of grammatical units in conversation: Conditional entry into the turn space of another speaker. In E. Ochs, E. A. Schegloff, & S. A. Thompson (Eds.), Interaction and grammar (pp. 238–276). Cambridge University Press. Lindholm, C. (2015). Parallel realities: The interactional management of confabulation in dementia care encounters. Research on Language and Social Interaction, 48(2), 176–199. https:// doi.org/10.1080/08351813.2015.1025502 Pajo, K. (2013). The occurrence of “what”, “where”, “what house” and other repair initiations in the home environment of hearing‐impaired individuals. International Journal of Language & Communication Disorders, 48(1), 66–77. https://doi.org/10.1111/ j.1460-6984.2012.00187.x Pajo, K., & Laakso, M. (2020). Other-initiation of repair by speakers with mild to severe hearing impairment. Clinical Linguistics & Phonetics, 34(10–11), 998–1017. https://doi.org/10.1080/ 02699206.2020.1724335 Robinson, J. D. (2016). Accountability in social interaction. In J. D. Robinson (Ed.), Accountability in social interaction (pp. 1–44). Oxford University Press. Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). A simplest systematics for the organization of turn-taking in conversation. Language, 50(4), 696–735. Schegloff, E. A. (1982). Discourse as an interactional achievement: Some uses of “uh huh” and other things that come between sentences. In D. Tannen (Ed.), Georgetown University roundtable on languages and linguistics (pp. 71–93). Georgetown University Press. Schegloff, E. A. (2000). When “others” initiate repair. Applied Linguistics, 21(2), 205–243. https://doi.org/10.1093/applin/21.2.205 Schegloff, E. A. (2007). Sequence organization in interaction. Cambridge University Press. Schegloff, E. A., Jefferson, G., & Sacks, H. (1977). The preference for self-correction in the
80 Ray Wilkinson organization of repair for conversation. Language, 53, 361–382. Stivers, T., Mondada, L., & Steensig, J. (2011). Knowledge, morality and affiliation in social interaction. In T. Stivers, L. Mondada, & J. Steensig (Eds.), The morality of knowledge in conversation (pp. 3–24). Cambridge University Press. Tetnowski, J. A., & Damico, J. S. (2001). A demonstration of the advantages of qualitative methodologies in stuttering research. Journal of Fluency Disorders, 26, 1–26. https://doi. org/10.1016/S0094-730X(01)00094-8 Tykkyläinen, T. (2010). Child-initiated repair in task interactions. In H. Gardner & M. Forrester (Eds.), Analysing interactions in childhood: Insights from conversation analysis (pp. 227–248). Wiley-Blackwell. Wilkinson, R. (2007). Managing linguistic incompetence as a delicate issue in aphasic talk-in-interaction: On the use of laughter in prolonged repair sequences. Journal of Pragmatics, 39(3), 542–569. https://doi. org/10.1016/j.pragma.2006.07.010 Wilkinson, R. (2014). Intervening with conversation analysis in speech and language therapy: Improving aphasic conversation. Research on Language and Social Interaction,
47(3), 219–238. https://doi.org/10.1080/08351 813.2014.925659 Wilkinson, R. (2019). Atypical interaction: Conversation analysis and communicative impairments. Research on Language and Social Interaction, 52(3), 281–299. https://doi.org/ 10.1080/08351813.2019.1631045 Wilkinson, R., Beeke, S., & Maxim, J. (2010). Formulating actions and events with limited linguistic resources: Enactment and iconicity in agrammatic aphasic talk. Research on Language and Social Interaction, 43(1), 57–84. https://doi. org/10.1080/08351810903471506 Wilkinson, R., & Morris, S. (2020). My own space in this world”: Stammering, telephone calls, and the progressivity and permeability of turns-at-talk. In R. Wilkinson, J. P. Rae, & G. Rasmussen (Eds.), Atypical interaction: The impact of communicative impairments within everyday talk (pp. 319–344). Palgrave Macmillan. Wilkinson, R., Rae, J. P., & Rasmussen, G. (Eds.). (2020). Atypical interaction: The impact of communicative impairments within everyday talk. Palgrave Macmillan. Wootton, A. (1999). An investigation of delayed echoing in a child with autism. First Language, 19(57), 359–381. https://doi. org/10.1177/014272379901905704
7 Clinical Sociolinguistics BRENT ARCHER, ELEANOR GULICK, JACK S. DAMICO, AND MARTIN J. BALL 7.1 Introduction The interaction between language and society has been one of the major concerns of linguistic science over the last 40 years, but until recently the findings of sociolinguistics have not been applied to speech and language disorders. In this chapter we outline some of the areas of research subsumed under the heading of sociolinguistics, and show how they have been applied to communication disorders in recent times. However, the area of sociolinguistic concern is a broad one; covering language variation and change at the micro- and macro levels, language planning, bilingualism, discourse, and pragmatics. Some of these topics are dealt with in other chapters in this volume (see Chapters 9 by Hua & Wei, 1 by Müller, Guendouzi, & Wilson, 5 by Perkins and 6 by Wilkinson), therefore this chapter is more narrowly focused, mainly on the variationist paradigm developed in the early work of such researchers as Labov (e.g. 1963, 1966, 1972a, 1972b) and Trudgill (e.g. 1972, 1974) among many others. Variationist sociolinguistics developed partly out of the long-standing dialectology tradition (concerned with preserving the older forms of regional speech), and partly in reaction to the dominant paradigm of generative linguistics with its emphasis on the “ideal speakerlistener” and on the exclusion of variation in linguistic output in preference for describing the invariate underlying linguistic competence. The first major studies in this new field of sociolinguistics appeared in the 1960s (for example, Labov, 1963, 1966). These scholars investigated linguistic variation at various levels (although phonology has been the main area of study) and looked for correlations between the patterns of variation found and both linguistic and non-linguistic factors. In order to do this, sociolinguists devised the unit of analysis termed the variable (see Wardhaugh, 1998 for further details). A linguistic variable has two or more variants; for example, in many dialects of English there is a variable (h) which has the variants [h] and [Ø] (i.e., the [h] may be pronounced or omitted). The use of these dimensions can be correlated with non-linguistic variables, for example, style, gender, sexual orientation, race, ethnicity, socio-economic status, or age. Each of these social variables will also consist of variants: style can be divided into varying degrees of formality or casualness; socio-economic status into categories such as lower, middle, and upper; gender into agender, cis men, cis women, gender expansive, non-binary, trans men, trans women, agender, non-binary or other gender identities; and age into different bands according to the focus of investigation. Findings,
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
82 Brent Archer, Eleanor Gulick, Jack S. Damico, and Martin J. Ball therefore, give the degree of correlation between the usage of a linguistic variant (e.g. p restige [h] or non-prestige [Ø] for the (h) variable described above) and the speaker’s socio-economic status, the style of the speaker’s interaction, their gender or age. Such correlations, of course, are not causative, but may be considered predictive. Sociolinguists have also taken the study of language variation to a more macrolinguistic level, including bi- and multilingualism. Bilingualism as a term covers both societal bilingualism (a society where two languages are spoken, but where speakers themselves are not necessarily bilingual), and individual bilingualism (see further in Edwards, 2005). The study of individual bilingualism encompasses measures of degree of proficiency and dominance in the relevant languages by speakers, patterns of code switching (i.e. switching in and out of different languages for stylistic or other effects), interaction between languages (e.g. using the grammatical structure of one language with lexis from another, or borrowing a single lexical item perhaps to fill a word gap in one of the languages), and an investigation of the domains of usage of the two languages (e.g., one language may be restricted to family use rather than extended to wider or official usage). Clearly, all these features may be of importance for a speech-language pathologist, and we return to issues of assessment with bilinguals later. Many other aspects of sociolinguistic research have had to be omitted from this introduction for reasons of space, but readers unfamiliar with the area are recommended to consult contributions to Ball (2005). Sociophonetic variation is covered in more detail in Docherty and Khattab (Chapter 37 in this volume).
7.2 SLPs and Linguicism/Accentism I: Accent Modification Linguicism and accentism, or biases against people who speak a particular language or dialect (Reagan & Gabrielli, 2022), is a form of oppression that has been studied by sociolinguists. Different varieties of a language (and indeed different languages in the case of bilingualism) have differing degrees of prestige in a speech community (Orelus, 2020; Sener, 2021). Consequently, people who speak certain dialects may face discrimination when working to meet basic needs while applying for jobs (Cerrato, 2017; Deprez-Sims & Morris, 2010) or when seeking housing (Baugh, 2018; Massey & Lundy, 2001). Moreover, people who speak with less prestigious accents must daily contend with negative listener judgements about personality, physical appearance and attire (Chakraborty et al., 2019); the perception that people with non-prestige accents are less intelligent, trustworthy or lively (Fuertes et al., 2012); a lack of feelings of belonging (Gluszek & Dovidio, 2010); assumptions that speaker with non-native accents are poorer communicators (Hosoda et al., 2007), and discrimination within higher education (Johnson & VanBrackle, 2012) and the criminal justice system (Dixon, et al. 2002; Rickford & King, 2016). The relative prestige ranking of different languages and dialects tends to align well with the hierarchies related to other social attributes; the varieties used by powerful groups (white, educated, middle class people) are usually accorded respect while the varieties used by less powerful groups (Black people, people living in poverty, people with less education) are often denigrated, or even actively suppressed (Hiraga, 2005; Luhman, 1990; Wheeler, 2019). The analytical framework of intersectionality (Crenshaw, 1991) may help us to understand the relationship between linguicism/accentism and other forms of bigotry. This approach to studying oppression is based on the idea that different forms of discrimination
Clinical Sociolinguistics 83 are interdependent and support one another. In the case of non-prestigious dialects, stereotypes about speakers (for example, people who speak with X accent are lazy), may be used to provide explanations for resource distribution (which is why people who speak with accent X also tend to be poor) which do not focus on broader societal systems of privilege and oppression. In this way, linguicism/accentism help to obscure the real reasons for disparities between different groups, which in turn ensures that these disparities are unlikely to be resolved. Within speech-language pathology, accent modification has been identified as a practice area which may contribute to linguicism (Grover et al., 2022; Yu, Nair et al., 2022). People who receive services of this kind are taught to speak with a new accent which differs from their current accents. First language clients may wish to speak with accents other than those they have already acquired, while second language clients may want to adopt an accent associated with first language speakership. Critics of accent modification have highlighted the problematic aspects of this form of intervention. First, while accent modification could be designed to help clients master any accent, Ennser-Kananen et al. (2021) analyzed 26 accent modification services offered by US universities to students born in other countries and found that all of the programs aimed to equip students with a General or Mainstream American accent. Riuttanen’s (2019) analysis of 40 websites built to advertise accent reduction services similarly found all of the offerings were designed to teach students to speak with a Mainstream English Accent. In general, accent modification interventions attempt to teach clients to speak English with the first language accents associated with white speakers in the “inner circle” (Kachru, 1992) of English-majority countries or regions such as the US, the UK, Australia, Anglophone Canada, and New Zealand (as opposed to the accents associated with first and second language speakers of color in India, Pakistan, Singapore, and Nigeria). Secondly, many publications which deal with accent modification services frame these interventions in terms of increasing intelligibility or comprehensibility (e.g. Burda et al., 2022), without acknowledging that all parties to an interaction are responsible for ensuring that mutual understanding and successful communication occur. Participating in accent modification is thus a complex issue for sociolinguistically aware, justice-oriented SLPs. On the one hand, when SLPs provide services designed to help everybody sound like white, first language speakers from rich, former colonial powers we endorse the idea that some accents and varieties of English are superior to others. Because linguicism intersects with and sustains other forms of discrimination, our practices in the clinic room can have wider ramifications and contribute to the maintenance of systems and ideologies of oppression such as white supremacy. On the other hand, our clients inhabit a world in which systems of injustice may have a significant impact on their lives, especially if they are less privileged than we are. If they choose to avail themselves of accent modification services in order to secure employment or make their lives easier in other ways, it seems inappropriate for us refuse to support them (especially if we do not face the same forms of discrimination as they do). Perhaps the way out of this dilemma lies in looking at diglossic situations in which groups of speakers are proficient in two language varieties and where institutions and authorities support and value both forms. The Bilingual-bicultural approach to educating Deaf children which stipulates that Deaf children learn a sign language as their first language and written forms of spoken language as their second (LaSasso & Lollis, 2003) and the research program of Rebecca Wheeler, which focuses on affirming and social-justice oriented methods for teaching children who speak non-prestigious varieties (see Wheeler (2008, 2009, 2019) are both cognizant of the importance of certain languages and dialects for
84 Brent Archer, Eleanor Gulick, Jack S. Damico, and Martin J. Ball functioning in society. However, these frameworks describe how people in positions of power and authority can combat stigma by valuing all the languages and varieties in a given speaker’s repertoire. Additionally, both frameworks emphasize code switching; students are supported in their use of their preferred varieties by teachers and clinicians who demonstrate respect for what are viewed societally as non-prestige forms, while students also learn to use the prestige form. Students can use the prestige form as they see fit, such as when communicating in the wider world, where unjust distributions of power continue to exist. As SLPs, we can learn lessons from these two approaches. We can provide the services that clients who want to speak with a different accent desire while at the same time explicitly and repeatedly communicating to them that their current accents, dialects or languages they use are useful in many situations, and that we respect both these varieties and the communities that use them. Including code-switching focused elements in our practice should not be the limit of our efforts to combat injustice, especially if we are white, middle class, cis gender, heterosexual, men, not disabled, or members of other relatively privileged groups. If we do nothing but equip clients with prestige accents while at the same time demonstrating respect for their other varieties in the therapy room, we are tacitly supporting the idea that their current accents, dialects or languages are inferior to other varieties or are otherwise problematic. Moreover, only focusing on accents that will help our clients code switch without focusing on broader systems of oppression suggests that the victims of discrimination should change themselves in order to be treated justly. Alongside our efforts to help our clients develop agency in varying their speech to sound more like members of dominant hegemonic classes, we are morally obligated to work towards the destruction of hegemony. In a recent publication, Yu, Nair et al. (2022) describe three elements that might be included in a social-justice oriented approach to accent modification (and SLP more generally): 1. Equity mindedness, a stance which requires us to be race conscious and to acknowledge that linguicism is deeply intertwined with racism and other forms of discrimination. Further, equity-minded practitioners continuously interrogate their own biases and blind spots and examine ways in which they are personally complicit in sustaining injustice and examining how they participate in institutions that shore up privilege and injustice. 2. Culturally sustaining pedagogy, a construct which celebrates clients’ prior knowledge and experiences. In our view, this philosophy enjoins us to work toward building a world in which the accents and dialects used by minoritized communities such as people of color are valued, and the need for accent modification eventually disappears. We can begin with addressing our practices and the practices and policies of our workplaces including schools, hospitals, clinic, and university training programs. 3. Emancipatory practice, an approach to providing services which requires us to ground our profession in critical theory, an analytical framework that identifies, critiques, and challenges oppressive power structures. Critical enquiry can help us evaluate how the institution of SLP creates unjust, downstream effects. We should be conscious of our role in building oppressive social and political structures and actively work to be agents of social justice. Horton (2021) provides an overview of concepts such as equity, privilege, social justice, and linguicism and highlights connections between these constructs and a wide number of policies, issues, and practices that SLP clinicians and researchers might encounter. For readers interested in pursuing the social justice agenda laid out by Yu, Nair et al. (2022), Horton’s chapter can provide valuable background knowledge.
Clinical Sociolinguistics 85
7.3 SLPs and Linguicism/Accentism II: African-American English Discrimination against speakers of varieties within the African-American English (AAE) family is one prominent example of injustice which has been highlighted in the sociolinguistics literature (e.g. Lippi-Green, 2021). Washington and Seidenberg (2022) provide a highquality overview of this dialect family, the factors which influence it’s use amongst school-age children and the literature related to literacy development in speakers of AAE while Baugh (2022) discusses how various individuals and groups actively campaigned to prevent “Ebonics” (an alternative name for AAE) from being recognized as a legitimate and valued variety of English. SLPs helped to establish and perpetuate this form of oppression. For decades, many white SLPs were ignorant of the existence of varieties of AAE as dialects that were widely used and accepted in many communities. We opted instead to use other, more prestigious forms of English as the standard against which clients should be measured. Essentially, we equated difference with disorder (Taylor & Peters-Johnson, 1986). Even as SLPs began to view AAE as a fully fledged variety worthy of respect and study, some authors have raised concern about the problematic relationship that might exist between AAE users and researchers. Newkirk-Turner and Morris (2021) point out that many investigators gather data from communities that they then use to advance their careers. Once research projects end, the researchers lose contact with their participants, and play little to no role in providing services to people who need them. The authors argue for a different approach to research, in which academics build long-term relationships with communities and offer services in return for access. Moreover, researchers function as allies who support the marginalized groups’ efforts to achieve greater justice. Although professional SLP organizations now acknowledge the importance of AAE, and mandate that clinicians should adopt more culturally responsive attitudes, data indicate that anti-AAE biases may run deep. In a recent study by Hendricks et al. (2021), 73 mostly white SLP students from 46 university programs in the United States completed an online survey in which they were asked to indicate the degree to which they agreed with statements regarding the validity of AAE. The students were then asked to listen to audio samples recorded by speakers of different dialects (mainstream American English and AAVE) and assign ratings to speakers. Participants displayed positive opinions of statements on the validity of AAE. However, the socio-intellectual (literacy, social status of the speaker etc.), aesthetic (how pleasing or beautiful the sample was etc.) and dynamism (how aggressive or active the speaker was, etc.) ratings were less favorable for speakers of AAE. Linguistic biases such as these play an important role in marginalizing speakers of less prestigious varieties and interfere with our ability to provide sound assessment and treatment services to people who do not speak “mainstream” forms of English (Easton & Verdon, 2021; Hendricks & Diehm, 2020; Hendricks et al., 2021). Within school settings, children who speak varieties of AAE (the majority of whom are Black) may be misdiagnosed as having a communication disorder because (white) SLPs lack an understanding of the features of these varieties (Fox & Jacewicz, 2021; Hendricks & Diehm, 2020). Since school-based SLPs are important gate keepers who help determine which learners should be enrolled in remedial programs, sociolinguistic blind-spots of this sort contribute to the over-representation of Black children in special education programs (as documented by Maydosz (2014), Grindal et al. (2019), O’Quin (2021) and others). Moreover, Black children in special education are more likely to be disciplined and face
86 Brent Archer, Eleanor Gulick, Jack S. Damico, and Martin J. Ball juvenile justice sanctions than white children, leading some authors to argue that remedial programs are an integral part of the school to confinement pipeline (Annamma, 2017). Pathologizing AAE thus contributes to downstream, broader effects such as Black men being incarcerated at rates vastly out of proportion to the demographic make-up of the United States. We turn our attention next to issues of assessment and normative variation. We will focus on how knowledge drawn from the field of sociolinguistics can be used to inform practices within speech–language pathology that are less likely to exacerbate current inequities and the oppression of minoritized groups.
7.4 Sociolinguistic Sensitivity in Assessment We can illustrate the dangers of ignoring the sociolinguistic characteristics of a speech community at various levels of linguistic structure (taking English as our example). At the phonetic realization level, we can note the heavy affrication of fortis stops (i.e., /p/, /t/, /k/) in Liverpool English (though subject to social class differentiation), or the diphthongization of front lax vowels (for example in the word ) in certain phonetic environments in Southern US English, as examples of realizations that could be deemed disordered by a clinician lacking sociolinguistic awareness. At the phonological level, the dental fricatives (i.e., /θ/ and /ð/) are either totally absent, or stylistically variable in several varieties of English (AAE, Black British English, Multicultural London English, Caribbean English, etc.) and onset clusters with /s/+C+/r/ (especially /str-/) are realized with initial /ʃ/ in the younger persons’ speech of many English speech communities (see Ball & Rutter, 2005). Lack of sociolinguistic awareness could lead to many of these non-standard variants being judged as incorrect in the assessment of clients with potential speech disorders. At the morphophonological level, the progressive aspect marker -ing has two stylistically controlled variant realizations, [ɪŋ] and [ɪn], in many regional varieties of the language. Inflectional morphology is also sociolinguistically variable to some extent: AAE varieties may variably omit the -s morpheme to mark third person singular present tense on lexical verbs; in derivational morphology we can note that the -ly de-adjectival adverb marker is virtually absent in many vernacular forms of English. Syntactically a wide range of variation may be encountered. These include, for example, double negatives (“not seen no one”), zero relativizer (“he’s a lad likes his black pudding”), and double modals (“I might could do it”), the omission of the infinitive “to be” (“The car needs washed”) among many others. Use of these forms is often correlated with both social class and style, but as with the other levels discussed above, they are all liable to misinterpretation as incorrect forms by assessors or assessments that are not sociolinguistically sensitive. As a final point, we need also to consider that lexical variation is also common, and that words used for concepts may differ across varieties (for example, different varieties of American English use terms such as “soda,” “pop,” or “Coke” to refer to fizzy soft drinks). Picture-naming assessments (for example of phonology) are often problematic in this area, as lexical items that are common in one variety may not be in another. Examples the fact that in Australia squirrels are absent, and so pictures of squirrels may elicit “wombat” or “possum” from Australian children. (This is from the Goldman Fristoe Test of Articulation, Goldman & Fristoe, 2000.)3
Clinical Sociolinguistics 87 Clearly, a solution which does not pathologize human variety and which aligns more closely with an orientation toward social justice is to provide assessments that cover ranges of sociolinguistically acceptable target forms for specific dialects or groups of dialects; in this regard, we can note that considerable research has been undertaken on African American English (see review in Wolfram, 2005). Other dialects divergent from standard forms that should be considered include Appalachian English, Cajun English, and Southern States English in the US, Lowland Scots in the UK, Newfoundland English in Canada, and forms of English used by indigenous peoples in North America, Australia, and New Zealand. National standards of English used in countries such as India, Pakistan, Bangladesh, Sri Lanka, Singapore, and the Hong Kong Special Administrative Region of China among others might also be candidates for variety-specific assessments. Oetting (2005), Oetting et al. (2013) and Oetting et al. (2022) describes one step along this path: the Diagnostic Evaluation of Language Variation (DELV), devised by Seymour et al. (2003). This was designed to assess children with a range of American English dialects (including those noted above), and was standardized on over a thousand children, 63 percent of whom were speakers of non-standard varieties. Test items cover phonology, syntax, semantics, and pragmatics, and its goal is to allow clinicians to note which variety of American English the client is a speaker of, and to allow classification of the client as impaired or not impaired in speech and/or language. Oetting also looked at a set of three measures often used in language analysis: mean length of utterance (MLU; Brown, 1973), Developmental Sentence Score (DSS; Lee, 1974), and the Index of Productive Syntax (IPSyn; Scarborough, 1991). She notes that these language sample measures are often avoided with non-standard dialect speakers, as there is a lack of data to show normative patterns outside the mainstream variety of English. She reports on an earlier study (Oetting et al., 1999) which used language samples from 31 children speaking a rural variety of Southern US English, and analyzed the data using the three measures just described. IPSyn does not require the scoring of individual utterances, rather the analyst searches the sample for examples of 56 prescribed structures. As these structures occur in most varieties, the IPSyn score was not adversely affected in Oetting, Cantrell and Horohov’s study. Oetting et al. (2013) suggests that experimental probes can be developed that avoid the differences between standard and non-standard versions of a language. One example is the non-word repetition task, where children hear and repeat nonsense words of varying length. Studies reported by Oetting show that using these tasks reduces differences in scores between standard and non-standard dialect speakers, but clearly tasks such as this are limited in their evaluative potential. Moving beyond varieties of a single language, sociolinguistic sensitivity in assessment is also important with bi- and multilingual clients. Similarly, De Lamo White and Jin (2011) compare a range of assessment frameworks that clinicians might employ when working with multilingual children and find that the sociocultural approach holds the most promise. Instead of using the behavior and mores of privileged classes as a metric against which children’s abilities should be measured, the clinician endeavors to understand the extent to which children are able to effectively communicate within specific sociocultural contexts. As Cheng (1997) points out, by observing the child in real-world contexts and soliciting indepth input from people who share these contexts with the child (parents, teachers, teacher’s assistants), clinicians can reduce the effects of bias on assessment and treatment practices. Points to be stressed about bilingual clients in the training of clinicians include the facts that bilingual clients need to be compared with similar bilinguals rather than with monolinguals, that code switching and mixing is normal, and may be used playfully by bilingual
88 Brent Archer, Eleanor Gulick, Jack S. Damico, and Martin J. Ball children, and that bilinguals may show evidence of “errors” in the non-dominant language that should not be considered examples of language disorders. Wei et al. (2005) also list points suggestive of disorder in bilinguals and points suggestive of imperfect acquisition (perhaps of a non-dominant language). Among the former are the inability to produce sounds which are common in the speech of children of the relevant age irrespective of their target language, inabilities in the production or comprehension of words familiar to children of the relevant age irrespective of their target language, and inability to produce grammatical sentences irrespective of the language the child is trying to speak. Among the latter are an unbalanced vocabulary between the languages, speech errors in one language while the same or similar target sounds are correct in the other, and ability to produce grammatical sentences in only one language. Sociolinguistic awareness in assessment should be coupled (as Wolfram, 1993 noted and as referred to above) to a similar awareness in remediation. This should include not only sociolinguistically relevant targets, but also an ability to distinguish between transitional error patterns as the client moves toward a relevant target, and the ability to know when a client has reached a realization that is acceptable in their variety of the language even if not in the standard form.
7.5 Language and Power in the Clinic As interventionists, SLPs are concerned with how things are accomplished during therapy and what variables act upon the therapeutic context to drive the necessary (and successful) social actions. Therapy is a complex social enterprise wherein specific goals are established and the methods for approaching these goals must be initially determined and then implemented (e.g., Isaksen, 2018; Lahey, 2004). For this “clinical business” to occur, it is essential that someone can take the lead in the planning and execution of therapy. That is, someone must have some form of directive influence and responsibility. Additionally, since there is always a complex and dynamic negotiation of roles, responsibilities, and obligations during the therapeutic session, there must be an understanding of how the social and interactional negotiations are handled from the perspective of both clinician and client. This is where sociolinguistics contributes some necessary insight. Since sociolinguistics has a long history of focusing on linguistic variation and interaction as it is juxtaposed with social action within various interactive encounters, both the methodologies and the knowledge base of sociolinguistics can be effectively employed to better understand the therapeutic context. Given the importance of the directive function in therapy (e.g., Ulichny & Watson-Gegeo, 1989), sociolinguists’ work on the influences of interactional power upon the actions and reactions of participants in interactive contexts is especially salient. Sociolinguistics helps our understanding of the ways that this complex power negotiation is implemented, and it is through the investigation of interactional power within therapy that sociolinguistics effectively informs our clinical arena in communication disorders. Simply put, by employing sociolinguistic research on interactional power within the therapeutic context, our remedial encounters become clearer to us and we can make them more effective. Sociolinguistic research has assisted our understanding of interactional power in therapeutic contexts, enabling us to see this operational construct from a robust rather than a naive perspective. As discussed by Tannen (1987), interactional power is a complicated phenomenon. It is a social construct that is more relational than discrete; rather than existing as a separate and definable social trait, it exists only as the emergent outgrowth of
Clinical Sociolinguistics 89 interactive processes between two or more interactants. In this sense, interactional power is a dynamic reflection of intersecting attitudes, expectations, and behaviors across individuals; it is not a simple or direct extension of an external reality. These points are forcefully demonstrated in the sociolinguistic literature where interactional power is revealed to be multimodal and multidimensional in manifestation (e.g., Brown & Gilman, 1960; Fairclough, 1989), culturally influenced (Schiffrin, 1987), and contextually relative (Hymes, 1967). These characteristics and the work done in defining this interactional concept have attempted to account for the complexity of the phenomenon and, consequently, interactional studies of clinician–client dyads have benefited from this more circumspect viewpoint (Panagos, 1996). For extended discussions on the complexity of interactional power and its characteristics, the reader is directed to Damico, Simmons-Mackie et al. (2005). Another aspect of clinical interactions that sociolinguistic frameworks and knowledge have enabled researchers to study is the power differential which exists in many therapy contexts, and the ways in which clinicians typically manipulate this differential to guide the therapeutic enterprise (e.g., Horton, 2007; Isaksen, 2018 Kovarsky & Duchan, 1997; Panagos, 1996). Given our understanding of sociolinguistics, it should not be surprising that there are a number of culturally conventionalized signals that assist in the interpretation of the power dynamic during interactions. Within certain caveats (see Damico, Simmons-Mackie et al., 2005), variables like forms of address (e.g., Brown & Ford, 1961; Brown & Gilman, 1960), negotiation of speaking turns (e.g., Brown & Levinson, 1987; Fairclough, 1989; Archer et al. 2021; Leaman & Archer, 2022), topic selection and maintenance (Archer et al., 2018; Horton, 2007; Leaman et al., 2022; Shuy, 1987; Walker, 1987), questioning (Azios & Archer, 2018; Denman & Wilkinson, 2011; Muskett et al., 2010; Tannen, 1987), the structuring of interaction via discourse markers (Kovarsky, 1990; Simmons-Mackie & Damico, 1996) response structures (SimmonsMackie et al., 1999), the use of evaluative statements (e.g., Cazden, 1988; Mehan, 1979; Ulichny & Watson-Gegeo, 1989), the strategic deployment of humor (Archer et al., 2019) are just some issues related to power differentials that have been examined. Attention to these and other emergent manifestations of the power differential in therapeutic contexts has created greater awareness and beneficial discussions regarding this important dimension of social/therapeutic encounters. Finally, the discipline of sociolinguistics furnished researchers interested in therapeutic activities with a set of methodologies that were ideally suited to studying complex social and interactional phenomena. In an excellent overview of the emergence of “speech therapy discourse,” John Panagos (1996) mentions how Prutting and colleagues (Prutting et al., 1978), Bobkoff (1982), Ripich (1982), and Panagos himself (e.g., Panagos & Fry, 1976; Panagos & Griffith, 1981) were influenced by the work of Hymes, Labov and others who focused on the concept of “communicative competence,” on models of interaction in social life (e.g., Hymes, 1967), or on therapeutic discourse itself (Labov & Fanshel, 1977). By providing both methods and a context for investigation, the discipline of sociolinguistics influenced these “pioneers” in the study of clinical discourse from a sociolinguistic perspective. Their work, in turn, gave rise to a generation of research that can point to the confluence of the ethnography of communication with sociolinguistics as an essential catalyst for much of the work done in this clinical area. As mentioned above, several of the earliest investigations of therapeutic discourse in the discipline of communicative disorders were influenced by the work of Hymes (1967) and Labov and Fanshel (1977). These early studies obtained similar results to some of the studies of interaction in general conversation or in targeted teaching/learning encounters (Cazden, 1988; Mehan, 1979). Indeed, many of the same social dimensions were found to be operative,
90 Brent Archer, Eleanor Gulick, Jack S. Damico, and Martin J. Ball with one individual within the dyad typically being more dominant than the other. For example, in perhaps the earliest published study from a sociolinguistic perspective, Prutting and colleagues (1978) investigated therapy discourse to determine how clinician and client communicated during therapy. They found a definite asymmetrical pattern in which the clinician dominated the conversational space by approximately a 2:1 ratio. Influenced by sociolinguistic research, Prutting and colleagues also noted how this asymmetry was constructed. They found that the specific types of interactions constructed, the speech acts employed (e.g., “request type communicative acts”), the way topic selection was controlled, and the clinician’s evaluative statements following client response all operated to shift a large power differential in favor of the clinician. A number of investigations subsequent to the Prutting study have focused on how the interactional power of the clinician was established and maintained during the therapeutic encounter. Letts (1985), for example, explicitly discussed the clinician’s agenda for conducting therapy and defended the need for clinician therapeutic control. She found that there were a number of rules or guidelines, around which clinicians seemed to organize therapy to create therapeutic control. For example, in her investigation, the clinician controlled the activities during therapy, how long each activity ran, and the feedback provided. Additionally, this feedback was oriented to more evaluative functions than pedagogical ones. Further, the flow of information about the client’s performance and about how to modify that performance was also used as “interactional currency” to establish and maintain therapeutic control. Letts emphasized that the way the therapy session was organized is one mechanism for creating and manipulating interactional power. Much of the work of Panagos and his students also focused on the structure of the therapeutic encounter and the therapy agenda as it was formulated and advanced (e.g., Panagos et al., 1986). In another study specifically influenced by sociolinguistic research, Kovarsky (1989, 1990) employed Schiffrin’s (1987) description of discourse markers as elements of speech that act to bracket units of talk, and he investigated how the discourse markers that he identified (e.g., okay, oh, so, well, now) were employed to organize the actual therapeutic interactions at the local level (Goodwin & Heritage, 1990). In his analysis, Kovarsky found three purposes for the identified discourse markers: control, evaluation, and response to informative interactions. Without question, however, the control function predominated both the analysis and the therapeutic context; while only one function was explicitly described as control, to some degree the other two functions (evaluation, the acceptance of information) also pivot on control and are functions of interactional power. Damico and Damico (1997) borrowed from the sociolinguistic insights of Ulichny and Watson-Gegeo (1989) to demonstrate how clinicians employ another interactional device – the dominant interpretive framework (DIF) – to establish and maintain both evaluative control and the impact of learning in the therapy session. Consistent with Ulichny and Watson-Gegeo, this study detailed how clinicians used various interactional strategies to shift client responses toward preferred and expected types of answers and belief systems. In doing so, the clinicians were able to force their own interpretation of the most appropriate and acceptable answers onto responses that were actually correct. Finally, Leaman and Archer (2022) conducted a study of repair sequences in conversations between people with aphasia and clinicians that was clearly informed by sociolinguistic concepts. Trouble sources (speech errors) are a common occurrence in conversation, especially during interactions involving people with aphasia. Instead of simply stepping back and allowing people with aphasia to repair their own errors, conversation-level aphasia interventions focus on supportive techniques whereby the clinician provides directive help to support repairs. The authors evaluated the benefits of an alternative approach that shifted
Clinical Sociolinguistics 91 the emphasis to self-repair; in dyadic conversations with ten people with aphasia, SLPs were instructed to wait when people with aphasia encountered speaking difficulties, thereby affording the people with aphasia the chance to resolve their own speech errors. Findings indicated that this less directive approach ensured people with aphasia exercised greater power within the conversation. The people with aphasia used edited turns in which selfrepair occurred as opportunities for talking about their personal histories and for introducing topics that were relevant and of interest to them. Within the clinical realm, this sociolinguistic focus on the relationship between language and power has had a pervasive impact. Much of what we understand about therapeutic interaction in terms of its complexity, its systematicity, and its impact on both learning and social management has been generated by sociolinguistic influences. This, in turn, has spawned greater attention to this facet of clinical activity. The resultant research and its applications have benefited clinicians and clients alike.
7.6 Conclusion Language is the most complex of human abilities, enabling us to act upon the social and physical worlds while also being a part of these worlds. If we are going to be successful as clinical and remedial agents for speech and language disorders, we must be able to effectively focus on the authentic needs of our clients and students, and we must strive to make a difference in their symbolic lives. Perhaps there is no better source of information to help us fulfill our obligations than sociolinguistics. As this brief discussion has demonstrated, this area of linguistics, focusing on the interaction between language and society, offers the clinician a way to address the complexity of language in context. Indeed, from the beginnings of this subdiscipline the focus was on actual language users in real and embedded contexts. Whether we employ the idea of variation in our linguistic code as driven by social factors, whether we employ the complex methodologies that have so effectively described elaborate psycho-social phenomena like power, authority, and identity, whether we employ the conceptualizations of this subdiscipline to address complicated linguistic manifestations like literacy, bilingualism, or compensatory adaptation to impairment (Perkins, 2002), we can rely on sociolinguistics to highlight both the complexity and some of the ways to address that complexity. Further, the murder of George Floyd at the hands of police in Minneapolis in 2019 touched off a series of protests in the United States and around the world. This event led to much greater awareness on the part of privileged people about forms of discrimination such as racism, sexism, homophobia, transphobia, white supremacy, and other forms of bigotry. Individuals, organizations, professions, and nations have continued the long overdue, often painful process of acknowledging past harms and are debating how best to dismantle systems of oppression. In our own field, a number of writers, clinicians, and researchers have called for speech language pathologists who enjoy social privilege to acknowledge and address the ways in which our actions have helped to create and perpetuate systems of oppression (Aguilar, 2021; Easton & Verdon, 2021; Ellis & Kendall, 2021; Grover et al., 2022; Hendricks et al., 2021; Khamis-Dakwar & Randazzo, 2021; Whitfield, 2022; Winn et al., 2022; Yu, Horton et al., 2022; Yu, Nair et al., 2022). There is no doubt that the profession of speech-language pathology and clinicians who are members of privileged social classes have fostered discrimination on the basis of language, dialect or accent.
92 Brent Archer, Eleanor Gulick, Jack S. Damico, and Martin J. Ball Several clinical implications follow from a focus on sociolinguistics. First, in our efforts as professionals we must strive to address the functional and authentic speech and language behaviors of our clients and students and the implications of these behaviors in the real world. When conducting assessments or planning and implementing interventions, we should avoid the construction of convenient and simplistic phenomena as reflected by decontextualized test performances and sanitized therapy activities. Second, we should recognize the complexity of language in context and respect the fact that what we do and how we do it in the clinical context will never be easy. Rather, we should strive to become clinical linguists who are not afraid to address complexity and the facets of linguistic and social phenomena. The brief descriptions and references provided within this chapter will serve as an excellent starting point. Finally, speech language pathologists are college educated and the vast majority of us do not have significant disabilities. Many of us are white, cis, middle class, first language speakers of English. In many institutions, we are accorded professional respect and decision makers in school districts, hospitals and other settings may draw upon our expertise when creating language related policies. Those of us from privileged backgrounds have the power and professional respect to be able to join the fight against accentism, linguicism, and the other forms of discrimination it supports and depends upon. Sociolinguistics provides us with lenses and concepts that can enable us to understand our culpability and furnish us with tools we can use to help build a more socially just workplaces, schools, and communities.
AUTHOR POSITIONALITY STATEMENTS BRENT ARCHER: I am a white middle class, cis-gender man with no disabilities who speaks English as a first language. People who speak with an accent similar to mine do not face discrimination because of the way we speak. I attended a predominantly white university program while pursuing a doctoral degree. I have benefitted from the privileges afforded by my intersecting identities within the education system and professionally within the field of communication sciences and disorders. Given my privileges I also see it as my responsibility to disrupt systemic oppression through my positions in academic, clinical, and research settings. ELEANOR GULICK: I am a white, monolingual, English speaking, middle class, cis-gender woman. I attended predominantly White university programs while earning my bachelor’s degree, master’s degree, and currently while pursuing a doctoral degree. I have benefitted from the privileges afforded by my intersecting identities within the education system and professionally within the field of communication sciences and disorders. Given my privileges I also see it as my responsibility to disrupt systemic oppression through my positions in academic, clinical, and research settings. MARTIN J. BALL: Cymro Cymraeg o sir Feirionydd yr ydwyf. Rwy’n cefnogi annibyniaeth i Gymru ac i’r gwledydd celtaidd eraill (Yr Alban, Llydaw a Chernyw). Rwyf o blaid Iwerddon unedig. Aelod ydw i o Blaid Cymru, o Gymdeithas yr Iaith, ac o Yes Cymru, ac edrychaf ymlaen yn frwd at ddiddymiad y Deyrnas Unedig. Yng ngeiriau Dafydd Iwan: er gwaetha pawb a phopeth, ry’n ni Yma o Hyd!
Clinical Sociolinguistics 93
REFERENCES Aguilar, Y. (2021). Exploration of speech-language pathology from a social justice and critical race theory perspective [Doctoral dissertation]. California State University San Marcos. Annamma, S. A. (2017). The pedagogy of pathologization: Dis/abled girls of color in the school-prison nexus. Routledge. Archer, B., Azios, J. H., Gulick, N., & Tetnowski, J. (2021). Facilitating participation in conversation groups for aphasia. Aphasiology, 35(6), 764–782. Archer, B., Azios, J. H., & Moody, S. (2019). Humour in clinical–educational interactions between graduate student clinicians and people with aphasia. International Journal of Language & Communication Disorders, 54(4), 580–595. Archer, B., Tetnowski, J., Freer, J. C., Schmadeke, S., & Christou-Franklin, E. (2018). Topic selection sequences in aphasia conversation groups. Aphasiology, 32(4), 394–416. Azios, J. H., & Archer, B. (2018). Singing behaviour in a client with traumatic brain injury: A conversation analysis investigation. Aphasiology, 32(8), 944–966. Ball, M. J. (Ed.) (2005). Clinical sociolinguistics. Blackwell. Ball, M. J., & Rutter, B. (2005). Is /str-/ a cluster in contemporary English? Presented at the American Speech-Language-Hearing Association Convention, San Diego. Baugh, J. (2018). Linguistics in pursuit of justice. Cambridge University Press. Baugh, J. (2022). Class backwards: Linguistic racism and educational malpractice in American schooling. Aula de Encuentro, (1), 90–116. Bobkoff, K. (1982). Analysis of verbal and non-verbal components of clinician–client interaction [Unpublished doctoral dissertation]. Kent State University. Brown, P., & Levinson, S. (1987). Politeness: Some universals in language usage (2nd ed.). Cambridge University Press. Brown, R. (1973). A first language. Harvard University Press. Brown, R., & Ford, M. (1961). Address in American English. Journal of Abnormal and Social Psychology, 62(2), 375–385. Brown, R., & Gilman, A. (1960). The pronouns of power and solidarity. In T. Sebeok (Ed.), Style in language (pp. 253–276). MIT Press.
Burda, A., Squires, L., Krupke, D., Arthur, A., Bahia, M., Bernard, K., Easley, M., English, M., Hicks, J., Johnson, V., Lancaster, M., O’Loughlin, E., & Skaar, S. (2022). Effectiveness of intense accent modification training with refugees from Burma. American Journal of Speech-Language Pathology, 31(6), 2688–2706. Cazden, C. (1988). Classroom discourse: The language of teaching and learning. Heinemann. Cerrato, L. (2017). Accent discrimination in the US: A hindrance to your employment and career development? [Unpublished masters thesis] Helsinki Metropolia University of Applied Sciences. Chakraborty, R., Schwartz, A. L., & Vaughan, P. (2019). Speech language pathologists’ perceptions of nonnative accent: A pilot study. Perspectives of the ASHA Special Interest Groups, 4(6), 1601–1611. Cheng, L. (1997). Diversity: Challenges and implications for assessment. Journal of Children’s Communication Development, 19(1), 55–62. Crenshaw, K. (1991, July). Mapping the margins: Intersectionality, identity politics, and violence against women of color. Stanford Law Review, 43(6), 1241–1299. Damico, J. S., & Damico, S. K. (1997). The establishment of a dominant interpretive framework in language intervention. Language, Speech, and Hearing Services in Schools, 28, 288–296. Damico, J. S., Simmons-Mackie, N., & Hawley, H. (2005). Language and power. In M. J. Ball (Ed.),Clinical sociolinguistics (pp. 63–73). Blackwell. De Lamo White, C., & Jin, L. (2011). Evaluation of speech and language assessment approaches with bilingual children. International Journal of Language & Communication Disorders, 46(6), 613–627. Denman, A., & Wilkinson, R. (2011). Applying conversation analysis to traumatic brain injury: Investigating touching another person in everyday social interaction. Disability and Rehabilitation, 33(3), 243–252. Deprez-Sims, A. S., & Morris, S. B. (2010). Accents in the workplace: Their effects during a job interview. International Journal of Psychology, 45(6), 417–426. Dixon, J. A., Mahoney, B., & Cocks, R. (2002). Accents of guilt? Effects of regional accent, race, and crime type on attributions of guilt.
94 Brent Archer, Eleanor Gulick, Jack S. Damico, and Martin J. Ball Journal of Language and Social Psychology, 21(2), 162–168. Easton, C., & Verdon, S. (2021). The influence of linguistic bias upon speech-language pathologists’ attitudes toward clinical scenarios involving nonstandard dialects of English. American Journal of Speech-Language Pathology, 30(5), 1973–1989. Edwards, J. (2005). Bilingualism and multilingualism. In M. J. Ball (Ed.), Clinical sociolinguistics (pp. 36–48). Blackwell. Ellis, C., & Kendall, D. (2021). Time to act: Confronting systemic racism in communication sciences and disorders academic training programs. American Journal of Speech-Language Pathology, 30(5), 1916–1924. https://doi.org/10.1044/2021_AJSLP-20-00369 Ennser-Kananen, J., Halonen, M., & Saarinen, T. (2021). “Come join us, and lose your accent!”: Accent Modification Courses as Hierarchization of International Student. Journal of International Students, 11(2), 322–340. Fairclough, N. (1989). Language and power. Longman. Fox, R. A., & Jacewicz, E. (2021). Cultural and multilingual sources of phonetic variation: Implications for clinical practice. In M. Ball (Ed.), Manual of clinical phonetics (pp. 89–100). Routledge. Fuertes, J. N., Gottdiener, W. H., Martin, H., Gilbert, T. C., & Giles, H. (2012). A meta‐ analysis of the effects of speakers’ accents on interpersonal evaluations. European Journal of Social Psychology, 42(1), 120–133. Gluszek, A., & Dovidio, J. F. (2010). Speaking with a nonnative accent: Perceptions of bias, communication difficulties, and belonging in the United States. Journal of Language and Social Psychology, 29(2), 224–234. Goldman, R., & Fristoe, M. (2000). Goldman– Fristoe test of articulation 2. AGS Publishing. Goodwin, C., & Heritage, J. (1990). Conversational analysis. Annual Review of Anthropology, 19(1), 283–307. Grindal, T., Schifter, L. A., Schwartz, G., & Hehir, T. (2019). Racial differences in special education identification and placement: Evidence across three states. Harvard Educational Review, 89(4), 525–553. Grover, V., Namasivayam, A., & Mahendra, N. (2022). A viewpoint on accent services: Framing and terminology matter. American Journal of Speech-Language Pathology, 31(2), 639–648.
Hendricks, A. E., & Diehm, E. A. (2020). Survey of assessment and intervention practices for students who speak African American English. Journal of Communication Disorders, 83(1), 105967. Hendricks, A. E., Watson-Wales, M., & Reed, P. E. (2021). Perceptions of African American English by students in speech-language pathology programs. American Journal of Speech-Language Pathology, 30(5), 1962–1972. Hiraga, Y. (2005). British attitudes towards six varieties of English in the USA and Britain. World Englishes, 24(3), 289–308. Horton, R. (2021). Introduction. In R. Horton (Ed.), Critical perspectives on social justice in speech-language pathology (pp. 130–150). IGI Global. Horton, S. (2007). Topic generation in aphasia language therapy sessions: Issues of identity. Aphasiology, 21(3–4), 283–298. Hosoda, M., Stone-Romero, E. F., & Walter, J. N. (2007). Listeners’ cognitive and affective reactions to English speakers with standard American English and Asian accents. Perceptual and Motor Skills, 104(1), 307–326. Hymes, D. (1967). Models of the interaction of language and social setting. Journal of Social Issues, 23(2), 8–28. Isaksen, J. (2018). Well, you are the one who decides. Topics in Language Disorders, 38(2), 126–142. Johnson, D., & VanBrackle, L. (2012). Linguistic discrimination in writing assessment: How raters react to African American “errors,” ESL errors, and standard English errors on a state-mandated writing exam. Assessing Writing, 17(1), 35–54. Kachru, B. B. (1992). Models for non-native Englishes. In B. Kachru (Ed.), The other tongue: English across cultures(pp.48–74).University of Illinois Press Khamis-Dakwar, R., & Randazzo, M. (2021). Deconstructing the three pillars of evidencebased practice to facilitate social justice work in speech language and hearing sciences. In R. Horton (Ed.), Critical perspectives on social justice in speech-language pathology (pp. 130–150). IGI Global Kovarsky, D. (1989). An ethnography of communication in child language therapy. [Unpublished PhD dissertation] University of Texas at Austin. Kovarsky, D. (1990). Discourse markers in adult-controlled therapy: Implications for child centered intervention. Journal of Childhood Communication Disorders, 13(1) 29–41.
Clinical Sociolinguistics 95 Kovarsky, D., & Duchan, J. F. (1997). The interactional dimensions of language therapy. Language, Speech, and Hearing Services in Schools, 28(3), 297–307. Labov, W. (1963). The social motivation of a sound change. Word, 19(3), 273–309. Labov, W. (1966). The social stratification of English in New York City. Center for Applied Linguistics. Labov, W. (1972a). Language in the inner city; studies in the black English vernacular. University of Pennsylvania Press. Labov, W. (1972b). Sociolinguitic Patterns. University of Pennsylvania Press. Labov, W., & Fanshel, D. (1977). Therapeutic discourse. Academic Press. Lahey, M. (2004). Therapy talk: Analyzing therapeutic discourse. Language, Speech, and Hearing Services in Schools, 35(1), 70–81. LaSasso, C., & Lollis, J. (2003, Winter). Survey of residential and day schools for deaf students in the United States that identify themselves as bilingual–bicultural programs. Journal of Deaf Studies and Deaf Education, 8(1), 79–88. Leaman, M. C., & Archer, B. (2022). “If you just stay with me and wait … You’ll get an idea of what I’m saying”: The communicative benefits of time for conversational self-repair for people with aphasia. American Journal of Speech-Language Pathology, 31(3), 1264–1283. Leaman, M. C., Archer, B., & Edmonds, L. A. (2022). Toward empowering conversational agency in aphasia: Understanding mechanisms of topic initiation in people with and without aphasia. American Journal of Speech-Language Pathology, 31(1), 322–341. Lee, L. (1974). Developmental sentence analysis. Northwestern University Press. Letts, C. (1985). Linguistic interaction in the clinic: How do therapists do therapy? Child Language Teaching and Therapy, 1 (3), 321–331. Lippi-Green, R. (2021). That′ s Not My Language: The Struggle to (Re) Define African American English. In R. Duenas Gonzalez and I. Melis (Eds.), Language Ideologies (p.p. 230-247). Routledge. Luhman, R. (1990). Appalachian English stereotypes: Language attitudes in Kentucky. Language in Society, 19(3), 331–348. Massey, D. S., & Lundy, G. (2001). Use of Black English and racial discrimination in urban housing markets: New methods and findings. Urban Affairs Review, 36(4), 452–469. Maydosz, A. S. (2014). Disproportional representation of minorities in special
education. Journal for Multicultural Education,8(2), 81–88. Mehan, H. (1979). Learning lessons. Harvard University Press. Muskett, T., Perkins, M., Clegg, J., & Body, R. (2010). Inflexibility as an interactional phenomenon: Using conversation analysis to re-examine a symptom of autism. Clinical Linguistics & Phonetics, 24(1), 1–16. Newkirk-Turner, B. L., & Morris, L. R. (2021). An unequal partnership: Communication sciences and disorders, Black children, and the Black speech community. In R. Horton (Ed.), Critical perspectives on social justice in speech-language pathology (pp. 180–196). IGI Global. Norris, J. A., & Damico, J. S. (1990). The whole language movement in theory and practice: Implications for language intervention. Language, Speech, and Hearing Services in Schools, 21(4), 211–220. O’Quin, C. B. (2021). Exploring African American Vernacular English and Disproportionality in Special Education [Doctoral dissertation]. Illinois State University. Oetting, J. (2005). Assessing language in children who speak a nonmainstream dialect of English. In M. J. Ball (Ed.), Clinical sociolinguistics (pp. 180–192). Blackwell. Oetting, J., Cantrell, J., & Horohov, J. (1999). A study of specific language impairment (SLI) in the context of non-standard dialect. Clinical Linguistics and Phonetics, 13(1), 25–44. Oetting, J. B., Berry, J. R., & Gregory-Martin, K. D. (2022). Sociolinguistics: Use of linguistic theory to inform clinical practice for children with Developmental Language Disorder within African American English. In N. Gurevich & C. Grindrod (Eds.), Clinical applications of linguistics to speech-language pathology (pp. 72–90). Routledge. Oetting, J. B., Lee, R., & Porter, K. L. (2013). Evaluating the grammars of children who speak nonmainstream dialects of English. Topics in Language Disorders, 33(2), 140. Orelus, P. W. (2020). Other people’s English accents matter: challenging standard English accent hegemony. Excellence in Education Journal, 9(1), 120–148. Panagos, J. M. (1996). Speech therapy discourse: The input to learning. In M. Smith & J. S. Damico (Eds.), Childhood language disorders (pp. 41–63). Thieme Medical Publishers. Panagos, J. M., Bobkoff, K., & Scott, C. M. (1986). Discourse analysis of language intervention.
96 Brent Archer, Eleanor Gulick, Jack S. Damico, and Martin J. Ball Child Language Teaching and Therapy, 2(2), 211–229. Panagos, J. M., & Fry, J. (1976). Code switching during language therapy communication. Paper presented at the Annual Convention of the American Speech-Language-Hearing Association, Houston, TX. Panagos, J. M., & Griffith, P. L. (1981). Okay, what do educators really know about language intervention? Topics in Learning Disabilities, 2(1), 69–82. Perkins, M. R. (2002). An emergentist approach to pragmatic impairment. In F. Windsor, M. L. Kelly, & N. Hewlett (Eds.), Investigations in clinical linguistics and phonetics (pp. 1–14). Lawrence Erlbaum. Prutting, C. A., Bagshaw, N., Goldstein, H., Juskowitz, S., & Umen, I. (1978). Clinician child discourse: Some preliminary questions. Journal of Speech and Hearing Disorders, 43(2), 123–139. Reagan, T., & Gabrielli, D. (2022). Identifying and responding to linguicism. In J. Schwieter, J. Rivera Flores, and P. Iida (Eds.), Engaging in critical language studies (pp. 1–21). Information Age Publishing. Rickford, J. R., & King, S. (2016). Language and linguistics on trial: Hearing Rachel Jeantel (and other vernacular speakers) in the courtroom and beyond. Language 92(4), 948–988. Ripich, D. (1982). Children’s social perception of speech-language sessions: A sociolinguistic analysis of role-play discourse [Unpublished doctoral dissertation]. Kent State University. Riuttanen, S. (2019). “Neutralize your native accent”: The ideological representation of accents on accent reduction websites. [Unpublished masters thesis] University of Jyväskylä. Scarborough, H. (1991). Index of productive syntax. Applied Psycholinguistics, 11(1), 1–22. Schiffrin, D. (1987). Discourse markers. Cambridge University Press. Sener, M. Y. (2021). English with a non-native accent as a basis for stigma and discrimination in the US. In J. Diab (Ed.), Dignity in movement: Borders, bodies, and rights (pp 1–10). E-International Relations Publishing. Seymour, H., Roeper, T., & de Villiers, J. (2003). Diagnostic evaluation of language variation. Psychological Corporation. Shuy, R. (1987). Conversational power in FBI covert tape recordings. In L. Kedah (Ed.), Power through discourse (pp. 43–56). Ablex. Simmons-Mackie, N. N., & Damico, J. S. (1996). The contribution of discourse markers to
communicative competence in aphasia. American Journal of Speech-Language Pathology, 5(1), 37–43. Simmons-Mackie, N. N., Damico, J. S., & Damico, H. L. (1999). A qualitative study of feedback in aphasia treatment. American Journal of Speech-Language Pathology, 8(3), 218–230. Tannen, D. (1987). Remarks on discourse and power. In L. Kedar (Ed.), Power through discourse (pp. 3–10). Ablex. Taylor, O. L., & Peters-Johnson, C. (1986). Speech and language disorders in blacks. In O. L. Taylor & C. Peters-Johnson (Eds.), Nature of communication disorders in culturally and linguistically diverse populations. College Hill Press. Trudgill, P. (1972). Sex, covert prestige and linguistic change in the urban British English of Norwich. Language in Society, 1(2), 179–195. Trudgill, P. (1974). The social differentiation of English in Norwich. Cambridge University Press. Ulichny, P., & Watson-Gegeo, K. A. (1989). Interactions and authority: The dominant interpretive framework in writing conferences. Discourse Processes, 12(3), 309–328. Walker, A. G. (1987). Linguistic manipulation, power, and the legal setting. In L. Kedah (Ed.), Power through discourse (pp. 57–80). Ablex. Wardhaugh, R. (1998). Introduction to sociolinguistics (3rd ed.). Blackwell. Washington, J. A., & Seidenberg, M. S. (2022). Language and dialect of African American children. In E. Saiegh-Haddad, L. Laks & C. McBride (Eds.), Handbook of literacy in diglossia and in dialectal contexts (pp. 11–32). Springer. Wei, L., Miller, N., Dodd, B., & Hua, Z. (2005). Childhood bilingualism: Distinguishing difference from disorder. In M. J. Ball (Ed.), Clinical sociolinguistics (pp. 193–206). Blackwell. Wheeler, R. S. (2008). Code-switching. Educational Leadership. Wheeler, R. S. (2009). “Taylor cat is black”: Code-switch to add Standard English to students’ linguistic repertoires. In J. Scott, D. Straker & L. Katz (Eds.), Affirming students’ right to their own language: Bridging language policies and pedagogical practices (pp. 176–191). Routledge. Wheeler, R. S. (2019). Attitude change is not enough: Disrupting deficit grading practices to disrupt dialect prejudice. Proceedings of the Linguistic Society of America, 4(1), 10–11. Whitfield, J. A. (2022). Systemic racism in communication sciences and disorders academic programs: A commentary on trends in racial representation. American Journal of Speech-Language Pathology 32(1), 1–10.
Clinical Sociolinguistics 97 Winn, M. B., Tripp, A., & Munson, B. (2022). A critique and call for action, in response to sexist commentary about vocal fry. Perspectives of the ASHA Special Interest Groups, 7(6), 1903–1907. Wolfram, W. (1993). The sociolinguistic model in speech and language pathology. In M. M. Leafy & J. L. Kallen (Eds.), International perspectives in speech and language pathology (pp. 1–29). Trinity College. Wolfram, W. (2005). African American English. In M. J. Ball (Ed.), Clinical sociolinguistics (pp. 87–100). Blackwell. Yu, B., Horton, R., Munson, B., Newkirk-Turner, B. L., Johnson, V. E., Khamis-Dakwar, R., Munoz, M. and Hyter, Y. D. (2022). Making
race visible in the speech, language, and hearing sciences: A critical discourse analysis. American Journal of Speech-language Pathology, 31(2), 578–600. Yu, B., Nair, V. K., Brea, M. R., Soto-Boykin, X., Privette, C., Sun, L., Khamis, R., Sheen Chiou, H., Fabiano-Smith, L., Epstein, L., & Hyter, Y. D. (2022). Gaps in framing and naming: Commentary to “a viewpoint on accent services”. American Journal of Speech-Language Pathology, 31(4), 1–6.
8 Systemic Functional Linguistics and Communication Disorders ELIZABETH SPENCER AND ALISON FERGUSON 8.1 Preamble Functional approaches to language assessment and intervention are recognized as important for both children and adult clients with communication disorders. Over the past 20 years, functional approaches to assessment, intervention and evaluating functional or real world outcomes have developed especially in consideration of the application of The World Health Organization’s International Classification of Functioning, Disability, and Health (WHO, 2001) across the continuum of care. However, what is still developing in the field of communication disorders is a systematic way of formulating these approaches and a theoretical perspective to inform them. Of particular interest is the ability to both assess and provide treatment that is ecologically valid, sensitive to the person’s needs and their environment. That is, speech-language pathologists need to provide assessment and interventions that represent authentic communication experiences for individuals (Keegan et al., 2022b). Systemic Functional Linguistics is a functional model of language in use that offers clinicians this theoretical perspective. Readers new to Systemic Functional Linguistics will find comprehensive information about its theory and methods of analysis, at an introductory level in Butt et al., 2000 and Thompson, 2013 and at an advanced level in Eggins & Slade, 2004, Halliday & Matthiessen, 2014, and Martin & Rose, 2003; see also a glossary of terms in the Appendix to this chapter. Alongside this shift to a social paradigm of assessment and intervention in speech pathology practice within the field of sociolinguistics, there have been continued developments in “critical discourse analysis” (see Keegan, Guendouzi, & Müller, Chapter 1 in this volume) which have highlighted the close relationship between what is said or written and its social context, and argued for the need to critically analyze the language/power relationships between all interactants (including practitioners) and their sociocultural assumptions and discourses (Fairclough, 1995, 1997; Locke, 2004). Systemic Functional Linguistics has become one of the most widely adopted linguistic methodologies for “doing” critical linguistics, as it provides both theoretical rigor and methodological systematicity for dealing with both macro and micro aspects of language within social context (Pennycook, 2001; Young & Harrison, 2004). Matthiessen (2013) explored how systemic functional linguistics can be applied in healthcare settings particularly what can be illuminated in terms of medical consultations and power imbalances but also in relation to how power constructions at
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
100 Elizabeth Spencer and Alison Ferguson the institutional level can impact on healthcare service and delivery. Applications of this in speech-pathology practice can be seen in recent work by Hersh and colleagues (2016) who explored nurses’ interactions with patients with and without aphasia in the acute hospital setting. Furthermore, Hersh et al., (2018) investigated how a partner of man with aphasia negotiated his discharge from hospital and the power imbalance experienced which demonstrated misunderstandings by the hospital about the impact of aphasia on both people with aphasia and their families. In this chapter, we focus primarily on how SFL has been applied to speech-language pathology to date, but we also attempt to indicate where wider aspects of this sociolinguistic theory may have relevance for speech-language pathology (Armstrong et al., 2005).
8.2 Key Concepts Systemic Functional Linguistics (SFL) is a semantic perspective on language in use and has been recognized as having particular relevance for speech-language pathology since the 1980s (Gotteri, 1988). There are two key aspects in the SFL model: system and function (Halliday, 1994). System refers to the network of choices which language users have available to them; analysis of their choices allows us to investigate the dynamic (and non-deterministic) nature of discourse. As Halliday and Matthiessen explain, “What this means is that each system – each moment of choice – contributes to the formation of the structure. So, when we analyse a text, we show the functional organization of its structure; and we show what meaningful choices have been made, each one seen in the context of what might have been meant but was not” (Halliday & Matthiessen, 2014, p. 24). These choices are not consciously made (although metalinguistic awareness is possible for some choices), nor do they represent a prescriptive inventory of structures. Instead, the language user creates or “realizes” meaning through multiple series of choices. Function refers to the perspective’s orientation to language in use, so that it is function rather than form that is the focus of this grammar. From an SFL perspective, language is viewed as a resource with a system of options for making meaning. The system is organized stratally in terms of its content (semantics and lexicogrammar) and expression (including phonology). Meaning choices (semantics) are expressed (realized) by lexicogrammatical choices that are, in turn, realized by choices within the phonological system (Halliday & Matthiessen, 2014). The term lexicogrammar refers to the level of wording, lexis referring to specific word meanings and grammar to more generalized structural meanings. The relationship between lexicogrammar and phonology is formal whereas that between semantics and lexicogrammar is functional, and probabilistic rather than deterministic. It is possible therefore to realize a specific meaning in more than one way, just as the same wording can yield different meanings or interpretations. One of the advantages for speech-language pathologists using SFL is that the approach allows for consideration of all levels of language, ranging from social and interactional use of language (often referred to within speech-language pathology as “pragmatic” features) as well as wordings and grammatical features, within the one unified theoretical framework (Thompson, 2013). Importantly, it also allows us to describe how individuals use language and the patterns that they use in any communication context and describe which resources they use and the opportunities they have for using them by also being able to describe the interaction at both the lexicogrammatical level (i.e. word choices and grammar) and the opportunities within the context of the communication situation by discourse by analyzing the exchange and how that is structured. Within SFL, analysts focus on how texts come to make meaning in context. A text is a unit of language in use, and it is the unity of meaning that defines a text rather than length or any
Systemic Functional Linguistics and Communication Disorders 101 structural features. For the speech-language pathologist this provides for flexibility in sampling depending on the area of clinical interest. So, for example, the speech-language pathologist might sample a narrative of stroke embedded in a larger unit of conversation. If the whole conversation is analyzed, then the exchange between participants will be of interest as well as how the narrative fitted into the conversation, for example, who initiated what and when. Alternatively, the focus of analysis might be on the narrative itself and how the important parts of the story were drawn together through lexical and grammatical resources for cohesion, for example.
8.3 Language in Context 8.3.1 Context of Situation Halliday (1994) proposed that there are three aspects of the context of situation that matter most to our understanding of how language is produced and understood: the Field of discourse (what is being talked about), the Tenor of discourse (the speaker’s relationship to the listener and the message), and the Mode of discourse (the part language is playing in the discourse). Hasan (Halliday & Hasan, 1985) suggested that we can characterize the contextual configuration of any text by describing its Field, Tenor, and Mode, and this provides a succinct description of any text we select for analysis. However, this relationship is bidirectional. We can both describe texts in terms of these three parameters and we can also analyze the aspects of Field, tenor and Mode and know what is required to produce a text. At both the macro and micro levels this guides us to know what is required in specific situations that would be classified as a particular type of text. For example, a casual conversation between friends over the phone or via text has different parameters of Field, Tenor & Mode compared to a formal presentation at a job interview. In the context of assessments and interventions, we can consider the contexts in which a client needs to communicate, with whom, what the communication is about (e.g. technical work terms) and how (e.g. written, spoken communication). Further, we can use the contextual configuration as a guide when considering what language samples to select in order to ensure a range of sampling across contexts (Ferguson, 2000a; Keegan et al., 2022b; Spencer et al., 2005). A detailed description of the contextual configuration provides for a systematic way to identify the relationships between the real-world social context and the linguistic text, and thus helps the speech-language pathologist capture points where speakers may evidence social or pragmatic difficulties, for example, where mismatches occur between use of polite forms in the social power or distance relationships between interactants (Spencer et al., 2009; Togher & Hand, 1998).
8.3.2 Context of Culture The cultural context in which an interaction occurs affects the particular instance of language use or “register” (made up of the register variables Field, Tenor, and Mode, which delineate the contextual configuration as described above); for example, we may adopt a more or less formal register in a particular social context (Eggins, 1994). However, cultural context is even more systematically mapped onto discourse than such particular instances, through text types or “genres” of discourse which are likely to occur in different cultural contexts. Genres refer to texts that share common structural elements inextricably tied to the contextual configuration (Butt et al., 2000). Examples of written genres are text messages, blogs, letters of complaint, recipes, instructions, and novels (with subgenres), and examples of spoken genres are personal narratives, recounts, instructions, and therapy
102 Elizabeth Spencer and Alison Ferguson sessions. Each genre can be seen as having a uniquely defining generic structure potential (GSP) (Eggins, 1994, pp. 25–48), which is made up of a set of obligatory and optional elements, each of which has a distinct contextual configuration in an ordered sequence. So, for example, therapy sessions are a type of discourse with which an experienced speech-language pathologist is highly familiar, and the medical-therapeutic cultural features increase the probability with which particular registers will be used within that genre. The cultural presumptions embedded in this genre become more visible when we observe people new to therapy interactions; for example, Ferguson argued that one of the roles of the clinical education process is to “acculturate” speech-language pathology students to the potential resources available within therapy sessions (Ferguson & Armstrong, 2004; Ferguson & Elliot, 2001), and Simmons-Mackie has discussed what happens when our clients make different assumptions regarding allowable contributions to the therapy session (Simmons-Mackie & Damico, 1999).
8.3.3 Text and Genre An understanding of how texts relate to genre is important in deciding which genres to sample, and helps us avoid concentrating our observations and treatment on one particular genre (e.g. narrative) at the expense of others that might also be of importance to clients (e.g. writing an essay, providing a report on a science experiment at school). Language learning from childhood through adolescence requires increasing mastery of a range of genres both in terms of control of the lexicogrammatical resources and the understanding of textual resources and generic structure required. For example, in the first few years of formal education, children/students focus on narratives, while in the middle-school and high-school years there is a demand for mastery over a wide range of genres including exposition, argument, and report. These resources are developed to fulfill the purpose of the text, for example to persuade an audience or to provide specific instructions. In order to master these genres, the student is learning how to make use of the distinct linguistic resources called upon within each genre, while at the same time the student’s mastery of the genre-specific language resources enables their access to the learning in the knowledge domain to which the genre contributes (Rothery, 1996); for example, the genre of “report” plays a major role in the domain of science. We can go beyond just acknowledging that texts are located in a context of situation and culture, in recognizing that different ethnic cultures have different genres; for example, the narrative genre in indigenous languages, Arabic and Asian languages will have a different set of obligatory elements than a “Western” narrative genre, and different cultures may have different expectations about the possibility of social chat with the clinician in a therapy session. Further still, we can begin to recognize that texts create contexts. For example, the client might begin to interview the clinician (about the clinician’s qualifications and experience, say), thus shifting the genre. In other words, the specification and description of register and generic structure potential are not a prescriptive set of requirements for language use; rather they are a set of options through which speakers chart their own course to make meanings and are negotiated throughout the discourse. When using these parts of the SFL framework in speech-language intervention then, we avoid setting up a predetermined checklist of elements and sequences, but instead ask ourselves what texts our clients are able to produce or understand, and in what contexts, as well as asking how they shape and use the resources from the genre and culture in which they are situated. When this lens is applied to people with communication disorders, we can view their skills as a resource which can be built upon to facilitate more effective communication based on an individual’s contextual needs for communication.
Systemic Functional Linguistics and Communication Disorders 103
8.3.4 Metafunctions Halliday (1994) proposed that there are three main functions of language: to convey something about the Field of information, to create or maintain the Tenor of interpersonal relationships, and to use the resources of language (Mode) to enable this to happen. As can be seen, the three main functions of language are closely related to the three main aspects seen to be most relevant in the context of situation. But it is not that some utterances express information (Field), while others express relationships (Tenor). Instead, each and every use of language expresses each of the three main functions simultaneously – and hence these functions are called metafunctions. The metafunction expressing Field is called the Experiential metafunction, the metafunction expressing Tenor is called the Interpersonal metafunction, and the metafunction expressing Mode is called the Textual metafunction. In other words, every utterance tells something, establishes a relationship between interactants, and uses language to do it. The importance of this notion of metafunctions is that it provides the link between each of the main aspects of context of situation and the resources available in language to make meanings. Out of all the many resources of language that are available to speakers, SFL proposes that there are certain specific language resources that are the most visible or sensitive reflectors of each metafunction and its relationship to the context of situation. We can look at how the resources of language are used to make meanings at three main levels: content (semantics, lexicogrammar) and the level of expression (including phonology/graphology, gestures, prosody).
8.4 Levels of Language 8.4.1 Content: Semantics SFL is, generally speaking, a “semantic” perspective; within the approach, a specific level of language is identified using the term “semantics,” often also described as “discourse-semantics.” This level will be commonly recognized by speech-language pathologists as consistent with their understanding of the level of “discourse,” in the sense that we are thinking about how meanings are made through the entire text (Halliday & Hasan, 1976), in other words, its unity of meaning. (In order to avoid confusion, we will use the term “discourse-semantics” to describe this level of language within this chapter.) When we analyze the text as a whole in terms of what it is about, we can look first at how meanings relate to what is being talked about in the external world (reference), and secondly at how the meaning choices relate to other options in the meaning system (lexical relations, e.g. synonymy, antonymy, meronymy, and so on). Both of these systems contribute to the cohesion of the text (Halliday & Hasan, 1976). While reference is considered to realize the Textual metafunction at the discourse-semantic level, lexical relations are considered to realize the Experiential metafunction at the discourse-semantic level. The potential of cohesion analysis as a clinical tool has been applied in speech-language pathology (Coelho et al., 1994; Ferguson, 1993; Fine et al., 1994; Jordan et al., 1991; Liles et al., 1995; Liles & Purcell, 1987; Mentis & Prutting, 1987; Müller & Wilson, 2008; Ripich & Terrell, 1988). Earlier applications of SFL explored how it could be a pathway at the discourse-semantic level to an expanded understanding of the relationship between spoken and written texts, and offers speech-language pathologists a range of analytic tools for assessment and planning for intervention for clients for whom written language is a high priority, for example adolescents and young adults with language-learning difficulties or acquired language disorders. The three most salient features of discourse which illuminate key aspects of spoken-written texts are considered to
104 Elizabeth Spencer and Alison Ferguson be: the relative lexical density and grammatical intricacy, the use of grammatical meta phor, and rhetorical structure. For adults, spoken language is typically more grammatically intricate (it has a higher average number of clauses per clause complex or per sentence) and less lexically dense (it has a lower type–token ratio) than written language, which is conversely typically more lexically dense and less grammatically intricate. One of the main ways to increase lexical density as a resource for meaning in written texts is through the use of grammatical metaphor. Grammatical metaphor is a resource for meaning which involves a process of rank shifting, moving from clause to phrase or clause complex to clause level, for example. The most apparent example in written texts is the use of “nominalization,” in which clauses shift to the rank of phrase level, for example, while a speaker might say, “The school term ended,” a writer might write, “The ending of the school term.” This ability to use the resource of grammatical metaphor marks the development toward the mature writer (Christie, 2002), as does the use of rhetorical structure. Rhetorical structure generally describes the typically observed pattern or sequence of “moves” associated with particular genres. Mortensen (2003) demonstrated the difficulties experienced in using rhetorical structure by writers with acquired language disorder when attempting to write an argument or narrative (Mortensen, 2003). At the discourse-semantic level in a conversational interaction, the fundamental shifts between roles of giving and receiving information or goods and services structure the exchange, and these role shifts determine the choices made within the Interpersonal metafunction, reflecting the Tenor of the interpersonal relationship between interactants. In speech-language pathology, analyses of Tenor have been used, for example, to investigate interactions between clients with traumatic brain injury and their everyday speaking partners (Keegan, Hoepner et al., 2022a; Togher, 2000; Togher et al., 1997a, 1997b, 1999; Togher, McDonald et al., 1999) and more recently how SFL can be used to analyze how humor is expressed to assist with engagement (Keegan et al., 2021), in the autistic population (Bartlett et al., 2005), in children with ADHD (Mathers, 2006) and in developmentally disordered populations (Fine, 1991).
8.4.2 Content: Lexicogrammar As previously mentioned, SFL proposes that certain lexicogrammatical resources are quite specific to the realization of different metafunctions and how they reflect the context of situation. At the level of the lexicogrammar, the Field of discourse is reflected in the network of choices within the Transitivity system. Simply, the Transitivity system is the expression of who is doing what to whom, under what conditions: Participants, Processes, and Circumstances, and how they relate to each other (Armstrong, 2001; Mathers, 2001). At the level of the lexicogrammar, the Tenor of discourse is reflected in the network of choices within the Mood system. This network involves the expression of the probabilities and obligations that arise and are negotiable between interactants in discourse. In English, these options include the ordering of Subject and Finite (e.g., inverted in the case of Interrogatives), the form of the Finite (e.g. use of tense), and Mood Adjuncts (expressing speaker’s attitude to their message, e.g. “unfortunately”) (Ferguson, 1992; Lee et al., 2015; Spencer et al., 2005; Togher & Hand, 1998). At the level of the lexicogrammar, the Tenor of discourse is also reflected in the network of choices in the Appraisal system (Eggins & Slade, 2004; Martin, 2000), which involves the expression of attitudes of the speaker, through the expression of appreciation (the expression of evaluation of an object or process, e.g. “the stroke education presentation was interesting/boring”), affect (the expression of feelings/emotions, e.g. “I’m happy/cross that I went along”), judgment (the expression of judgment about people’s behavior, e.g. “the presenter was skillful/incompetent”), and amplification (resources for grading appraisal, e.g. “very happy”, “just a bit sad”, and use of repetition). Appraisal analysis has been applied to the discourse of people with aphasia (Armstrong, 2005), and people with non-dominant hemisphere language impairment (Sherratt, 2007), people who stutter (Lee et al., 2015, 2016a, 2016b) and most recently in people with TBI (Keegan et al., 2021).
Systemic Functional Linguistics and Communication Disorders 105 With regard to Mode of discourse, we have already seen that at the discourse- semantic level we have resources for building cohesion, but there are further resources available at the lexicogrammatical level which contribute to overall coherence to the listener, namely, the system of Theme. Theme involves the expression of priority or importance given to elements in a clause, and may strike a chord with those who have considered given/new relationships in texts (though there are important differences between these concepts). Thomson has applied the analysis of Theme to the narrative texts of children with specific language impairment (Thomson, 2005). In English, Theme occurs in the initial position in the clause, and the major points of interest for speech-language pathologists analyzing texts produced by individuals with language impairment are the use of multiple and marked Themes, and the analysis of Thematic progression. The use of multiple Themes reflects the language user’s stage of development and/or access to lexicogrammatical resources, so for example, “The girl cried” thematizes just “girl”, whereas “And, unfortunately, the girl cried” highlights a number of meanings. Marked Theme allows the language user to dramatize meaning, so, for example, “After her Mother’s death, the girl cried” highlights the precipitating event rather than the girl’s response and reflects not only the language user’s access to lexical and grammatical resources, but also the user’s grasp of the situation (pragmatic understanding) and options for how to present an utterance to the listener to achieve specific purposes. Thematic progression through a text is of interest in showing how the language user tracks or draws attention to the unfolding development of main ideas, for example through iteration (“First open the door, then step through the door, and then sit down”), or linear progression (“zigzagging”): The boy approached the door. The door creaked open. Through the opening there was light shining.
As previously highlighted, each instance of language use realizes each of the three metafunctions simultaneously, and this can be exemplified at the level of the lexicogrammar fairly readily. In the example below, we have provided a snapshot of the Experiential metafunction (as realized through Transitivity), the Interpersonal metafunction (as realized through Mood), and the Textual metafunction (as realized through Theme), for just one clause. Example: Analysis at the level of content: lexicogrammar
106 Elizabeth Spencer and Alison Ferguson As can be seen from the example, each metafunction is realized through lexicogrammatical choices which reflect each of the aspects of context. Also, the analysis of each metafunction allows us different lenses through which to view different lexicogrammatical resources, with some parts being more visible through one than through another lens. Combined analyses allow for a total picture to emerge as to how the speaker’s meanings are being expressed. These understandings of the lexicogrammatical resources for making meanings provide a framework that the speech-language pathologist can use to explicitly assist the client to consciously make use of these resources within a metalinguistic approach to intervention.
8.4.3 Expression Halliday and Matthiessen described the area of expression in the following way: We can divide the phonology into two regions of articulation and prosody. As a general principle, articulation is “arbitrary” (conventional), in the sense that there is no systematic relation between sound and meaning. Prosody on the other hand, is “natural”: it is related systematically to meaning, as one of the resources for carrying contrasts in grammar.
(Halliday & Matthiessen, 2014, p. 11) As well as articulation and prosody, the level of expression also includes graphology and gestural expression. To date, there has been limited direct application of SFL approaches to expression within speech-language pathology, although Ferguson and Peterson (2002) have looked at the role of prosody in the expression of social meanings conveyed by the intonation used by communication partners of people with aphasia. They suggest that prosody provides a potential resource for speakers to draw attention to key information when talking with people with comprehension problems associated with aphasia. SFL pays particular attention to prosody, as indicated in the above quote, as a resource for grammatical contrasts, rather than seeing it as a paralinguistic feature separate from the linguistic system. For example, prosody is a major resource for indicating clause (and clause complex) boundaries, and for making given/new distinctions. Thus prosody provides speech-language pathologists with important signposts to assist in analysis. For clinical populations, prosody is potentially both an area of difficulty (for example, in traumatic brain injury) or a resource for meaning in the face of lexicogrammatical compromise (for example in Wernicke’s aphasia). Müller, Ball, and Rutter (2008) also demonstrated that SFL, particularly the system of prosodic system networks, could be applied to inform intervention targets in a case study of child phonological disorder.
8.5 Clinical Issues SFL is one approach amongst a number that speech-language pathologists are using to assess and develop interventions for children and adults with communication difficulties. SFL is a sociolinguistic perspective and so contrasts sharply with approaches to language analysis based on psycholinguistic explanation (Hand, 2005). As a sociolinguistic perspective, SFL seeks to describe and explain how language is used by speakers and is primarily concerned with understanding the relationship between the talk and the situations in which talking occurs. SFL does not theorize regarding the relationship between language and the brain, nor does it seek to establish universal abstract rules underlying language. It is, however, worth noting that emerging developments in cognitive linguistics, and in computational linguistics in the areas of neural networks and connectionist theories
Systemic Functional Linguistics and Communication Disorders 107 (Cohen et al., 2000; Daniloff, 2002), are not inconsistent with SFL notions regarding the usefulness of probabilistic modeling (Halliday & Matthiessen, 1999). In relation to other sociolinguistic perspectives, SFL offers a semantic “lens” through which to view all aspects of language use. SFL shares with conversation analysis (CA) (see Wilkinson, Chapter 6 in this volume) its interest in naturalistic sampling, and the importance of co-text in providing resources for the dynamic interaction between speakers (Prevignano & Thibault, 2003), but differs from CA in relating observations back to a “top-down” explanatory theory and in its focus on detailed lexicogrammatical analysis (Ferguson, 2000b). SFL is close to a number of other related discourse theories which share its concerns with contextually based analysis and explanation, most notably the work of Sinclair and Coulthard (Coulthard, 1992), the ethnographic approach of Hymes (Hymes, 1995), and the interaction approach of Gumperz (Eerdmans et al., 2003). Arguably, SFL offers four main aspects of interest to speech-language pathologists beyond these other approaches. First, SFL’s detailed lexicogrammatical analyses allow the speech-language pathologist to comprehensively describe clients’ use of language which allows consideration of strengths and areas of difficulty in an individual. Secondly, given the focus on context and how communication is co-created, it allows SLPs to also focus on environmental factors such as communication partners and use of supports to facilitate improvements in communication for people with impairments. Thirdly, SFL has been applied across educational, second language learning, and clinical domains (as well as across other applied fields such as stylistics and computational linguistics), and these applications provide a rich resource for speech-language pathologists working with diverse caseloads. And finally, SFL’s characterization of the relationship between culture, context and text has provided both theoretical and methodological rigor to critical discourse analysis, seeking to explore and question relationships of power and language. Issues of critical literacy, for example of social class, ethnicity and access to literacy (Damico et al., 2005), and issues of access to print and on-line materials for people with communication difficulties (Ghidella et al., 2005; Rose et al., 2003) are just two of the areas of current concern to speech-language pathologists which can be informed by critical discourse analysis in general, and Systemic Functional Linguistics in particular. Throughout this chapter we have attempted to provide examples of applications of SFL to speech-language pathology. What we hope is clear from these examples is that SFL is not, in itself, a specific approach to treatment, in that the theory is not a theory of learning or of behavioral change. Nor does SFL provide a “recipe” or “checklist” for assessment or treatment targets, as the notion that language use is dynamic and involves choices in the expression of meaning is essential to the theoretical perspective. For the speech-language pathologist, SFL involves a very fundamental shift in thinking, so that rather than thinking in terms of what clients cannot do, or what errors they make, the speech-language pathologist asks what meanings are being expressed and what resources of language are available (or potentially available) to assist their expression. Müller and Mok (2012) demonstrated strengths in linguistic resources for people with dementia and Keegan and colleagues has explored how aspects of personality, humor and emotions are expressed following traumatic brain injury (Keegan, Müller et al., 2022b; Keegan et al., 2017, 2021). The assessment protocols and treatment regimes which emerge from this perspective are highly individualized, and at the same time very detailed, descriptive, and measurable in terms of intra-individual change over time. To date, the majority of published research in speech-language pathology using SFL has provided description of linguistic resources at a range of levels of analysis for different clinical populations. Collectively, this work has highlighted how individuals with communication impairment can communicate effectively despite limitations resulting from aphasia, traumatic brain injury, stuttering, dysarthria, dementia, and autism spectrum disorder.
108 Elizabeth Spencer and Alison Ferguson For example, Müller and colleagues investigated speech functions, initiation cohesion, and reference in people with dementia. Resources of modality (Lee et al., 2015; Spencer et al., 2009) and appraisal (Lee et al., 2015, 2016a) were explored in people who stutter both compared to non-stuttering adults and post-treatment for stuttering. In relation to traumatic brain injury, recent work has described how identity and emotions are expressed post-injury using analyses at the discourse-semantic level of appraisal, speech function, and exchange structure analysis and lexico-semantic levels for modality and transitivity (Keegan et al., 2017, 2021, 2022a, 2022b). In terms of applications for assessment, Whitworth and colleagues demonstrated how SFL can be applied to a range of clinical populations including people with cognitive communication difficulties for assessment using their narrativebased assessment protocol, the Curtin University Discourse Protocol (CUDP) (Whitworth et al., 2020). Meulenbrook and Cherney (2019) created an authentic workplace language assessment task to evaluate politeness in voicemail message to assist people with TBI in applying for jobs. This work has also focused on how specific factors such how context of situation can impact communication therefore highlighting the need to assess communication in a range of situations and with a range of communication partners that are relevant to the individual person to ensure that patterns of strengths and difficulties can be documented, and intervention can be made relevant to the individual’s needs in order to communicate effectively in everyday situations. The use of SFL for intervention is an area of growth currently. Whitworth and colleagues have developed a narrative-based treatment program (NARNIA) which incorporates aspects of SFL theory in relation to discourse macro-structure with people with aphasia (Whitworth et al., 2015) and have recently piloted this with people with cognitive-communication disorders (Whitworth et al., 2020). There are many challenges for the future in the ongoing application of SFL to speechlanguage pathology, not the least of which is making the theoretical perspective more readily accessible to speech-language pathologists in the field. More detailed case illustrations with description of therapy applications will be needed, along with greater specification of the clinical decision-making processes involved in the development of individualized assessment protocols and treatment regimes. Analytic methodologies currently well established for research purposes need to be refined, so that subsets of them can be developed that are both valid and reliable for routine clinical use. At the same time, it will be important for speechlanguage pathologists to maintain close dialogue with systemic functional linguists as pathological language presents an important crucible in which to test and develop the theory itself. Speech-language pathologists typically find that SFL’s basic concepts of strata, levels of language, and aspects of context (Field, Tenor, Mode) sit comfortably within their other understandings of language. However, SFL offers speech-language pathologists an important series of conceptual challenges through the constructs of the metafunctions of each and every use of language (Experiential, Interpersonal, Textual), and systemic networks. Work is being done to further speech-language pathologists’ understanding of SFL and its applications particularly in the area of assessment of communication disorders. Keegan, Hoepner and colleagues (2022a) have recently operationalized. They argue: the tools and techniques outlined here can and should be applied now to truly understand the contextual, social, environmental, and personal influences at play and facilitate the optimal functional impact on the social communication of individuals with cognitive-communication difficulties (p. 8)
This will facilitate contextually embedded understandings of the communication disorders in children, adolescents, and adults and potential for enabling the exchange of meaning.
Systemic Functional Linguistics and Communication Disorders 109
GLOSSARY OF SFL TERMS
what speech-language pathologists often refer to as the “modality” of communication, e.g. spoken, written, signed clause complex more than one clause that exist in some type of structural dependency relationship (parataxis – coordination, hypotaxis – subordination); the spoken equivalent of a written “sentence” coherence the perception of unity and sense by the listener cohesion the linguistic resources by which a text achieves unity context the non-verbal, non-linguistic environment of the use of language context of culture the ideological and ethnic environment of the use of language context of situation the main aspects of the non-linguistic environment seen to affect the use of language, namely Field, Tenor, Mode contextual configuration the unique combination of Field, Tenor and Mode for any use of language co-text the linguistic environment of the use of language, for example, surrounding parts of the text delicacy the depth of the analysis of choices being made in the linguistic system discourse any connected use of language, whether written or spoken, involving one or more interactants, hence including conversation discourse-semantics level of language involving systems of meaning which run through the text as a whole Experiential the metafunction of language use to be about something Field what is being talked or written about Generic Structure Potential the obligatory and optional elements in a genre and their sequence genre type of discourse, culturally determined Interpersonal the metafunction of language use to express and create the relationship between interactants level refers to the series of strata of meaning, in which each stratum is “realized” by the level below: extralinguistic levels of context of culture and context of situation, and linguistic levels involving discourse-semantics, lexicogrammar, and expression. lexical relations how the words used relate to the Field and to each other in the text and in the language system lexicogrammar level of language involving systems of meaning expressed in wordings in the clause metafunction one of the functions of every use of language (Experiential, Interpersonal, Textual) Mode the part language is playing in the discourse Mood the lexicogrammatical system of expressing the relationship between the speaker and what is being said, and the channel
110 Elizabeth Spencer and Alison Ferguson
move
rank
realized by
reference register
system network Tenor text Textual Theme Transitivity
relationship between the interactants, at the clause level, involving modality (e.g. declarative, interrogative, imperative), polarity (e.g. negation), and other resources for modulating meaning a semantic unit, reflecting one act of meaning by the speaker, akin to turn-taking in conversation, after which a speaker change could occur without being seen as an interruption. For written texts, moves are signaled through the use of conventions such as sentence punctuation and paragraphing language is seen as comprising constituents which when combined form meanings at different “ranks”: thus word constituents combine to form noun and verb phrases, which combine to form clauses, which combine to form clause complexes each level of language simultaneously reflects or expresses the meanings at the level(s) above it (and each realization constructs the meanings in a similar fashion). For example, a particular culture gives rise to (is realized by) certain genres, a particular genre gives rise to (is realized by) certain registers or contextual configurations, and a particular configuration of Field, Tenor, and Mode will give rise to (is realized by) particular aspects of Experiential, Interpersonal, and Textual meanings respectively, and they in turn will be realized by particular resources in the lexicogrammatical system how participants are introduced and tracked through the discourse the way an individual speaker has used the contextual configuration of Field, Tenor, and Mode in a particular instance of language use the choices available to the speaker from the options in the linguistic system, diagrammatically represented the role relationship between interactants some use of language that forms some sort of meaningful unit, has “textuality” the metafunction of language use to organize meaning the lexicogrammatical system of organizing message salience, into starting points (Theme) and the remainder (Rheme). the lexicogrammatical system of expressing who is doing what to whom
REFERENCES Armstrong, E. (2001). Connecting lexical patterns of verb usage with discourse meanings in aphasia. Aphasiology, 15(10-11), 1029–1046. Armstrong, E. (2005). Expressing opinions and feelings in aphasia: Linguistic options. Aphasiology, 19(3/5), 285–296.
Armstrong, E., Ferguson, A., Mortensen, L., & Togher, L. (2005). Acquired language disorders: Some functional insights. In R. Hasan, J. Webster, & C. Matthiessen (Eds.), Continuing discourse on language (Vol. 1, pp. 384–412). Equinox.
Systemic Functional Linguistics and Communication Disorders 111 Bartlett, S., Armstrong, E., & Roberts, J. (2005). Linguistic resources of individuals with Asperger Syndrome. Clinical Linguistics and Phonetics, 19(3), 203–213. Butt, D., Fahey, R., Feez, S., Spinks, S., & Yallop, C. (2000). Using functional grammar: An explorer’s guide (2nd ed.). National Centre for English Language Teaching and Research. Christie, F. (2002). Classroom discourse analysis. Continuum. Coelho, C. A., Liles, B. Z., Duffy, R. J., Clarkson, J. V., & Elia, D. (1994). Longitudinal assessment of narrative discourse in a mildly aphasic adult. Clinical Aphasiology, 22, 145–155. Cohen, G., Johnston, R. A., & Plunkett, K. (Eds.). (2000). Exploring cognition: Damaged brains and neural networks. Psychology Press. Coulthard, M. (Ed.). (1992). Advances in spoken discourse analysis. Routledge. Damico, J. S., Nelson, R. L., & Bryan, L. (2005). Literacy as a sociolinguistic process for clinical purposes. In M. J. Ball (Ed.), Clinical sociolinguistics (pp. 242–249). Blackwell. Daniloff, R. G. (Ed.). (2002). Connectionist approaches to clinical problems in speech and language. Lawrence Erlbaum. Eerdmans, S. L., Prevignano, C. L., & Thibault, P. J. (Eds.). (2003). Language and interaction: Discussions with John J. Gumperz. John Benjamins. Eggins, S. (1994). An introduction to systemic functional linguistics. Pinter. Eggins, S., & Slade, D. (2004). Analysing casual conversation. Equinox. Fairclough, N. (1995). Critical discourse analysis: The critical study of language. Longman. Fairclough, N. 1997. Discourse across disciplines: Discourse analysis in researching social change. In A. Mauranen & K. Sajavaara (Eds.), Applied linguistics across disciplines (AILA Review 12). Association Internationale de Linguistique Appliquée. Ferguson, A. (1992). Interpersonal aspects of aphasic communication. Journal of Neurolinguistics, 7(4), 277–294. Ferguson, A. (1993). Conversational repair of word-finding difficulty. In M. L. Lemme (Ed.), Clinical aphasiology (Vol. 21, pp. 299–310). Pro-Ed. Ferguson, A. (2000a). Maximising communicative effectiveness. In N. Muller (Ed.), Pragmatic approaches to aphasia (pp. 53–88). John Benjamins. Ferguson, A. (2000b). Understanding paragrammatism: Contributions from conversation analysis and systemic functional linguistics. In M. Coulthard (Ed.), Working with Dialogue: Proceedings of the 7th Biennial Congress of the International Association for Dialogue
Analysis, Birmingham, April 8–10 (pp. 264–274). John Benjamins. Ferguson, A., & Armstrong, E. (2004). Reflections on speech-language therapists’ talk: Implications for clinical practice and education. International Journal of Language and Communication Disorders, 39(4), 469–477. Ferguson, A., & Elliot, N. (2001). Analysing aphasia treatment sessions. Clinical Linguistics and Phonetics, 15(3), 229–243. Ferguson, A., & Peterson, P. (2002). Intonation in partner accommodation for aphasia: A descriptive single case study. Journal of Communication Disorders, 35 (1), 11–30. Fine, J. (1991). The static and dynamic choices of responding: Toward the process of building social reality by the developmentally disordered. In E. Ventola (Ed.), Functional and systemic linguistics: Approaches and uses (pp. 213–234). Mouton de Gruyter. Fine, J., Bartolucci, G., Szatmari, P., & Ginsberg, G. (1994). Cohesive discourse in pervasive developmental disorders. Journal of Autism and Developmental Disorders, 24(3), 315–329. Ghidella, C. L., Murray, S. J., Smart, M. J., McKenna, K. T., & Worrall, L. (2005). Aphasia websites: An examination of their quality and communicative accessibility. Aphasiology, 19(12), 1134–1146. Gotteri, N. (1988). Systemic linguistics in language pathology. In R. P. Fawcett & D. Young (Eds.), New developments in systemic linguistics: Theory and application (Vol. 2, pp. 219–225). Pinter. Halliday, M. A. K. (1994). An introduction to functional grammar (2nd ed.). Arnold. Halliday, M. A. K., & Hasan, R. (1976). Cohesion in English. Longman. Halliday, M. A. K., & Hasan, R. (1985). Language, context, and text: Aspects of language in a social-semiotic perspective. Deakin University. Halliday, M. A. K., & Matthiessen, C. M. I. M. (1999). Construing experience through meaning: A language-based approach to cognition. Continuum. Halliday, M. A. K., & Matthiessen, C. M. I. M. (2014). An introduction to functional grammar (4th ed.). Routledge. Hand, L. (2005). Some comparison and contrast between systemic-functional analyses and traditional clinical linguistic analyses of the discourse of children with specific language impairment. Paper presented at the International Systemic Functional Congress 32: Discourses of Hope: Peace, Reconciliation, Learning and Change. Sydney, July 17–22.
112 Elizabeth Spencer and Alison Ferguson Hersh, D., & Armstrong, E. (2021). Information, communication, advocacy, and complaint: How the spouse of a man with aphasia managed his discharge from hospital. Aphasiology, 35(8), 1067–1083. Hersh, D., Godecke, E., Armstrong, E., Ciccone, N., & Bernhardt, J. (2016). “Ward talk”: Nurses interaction with people with and without aphasia in the very early period poststroke. Aphasiology, 30(5), 609–628. Hersh, D., Wood, P., & Armstrong, E. (2018). Informal aphasia assessment, interaction and the development of the therapeutic relationship in the early period after stroke. Aphasiology, 32(8), 876–901. Hymes, D. (1995). Ethnography, linguistics, narrative inequality: Toward an understanding of voice. Taylor & Francis. Jordan, F. M., Murdoch, B. E., & Buttsworth, D. L. (1991). Closed-head-injured children’s performance on narrative tasks. Journal of Speech and Hearing Research, 34, 572–582. Keegan, L. C., Hoepner, J. K., Togher, L., & Kennedy, M. (2022a). Clinically applicable sociolinguistic assessment for cognitive communication disorders. AJSLP, 32(2S), 1–11. Keegan, L. C., Müller, N., Ball, M. J., & Togher, L. (2022b). Anger and aspirations: Linguistic analysis of identity after traumatic brain injury. Neuropsychological Rehabilitation, 32(8), 2029–2053. Keegan, L. C., Suger, C., & Togher, L. (2021). Discourse analysis of humor after traumatic brain injury. American Journal of SpeechLanguage Pathology, 30(2S), 949–961. Keegan, L. C., Togher, L., Murdock, M., & Hendry, E. (2017). Expression of masculine identity in individuals with traumatic brain injury. Brain Injury, 31(12), 1632–1641. Lee, A., van Dulm, O., Robb, M. P., & Ormond, T. (2015). Communication restriction in adults who stutter. Clinical Linguistics & Phonetics, 29(7), 536–556. Lee, A., van Dulm, O., Robb, M. P., & Ormond, T. (2016a). Communication restriction in adults who stutter: Part II. Clinical Linguistics & Phonetics, 30(7), 546–547. Lee, A., van Dulm, O., Robb, M. P., & Ormond, T. (2016b). Communication restriction in adults who stutter: Part III. Clinical Linguistics & Phonetics, 30(11), 911–924. Liles, B., Duffy, R., Merritt, D., & Purcell, S. (1995). Measurement of narrative discourse in children with language disorders. Journal of Speech and Hearing Disorders, 38(2), 415–425. Liles, B., & Purcell, S. (1987). Departures in the spoken narratives of normal and language-
disordered children. Applied Psycholinguistics, 8(2), 185–202. Locke, S. (2004). Critical discourse analysis. Continuum. Martin, J. R. (2000). Beyond exchange: Appraisal systems in English. In S. Hunston & G. Thompson (Eds.), Evaluation in text (pp. 143–175). Oxford University Press. Martin, J. R., & Rose, D. (2003). Working with discourse: Meaning beyond the clause. Continuum. Mathers, M. (2001). Language use in Attention Deficit Hyperactivity Disorder: A preliminary report. Asia Pacific Journal of Speech, Language and Hearing, 6(1), 47–52. Mathers, M. E. (2006). Aspects of language in children with ADHD: Applying functional analyses to explore language use. Journal of Attention Disorders, 9(3), 523–533. Matthiessen, C. M. I. M. (2013). Applying systemic functional linguistics in healthcare contexts. Text & Talk, 33(4–5), 437–467. Mentis, M., & Prutting, C. A. (1987). Cohesion in the discourse of normal and head-injured adults. Journal of Speech and Hearing Research, 30(1), 88–98. Mortensen, L. (2003). Reconstructing the writer: Acquired brain impairment and letters of community membership. Unpublished PhD thesis, Macquarie University. Meulenbrook, P., & Cherney, L. (2019). The voicemail elicitation task: Functional workplace language assessment for person with traumatic brain injury. Journal of Speech, Language, and Hearing Research, 62(9), 3367–3380. Müller, N., Ball, M. J., & Rutter, B. (2008). An idiosyncratic case of /r/ disorder: Application of principles from systemic phonology and systemic functional linguistics. Asia Pacific Journal of Speech, Language and Hearing, 11(4), 269–281. Müller, N., & Mok, Z. (2012). Applying systemic functional linguistics to conversations with dementia: The linguistic construction of relationships between participants. Seminars in Speech & Language, 33(1), 5–15. Müller, N., & Wilson, B. T. (2008). Collaborative role construction in a conversation with dementia: An application of systemic functional linguistics. Clinical Linguistics & Phonetics, 22(10–11), 767–774. Pennycook, A. (2001). Critical applied linguistics: A critical introduction. Lawrence Erlbaum. Prevignano, C. L., & Thibault, P. J. (Eds.). (2003). Discussing conversation analysis. John Benjamins. Ripich, D., & Terrell, B. (1988). Patterns of discourse cohesion and coherence in
Systemic Functional Linguistics and Communication Disorders 113 Alzheimer’s Disease. Journal of Speech and Hearing Disorders, 53(1), 8–15. Rose, T. A., Worrall, L. E., & McKenna, K. T. (2003). The effectiveness of aphasia-friendly principles for printed health education materials for people with aphasia following stroke. Aphasiology, 17(10), 947–964. Rothery, J. (1996). Making changes: Developing an educational linguistics. In R. Hasan & G. Williams (Eds.), Literacy in society. Longman. Sherratt, S. (2007). Right brain damage and the verbal expression of emotion: A preliminary investigation. Aphasiology, 21(3-4), 320–339. Simmons-Mackie, N., & Damico, J. S. (1999). Social role negotiation in aphasia therapy: Competence, incompetence, and conflict. In D. Kovarsky, J. F. Duchan, & M. Maxwell (Eds.), Constructing (in)competence: Disabling evaluations in clinical and social interaction (pp. 313–342). Lawrence Erlbaum. Spencer, E., Packman, A., Onslow, M., & Ferguson, A. (2005). A preliminary investigation of the impact of stuttering on language use. Clinical Linguistics and Phonetics, 19(3), 191–201. Spencer, E., Packman, A., Onslow, M., & Ferguson, A. (2009). The effect of stuttering on communication: A preliminary investigation. Clinical Linguistics & Phonetics, 23(7), 473–488. Thompson, G. (2013). Introducing functional grammar. Taylor & Francis. Thomson, J. (2005). Theme analysis of narratives produced by children with and without Specific Language Impairment. Clinical Linguistics and Phonetics, 19(3), 175–190. Togher, L. (2000). Giving information: The importance of context on communicative opportunity for people with traumatic brain injury. Aphasiology, 14, 365–390. Togher, L., & Hand, L. (1998). Use of politeness markers with different communication
partners: An investigation of five subjects with traumatic brain injury. Aphasiology, 12, 755–770. Togher, L., Hand, L., & Code, C. (1997a). Analysing discourse in the traumatic brain injury population: Telephone interactions with different communication partners. Brain Injury, 11, 169–189. Togher, L., Hand, L., & Code, C. (1997b). Measuring service encounters in the traumatic brain injury population. Aphasiology, 11, 491–504. Togher, L., Hand, L., & Code, C. (1999). Exchanges of information in the talk of people with traumatic brain injury. In S. McDonald, L. Togher, & C. Code (Eds.), Communication disorders following traumatic brain injury (pp. 113–145). Psychology Press. Togher, L., McDonald, S., Code, C., & Grant, S. (1999). Can training communication partners of people with TBI make a difference? Paper presented at the 22nd Annual Brain Impairment Conference, Sydney. Whitworth, A., Leitão, S., Cartwright, J., Webster, J., Hankey, G. J., Zach, J., Howard, D., & Wolz, V. (2015). NARNIA: A new twist to an old tale. A pilot RCT to evaluate a multilevel approach to improving discourse in aphasia. Aphasiology, 29(11), 1345–1382. Whitworth, A., Ng, N., Timms, L., & Power, E. (2020). Exploring the viability of NARNIA with cognitive-communication difficulties: A pilot study. Seminars in Speech & Language, 41(01), 83–98. WHO (2001). ICF: International classification of functioning, disability and health. World Health Organization.
Young, L., & Harrison, C. (Eds.). (2004). Systemic functional linguistics and critical discourse analysis. Continuum.
9 Multimodal Analysis of Interaction SCOTT BARNES AND FRANCESCO POSSEMATO 9.1 Introduction Human communication is multimodal. When people gather together, they talk, position their bodies, shape their hands, contort their faces, direct their gaze, inter alia, in the course of communicative acts. Few would question that these behaviors are important for communication, but recognizing and embracing multimodality as a defining property of human communication has far-reaching consequences for its study. This chapter will introduce theoretical and methodological resources for exploring multimodality in real-time, interactive human communication. Although there is a growing body of research exploring multimodal interaction instrumentally and experimentally (see, e.g., Holler, 2022), the concepts and analytic units that form the basis of these studies are commonly derived from observational studies of interaction, and particularly those employing conversation analysis (see Chapter 6). As such, we will principally draw on multimodal conversation analysis (and related work) in this chapter, and set out methodological strategies suited to observational research. Importantly, too, we will argue that observational studies employing systematic, principled approaches to multimodality and interaction are essential for deepening our understanding of communication disability. The unique constellations of semiotic resources that people with communication disabilities use in their interactions require careful consideration with reference to the foundational meaning-making processes that are intrinsic to, and enabling of, all forms of human communication.
9.2 Signs, Modalities, and Social Activity Human communication relies on successfully creating and making sense of signs. As such, to begin this chapter, we will briefly outline the neo-Piercean model put forward by Kockelman (2005) and Enfield (2013), and demonstrate how it underpins and enables the semiotic processes (i.e., the activities involved with producing and attributing meaning) that are essential for human communication. This model posits that semiosis is public, dynamic, and relational, and it sets out the dependencies between a Sign, an Object, an Agent and an Interpretant. Signs are perceptible phenomena that can be related to an
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
116 Scott Barnes and Francesco Possemato Object; specifically, as standing for that Object. The semiotic process is mediated by two other elements: an Agent who senses the Sign, and an Interpretant, which orients to the Sign as standing for an Object (Enfield, 2013). Within this model, meaning is ascribed inand-as an Agent instigates an Interpretant. Through its orientation towards the Object, the Interpretant displays an attribution of meaning to the Sign. (It may subsequently become its own Sign, and give rise to its own Interpretants). For instance, a gardener (Agent) who perceives their spouse as saying ‘leaves’ (Sign) may take this as referring to some leaves they failed to rake up (Object). As a result, they may walk to the leaves and rake them (Interpretant). However, a performer who hears their director utter the same word in a rehearsal may simply exit the stage. This highlights the flexible relationship between Signs and Objects, and the essential relevance relationships between Signs and Interpretants, i.e., Interpretants are understood as relevantly arising from Signs. In summary, this model of semiosis reveals the mechanisms enabling meaningful communication regardless of the modalities being utilized. The sensorimotor systems of the human body furnish important infrastructure for creating signs. The configuration and properties of these systems provides a basis for distinguishing the modalities supporting human communication. For instance, following Enfield (2005), Stivers and Sidnell (2005) draw a distinction between the vocal-aural modality and the visuospatial modality and their distinctive channels therein. For the vocal-aural modality, Stivers and Sidnell (2005) identify lexico-syntactic and prosodic channels. For the visuospatial modality, they note channels for gaze, facial expression, manual gestures, and overall body posture / positioning. There is also growing research on the haptic modality, i.e., touch (e.g., Cekaite & Mondada, 2020). Alongside differences in the parts of the body and sensorimotor systems that mediate them, the vocal-aural, visuospatial, and haptic modalities may be variously differentiated through their temporalities (Deppermann & Streeck, 2018). For instance, in typical communication, talk is delivered via clearly bounded, ephemeral bursts, whereas gaze and body positioning are continuously available for conveying meaning (provided that people remain oriented to one another). Moving outside the body, too, the visuospatial (and haptic) persistence of the material environment and its artifacts provides a range of enduring semiotic potentials that may be engaged via the semiotic resources of the body (e.g., gazing to an object, moving to a different space, leaning against a tree). We will return to the relationships between modalities in Section 9.3. For the moment, it is enough to note that people will ordinarily have a range of modalities and channels to draw upon in designing communicative acts. Recognizing that human communication is multimodal does not require us to treat all modalities as equipotent. Instead, we may explore this matter empirically, and examine the ways in which participants themselves dynamically and interactively assign meaning to different semiotic resources (and their combination) to create social activities. We will adopt an approach to this process that is grounded in the theoretical perspectives advanced by ethnomethodology and conversation analysis, which propose that participants’ observable behaviors pervasively and unavoidably reproduce a shared social world (e.g., Garfinkel, 2002; Heritage, 1984). By acting in particular ways, people demonstrate their understanding of the social activity at hand (e.g., queuing at a coffee shop, gossiping with a friend, interviewing for a job, being a member of an audience) while reproducing that same activity through their unfolding conduct. This commitment to sense-making (and making-sense) is driven by the accountable nature of social action (e.g., Enfield & Sidnell, 2017). That is, people are treated as responsible for their conduct, and are evaluated by others with reference to relevant normative expectations. The public, interactive nature of this sense-making process means that it is available for study in ways that are aligned with its native scene. The implication of this stance for the investigation of multimodality is that, if people are
Multimodal Analysis of Interaction 117 co-present when communicating, then analysts must engage with full array of behaviors that participants employed to accomplish the social activity (as well as their relations to the material environment). Moreover, because these behaviors were designed in the first place for others, the subsequent conduct of these same others (i.e., the Interpretants they create) should shape how we come to analyze the significance of the behavior. We will take up these implications in more detail in the sections that follow.
9.3 Realizations of Multimodality The ethnomethodological perspective on human social life brings with it various kinds of frightening ‘tyrannies’ (cf. Enfield & Sidnell, 2017; Rawls, 2002, p. 56). One such tyranny is that of accountability, which, as we previously noted, refers to the pervasive application of normative expectations to behavior (Enfield & Sidnell, 2017). Here, we will invoke another tyranny: the tyranny of embodiment. That is, when co-present and communicating with one another, people are always in and bound by their bodies. Together, the tyrannies of accountability and embodiment guarantee that people’s bodies will be understood as meaningful / communicative, and encourages them to use and regulate them in disciplined ways. This means that, when communicating with one another, people are committed to presenting their bodies in ways that align with the social activity at hand. However, this does not mean that each modality will be carrying an equivalent semiotic burden. The vocal-aural modality has a privileged role in accomplishing social activity in most communities. This (minimally) reflects both the semiotic potential of language, and the fact that human cultures are overwhelmingly biased towards talk-based ways of managing social life. For example, social activities may mandate that participants contribute by speaking (e.g., in courtrooms, arguments, classrooms, wedding ceremonies), and participants may be held accountable for not delivering a communicative act via talk (e.g., ‘Don’t just nod, you have to say it!’). It is also clear that there are many intersecting normative expectations for speaking in interaction, a number of which have been captured in research on turn-taking, sequence, and repair organizations (Schegloff, 2006). So, the vocal-aural modality and its products are uniquely and demonstrably privileged resources for accomplishing human communication, and they have distinctive and robust internal organizational features. In recognizing this, however, we need to be mindful that talk is typically embedded in moments that are richly multimodal. As well, whatever systemic characteristics talk may have unto itself, there are a range of systemic and flexible ways for it to converge with semiotic resources delivered through other modalities. We will now briefly explore a relationship between a talk-based system for organizing interaction and other relevant modalities, and then consider environments where different modalities can be supporting different activities. Turn-taking is a foundational aspect of communication, and provides a way of managing participation through talking (Levinson, 2016). Despite the quintessentially vocal-aural nature of turn-taking, it has long been recognized that turn-taking must be interwoven with semiotic resources delivered through the visuospatial modality; particularly, the direction of gaze (e.g., Lerner, 2003; Sacks et al., 1974). This is perhaps most visible in the area of turn allocation. The turn-taking system provides for a number of outcomes around transition-relevance places, i.e., moments when another speaker may take the floor (but need not). The current speaker may select the next speaker, another speaker may selfselect for the next turn, or the current speaker may take the floor once more, and add another turn-constructional unit. The current speaker is in a structurally advantageous position in
118 Scott Barnes and Francesco Possemato that they can (and often do) design their turn to make clear who should take the floor next (Auer, 2021). One highly explicit talk-based means they have available to do so is to utter the name of the selected person. Interestingly, however, it is rare for current speakers to employ this strategy; moreover, it appears to be reserved for particular, marked interactional circumstances (see Hamdani et al., 2022). Instead, the default, unmarked method for managing next speaker selection blends talk and embodiment, with features of the turn (e.g., use of pronouns, morphosyntax, word choice) combining with visuospatial resources (e.g., gaze, body positioning) to demonstrate that someone in particular is being selected, and what they are being selected to do (see Blythe et al., 2018). In summary, the successful operation of turn-taking relies on the systemic coordination of behavior across modalities, with talk and visuospatial resources dynamically blended to manage the participatory contingencies of the moment. The independence of modalities can also enable different tracks of social activity to develop simultaneously, with particular modalities (temporarily or persistently) aligned with each activity. This is especially salient in activities that are reliant on artifact-rich scenes, like interactions at meal-times, in supermarkets, while driving cars, or undertaking air traffic control, for example (e.g., Arminen et al., 2014; Keisanen & Rauniomaa, 2012; Mondada, 2019). The requirements and affordances of such scenes allow for separate trajectories to emerge across modalities, with utilization of native artifacts supporting some aspects of the social activity while talk fosters others (e.g., preparing a meal while having a conversation) (Mondada, 2019, pp. 50–51). Of course, the streams of activity carried out via separate modalities can converge – and, in some cases, must converge – for their normative achievement. As a result, there is much potential for the development of conduct that necessarily spans modalities; both simultaneously and sequentially. Arminen et al. (2014) offer an interesting demonstration of these phenomena by exploring the multi-activity environment of air traffic control training. Here, the trainee must concurrently interact with their trainer, the aircraft they are controlling, the air traffic control systems, and other air traffic controllers. As well, they must dynamically prioritize one or more of these interactions as they engage in the training task. As a result of these layers of activity, trainees must seamlessly shift between modalities, with inspection and manipulation of particular screens and controls implicating mediated and non-mediated talk with various parties, and vice versa. This creates patterns of sequential activity that structurally integrates talk and embodied activity in distinctive trajectories of action. Along more mundane lines, Keisanen and Rauniomaa (2012) and Rauniomaa and Keisanen (2012) examine vocal requests for action (i.e., recruitments, see Floyd et al., 2020) and their responses in everyday interactions, focusing on their timing and composition. Although less exotic-seeming than air traffic control, everyday interactions also provide artifact-rich sites of interaction (e.g., homes, cars), and occasion myriad social identities (e.g., friend-friend, mother-daughter) and purposeful courses of action (e.g., topical talk, gossiping, giving advice). This means that requests routinely arise in moments laden with environmental, identity-based, and/or activity-based contingencies that must be negotiated to successfully complete these trajectories of action. Keisanen and Rauniomaa (2012) observe that, prior to the vocalization of a request, participants position their bodies and artifacts in ways that demonstrate orientation towards the target of the incipient request (e.g., pointing to or gazing at an object). In some instances, these displays occur in overlap with talk, allowing the activities taking place through talk to progress, uninterrupted, until there is space to produce the vocal request foreshadowed by the embodied display. Responses to these requests also offer interesting modality-specific patterns. Rauniomaa and Keisanen (2012) demonstrate that, when there are few contingencies impeding compliance with the request, recipients tend not to provide vocal responses, and instead proceed with embodied fulfilment of the request (e.g., by passing an object). However, when there are barriers to immediate fulfilment (but the response is still favourable), request recipients produce a vocal
Multimodal Analysis of Interaction 119 acceptance of the request (e.g., yeah, okay) while making progress towards its fulfilment. In summary, the composition of requests and their responses is patently multimodal, and the selection of modality is carefully fitted to its moment of deployment, and the nature of action being undertaken (see, e.g., Pekarek Doehler et al., 2021, on other action sequences). In Section 9.3, we have discussed the synchronous and sequential interweaving of modalities to effect social action. A number of the examples explored are consistent with the notion of multimodal gestalt, which describes the ephemeral convergence of different semiotic resources to create discrete signs and actions (e.g., Mondada, 2018). Stukenbrock (2021) d istinguishes multimodal gestalts that are designed and delivered in stable, sedimented forms – like the essentially grammatical relationships between demonstratives and visuospatial configurations – from multimodal gestalts that are the result of ad-hoc adaptation, and assembled to meet local contingencies. Moreover, she describes a pathway from the latter to the former, with ad-hoc gestalts having the potential to develop into more stable, normatively-implicated constructions as they are reproduced and distributed through communities of people and their social activities (see also Streeck, 2018, for a more expansive perspective). Stukenbrock’s (2021) contrast aligns with the semiotic process itself; that is, it invokes people’s effectiveness at making sense of individual moments of communication, while also being able to employ normative expectations about communication across moments and over time. From a more linguistic perspective, too, it also foreshadows a broadening and deepening of what we might consider to be synchronic relationships in a language, offering a pathway for incorporating grammar and the body (see Keevallik, 2018). But, in order to foster these fascinating lines of inquiry, a robust research methodology is required.
9.4 Analytic Strategies for Multimodality The challenges that multimodality poses for observational research on human c ommunication are substantial. This is perhaps particularly so for traditions like conversation analysis that have a history of transcription and – both rightly and wrongly – talk-centricity (cf. Mondada, 2018). In this section, we will introduce and critically discuss transcription and analysis practices relevant for multimodal analysis of interaction. We will argue that, although they have a number of limitations, these practices are suited to pursuing communicative/interactional phenomena on their own terms, with a focus on the temporal unfolding of participants’ conduct. Transcribing is a defining practice of conversation analysis. Its transcription conventions were originally established to represent and describe primarily talk-based interactional practices, i.e., to ‘reanimat[e] the talk’ (Hepburn & Bolden, 2017, p. 4). By capturing what, how, and when people contribute in interaction, transcriptions make the video- and audio-recorded data available for analytic inspection. This enables an ‘emic’ (i.e., participant-relevant) account of the sense-making processes captured in recordings. The apparent detail of conversation-analytic transcription conventions gives off a sense of comprehensiveness, but transcribing is a selective practice (Mondada, 2018, p. 88). Transcripts also embed a degree of analytic discernment, which is reflected, most notably, in the kinds of details that are included and excluded in them. Transcripts of talk-mediated interactions generally include aspects of speech delivery (e.g., pitch, tempo, loudness, duration, final intonation and overall voice quality), other features accompanying it (e.g., laughter, crying, and aspirations), and the temporal and sequential relationships between turns (e.g., overlaps, silences). Gestures and gaze were included in the transcriptions of some seminal conversation-analytic work (e.g., Goodwin, 1981; Schegloff, 1984), but recent advancements in video and audio-recording technology have sparked more intensive interest in the analysis of visible conduct across a variety of contexts (cf. Nevile, 2015, for an overview of the studies). As a result, both within conversation analysis and in cognate disciplines (e.g., gesture studies, ethnography, and psychology), there
120 Scott Barnes and Francesco Possemato has been a focus on creating conventions for transcribing conduct delivered via the visuospatial modality (e.g., Enfield, 2009; Kendon, 2004; Kita, 2003; McNeill, 1992; Streeck et al., 2011). In the conversation-analytic tradition, there remains no standard way of transcribing visible behavior in interaction (Hepburn & Bolden, 2017), with most researchers adapting and expanding the Jeffersonian transcription conventions originally developed for talkin-interaction (Jefferson, 2004). At the same time, there have been many new innovations in transcription directed at finding more fitting ways of capturing the layered dynamics of multimodality. Table 9.1 summarises a variety of multimodal transcription practices employed in selected multimodal conversation-analytic (and related) studies. Interested readers are encouraged to explore these studies and consider the transcription solutions that the researchers reached in order to capture and present their key phenomena. In the paragraphs that follow in this section, we will describe some of the principles and objectives
Table 9.1 Exemplary studies employing multimodal transcription. Study Goodwin (2003)
Dataset and focus
Key transcription features
A single segment of interaction Jeffersonian conventions accompanied by from an everyday transcriber annotations and ‘mapped conversation involving a pictures’ (Tufte, 2006) to form ‘analytic man with aphasia. sketches’. Iwasaki et al. A single segment of Tactile Auslan glosses and translation of signs (2019) interaction from an accompanied by ‘Frames’ in the form of everyday conversation screenshots with explanatory captions. involving Australian deafblind speakers. Kendon Gestures in everyday Conventional orthography marked with tone (2004) Neapolitan and Italian unit boundaries accompanied by symbols conversation. indicating the different gesture phases including preparation, stroke, hold and recovery. Transcripts are complemented by drawings of visible action. Mondada Turn-taking in workplace Jeffersonian conventions complemented by a (2007) meetings. system derived from Goodwin’s gaze conventions (1981) and Schegloff’s gesture notation (1984) capturing different gesture phases (e.g., preparation, stroke, hold. and retraction) and their temporal alignment with emerging talk. Possemato et Locational pointing practices Jeffersonian conventions accompanied by al. (2021) in multiparty everyday descriptors of point gesture including conversations by longarticulator, morphology, directionality, and term residents of remote motion. Use of annotated screenshots and Western Australia. overlaid graphics derived from Geographic Information System (GIS) data. Rossano Gaze direction and sequence Jeffersonian conventions accompanied by (2012) organization in dyadic and iconic representations of faces (ovals) and triadic everyday Italian gaze direction and shifts (arrows). conversations.
Multimodal Analysis of Interaction 121 informing multimodal transcription, before offering some remarks on how this articulates with the analytic outcomes they are designed to achieve. A key starting point for multimodal transcription and analysis is to document each of the different semiotic resources mobilized by participants in the visuospatial modality (alongside any accompanying in talk) (Stivers & Sidnell, 2005) and demonstrate how they incrementally unfolded in real-time. Practically, this often results in transcripts that are still organized in lines/rows, but have multiple sub-lines/-rows for selected non-talk modalities. Because, as we discussed in Section 9.3, the synchronous and sequential deployment of multimodal conduct is constitutive of social action, a fundamental objective of multimodal transcriptions is to preserve these relationships in their transcript-based representations (Mondada, 2018). Moreover, transcripts must capture participants’ public and accountable orientations towards embodied and material resources, i.e., they must reveal aspects of interactivity by capturing participants conduct relative to one another. As such, transcript lines may include conduct from multiple participants. Through these transcription practices, multimodal conversation analysis develops a transparent, practical basis for generating emic interpretations of interactional practices. The cost of this, however, is that transcripts can become very informationally dense and demanding of the reader. One recurrent challenge for transcribing multimodality surrounds the description of embodied conduct. Many of the terms used for describing this conduct (e.g., hand positions, head movements) bring with them a priori qualities and connotations that may be only partially consistent with the heavily contextualized way that bodily resources and artifacts are utilized by participants in interaction (Mondada, 2018, p. 95). Nonetheless, glosses are regularly used to describe salient visible conduct, such as gaze direction and shifts, gesture phases, morphology and direction, and body arrangement and movement in spaces. Although they can provide a reasonably granular rendering of the composition of multimodal conduct, textual descriptions cannot exhaust the richness of local ecologies in which it is assembled and organized into, for instance, multimodal gestalts. For this reason, more iconic ways of representing multimodal gestalts – such as video still frames – are regularly employed in documenting recordings. The integration of visual details of interaction into transcriptions that already include detailed transcription of multimodal conduct realises a hybrid textual-visual dimension of transcripts as analytic objects (Mondada, 2016) while enabling the researcher to achieve a ‘synthetic view’ on multimodal conduct (Mondada, 2018, p. 91), i.e., one which provides a layered analytic representation of its temporal-sequential courses. The inclusion of depictive representations of multimodal behavior becomes particularly important when analysing moments in which talk is absent (e.g., Hoey, 2020; Mondada, 2019; Vatanen, 2021), the social activity is mobile (Cekaite & Mondada, 2020), and participants’ intersubjective sensorial experiences are in focus (e.g., Mondada, 2021, 2022). As we have indicated so far, multimodal transcription practices are intended to foster analyses that follow participants’ own sense-making. At a basic level, this means that researchers must show that and how participants’ conduct effects specific outcomes, as demonstrated in and through the behavior of other participants. Ultimately, analysis aims to develop two interrelated kinds of accounts: 1) accounts of momentary sense-making and its outcomes; and, 2) accounts of the normative infrastructure supporting sense-making and its outcomes. These different ways of working have been respectively termed ‘single episode’ analysis and ‘collection’ analysis in the wider conversation-analytic literature (see Schegloff, 1987; also Barnes et al., 2019). Single episode analysis seeks to reveal the ad-hoc ways that participants design and take-up conduct on a moment by moment basis. Collection analysis is grounded in multiple accounts of single episodes, but narrows inquiry to a discrete domain in order to discover and describe the normative basis for its organization. In both ways of working, analyses of multimodal conduct are likely to reveal systematic convergences of semiotic resources (even if one or more are demonstrably in the foreground).
122 Scott Barnes and Francesco Possemato As Mondada (2018, 2019) observes, an analytic focus on talk sequesters evidence of participant sense-making to the next turn (cf. Sacks et al., 1974), but the ongoing simultaneities of the body, artifacts, and the surrounding environment provides an enduring and dynamic scene for participants to interactively regulate. This means that analytic evidence for claims about multimodal interaction are often distributed throughout modalities and across time scales (e.g., Stukenbrock, 2021).
9.5 Communication Disability and Multimodality There are a number of intersecting reasons why multimodal analysis of interaction is important for research and professional practice with communication disability. First, co-present communication is the primary medium through which we carry out our everyday lives, and improvements to the efficacy of this form of communication is the ultimate objective of much speech pathology practice for communication disability (Barnes & Bloch, 2019; Doedens & Meteyard, 2022). Second, many people who experience communication disability have reduced access to one or more semiotic modalities. One implication of this reduced access is that people with communication disability must recurrently engage in interactions in which there are strong asymmetries between them and their communication partners, including in the semiotic resources available to each party, the primary modalities employed for communication, or both. Finally, but not exhaustively, the real-time regulation of semiotic resources has substantial potential to inform accounts of the sensorimotor and cognitive systems supporting their typical and atypical functioning (cf. Barnes & Bloch, 2019; Holler, 2022). In sum, this means that improvements to our understanding of multimodality and communication disability is likely to have a range of theoretical and practical benefits. Despite broad recognition of its importance, there are some unhelpful ideologies about multimodality that are evident in scholarly and professional literature relevant for communication disability (Barnes, 2019). For instance, multimodality has often been regarded as a unique feature of communication involving people with severe restrictions on speech and/ or who use augmentative and alternative communication (AAC), rather than an inherent feature of all communication and social life. In many instances, what is being topicalized with this association is the reapportionment of semiotic burden between modalities in interactions involving people with severe restrictions on speech, with another modality fulfilling a semiotic role that the vocal-aural modality would ordinarily – and, more specifically, would normatively – support. This implicit ideology, although understandable, can interfere with the conceptualization, study, and professional measurement of communication involving people with communication disability by detaching them from typical communicative expectations, i.e., by treating their communication as atypical or exceptional in ways that it is not. At the other end of the spectrum, this way of thinking has also encouraged researchers and clinicians to disattend to the multimodal nature of communication difficulties for people who experience impairments affecting speech, but still retain access to it as a robust semiotic resource. Instead, we suggest that researchers (and clinicians) should explicitly adopt a perspective on multimodality and communication disability that takes multimodality as an inherent feature of all communication, and embeds it in the foundational interactional tasks that mediate social life. This can provide a firm basis for exploring just how communication disabilities influence the shape of human communication. As briefly outlined in Section 9.4, the “embodied turn” in research on language and social interaction (Nevile, 2015) has also been reflected in recent communication disability research (e.g., Barnes et al., 2022; Killmer et al., 2022; Korkiakangas, 2018; Merlino, 2018; Norén & Pilesjö, 2016; Savolainen et al., 2020), although perhaps to a lesser extent. There is also a
Multimodal Analysis of Interaction 123 strong history of research applying multimodally informed conversation analysis to communication disability (e.g., Bloch & Wilkinson, 2009; Goodwin, 2003; Mahon, 2009; Rhys, 2005; Wilkinson et al., 2010, 2011). In many of these studies, systematic engagement with multimodality was necessary because, without it, the specifics of the communicative phenomena would be rendered unavailable (e.g., gestures, gaze patterns, use of AAC systems). These studies have revealed a wide range of ways that people with the communication disabilities and their communication partners may employ multiple modalities to accomplish effective communication despite reduced access to vocal-aural resources, as well as challenges intrinsic to relying on atypical configurations of modalities, and semiotic resources that are less powerful than talk. As a whole, however, it is apparent that this body of research has been quite pragmatic, and guided by the topics, populations, and native phenomena under study. This is not necessarily problematic – and, in many ways, it is desirable – but we will suggest that there are opportunities to proceed more programmatically with observational research on multimodality in interactions involving people with communication disability (cf. Barnes & Bloch, 2019; Wilkinson, 2019). For many people with communication disability – and especially people with acquired communication disabilities – the semiotic resources and communicative practices they bring to bear in interaction are not (or not singularly) the result of processes commensurate with the typical transmission of these behaviors throughout a community (cf. Enfield, 2014; Streeck, 2018; Stukenbrock, 2021), e.g., one does not become a speaker with aphasia in precisely the same way that one becomes a speaker of Spanish. This means that there is unlikely to be unique multimodal constructions in interactions involving people with communication disabilities that have the same causal origins as the community-wide multimodal constructions normatively employed by typical speakers of a language. (Of course, people with communication disabilities may use – and perhaps adapt – these same multimodal constructions as a member of the same community). That said, it is certainly the case that communication disabilities will pose recurrent challenges for creating signs and regulating interaction, which people with communication disabilities and their communication partners must iteratively address in the course of their everyday lives. As well, individual people with communication disabilities will have a range of unique competencies, experiences, and preferences that will influence the form of their personal interactions. There is therefore potential for systematic ways of configuring modalities to emerge in their interactions, and the locus of this systematicity may be at the level of individuals, dyads, populations, and, in some cases, discrete communities. Together, this points toward at least two analytic targets for research on multimodality, interaction, and communication disability: (1) exploration of unique methods for creating signs and regulating interaction; and, (2) exploration of recurrent challenges for creating signs and regulating interaction. These two targets are analogous to the forms of multimodal gestalt described by Stukenbrock (2021), and are well-aligned with the analytic strategies for conversation-analytic research outlined in Section 9.4. For Target 1, the aim would be to explore how people with communication disability configure modalities to meet the occasioned demands of social activities and specific moments therein. In doing so, such analysis would reveal the situated work of the parties to the interaction, and how it achieves the action and/or social activity at hand (e.g., Goodwin, 2003; Killmer et al., 2022; Norén & Pilesjö, 2016). For Target 2, the aim would be to explore how more generic aspects of interaction are managed by people with communication disabilities, e.g., regulating participation/turn-taking, building sequences of action, initiating repair (e.g., Barnes et al., 2022; Savolainen et al., 2020; Wilkinson et al., 2010). This initiative implicates a greater degree of abstraction and would seek to identify stable, normatively implicated multimodal constructions. As such, it would benefit from data collection involving a substantial number of participants, and longitudinal study at various timescales (see, e.g., Stukenbrock, 2021,
124 Scott Barnes and Francesco Possemato pp. 3–4; Depperman and Streeck, 2018). In doing so, such analysis would reveal the multimodal “practice(d) solutions” (Schegloff, 2006, p. 71) that people with communication disabilities and their communication partners develop as persisting methods for addressing the problems of meaning and participation characteristic of real-time interaction. This may also provide insight into population-specific methods (e.g., ways that people with aphasia take turns), and ones that are more widely distributed (i.e., arise across populations). For both targets, it is likely that bespoke ways and transcribing, describing, and depicting will be required to capture the effects of communication disability on key semiotic resources (e.g., participant body positioning, movement, talk, gaze) as well as its impacts on the interaction itself (e.g., complex sequencing, distorted timing). As we have noted in Section 9.4, there is great diversity in the transcription of non-talk modalities. We think that researchers exploring multimodality in interactions involving people with communication disabilities must continue to be creative on this front to ensure that they are meeting multimodal phenomena on their own terms, while still maintaining the ability to carefully capture its temporalities.
9.6 Conclusion Human communication is multimodal and, with each moment of communication, people use these modalities to dynamically reproduce a shared social world. The interrelationships between modalities are multiple and deep, and, in interaction, they are woven together synchronously and sequentially to effect social action. Studies employing multimodal conversation analysis aim to capture how people make sense of multimodal communication behavior in real-time, and there is a burgeoning body of research on the configurations of semiotic resources that people with communication disabilities employ in interaction. As knowledge on multimodality in human communication continues to accumulate, it appears inevitable that speech pathology assessment and intervention will become more adept at conceptualizing it, capturing it, and engaging with it (cf. Pierce et al., 2019). With regard to speech pathology assessment in particular, the increasing accessibility and usability of instrumental methods for documenting human behavior (e.g., Holler, 2022) will likely be appealing to clinicians seeking robust measurement practices for multimodal human communication. However, such tools leave in place the task of determining the meaning of behaviors that are instrumentally detected. This challenge will require sustained effort on the part of researchers, and reflection on the part of clinicians (cf. Barnes & Bloch, 2019). Happily, however, many of the analytic strategies employed in observational research on multimodality in interaction are readily replicable in clinical practice (e.g., Beeke et al., 2007), even if they may require a different packaging for this professional purpose (e.g., transformation into a rating scale). Nonetheless, it is clear that rigorous analysis of multimodal interaction can bring researchers and clinicians closer to the specifics of communication in everyday life, and the competencies, challenges, and experiences of people with communication disabilities.
REFERENCES Arminen, I., Koskela, I., & Palukka, H. (2014). Multimodal production of second pair parts in air traffic control training. Journal of Pragmatics, 65, 46–62.
Auer, P. (2021). Turn-allocation and gaze: A multimodal revision of the “current-speakerselects-next” rule of the turn-taking system of conversation analysis. Discourse Studies, 23(2), 117–140.
Multimodal Analysis of Interaction 125 Barnes, S. (2019). Improving the ideas behind multimodal communication. Journal of Clinical Practice in Speech-Language Pathology, 21(3), 131–134. Barnes, S., Beeke, S., & Bloch, S. (2022). How is right hemisphere communication disorder disabling? Evidence from response mobilizing actions in conversation. Disability and Rehabilitation, 44(2), 261–274. Barnes, S., & Bloch, S. (2019). Why is measuring communication difficult? A critical review of current speech pathology concepts and measures. Clinical Linguistics and Phonetics, 33(3), 219–236. Barnes, S., Toocaram, S., Nickels, L., Beeke, S., Best, W., & Bloch, S. (2019). Everyday conversation after right hemisphere damage: A methodological demonstration and some preliminary findings. Journal of Neurolinguistics, 52, 1–19. Beeke, S., Maxim, J., & Wilkinson, R. (2007). Using conversation analysis to assess and treat people with aphasia. Seminars in Speech and Language, 28(2), 136–147. Bloch, S., & Wilkinson, R. (2009). Acquired dysarthria in conversation: Identifying sources of understandability problems. International Journal of Language & Communication Disorders, 44(5), 769–783. Blythe, J., Gardner, R., Mushin, I., & Stirling, L. (2018). Tools of engagement: Selecting a next speaker in Australian Aboriginal multiparty conversations. Research on Language and Social Interaction, 51(2), 145–170. Cekaite, A., & Mondada, L. (Eds.). (2020). Touch in social interaction: Touch, language and body. Routledge. Deppermann, A., & Streeck, J. (Eds.). (2018). Time in embodied interaction: Synchronicity and sequentiality of multimodal resources. John Benjamins Publishing Company. Doedens, W., & Meteyard, L. (2022). What is functional communication? A theoretical framework for real-world communication applied to aphasia rehabilitation. Neuropsychology Review, 32(4), 937–973. Enfield, N. J. (2005). The body as a cognitive artifact in kinship representations: Hand gesture diagrams by speakers of Lao. Current Anthropology, 46(1), 1–26. Enfield, N. J. (2009). The anatomy of meaning: Speech, gesture, and composite utterances. Cambridge University Press. Enfield, N. J. (2013). Relationship thinking: Agency, enchrony, and human sociality. Oxford University Press.
Enfield, N. J. (2014). Natural causes of language: Frames, biases and cultural transmission. Language Science Press. Enfield, N. J., & Sidnell, J. (2017). The concept of action. Cambridge University Press. Floyd, S., Rossi, G., & Enfield, N. J. (Eds.). (2020). Getting others to do things: A pragmatic typology of recruitments. Language Science Press. Garfinkel, H. (2002). Ethnomethodology’s program: Working out Durkheim’s aphorism. Rowman & Littefield Publishers. Goodwin, C. (1981). Conversational organization: Interaction between speakers and hearers. Academic Press. Goodwin, C. (2003). Conversational frameworks for the accomplishment of meaning in aphasia. In C. Goodwin (Ed.), Conversation and brain damage (pp. 90–116). Oxford University Press. Hamdani, F., Barnes, S., & Blythe, J. (2022). Questions with address terms in Indonesian conversation: Managing next-speaker selection and action formation. Journal of Pragmatics, 200, 194–210. Hepburn, A., & Bolden, G. B. (2017). Transcribing for social research. Sage. Heritage, J. (1984). Garfinkel and ethnomethodology. Polity Press. Hoey, E. M. (2020). When conversation lapses: The public accountability of silent copresence. Oxford University Press. Holler, J. (2022). Visual bodily signals as core devices for coordinating minds in interaction. Philosophical Transactions of the Royal Society – Series B, 377, 1–15. Iwasaki, S., Bartlett, M., Manns, H., & Willoughby, L. (2019). The challenges of multimodality and multi-sensoriality: Methodological issues in analyzing tactile signed interaction. Journal of Pragmatics, 143, 215–227. Jefferson, G. (2004). Glossary of transcript symbols with an introduction. In G. H. Lerner (Ed.), Conversation analysis: Studies from the first generation (pp. 3–23). John Benjamins. Keevallik, L. (2018). What does embodied interaction tell us about grammar? Research on Language and Social Interaction, 51(1), 1–21. Keisanen, T., & Rauniomaa, M. (2012). The organization of participation and contingency in prebeginnings of request sequences. Research on Language and Social Interaction, 45(4), 323–351. Kendon, A. (2004). Gesture: Visible action as utterance. Cambridge University Press.
126 Scott Barnes and Francesco Possemato Killmer, H., Svennevig, J., & Beeke, S. (2022, in press). Requests to children by parents with aphasia. Aphasiology, 37(9), 1–23. Kita, S. (2003). Pointing: Where language, culture, and cognition meet. L. Erlbaum Associates. Kockelman, P. (2005). The semiotic stance. Semiotica, 157, 233–304. Korkiakangas, T. (2018). Communication, gaze and autism: A multimodal interaction perspective. Routledge. Lerner, G. H. (2003). Selecting next speaker: The context-sensitive operation of a contextfree organization. Language in Society, 32(2), 177–201. Levinson, S. C. (2016). Turn-taking in human communication: Origins and implications for language processing. Trends in Cognitive Sciences, 20, 6–14. Mahon, M. (2009). Interactions between a deaf child for whom English is an additional language and his specialist teacher in the first year at school: Combining words and gestures. Clinical Linguistics & Phonetics, 23(8), 611–629. McNeill, D. (1992). Hand and mind: What gestures reveal about thought. University of Chicago Press. Merlino, S. (2018). Assisting the client with aphasia in speech therapy: A sequential and multimodal analysis of cueing practices. Hacettepe University Journal of Education (HUJE), 33, 334–357. Mondada, L. (2007). Multimodal resources for turn-taking: Pointing and the emergence of possible next speakers. Discourse Studies, 9(2), 194–225. Mondada, L. (2016). Challenges of multimodality: Language and the body in social interaction. Journal of Sociolinguistics, 20(3), 336–366. Mondada, L. (2018). Multiple temporalities of language and body in interaction: Challenges for transcribing multimodality. Research on Language and Social Interaction, 51(1), 85–106. Mondada, L. (2019). Contemporary issues in conversation analysis: Embodiment and materiality, multimodality and multisensoriality in social interaction. Journal of Pragmatics, 145, 47–62. Mondada, L. (2021). Achieving the intersubjectivity of sensorial practices: Body, language, and the senses in tasting activities. In J. Lindström, L. Ritva, A. Peräkylä, & M. L. Sorjonen (Eds.), Intersubjectivity in action (pp. 279–302). John Benjamins.
Mondada, L. (2022). Appealing to the senses: Approaching, sensing, and interacting at the market’s stall. Discourse & Communication, 16(2), 160–199. Nevile, M. (2015). The embodied turn in research on language and social interaction. Research on Language and Social Interaction, 48(2), 121–151. Norén, N., & Pilesjö, M. S. (2016). Supporting a child with multiple disabilities to participate in social interaction: The case of asking a question. Clinical Linguistics & Phonetics, 30(10), 790–811. Pekarek Doehler, S., Polak-Yitzhaki, H., Li, X., Stoenica, I. M., Havlik, M., & Keevallik, L. (2021). Multimodal assemblies for prefacing a dispreferred response: A crosslinguistic analysis. Frontiers in Psychology, 12, 689275–689275. Pierce, J., O’Halloran, R., Togher, L., & Rose, M. L. (2019). What is meant by “multimodal therapy” for aphasia? American Journal of Speech-Language Pathology, 28(2), 706–716. Possemato, F., Blythe, J., de Dear, C., Dahmen, J., Gardner, R., & Stirling, L. (2021). Using a geospatial approach to document and analyse locational points in face-to-face conversation. Language Documentation and Description, 20, 313–351. Rauniomaa, M., & Keisanen, T. (2012). Two multimodal formats for responding to requests. Journal of Pragmatics, 44(6–7), 829–842. Rawls, A. W. (2002). Editor’s introduction. In H. Garfinkel (Ed.), Ethnomethodology’s program: Working out Durkheim’s aphorism (pp. 1–64). Rowman & Littefield Publishers. Rhys, C. S. (2005). Gaze and the turn: A nonverbal solution to an interactive problem. Clinical Linguistics & Phonetics, 19(5), 419–431. Rossano, F. (2012). Gaze behavior in face-to-face interaction. Radboud University Nijmegen. Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). A simplest systematics for the organization of turn-taking for conversation. Language, 50(4), 696–735. Savolainen, I., Klippi, A., Tykkyläinen, T., Higginbotham, J., & Launonen, K. (2020). The structure of participants’ turn-transition practices in aided conversations that use speech-output technologies. Augmentative and Alternative Communication, 36(1), 18–30. Schegloff, E. A. (1984). On some questions and ambiguities in conversation. In J. M. Atkinson & J. Heritage (Eds.), Structures of social action: Studies in conversation analysis (pp. 28–52). Cambridge University Press.
Multimodal Analysis of Interaction 127 Schegloff, E. A. (1987). Analyzing single episodes of interaction: An exercise in conversation analysis. Social Psychology Quarterly, 50(2), 101–114. Schegloff, E. A. (2006). Interaction: The infrastructure for social institutions, the natural ecological niche for language and the arena in which culture is enacted. In N. J. Enfield & S. C. Levinson (Eds.), Roots of human sociality: Culture, cognition and interaction (pp. 70–96). Berg. Stivers, T., & Sidnell, J. (2005). Introduction: Multimodal interaction. Semiotica, 156(1/4), 1–20. Streeck, J. (2018). Grammiticalization and bodily action: Do they go together? Research on Language and Social Interaction, 51(1), 26–32. Streeck, J., Goodwin, C., & LeBaron, C. D. (Eds.). (2011). Embodied interaction: Language Language and body in the material world. Cambridge University Press. Stukenbrock, A. (2021). Multimodal gestalts and their change over time: Is routinization also grammaticalization? Frontiers in Communication, 6, 1–17.
Tufte, E. R. (2006). Beautiful evidence. Graphic Press. Vatanen, A. (2021). Co-presence during lapses: On “comfortable silences” in Finnish everyday interaction. In J. Lindström, L. Ritva, A. Peräkylä, & M. L. Sorjonen (Eds.), Intersubjectivity in action (pp. 251–276). John Benjamins. Wilkinson, R. (2019). Atypical interaction: Conversation analysis and communicative impairments. Research on Language and Social Interaction, 52(3), 281–299. Wilkinson, R., Beeke, S., & Maxim, J. (2010). Formulating actions and events with limited linguistic resources: Enactment and iconicity in agrammatic aphasic talk. Research on Language and Social Interaction, 43(1), 57–84. Wilkinson, R., Bloch, S., & Clarke, M. (2011). On the use of graphic resources in interaction by people with communication disorders. In J. Streeck, C. Goodwin, & C. LeBaron (Eds.), Embodied interaction: Language and body in the material world (pp. 152–168). Cambridge University Press.
10 Cross-Linguistic and Multilingual Perspectives on Communicative Competence and Communication Impairment: Pragmatics, Discourse, and Sociolinguistics ZHU HUA AND LI WEI 10.1 Introduction The last three decades have witnessed a significant increase in the number of cross-linguistic and multilingual studies in the field of language and communicative development and impairment. This is largely a result of better recognition of the extent of language contact, bilingualism and multilingualism as a consequence of migration, globalization and enhanced information and communication technologies, and of the need to see multilingual speakers as the norm rather than the exception in society. Importantly, these studies contribute to our understanding of communication and the underlying processes and factors of communication impairment and inform clinical assessment and intervention of the target populations in the same way as research on English-speaking monolingual population. They also offer opportunities to evaluate and challenge theoretical claims about typical communication development and impairment proposed with reference to monolingual development, most often in English, reflecting the origin and development of the field of child language studies in general. They examine whether and how differences in specific languages or language combinations result in different patterns in development of pragmatic and discourse skills or communication impairment, and whether the same impairment manifests itself in different ways from one language to another or from monolingual speakers to multilingual speakers, and to what extent language differences account for variance as opposed to individual differences. These studies also provide much-needed baseline information for the purpose of clinical assessment and intervention. In this chapter, we review cross-linguistic and multilingual studies of communication development and impairment, focusing on pragmatics, discourse and sociolinguistics. Given that these terms have different meanings to different people, we first establish the conceptual focus before offering a review of the literature. It is also worth noting that the terminology and
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
130 Zhu Hua and Li Wei concepts related to impairment are evolving. The key is to prioritise culturally respectful and empathetic communication with people who have diverse communication needs and styles.
10.2 Language Use: Pragmatics, Discourse, and Sociolinguistics In broad terms, pragmatics, discourse, and sociolinguistics are all about language use. Pragmatics is often understood as the study of meaning in context. It is about explaining how speakers produce language forms in specific ways so that their intended meanings are not only expressed in a context-appropriate manner but also understood by the hearer as intended. Concepts such as intentionality, form–function mapping, relevance, and appropriacy are central to the study of pragmatics. The acquisition of pragmatics, for example, would involve learning, at a micro-level, how to convey and interpret the meaning which cannot be expressed purely and entirely by means of the phonology, morphology, syntax, and semantics of a particular language, and, at a macrolevel, how to use language in social interaction. Pragmatic development includes the mastery of communicative use of linguistic and non-linguistic expressions, the development of conversational skills, and the acquisition of various contextually or culturally determined rules governing linguistic interaction to achieve communication success. Discourse has been defined by many linguists as anything “beyond the sentence,” as discussed in Schiffrin et al. (2001). Broadly speaking, it covers two areas: at the conversational level, interactional patterns such as turn-taking, initiation of conversation exchanges, and recognition and repair of communicative breakdown, and, at connected speech level, narrative, argument, explanation, and definition. Inevitably there is an overlap between pragmatics and discourse. Some critical theorists use “discourse” to refer to a broader range of social practice that includes non-linguistic and non-specific instances of language (e.g. discourse of power). In this chapter, we confine ourselves to the traditional, narrower definition of discourse and focus on language use beyond the sentence level. Sociolinguistics is the study of stylistic, dialectal and cultural variations in language use. While it shares with pragmatics and discourse the interest in language use in context, sociolinguistics typically studies it from a speaker-oriented perspective, focusing on variables such as age, gender, and socioeconomic class. Sociolinguists tend to study language use by groups of speakers rather than individually, and are concerned with collective patterns of language behavior in social contexts. In other words, sociolinguistics is not only about language in use, but also about speakers in community. It is also interested in what societies do with their languages, that is, language policy, language planning, and language attitude. Taken together, pragmatics, discourse, and sociolinguistics explore the key components of communicative competence (Hymes, 1972; Saville-Troike, 1996), that is, knowledge and expectation of how to use the language appropriately in particular social settings. A closely related concept that places the bilingual and multilingual language user at the heart of it is Cook’s Multicompetence (Cook, 1991; Cook & Li, 2016). It was initially developed in the context of second language acquisition to describe “the compound state of a mind with two grammars”, and later modified to “the knowledge of more than one language in the same mind” and then as “the overall system of a mind or a community that uses more than one language” (Cook, 2016, pp. 2–3). It refers to bilingual and multilingual language user’s capacity to use multiple languages and other semiotic and communicative resources in a coordinated, meaningful, and context-appropriate way. The concept has three premises: (1) Multicompetence concerns the total system for all languages (L1, L2, Ln) in a single mind or community and their inter-relationships; (2) multicompetence does not depend on the monolingual native-speaker; and (3) multicompetence affects the whole mind, that is, all language and cognitive systems, rather than language alone. The concept also brings together different strands of work in generative linguistics, psycholinguistics, and sociolinguistics.
Cross-Linguistic and Multilingual Perspectives 131
10.3 Cross-Linguistic Perspective 10.3.1 Development of Pragmatics and Discourse Skills While the study of the development of pragmatics and discourse skills has been predominantly focused on English (for developmental pragmatics in English, see Leinonen et al., 2000; McTear & Conti-Ramsden, 1992; Ninio & Snow, 1996; Ochs & Schieffelin, 1979; for pragmatics and discourse of the English-speaking elderly, see Coupland et al., 1991; Davis, 2005; Maxim, 1994), studies on other languages have started to emerge or become available in English in the last two decades. These studies have enriched our understanding of development of pragmatics and discourse skills. For example, Huang (2014) investigated the pragmatic function of self/other reference among Mandarin-speaking children. Through a detailed analysis of recorded spontaneous interactional data, Huang found that there are significant differences in the children and the mother’s use of self/other reference. This, according to Huang, suggests that the Mandarin-speaking children under study were able to make creative use of their own linguistic resources to mark the pragmatic functions of agentivity and social control. This finding lends strong support to the argument that children may have developed some understanding of the role of language in bringing changes to their environments and an ability to manipulate linguistic forms and functions to achieve their goals from an early age. Another example is the study by Loukusa et al. (2007). They tested the ability to answer questions targeting he pragmatic processes of reference assignment, enrichment and implicature, as proposed by relevance theory among children aged 3 to 9 years. The results provide insights into age-related patterns of acquisition of different aspects of pragmatic comprehension with the ages of 3 and 4 years seeing the most significant development. Studies also show that cross-cultural differences in pragmatics could lead to different expectations of what is normal and what is impaired. Early studies by Ochs (1988) of language socialization in a Samoan village in the Pacific Islands, for example, found that patterns of silence and overlapping speech were very different from those found in English-speaking cultures, and they carried specific cultural meanings that needed to be interpreted differently. Guo (1995) and Ervin-Tripp et al. (1990) observed that Chinese and Japanese children followed culturally specific politeness rules in controlling the topic and flow of conversation. There have also been reports of avoidance of direct questions and apparent overuse of repetition in certain languages and cultures (see Schieffelin & Ochs, 1986). Cross-linguistic studies have successfully confirmed the link between child-directed talk and children’s language development as well as the cultural impact on child-adult interaction. For example, JingSchmidt (2014) found significant differences in the relative frequency of positive and negative affective speech acts among American and Mandarin Chinese mothers. One area of pragmatics that has received some attention from cross-linguistic researchers is the communicative use of non-verbal behaviors (e.g. gestures such as pointing) in young children. Recent examples of this work include Blake, Osborne, Cabral, and Gluck’s study (2003) of Japanese children’s use of gesture, Rodrigo, Gonzalez, Vega, Muneton-Ayala, and Rodriguez’s longitudinal study (2004) of Spanish children’s use of gestural and verbal deixis, and Guidetti’s study (2005) of the combined use of gestures and speech to signal their i ntention to agree or refuse among young French children. While research on English-speaking children also points to the importance of gestures in language acquisition, cultural differences in the meaning of gestures are an important issue for the developing child. At a discourse level, Meng and Schrabback (1999) look at the acquisition of German interjections, in particular “hm” and “na”, in adult–child discourse. It was found that the children aged 2;8–3;4 (year; month) had already managed to acquire basic interjectional forms and functions, as well as some discourse-type constraints, but they seemed to fail to
132 Zhu Hua and Li Wei understand the pluri-functionality of interjections. Perroni (1993) reported a longitudinal, observational study of the development of narrative discourse between two Brazilian Portuguese-speaking children and identified various types of strategies underlying narrative constructions. Aviezer (2003) investigated strategies of clarification in the face of miscommunication by Hebrew-speaking children. Corsaro and Maynard (1996) examined “format tying” (participants’ strategic use of phonological, syntactic, and semantic surface– structure features of prior turns at talk) in discussion and argument among Italian and American children. Korolija (2000) investigated the accomplishment of coherence in multiparty conversations amongst Swedish-speaking elderly people. Wong and Ingram (2003) looked at the patterns of acquisition of question among Cantonese-speaking children. Jisa (1987) described F rench-speaking children’s use of high-frequency oral discourse connectors in their narratives. Tse and Li (2014) looked into how young Cantonese-speaking children express time.
10.3.2 Pragmatic and Discourse Skills of Children with Language and Communication Impairments There is much debate on the status of pragmatic skills in English-speaking children diagnosed with SLI. This is partly to do with the lack of consensus on what pragmatics means in the first place. Shaeffer (2005, p. 90) argued that most studies of children with SLI seem to point to deficits in pragmatic abilities such as speech acts, conversational participation and discourse regulation (initiations, replies, topic maintenance, turn-taking, utterance repair, etc.). Other studies suggest that children with SLI tend to be associated with poor participation in cooperative learning and poor negotiation skills (Brinton et al., 1998). Craig and Evans (1993) pointed out that children with SLI presenting expressive deficits and those presenting combined expressive-receptive deficits were found to vary from each other on specific measures of turn-taking and cohesion. This seems to suggest that in addition to expressive language, the receptive language ability will need to be considered in pragmatics research. Most of these studies are concerned with English-speaking children. An issue that needs to be considered here is the status of pragmatic impairment. There is controversy as to whether children with pure pragmatic impairment exist or the so-called pragmatic impairment is a secondary consequence of SLI or other dysfunctions such as autism spectrum disorder (Ketelaars & Embrechts, 2017). In categorizing subgroups of children with language and speech impairment, Conti-Ramsden and Botting (1999) and ContiRamsden et al. (1997) list pragmatic difficulties as either co-existing with semantic difficulties or existing as a separate category. In contrast, in a study on subgroups of language impairment among Dutch speaking children, pragmatic impairment did not account for group variance and therefore was not listed as a subtype of impairment (Daal et al., 2004). The debate on whether pragmatics can be impaired independently has implications not only for clinical diagnosis and management and development of reliable, ecologically valid instruments, but also for linguistic theory. Shaeffer (2005, p. 90) argued that “If pragmatics can be impaired independently, without affecting other components of language, this provides support for the modularity of language, i.e. for the hypothesis that there is an independent pragmatics module.” A different approach to pragmatic impairment is proposed by Michael Perkins (2002, also see Chapter 5, this volume). In this approach, pragmatic behavior is seen as an emergent consequence of interactions within and between linguistic systems which include phonology, prosody, morphology, syntax, lexis and discourse, cognitive systems, and sensorimotor systems. Therefore, different underlying causes may result in different
Cross-Linguistic and Multilingual Perspectives 133 types of pragmatic impairment: for example, cognitive dysfunction leads to primary pragmatic impairment; linguistic or sensorimotor dysfunction may result in secondary pragmatic impairment; dysfunction in more than one of these systems may result in complex pragmatic impairment. Again, very little is known about pragmatic impairment in children speaking languages other than English. Pragmatic deficit also occurs in various kinds of autism. Individuals with Asperger syndrome or high-functioning autism are highly susceptible to pragmatic impairments such as inappropriate speech, non-compliance with rules of conversation, difficulty in dialogue management, and failures in communication inference (for a review, see Volden, 2017). Oi (2005) looked at how non-autistic interlocutors respond to pragmatic impairments in Japanese children with Asperger syndrome. He found that the autistic participants adopted a greater number of compensation strategy types than the normally functioning adults when a breakdown occurred. Interestingly, adults’ judgment on whether there is communicative breakdown in the conversation and whether the interactant’s compensation strategy is effective seems to be different between initial and second-round analyses of videotapes of the conversation. This finding, though based on Japanese children with autism, may have wider implications for clinical practice across different languages. Cummings’ edited volumes (2017, 2021) provide a review of other pragmatic abnormality that occurs in children with attention deficit hyperactivity, intellectual disability, brain tumor, cerebral Palsy, Fragile X syndrome, Down syndrome, Williams syndrome, 22q11.2 deletion syndrome, Tourett Syndrome, sensory loss, and selective mutism. These reviews open up new frontiers for understanding pragmatic impairment across languages.
10.3.3 Pragmatic and Discourse Skills of People with Acquired Language and Communication Impairments Pragmatic deficits can occur as a consequence of brain damage or aphasia. Some studies document the pragmatic behaviors of English speakers with brain damage. Dennis and Barnes (1990) show that children and adolescents with closed-head injury have difficulties in certain pragmatic tasks, such as knowing the alternate meanings of an ambiguous word in context or bridging the inferential gap between events in stereotyped social institutions. Eisele et al. (1998) noted inferential deficits in the comprehension of implications and presuppositions in children with unilateral left- or right-hemisphere damage. Bara et al. (1999) argued that for young children, the resultant pragmatic impairment is less severe than for older children with brain damage, probably because the other areas are able to take over pragmatic abilities at early ages but not later. Aphasia often leads to pragmatic deficits. In one of the very few studies of pragmatic deficits in speakers of languages other than English, Pak-Hin and Law (2004) developed a Cantonese linguistic communication measure to quantify narrative production of Cantonese speakers with aphasia. The measure contained eight indices reflecting the amount, efficiency, and rate of information conveyed, the grammaticality of and the extent of elaboration on sentences produced, as well as the degree of erroneous production and lexical diversity in the speech output. Cantonese speakers with aphasia displayed various deficits in these measures. Wulfeck et al. (1989) and Rizzi (1980) compared English, Italian, and German aphasia patients’ ability to differentiate the given/new contrast on several aspects of linguistic expression. Severity of aphasia rather than structural differences in languages was found to account for the differences in the speakers’ pragmatic abilities. Studies of language degeneration in adults with Dementia of Alzheimer’s type (DAT) suggest that whereas phonology, morphology and syntax are relatively preserved, a deterioration of conceptual, semantic, and pragmatic aspects is usually evident. The patients’
134 Zhu Hua and Li Wei discourse is characterized by a predominant lack of coherence (organization of ideas at the conceptual level) in spite of good preservation of cohesion (logical organization of syntactic elements at the linguistic level). St-Pierre et al. (2005) investigated the discourse of Frenchspeaking DAT patients and argued that the lack of coherence in the narrative discourse of DAT patients is due to the lower proportion of relevant information it contains. A number of researchers have looked at language impairment of people with schizophrenia. There seems to be a general agreement that the primary language deficit is manifested in the area of pragmatic performance. Based on data from Hebrew-speaking patients, Meilijson et al. (2004) showed that participants with schizophrenia had their most inappropriate performance in topic change, followed by topic maintenance. There is a growing awareness of pragmatic impairment present in complex population in adulthood. Cummings (2021) provides a review of pragmatic impairments in a range of conditions including right-hemisphere language disorders, psychiatric disorders, dementia of the Alzheimer Type, Parkinson’s disease, Multiple Sclerosis, Amyotropic Lateral Sclerosis, Hungtington’s disease, and traumatic brain injury.
10.3.4 The Role of Culture The role of culture emerges as a key issue in cross-linguistic research of language and communication impairment. It is important to point out that linguistic practices are part and parcel of a specific cultural tradition. They are manifestations of cultural values. Speakers of different languages are socialized into different cultural values and traditions through an engagement of linguistic practices and they come to represent different cultures through their linguistic practices. Cross-linguistic studies can shed light on culture-specific appropriateness or norm which is crucial to our understanding of pragmatics and discourse in the context of language and communication impairment. Nevertheless, how children acquire culture-specific or context-specific rules governing appropriateness of interaction seems to be under-researched. These culture-specific rules, at a micro-level, involve how to use contextualized cues to interpret other people’s communicative intent and communicate one’s own and, at a macro-level, consist of cultural and social norms and conventions which are intertwined with interactional practices. For example, people from certain cultures may have longer gaps between turns; different cultures may have different rules of politeness in performing various speech acts; and different languages may employ different linguistic means to achieve the same pragmatic function or the same linguistic means for different pragmatic functions. Taylor (1986) and Taylor and Clarke (1994) proposed a cultural framework which attempts to demonstrate the impact of culture on communication disorders in terms of four central topics associated with the nature, causes, assessment and treatment of communication disorders. These topics are developmental issues (such as adult–child interaction within culture, and indigenous cognitive acquisition), precursors of communication pathology (such as cultural definitions of normal and pathological interaction), assessment (i.e. culturally valid assessment and diagnosis of communication), and diagnosis and treatment (i.e. application of culturally valid treatment procedures). An example of the importance of cross-linguistic, cross-cultural analysis in understanding interactional and language socialization processes is King and Melzi’s (2004) study which explores the use of diminutives in everyday conversation between Spanish-speaking Peruvian mothers and their children and attempts to explain why and how diminutive imitation seems to promote greater overall use of diminutives in the Peruvian context. Diminutives have received little attention from language researchers, partly because English has a relatively impoverished and unproductive diminutive system, mainly relying on the suffix –y/ie occurring with a
Cross-Linguistic and Multilingual Perspectives 135 restricted set of common and proper nouns. However, in languages such as Spanish diminutives have much richer semantic systems and pragmatic functions. In addition to “smallness,” diminutives in Spanish convey intimacy, playfulness, politeness or humor. They reflect the Peruvian cultural value of “carino, which translates loosely as tenderness, endearment, fondness and positive affect” (p. 257). Diminutives have been found to be prevalent in female speech and in speech directed at children. Through imitation or repetition of their mothers’ use of diminutives, as King and Melzi argue, Spanish children are able to acquire the system of diminutives very early despite its semantic and pragmatic complexity. In the areas of pragmatics and discourse, where people from different cultural and language backgrounds may behave differently in interaction and have different norms towards what constitutes culturally appropriate behaviors, culture-specific expectations and procedures need to be followed in administering clinical assessment. The cross-cultural child socialization literature also suggests that children from some cultures may not be at ease in testing situation in clinics. Cheng (2004, p. 169) argues that the discourse styles of AsianPacific American populations may differ from those of American homes and schools. For example, this population may delay or hesitate in response, be less likely to ask questions or use discourse markers to acknowledge the interactant, and tend to use longer pauses between turns. It is important for clinicians not to interpret these differences as “deficient, disordered, aberrant and undesirable.” Barrenechea and Schmitt (1989) examined Spanishspeaking preschool children for the development of seven language functions and three discourse features. A set of preliminary guidelines for the development of normal p ragmatics in Hispanic preschoolers was then developed.
10.3.5 Development of Sociolinguistic Competence As discussed earlier, sociolinguistics concerns stylistic, dialectal, and cultural variations in language use by different speaker groups. Cross-linguistic studies of sociolinguistics in the context of communication disorders, similar to those of pragmatics and discourse, are predominantly concerned with how normal speakers use linguistic means (specifically dialectal and social variations) to convey meaning. Two broad types of sociolinguistic studies can be identified in the literature: comparisons of group patterns and acquisition of dialectal and social variations. The first type – group comparisons – often overlaps with studies of pragmatics and discourse. Rice et al. (1991), for example, compared the patterns of social interactions among four groups (normally developing English, specific language impairment, speech impairment, and English as a second language). They found that children with limited communication skills were more likely than their normal language peers to initiate with adults (rather than children) and to shorten their responses or use non-verbal responses. Children learning English as a second language were the least likely to initiate interactions and were the most likely to be avoided as the recipient of an initiation. Andersen et al. (1999) examined cross-linguistic data from American English, Lyonnais French, and Chicano Spanish on the use of discourse markers to indicate social relationships between interlocutors. Striking cross-linguistic parallels were found in the way children of different language backgrounds learn to use discourse makers both to convey social meaning and to manipulate the social situation where power relationships are not pre-established. For example, all groups were found to use more lexical discourse markers and more “stacks” (such as well, now then) to mark higher-status roles, with non-lexical variants (such as uh, euh, or eh) more frequent in the low-status roles. Amongst studies of children’s acquisition of dialectal and social variations, African American English (AAE) have received a considerable amount of attention. AAE is a
136 Zhu Hua and Li Wei language variety whose key features closely approximate, at the surface level, those of American-English-speaking children with SLI (such as habitual be, copula absence, inflectional –s, and other grammatical, phonological and lexical features; Wolfram, 2005). There are an increasing number of studies on developing and evaluating assessment instruments and establishing expectations for the language performance of young African American children. Studies in this area include (the list is by no means exhaustive): Craig and Washington (2002), Qi et al. (2006), Thomas-Tate et al. (2006), Washington and Craig (1992a, 1992b, 2004) (see Roberts, 2005 for a review). The special issue by Hyter and Rivers (2015) is dedicated to pragmatic language development of African American children and adolescents and reports on the features of narrative discourse skills, emotion discourse, written narratives and expository language skills (Hyter et al., 2015; Kersting et al., 2015; Koonce, 2015). Their works demonstrate the importance of recognizing pragmatic features that are characteristic of cultural discourse styles. Several studies also point out that children from low socio-economic strata tend to perform lower than expected on standardized tests of language abilities compared with children from middle or high socio-economic background (Qi et al., 2006). Cummings (2021) provide reviews of pragmatic features of a number of “underserviced” populations including infants and children adopted internationally, infants and children exposed to HIV and substance abuse, maltreated and traumatized children and young people, children and young people with written language disorders, adults in prison populations, etc. These works have resulted in significant breakthroughs in our understanding of the impact of dialect and of potential educational and clinical significance of language differences associated with AAE in many aspects. These include the following: 1. Consideration needs to be given to non-standard, regional and social cultural variations of a language in clinical assessment and diagnosis. 2. Cultural sensitivity and specificity of language-screening instruments need to be rigorously tested. 3. Both standardized assessment instruments and non-standardized, criterion referenced assessments need to be developed and appropriately selected. Oetting (2005) reviewed a list of newly developed and/or recently validated tools for assessing children who speak a non-mainstream dialect of English and discussed the challenges facing the clinical adaptation of these tools. Laing and Kamhi (2003) presented two procedures (processing-dependent measures and dynamic assessment measures) which they believed could provide unbiased assessment for culturally and linguistically diverse populations. Carter et al. (2005) identified the major issues in the cross-cultural adaptation of speech and language assessment and argued that awareness of cultural variation and bias, and cooperative efforts to develop and administer culturally appropriate assessment tools, are the foundation of effective, valid treatment programs. In a study of reliability of identification of non-standard and non-native English-speaking children with speech-language delay and disorder, Gupta et al. (1999) found that professionals such as doctors and teachers who have not had systematic training in sociolinguistics or speech and language therapy often shared with parents their perception of dialectal variations as a potential contributor to communication disorders. On the whole, they were more likely to refer children with strong dialectal and contact features in their English to speech and language therapists. Interestingly, professionals working in geographical areas where there are easily recognizable dialectal variations or close contacts between different language groups tend to under-refer children with speech-language problems, assuming that the problems were part of the non-standard and non-native features of English.
Cross-Linguistic and Multilingual Perspectives 137
10.4 Multilingual Perspective 10.4.1 Multicompetence of Multilingual Speakers As mentioned above, there has been an increased awareness that the vast majority of the world’s population are bilingual or multilingual and that studies of language and communication impairment must take into account the speaker’s multilingual skills. There is a growing body of literature on the language development of multilingual children and the language use of multilingual adults and the elderly. Although some of the studies deal with specific linguistic features such as word order or gender assignment, most researchers recognize that bilingualism and multilingualism are essentially a language use issue. As Mackey (1962, p. 51) put it, “Bilingualism is not a phenomenon of language; but a characteristic of its use. It is not a feature of the code but of the message. It does not belong to the domain of ‘langue,’ but of ‘parole.’” To a multilingual speaker, the most important issue is appropriate choice of which language to speak to whom and when (Fishman, 1965), a central question that concerns all the studies of pragmatics, discourse and sociolinguistics. There has been much debate over the notion of language differentiation in multilingual speakers. With regard to children, the issue is how and when the child develops representations of the different languages he or she is learning, as opposed to one undifferentiated system that combines both. With regard to the elderly, the issue becomes whether or not the speaker can maintain appropriate choice of language when certain aspects of his or her language and cognitive faculty have been impaired. Language differentiation occurs at different levels: phonological, lexical, morphosyntactic and, of course, whole language systems (see De Houwer, 1995; Meisel, 2004). Typically, though, multilingual speakers alternate between languages in their linguistic repertoires. This is known as “code switching.” Code switching can occur between words, phrases, clauses, sentences, and speaker turns. It assumes the speaker’s ability to differentiate languages. Studies have found that bilingual children as young as two years can switch from one language to another in contextually sensitive ways (e.g., Lanza, 1992). Multilingual children draw on elements from different language in their linguistic repertoire in a coordinated way (see Zhu & Li, 2005 for a review). Community norms of practice are crucial in identifying potential communication difficulties and disorders in children. In a study of preschool Mirpuri-English bilingual children in the north west of England, Pert and Letts (2006) found not only that every child in the sample produced utterances containing intrasentential mixing, but also that over 40 percent of multi-word utterances contained an intrasentential mixes. The Mean Lengths of Utterances for bilingual utterances were higher than for monolingual Mirpuri or English utterances. The bilingual utterances conformed to the grammatical constraints proposed in theoretical models such as the Matrix Language Frame model (Myers-Scotton, 1993). Pert and Letts argued, on the basis of the study, that a lack of language mixing in children in this population may in fact be an indicator of language delay or intrinsic disorder. Studies of this kind have wide-ranging implications for speech and language therapy in communities where language mixing is the norm. A number of studies of multilingual adult and elderly speakers have investigated the pragmatics of language choice and use from an emotional and affective perspective. It has been suggested that multilinguals often associate different experiences with different languages. Feelings, emotions and attitudes are therefore coded with specific language tags (Altarriba & Soltano, 1996; Schrauf, 2000). Multilinguals have a choice as to what language to use and thereby have the ability to select the word that most clearly captures
138 Zhu Hua and Li Wei the essence of what they are trying to communicate as part of their multicompetence. Appropriate use of language switching in therapeutic settings with bilingual and multilingual populations has effects both on the clients’ language and communication skills and their affective development.
10.4.2 Multilingual Speakers with Language and Communication Impairment Studies of bilingual and multilingual children with language and communicative impairment are scarce. Paradis et al. (2003, p. 14) point out that “there is a dearth of research on bilingual children with SLI, even though there are many bilingual children in North America, and even worldwide.” Of the published studies, few deal specifically with issues of pragmatics or discourse. A sizable body of literature does exist on the development of narrative abilities of bilingual and multilingual children, which includes samples of bilingual children with various language disorders. Gutiérrez-Clellen (2004), for example, looked at narrative structures of Spanish-English bilingual children with language disorders. Their stories omitted specific links between events and lacked referential cohesion. For example, although when new referents were first introduced appropriate noun phrases were used, subsequent references were often ambiguous due to lack of cohesive devices. However, the researcher argued that the problems were linked to the children’s limited syntactic complexity. Indeed, the children in this particular study were diagnosed as having SLI, and their difficulties with pragmatics and discourse were seen to be due more to SLI than to being bilingual. Studies of multilingual speakers with acquired language and communication disorders often include examples of the speakers’ inappropriate choice of language. Friedland (1998), for example, found that her four Afrikaans-English bilingual subjects with Alzheimer’s disease all had difficulties in making addressee-appropriate language choices. This was not simply a matter of word retrieval, but an issue of pragmatics. They knew which words to use but often found it difficult to decide which language should be chosen. Similarly, some bilinguals with aphasia have problems with language choice and are unable to switch from one language to another for repairs (see Ijalba et al., 2004 for a review). Arslan and Felser (2018) showed that the nature and extent of impairments on discourse skills among bilinguals with aphasia are impacted considerably by individual bilingualism profiles (e.g. onset of bilingualism, premorbid language dominance). Recently, there has been a considerable scholarship on what has been termed “translanguaging” (García & Li, 2014), that is, bilingual and multilingual speakers’ flexible use of their communicative repertoire in an integrated and coordinated way without regard to named languages. Such practices are commonplace in bilingual and multilingual communities. Research shows that assessing language development and abilities of those who engage in translanguaging in their everyday life cannot be done monolingually, in one language at a time, as these language users have not had a monolingual experience at all and do not separate their languages in their linguistic repertoire. This present real challenges to language assessment, clinical communication, and speech and language therapy. A socially equitable, inclusive and culturally sensitive approach is much needed.
10.5 Conclusion Cross-linguistic and multilingual studies of pragmatics, discourse, and sociolinguistics has the potential to challenge the received wisdom of normal communication development which tend to be based on English-speaking population. These studies also inform practices
Cross-Linguistic and Multilingual Perspectives 139 of professionals working with speakers of languages other than English or multilingual speakers. There is an urgent need for more culturally and linguistically sensitive and ecologically valid assessment of communicative profiles. Such assessment clearly needs to be based on sound research. It is hoped that more crosslinguistic and multilingual studies will become available, and their insights become the baseline rather than the exception.
REFERENCES Altarriba, J., & Soltano, E. G. (1996). Repetition blindness and bilingual memory. Memory and Cognition, 24(6), 700–711. Andersen, E. S., Brizuela, M., DuPuy, B., & Gonnerman, L. (1999). Cross-linguistic evidence for the early acquisition of discourse markers as register variables. Journal of Pragmatics, 31(10), 1339–1351. Arslan, S., & Felser, C. (2018). Comprehension of wh-questions in Turkish–German bilinguals with aphasia: A dual-case study. Clinical Linguistics & Phonetics, 32(7), 640–660. Aviezer, O. (2003). Bedtime talk of three-yearolds: Collaborative repair of miscommunication. First Language, 23(1), 117–139. Bara, B., Bosco, F. M., & Bucciarelli, M. (1999). Developmental pragmatics in normal and abnormal children. Brain and Language, 68(3), 507–528. Barrenechea, L. I., & Schmitt, J. F. (1989). Selected pragmatic features in Spanish-speaking preschool children. Journal of Psycholinguistic Research, 18(4), 353–367. Blake, J., Osborne, P., Cabral, M., & Gluck, P. (2003). The development of communicative gestures in Japanese infants. First Language, 23(1), 3–20. Brinton, B., Fujiki, M., & McKee, L. (1998). Negotiation skills of children with specific language impairment. Journal of Speech, Language and Hearing Research, 41(4), 927–940. Carter, J., Lees, J., Murira, G. M., Gona, J., Neville, B., & Newton, C. (2005). Issues in the development of cross-cultural assessments of speech and language for children. International Journal of Language and Communication Disorders, 40(4), 385–401. Cheng, L.-R. L. (2004). Speech and language issues in children from Asian-Pacific backgrounds. In R. Kent (Ed.), The MIT encyclopedia of communication disorders (pp. 167–169). MIT Press.
Conti-Ramsden, G., & Botting, N. (1999). Classification of children with specific language impairment. In W. Yule & M. Rutter (Eds.), Language development and disorders (pp. 16–41). Mac Keith Press. Conti-Ramsden, G., Crutchley, A., & Botting, N. (1997). The extent to which psychometric tests differentiate subgroups of children with SLI. Journal of Speech, Language and Hearing Research, 40(4), 765–777. Cook, V. (1991). The poverty-of-the-stimulus argument and multicompetence. Second Language Research, 7(2), 103–117. Cook, V. (2016). Premises of multicompetence. In V. Cook & W. Li (Eds.), Cambridge handbook of linguistic multicompetence (pp. 1–25). Cambridge University Press. Cook, V., & Li, W. (Eds.). (2016). Cambridge handbook of linguistic multicompetence. Cambridge University Press. Corsaro, W., & Maynard, D. (1996). Format Tying in discussion and argumentation among Italian and American children. In D. Slobin, J. Gerhardt, A. Kyratzis, & J. S. Guo (Eds.), Social interaction, social context, and language (pp. 157–174). Lawrence Erlbaum. Coupland, N., Coupland, J., & Giles, H. (1991). Language, society and the elderly. Blackwell. Craig, H. K., & Evans, J. (1993). Pragmatics and SLI. Journal of Speech and Hearing Research, 36(4), 779–789. Craig, H. K., & Washington, J. A. (2002). Oral language expectations for African American preschoolers and kindergartners. American Journal of Speech-Language Pathology, 11(1), 59–70. Cummings, L. (Ed.). (2017). Research in clinical pragmatics. Springer. Cummings, L. (Ed.). (2021). Handbook of pragmatic language disorders. Complex and underserved populations. Springer. Daal, J. V., Verhoeven, L., & Balkom, H. V. (2004). Subtypes of severe speech and language impairments. Journal of Speech, Language and Hearing Research, 47(6), 1411–1423.
140 Zhu Hua and Li Wei Davis, B. H. (Ed.). (2005). Alzheimer talk, text and context: Enhancing communication. Palgrave Macmillan. De Houwer, A. (1995). Bilingual language acquisition. In P. Fletcher & B. MacWhinney (Eds.), The handbook of child language (pp. 219–250). Blackwell. Dennis, M., & Barnes, M. A. (1990). Knowing the meaning, getting the point, bridging the gap, and carrying the message: Aspects of discourse following closed head injury in childhood and adolescence. Brain and Language, 39(3), 428–446. Eisele, J. A., Lust, B., & Aram, D. M. (1998). Presupposition and implication of truth: Linguistic deficits following early brain lesions. Brain and Language, 61(3), 335–375. Ervin-Tripp, S., Guo, J., & Lampert, M. (1990). Politeness and persuasion in children’s control acts. Journal of Pragmatics, 14(2), 307–332. Fishman, J. A. (1965). Who speaks what language to whom and when? La Linguistique, 1(2), 67–88. Friedland, D. (1998). Language loss in bilingual speakers with Alzheimer’s disease. Unpublished PhD thesis, University of Newcastle upon Tyne. García, O., & Li, W. (2014). Translanguaging: Language, bilingualism and education. Palgrave Macmillan. Guidetti, M. (2005). Yes or no? How young French children combine gestures and speech to agree and refuse. Journal of Child Language, 32(4), 911–924. Guo, J. (1995). The interactional basis of the Mandarin modal néng (can). In J. Bybee & S. Fleischman (Eds.), Modality in grammar and discourse (pp. 205–238). John Benjamins. Gupta, A. F., Wei, L., & Dodd, B. (1999). Reliability of identification of children with speech-language delay and disorder with particular reference to non-standard or non-native English speakers. End of award (R000 22 2307) report to ESRC, UK. Gutiérrez-Clellen, V. F. (2004). Narrative development and disorders in bilingual children. In B. A. Goldstein (Ed.), Bilingual language development and disorders in SpanishEnglish speakers (pp. 235–256). Paul Brookes. Horton-Ikard, R., Weismer, S. E., & Edwards, C. (2005). Examining the use of standard language production measures in the language samples of African-American toddlers. Journal of Multilingual Communication Disorders, 3(3), 169–182.
Huang, C.-C. (2014). The pragmatic function of self/other reference in Mandarin child language. In H. Zhu & L. Jin (Eds.), Development of pragmatic and discourse skills in Chinesespeaking children (pp. 13–34). John Benjamins. Hymes, D. H. (1972). On communicative competence. In J. B. Pride & J. Holmes (Eds.), Sociolinguistics: Selected readings (pp. 269–293). Penguin. Hyter, Y. D., & Rivers, K. O. (2015). “The road less traveled”: Pragmatic language development in African American children and adolescents. Topics in Language Disorders, 35(1), 1–7. Hyter, Y. D., Rivers, K. O., & DeJarnette, G. (2015). Pragmatic language of African American children and adolescents: A systematic synthesis of the literature. Topics in Language Disorders, 35(1), 8–45. Ijalba, E., Obler, L. K., & Chengappa, S. (2004). Bilingual aphasia. In T. K. Bhatia & W. C. Ritchie (Eds.), The handbook of bilingualism (pp. 71–89). Blackwell. Jing-Schmidt, Z. (2014). Maternal affective input in mother-child interaction: A cross-cultural perspective. In H. Zhu & L. Jin (Eds.), Development of pragmatic and discourse skills in Chinese-speaking children (pp. 57–89). John Benjamins. Jisa, H. (1987). Sentence connectors in French children’s monologue performance. Journal of Pragmatics, 11(5), 607–621. Kersting, J., Anderson, M. A., Newkirk-Turner, B., & Nelson, N. W. (2015). Pragmatic features in original narratives written by African American students at three grade levels. Topics in Language Disorders, 35(1), 90–108. Ketelaars, M. P., & Embrechts, M. T. (2017). Pragmatic language impairment. In L. Cummings (Ed.), Research in clinical pragmatics (pp. 29–57). Springer. King, K., & Melzi, G. (2004). Intimacy, imitation and language learning: Spanish diminutives in mother–child conversation. First Language, 24(2), 241–261. Koonce, N. (2015). When it comes to explaining: A preliminary investigation of the expository language skills of African American school-age children. Topics in Language Disorders, 35(1), 76–89. Korolija, N. (2000). Coherence-inducing strategies in conversation as amongst the aged. Journal of Pragmatics, 32(4), 425–462. Laing, S., & Kamhi, A. (2003). Alternative assessment of language and literacy in culturally and linguistically diverse
Cross-Linguistic and Multilingual Perspectives 141 population. Language, Speech and Hearing Services in Schools, 34(1), 44–55. Lanza, E. (1992). Language mixing in infant bilingualism. Clarendon Press. Leinonen, E., Letts, C., & Smith, B. R. (2000). Children’s pragmatic communication difficulties. Whurr. Loukusa, S., Leinonen, E., & Ryder, N. (2007). Development of pragmatic language comprehension in Finnish-speaking children. First Language, 27(3), 279–296. Mackey, W. F. (1962). The description of bilingualism. Canadian Journal of Linguistics, 7(2), 51–85. Maxim, J. (1994). Language of the elderly: A clinical perspective. Whurr. McTear, M., & Conti-Ramsden, G. (1992). Pragmatic disability in children. Whurr. Meilijson, S., Kasher, A., & Elizur, A. (2004). Language performance in chronic schizophrenia: A pragmatic approach. Journal of Speech, Language and Hearing Research, 47(3), 695–713. Meisel, J. M. (2004). The bilingual child. In T. K. Bhatia & W. C. Ritchie (Eds.), The handbook of bilingualism (pp. 91–113). Blackwell. Meng, K., & Schrabback, S. (1999). Interjections in adult–child discourse: The cases of German HM and NA. Journal of Pragmatics, 31(10), 1263–1287. Myers-Scotton, C. (1993). Duelling languages: Grammatical structure of codeswitching. Clarendon Press. Ninio, A., & Snow, C. (1996). Pragmatic development. Westview Press. Ochs, E. (1988). Culture and language development: Language acquisition and language socialization in a Samoan village. Cambridge University Press. Ochs, E., & Schieffelin, B. (Eds.). (1979). Developmental pragmatics. Academic Press. Oetting, J. (2005). Assessing language in children who speak a nonmainstream dialect of English. In M. J. Ball (Ed.), Clinical sociolinguistics (pp. 180–192). Blackwell. Oi, M. (2005). Interpersonal compensation for pragmatic impairments in Japanese children with Asperger syndrome or high-functioning autism. Journal of Multilingual Communication Disorders, 3(3), 203–210. Pak-Hin, A. K., & Law, S.-P. (2004). A Cantonese linguistic communication measure for evaluating aphasic narrative production: Normative and preliminary aphasic data. Journal of Multilingual Communication Disorders, 2(2), 124–146.
Paradis, J., Crago, M., Genesee, F., & Rice, M. (2003). Bilingual children with specific language impairment: How do they compare with their monolingual peers? Journal of Speech, Language and Hearing Research, 46(1), 1–15. Perkins, M. (2002). An emergentist approach to clinical pragmatics. In F. Windsor, M. L. Kelly, & N. Hewlett (Eds.), Investigations in clinical phonetics and linguistics (pp. 1–14). Lawrence Erlbaum. Perroni, M. C. (1993). On the acquisition of narrative discourse: A study in Portuguese. Journal of Pragmatics, 20(6), 559–577. Pert, S., & Letts, C. (2006). Codeswitching in Mirpuri speaking Pakistani heritage preschool children: Bilingual language acquisition. International Journal of Bilingualism, 10(3), 349–374. Qi, C. H.-Q., Kaiser, A., Milan, S., & Hancock, T. (2006). Language performance of low-income African American and European American preschool children on the PPVT-III. Language, Speech and Hearing Services in Schools, 37(1), 5–16. Rice, M., Sell, M., & Hadley, P. (1991). Social interactions of speech and language impaired children. Journal of Speech and Hearing Research, 34(6), 1299–1307. Rizzi, L. (1980). A restructuring rule in Italian syntax. In S. J. Keyser (Ed.), Recent transformational studies in European languages, 113–158. MIT Press. Roberts, J. (2005). Acquisition of sociolinguistic variation. In M. J. Ball (Ed.), Clinical sociolinguistics (pp. 153–164). Blackwell. Rodrigo, M., Gonzalez, A., Vega, M., MunetonAyala, M., & Rodriguez, G. (2004). From gestural to verbal deixis, a longitudinal study with Spanish infants and toddlers. First Language, 24(1), 71–90. Saville-Troike, M. (1996). The ethnography of communication (2nd ed.). Blackwell. Schieffelin, B., & Ochs, E. (Eds.). (1986). Language socialization across cultures. Cambridge University Press. Schiffrin, D., Tannen, D., & Hamilton, H. E. (Eds.). (2001). The handbook of discourse analysis. Blackwell. Schrauf, R. W. (2000). Bilingual autobiographical memory. Culture and Psychology, 6(4), 387–417. Shaeffer, J. (2005). Pragmatic and grammatical properties of subjects in children with specific language impairment. In R. Okabe & K. Nielsen (Eds.), Papers in Psycholinguistics 2 (UCLA Working Papers in Linguistics, 13)
142 Zhu Hua and Li Wei (pp. 87–134). www.linguistics.ucla.edu/ faciliti/wpl/issues/wpl13/wpl13.htm. St-Pierre, M.-C., Ska, B., & Béland, R. (2005). Lack of coherence in the narrative discourse of patients with dementia of the Alzheimer’s type. Journal of Multilingual Communication Disorders, 3(3), 211–215. Taylor, O. L. (Ed.). (1986). Nature of communication disorders in culturally and linguistically diverse populations. College-Hill Press. Taylor, O. L., & Clarke, M. (1994). Culture and communication disorders: A theoretical framework. Seminars in Speech and Language, 15(2), 103–113. Thomas-Tate, S., Washington, J., Craig, H., & Packard, M. (2006). Performance of African American preschool and kindergarten students on the expressive vocabulary test. Language, Speech and Hearing Services in Schools, 37(2), 143–149. Tse, S. K., & Li, H. (2014). Tense and temporality: How young children express time in Cantonese. In H. Zhu & L. Jin (Eds.), Development of pragmatic and discourse skills in Chinese-speaking children, 35–56. John Benjamins. Volden, J. (2017). Autism spectrum disorder. In L. Cummings (Ed.), Research in clinical pragmatics (pp. 59–84). Springer. Washington, J. A., & Craig, H. K. (1992a). Articulation test performance of low income,
African-American preschoolers with communication impairments. Language, Speech and Hearing Services in Schools, 23(3), 203–207. Washington, J. A., & Craig, H. K. (1992b). Performance of low-income, African American preschool and kindergarten children on the Peabody Picture vocabulary testrevised. Language, Speech and Hearing Services in Schools, 23(3), 329–333. Washington, J. A., & Craig, H. K. (2004). A language screening protocol for use with young African American children in urban settings. American Journal of Speech-Language Pathology, 13(4), 329–340. Wolfram, W. (2005). African American English. In M. J. Ball (Ed.), Clinical sociolinguistics (pp. 87–100). Blackwell. Wong, W., & Ingram, D. (2003). Question acquisition by Cantonese speaking children. Journal of Multilingual Communication Disorders, 1(2), 148–157. Wulfeck, B., Bates, E., Juarez, L., Opie, M., Friederici, A., MacWhinney, B., & Zurif, E. (1989). Pragmatics in aphasia: Crosslinguistic evidence. Language and Speech, 32(4), 315–336. Zhu, H. & Li, W. (2005). Bi- and multilingual language acquisition. In M. J. Ball (Ed.), Clinical sociolinguistics (pp. 165–179). Blackwell.
11 Clinical Corpus Linguistics DAVIDA FROMM AND BRIAN MACWHINNEY 11.1 Preamble Major advances in computer power and new technologies in machine learning have made it possible to study large corpora of language with increased efficiency and reliability. Manual transcription, coding, and analyses of large datasets require time and training that exceed the capacity of most research programs. Even the collection of the datasets themselves can be challenging, as they need to be large enough and representative enough across a variety of domains (e.g., type and severity of a particular disorder, demographic diversity) to conduct robust, powerful studies. The TalkBank project seeks to address these issues and to take advantage of these new opportunities. TalkBank (https://talkbank.org) arose as an extension of the already existing child language acquisition data system called CHILDES, established by Catherine Snow and Brian MacWhinney in 1984. In 2000, we began the construction of a more general, distributed, web-based, data archiving system for transcribed video and audio data on communicative interactions in multiple forms. Since 2000, TalkBank has expanded to include 15 language banks, 7 of which focus on clinical data: AphasiaBank, ASDBank, DementiaBank, FluencyBank, PsychosisBank, RHDBank, and TBIBank. Additionally, the CHILDES and PhonBank databases have data from clinical populations that augment their corpora for the study of normal language and phonological development. This chapter will describe these clinical corpora and their impact on the study of language in a range of fields. First, we review the TalkBank principles which have been essential to its widespread adoption for advancing the study of spoken language.
11.2 TalkBank Principles The TalkBank system is grounded on six basic principles: open data-sharing, use of the CHAT transcription format, CHAT-consistent software, interoperability, responsivity to research group needs, and adoption of international standards. These principles will be briefly summarized here. 1. Maximally open data-sharing: The social sciences have been slower to adopt data-sharing process than the physical sciences, mainly out of concerns regarding participant privacy.
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
144 Davida Fromm and Brian MacWhinney This has limited the scientific advancement facilitated by large datasets. To address this, TalkBank requires that members (licensed clinicians and researchers) agree to abide by data sharing ground rules and a code of ethics (https://talkbank.org/share). The system also provides options that preserve participant anonymity (e.g., de-identification, audio bleeping, password protection, controlled viewing). 2. CHAT transcription format: TalkBank uses a uniform transcription standard, called CHAT, that allows one to encode any of the features that play a role in spoken language. These features and codes are documented in the CHAT manual which can be downloaded from https://talkbank.org/manuals/chat.pdf. Although the system is quite extensive, individual projects usually only need to use specific subsections of the full format. CHAT allows for words and utterances to be linked to corresponding segments in the media files, thereby facilitating transcription, coding, and temporal analyses. Because it uses UTF-8 Unicode, all languages can be transcribed directly in CHAT files. 3. CHAT-compatible software: CHAT files are compatible with the CLAN program (http:// dali.talkbank.org/clan) which allows for automatic parsing of the transcript and many automatic analyses of language and discourse for syntax, morphology, phonology, lexicon, timing, and pragmatics. 4. Interoperability: To archive data coming from non-CHAT formats and to provide users with options for using other programs with CHAT data, CLAN includes 14 programs for translating to and from these formats to CHAT. These other formats include Anvil, CA, CONLL, DataVyu, ELAN, LAB, LENA, LIPP, Praat, RTF, SALT, SRT, Text, and XMARaLDA. 5. Responsivity to research community needs: TalkBank seeks to be maximally responsive to the needs of individual researchers and their research communities. We attempt to implement all features that are suggested by users in terms of software features, data coverage, documentation, and user support. This is accomplished in multiple ways, such as through Google Groups mailing lists, YouTube screencast tutorials, construction of web pages for each corpus, index pages for databanks, manuals for CHAT and CLAN, article publications, conference presentations and workshops, adding new computational resources, and regular updating of programs and materials. The TalkBank Governing Board provides overall guidance for the project. 6. International standards: TalkBank adheres to international standards for database and language technology. In particular, the system adheres to (1) the FAIR standards (Wilkinson et al., 2016) for open access to data, which hold that data should be Findable, Accessible, Interoperable, and Reusable, and (2) the TRUST standards (Lin et al., 2020) for maintenance of reliable digital databases. The icons seen at the bottom of the TalkBank home page (CLARIN, Core Trust Seal, etc.) demonstrate our commitment to meeting state-of-the-art standards set by the international community of scientists interested in computational linguistics and corpora.
11.3 Clinical Corpora All seven clinical databanks are listed and accessible from the main TalkBank webpage at https://talkbank.org. Clicking on a clinical bank’s hyperlink takes one to the webpage for that clinical corpus, as in Figure 11.1 for the TBIBank homepage. These corpus webpages are organized with section headings such as System, Database, Programs, Protocols, Teaching, and Manuals. In the Database section, there is an ‘Index to Corpora’ link that provides a directory of what is contained in the collection. Clicking on that link opens a page with a table describing features of the individual corpora contained in the database. From that table, clicking on the hyperlink for any individual corpus opens a page with information
Clinical Corpus Linguistics 145 System
Database
Programs
Protocol
Teaching
Materials
Figure 11.1 TBIBank homepage © Davida Fromm & Brian MacWhinney 2023.
about the contributors, publications, and the corpus. The corpus page also has links to view the Browsable database (explained more later) and to download the transcripts and media for that corpus. Most of the clinical corpora are password protected. However, access is given readily and quickly to verified professionals who send an email request with their contact information and affiliation to [email protected].
11.3.1 AphasiaBank Aphasia results from damage to the language areas of the brain (usually the left hemisphere) and may impair expression, comprehension, reading, and writing. AphasiaBank (https:// aphasia.talkbank.org) started in 2005 with a planning meeting of 20 experienced aphasia researchers who agreed on the need for a shared protocol, a shared database, and increased availability of computational tools for the aphasia research community (MacWhinney et al., 2011). With NIH funding beginning in 2007, the project team developed a standard discourse protocol and test battery and began collecting data from university clinics and aphasia centers around the United States and Canada. For English, there are currently 467 transcripts from persons with aphasia (PWAs) and 286 from age-matched controls. Croatian has 53 PWAs; French has 13 PWAs and 14 controls; German has 4 PWAs; Italian has 10 PWAs; Mandarin has 11 PWAs and 31 controls; Romanian has one PWA; and Spanish has 4 PWAs. In addition to these transcripts that use the standard one-hour discourse protocol, there are four collections of non-protocol data including 658 with various types of discourse data, 377 files with script training samples before and after treatment, 207 recordings of aphasia group therapy sessions, and 99 recordings using the Famous People Protocol (Holland et al., 2019) for PWAs with severe aphasia. Non-protocol PWA data in other languages include 26 for German, 53 for Greek, 64 for Hungarian, 8 for Italian, 15 for Mandarin, and 32 for Spanish. All the transcripts are in CHAT, and most have been linked to either audio or video on the utterance level and coded for discourse features. The standard discourse protocol includes personal narratives, picture descriptions, storytelling (Cinderella), and a procedural task. Detailed administration instructions and a script for the investigator were developed to ensure consistent implementation across sites.
146 Davida Fromm and Brian MacWhinney The protocol is video-recorded and can be administered in person or via the internet. Demographic data are collected, and a variety of formal and informal tests are administered as well. These materials (e.g., protocol, instructions, script) are all accessible from the main webpage. The English-speaking protocol database includes participants from 37 different sites, providing good geographic diversity. The non-protocol aphasia discourse collection contains over 20 corpora contributed by researchers who collected language samples that were specific to their purposes. Some examples that illustrate the range of materials in this collection are: (1) the Fridriksson corpus, which contains WAB-R picture descriptions from 19 PWAs before and after speech entrainment treatment (Fridriksson et al., 2012); (2) the Pawleys corpus, which is a longitudinal set of 48 short (5–10 minute) conversations with a 68-year old man with fluent, Wernicke’s type aphasia, during the first 2–6 months following an ischemic stroke; (3) the SouthAL corpus (Smith & Clark, 2019), which contains transcripts and media files for 9 PWAs and 8 controls doing an oral reading test; and (4) the Olness corpus (Olness et al., 2002), which contains transcripts and audio files from 50 PWAs and 30 controls, half of whom are Caucasian and half African American, doing a wide variety of discourse tasks and an ethnographic semi-structured interview. Along with the large numbers of group treatment videos from six different sites, script training samples from two sites, and administrations of the Famous People Protocol from 13 sites, these materials greatly increase the ways in which this clinical database can impact clinical research and teaching opportunities. The overarching goal of AphasiaBank is the construction of methods for improving clinical management in aphasia. The database has facilitated that primarily through its extensive research and teaching resources. In terms of research, hundreds of publications, presentations, and theses have made use of the materials. (A full bibliography of relevant articles and conference presentations is available at the AphasiaBank webpage.) Examples of the research projects include: creating clinician-friendly discourse assessment tools with norms (Dalton et al., 2020; Richardson & Dalton, 2020; Richardson et al., 2021); developing automated analysis programs for profiling language output specific to aphasia and creating benchmarks for comparison (Forbes et al., 2012); automating established grammatical analysis systems (Fromm, Katta et al., 2021; Fromm et al., 2020); examining spontaneous speech predictors of fluency measures (Clough & Gordon, 2020; Gordon & Clough, 2020); comparing measures for lexical diversity (Fergadiotis et al., 2013); evaluating psychometric properties of language outcome measures (Boyle, 2015; Kim & Wright, 2020) and automated error analysis (Smith et al., 2019); formalizing an auditory-perceptual rating scale for connected speech (Casilio et al., 2019); examining microlinguistic aspects of language across discourse genres (Stark, 2019); investigating how gesture conveys information that is essential to understanding communication (Sekine et al., 2013; van Nispen et al., 2017); comparing manual and automated analysis of connected speech (Hsu & Thompson, 2018); training, testing, and evaluating technologies for automatic speech recognition (ASR) systems (Le et al., 2018; Perez et al., 2020); and using machine learning approaches to enhance classifications of aphasia based on spontaneous speech output (Fromm, Greenhouse et al., 2021). Hundreds of university programs around the world use the AphasiaBank Grand Rounds guided tutorial to teach about aphasia and to expose students to a wider range of people with aphasia than they might otherwise encounter. The Grand Rounds pages include case histories for 16 individuals, 40 captioned video clips of their discourse and performance on different tasks (e.g., confrontation naming, repetition), and clinically oriented questions to stimulate thought and discussion. The cases were carefully curated from the larger collection in the database to make it easier for instructors to find illustrative examples of individuals with different types and severities of aphasia to augment their course material. Another resource, the Examples page, zeroes in on the
Clinical Corpus Linguistics 147 connected speech of PWAs with definitions and short video clips of common features such as paraphasias, anomia, agrammatism, and circumlocution. Finally, a Classroom Activities page includes suggestions for assignments that make use of the AphasiaBank corpus as well as others (e.g., RHDBank) for cross-disorder comparisons. With good resources for academic and clinical instructors, students will be better prepared to provide effective diagnostic and treatment services to patients. Currently, AphasiaBank has over 1,250 members from more than 55 countries making use of the educational, clinical, and research materials.
11.3.2 TBIBank TBIBank is a repository for multimedia interactions for the study of communication in people with traumatic brain injury (TBI). TBI can result in cognitive-communication disorders that may affect all aspects of language (e.g., speaking, listening, reading, writing, pragmatics) as well as attention, reasoning, memory, and executive function. Discourse in TBI has been described as disorganized, inappropriate, tangential, unclear, redundant, and self-focused. Like AphasiaBank, TBIBank includes media files and transcripts from a standard discourse protocol and test battery. The discourse protocol has several tasks that overlap with the AphasiaBank protocol (e.g., the Cinderella story narrative), making it possible to conduct cross-disorder comparisons. This Togher-Protocol corpus is from Australia and has the added advantage of containing longitudinal discourse samples from 58 participants that allow for the study of recovery during the first two years, post-onset. The corpus contains a total of 237 transcript and media files. Several other sizeable and valuable non-protocol corpora have been contributed to this databank, yielding a total of over 800 additional files. One corpus has 55 participants with closed head injuries and 52 controls doing a variety of discourse tasks, such as story retell, story generation, and informal conversation (Coelho et al., 2003). No other languages besides English are included in this databank yet. Currently, TBIBank has over 200 members from around the world. This clinical corpus has been used in several dozen published reports, conference presentations, and graduate theses. Stubbs et al. (2018) showed that the simple procedural discourse task in the standard discourse protocol was sensitive to qualitative changes between 3- and 6-months post-injury, with significant increases in use of relevant information (macrostructure). The task also distinguished the TBI group from a control group at both time points based on speech rate and two macrostructural categories (essential and optional steps). Power et al. (2019) also reported differences on macrostructural and superstructural measures but not microstructural measures from a single picture description task done by participants with TBI controls. In a final example, Norman et al. (2022) recently selected adults with moderate to severe TBI from the TBIBank database to for their study comparing discourse-level language performance in adults with three other groups: mild TBI, healthy adults, and orthopedic controls. Studies like these require standard discourse protocols and large databases to help identify which measures are sensitive to group differences and, thereby, inform effective assessment and treatment planning for this clinical population. Like AphasiaBank, TBIBank has a Grand Rounds tutorial that includes case histories and 25 captioned video clips. The tutorial addresses the range of spoken cognitive-communication disorders that can result from TBI, discourse analyses to complement assessment, treatment approaches that target real-life discourse level communication activities, comorbidities, and recovery. It is designed as an online learning module and begins with a prelearning quiz that allows for measurement of newly acquired knowledge and skills.
148 Davida Fromm and Brian MacWhinney
11.3.3 RHDBank RHDBank (https://rhd.talkbank.org) was created for the study of communication in adults with right hemisphere disorder (RHD) resulting from damage to the right hemisphere (Minga et al., 2021). Symptoms of RHD include cognitive-communication deficits that impair pragmatic skills, resulting in difficulties producing and comprehending discourse. Deficits commonly seen in people with RHD include difficulty with topic maintenance, discourse coherence and cohesion, inference generation, turn-taking, question use, and the integration of contextual nuance. Individuals with RHD have typically been underserved clinically because their symptoms tend to be more subtle than those with aphasia resulting from left hemisphere stroke. However, the consequences of these deficits can negatively impact quality of life in many ways, such as successful return to work and social relationships with family and friends. A paucity of research and clinical resources also contributes to gaps in service to this population. As with the main AphasiaBank corpus, RHDBank contains corpora that use a standard discourse protocol, demographic data collection, and set of assessment procedures. The materials were chosen to have some overlap with those in the other clinical banks to allow for cross-disorder comparisons. In addition to the tasks in the AphasiaBank protocol, the RHD discourse protocol includes a first-encounter conversation (Kennedy et al., 1994). This task provides an opportunity to assess behaviors such as turn taking, question use, and pragmatics. The test battery includes assessments for visuospatial neglect and cognitivelinguistic functioning. To date, the protocol database has media and transcripts from 23 adults with RHD and 23 controls, but data collection is ongoing. A few non-protocol corpora are also available in this databank. The Hopkins corpus includes Cookie Theft picture descriptions from 42 participants who were seen acutely following right hemisphere strokes and then followed at various time intervals thereafter. A Spanish corpus contains discourse transcripts from 11 individuals with RHD. All materials are available from the webpage. Currently, RHDBank has over 150 members who requested access for research projects, clinical training, and educational applications. A Grand Rounds tutorial contains 13 video clips and material that highlight language production behaviors and cognitive-linguistic deficits associated with RHD. It includes clinically oriented discussion questions as well as evidencebased literature on treatment of cognitive-linguistic deficits. Research studies have reported findings such as differences in the types of questions used by participants with RHD compared with controls (Minga et al., 2020; 2022), the utility of the procedural discourse task for clinical evaluation of individuals with RHD (Cummings, 2019), and differences in macrostructural measures (e.g., main concepts and global coherence) across various tasks when compared with controls and PWA (Johnson et al., 2019). A bibliography of publications and links to conference presentations are available at the webpage. The goal is to fill the knowledge gaps and provide more exposure and training in this area to increase the likelihood that clinicians will have better tools and experience to assess and treat this population.
11.3.4 DementiaBank DementiaBank (https://dementia.talkbank.org) includes transcripts and media from individuals with various types of dementia as well as individuals with primary progressive aphasia (PPA). Dementia has many potential causes and presentations, but usually involves gradually worsening impairments in memory, communication, reasoning, and orientation. Language symptoms depend largely on the type and severity of dementia. Language production deficits generally include word-finding problems, empty speech, paraphasias, circumlocution, perseveration, and reduced output. A corpus currently being collected by Alyssa Lanzi at the University of Delaware is using a standard
Clinical Corpus Linguistics 149 discourse protocol to collect data from individuals with neurotypical and mild cognitive impairment and dementia. Again, this protocol has some overlaps with the other clinical databank protocols as well as some unique tasks and assessments designed for this clinical population. Most of the corpora in this databank were contributed from other larger studies conducted years ago. One corpus in this repository, the Pitt Corpus (Becker et al., 1994) contains longitudinal data for four language tasks (Cookie Theft picture descriptions, a sentence construction task, word fluency tasks, and a story retell task) from hundreds of individuals with Alzheimer’s disease (AD) and other types of dementia as well as elderly controls. Another large corpus, WLS (Herd et al., 2014), includes a subset of tasks from the Wisconsin Longitudinal Study, which is a long-term study of a random sample of over 10,000 high school graduates from 1957. The WLS corpus in the DementiaBank collection contains 1,369 audio and transcript files of Cookie Theft picture description from the 2011 test sessions, with additional data on demographics and a variety of related test scores (e.g., letter and category word fluency). Most of these participants would be considered healthy controls. Other corpora in DementiaBank include conversations and other language tasks from individuals with AD. Corpora have also been contributed from German, Mandarin, Spanish, and Taiwanese. Currently, DementiaBank has over 600 members from all over the world. The large datasets in this clinical corpus have been of particular interest to researchers who are using a variety of ASR, machine learning, and language processing techniques to automatically identify AD from short narrative samples (de la Fuente Garcia et al., 2020). Several computational challenges have been hosted, for example at InterSpeech conferences (e.g., ADReSS Challenge), where teams from across the world test methods for detecting Alzheimer’s disease and predicting cognitive decline based on spontaneous speech samples. Carefully curated sets of data are made available to the participating teams for comparison of results. The goal is to develop the most effective clinical applications of these techniques for early diagnosis and implementation of devices to promote patient health and safety. To date, the best classification accuracy (without transcripts or human intervention) is around 78% and the best cognitive score prognosis accuracy is around 66% (Luz et al., 2021). Accuracy is improved if text files are used with the audio signals. This important line of work relies on large, high-quality datasets, which are unfortunately limited in supply. A bibliography of all articles that use DementiaBank corpora is available from a link at the webpage.
11.3.5 FluencyBank FluencyBank (https://fluency.talkbank.org), organized by Nan Bernstein Ratner at the University of Maryland, is one of the newer databases, established in 2016 to address the need for a shared open-access database that could allow for a broader and deeper understanding of the development of fluency and disfluency (stuttering) in normal and atypical speech (Bernstein Ratner & MacWhinney, 2018). Stuttering is a significantly disabling, lifelong communication disorder with severe psychological, educational, social, economic, and vocational impacts (Gerlach et al., 2018). Fluency of speech production is involved in many stages and processes of language encoding (Ferreira, 2007). When disfluency exceeds listener expectations in frequency and/or quality, it is often perceived as stuttering, a disorder producing substantial personal handicap, with academic, vocational, and social impacts spanning the lifetime (Tichenor & Yaruss, 2020). Disfluency is also involved in other expressive communication disorders. The FluencyBank database currently contains 13 corpora in English, and one each in Dutch, French, and German. The English samples include almost 500 audio or video files of children and adults who stutter; some of the corpora include control participants as well.
150 Davida Fromm and Brian MacWhinney Teaching resources include a large selection of videos of adults and children who stutter as well as suggested classroom activities. To study disfluency patterns in detail, we have introduced into CHAT a series of special codes and Unicode symbols. These mark features such as stalls, pauses, filled pauses, prolongations, broken words, blocking, repeated segments, lengthened repeated segments, phonological fragments, and various types of word and phrase repetition. These features can then be analyzed with CLAN’s FLUCALC program, which resembles the KIDEVAL program in many regards. One feature of FLUCALC is its ability to compute a disfluency index based on syllables, rather than words, as developed by the Illinois Stuttering Project (Yairi & Ambrose, 1999). The first phase of construction of FluencyBank focused on the emergence of speech fluency over childhood in both typically developing preschoolers and children who stutter (CWS). This work has focused on ways of understanding what features can predict recovery from early childhood stuttering as opposed to persistent stuttering. Various factors have been proposed as potential predictors of recovery from childhood onset stuttering. Among these are negative family history of stuttering, female sex, earlier age of stuttering onset, better language/phonological skills, and genetic and brain indices. Interestingly, the initial speech fluency profile does not predict outcomes. We are now exploring ways in which neuroanatomical measures (Chang et al., 2019) collected in combination with language samples can further illuminate this issue.
11.3.6 ASDBank ASDBank (https://asd.talkbank.org) includes data on language development from children and adolescents with autism spectrum disorder (ASD). It uses methods and analyses that are like those used in the larger CHILDES database. This is one of the smaller collections, and unfortunately it does not include media files for over half of the corpora. The most complete corpus in this collection contains Dutch language productions from 46 children with ASD. These data are from a project that investigates asymmetries between production and comprehension in unimpaired children, in young and elderly adults, and in autistic and ADHD children and adolescents (Kuijper et al., 2015). We hope to expand this databank in future work.
11.3.7 CHILDES The oldest and most widely used component of TalkBank is CHILDES (Child Language Data Exchange System, https://childes.talkbank.org) which began in 1984 and has now been used in over 8000 publications. Although the bulk of work with CHILDES has focused on the analysis of normal language development, there are also 27 corpora from children with various developmental disorders, such as late talking, SLI, Downs Syndrome, and epilepsy. These are included in CHILDES, although similar data from children who stutter are in FluencyBank and data from children with ASD are in ASDBank.
11.3.8 PhonBank The PhonBank project (https://phonbank.talkbank.org) is led by Yvan Rose at Memorial University, Newfoundland. PhonBank work has focused both on the creation of a database of phonologically transcribed productions from young children and the development of a program called Phon designed to analyze these data. In accord with the emphasis in TalkBank on interoperability, it is possible to open CHAT files directly in Phon, although the level of phonological coding and analysis in Phon is much deeper than that found in CHILDES
Clinical Corpus Linguistics 151 corpora. Phon provides a wide range of analysis options, including automatic syllabification, automatic model phonology insertion, full analysis using Praat, dozens of standard measures, pre-configured reports, and user-configurable report formats. The languages represented in PhonBank corpora include Arabic, Berber, Cree, Cantonese, Catalan, Dutch, English, German, Greek, Icelandic, Japanese, Mandarin, Norwegian, Polish, Portuguese, Quechua, Romanian, Spanish, Swedish, and Turkish. There are also phonologically transcribed corpora from bilingual children and second language learners. During the first phase of constructing PhonBank, the emphasis was on data from normally developing children. However, these data have now been supplemented with corpora from children with phonological disorders in English, French, Portuguese, and Spanish.
11.3.9 PsychosisBank The most recent addition to TalkBank’s clinical banks is PsychosisBank. This bank focuses on language in psychosis in collaboration with the international Discourse in Psychosis consortium at https://discourseinpsychosis.org, led by Lena Palaniyappan at Western University. This group has formulated a standard spoken language elicitation protocol based on the AphasiaBank protocol with extensions to psychosis. Projects using this new protocol are now underway.
11.4 Other Tools In addition to the various sets of Grand Rounds pages and Classroom Activities that have already been mentioned, TalkBank has several tools that can be used with all databanks for a variety of research and teaching applications.
11.4.1 Browsable Database The Browsable Database provides direct playback of the transcripts and media in a databank without having to download anything. A directory in the upper left corner of the screen allows users to select the language, the corpus, and the file of interest. As the media file plays, yellow highlighting appears on the corresponding transcript line, as illustrated in Figure 11.2. This facility is particularly useful for scanning over a corpus and for providing easy access for student work. The Browsable Database facility also provides the platform for the Collaborative Commentary system, and it can be called up to display the lines that correspond to specific string matches in a TalkBankDB search.
11.4.2 Collaborative Commentary Collaborative Commentary (CC) is a tool that works from the Browsable database. It allows users to enter comments in relation to single utterances or a range of utterances, as illustrated in Figure 11.3. CC allows researchers, instructors, and clinicians to form commentary groups directed by a single manager but composed of multiple group members. Members can be co-workers, colleagues, or students. They can insert analytic comments or codes directly into the online transcript display with each comment or code being tagged to a specific utterance. For example, clinical researchers can collectively evaluate behaviors and refine descriptions, research teams can measure and establish coding reliability, and students can learn to identify a variety of behaviors (e.g., paraphasias, circumlocutions, agrammatism). These comments
152 Davida Fromm and Brian MacWhinney
Figure 11.2 Browsable database © Davida Fromm.
Figure 11.3 Collaborative commentary example © Davida Fromm.
Clinical Corpus Linguistics 153 are stored in a separate but linked postgreSQL database organized by group. Each group has access to comments from its own members, but not to those from other groups. The group manager controls the process of adding members and setting the group password. Within each group, it is possible to search for specific codes and to click on those to open the relevant segment using the Browsable Database. In this way, users can study each comment in detail and can create reports and statistics based on the comments and codes. Collaborative Commentary provides innovative methods for analyzing spoken language. Using aphasia data as an example, CC will allow researchers to sharpen their coding and interpretation of the details of the successes and difficulties that persons with aphasia face during conversational interaction. The interpretation of the scope and causes of these difficulties can directly inform assessment, classification, and treatment. Inevitably, there will be variance between analysts in the interpretation of patterns and their causes. To quantify and analyze this variation, systematic group-specified codes can be used to draw in additional cases and examples from the larger database. In this way, creators of competing or cooperating interpretations can create a portfolio of documentation for their positions. For instructors, this type of immediate access to samples of interactions with people with aphasia can greatly enhance their students’ learning.
11.4.3 TalkBankDB TalkBankDB (https://talkbank.org/DB) is a web-based postgreSQL system that provides fuller and more direct access to and quantitative analysis of the entire TalkBank database. It allows for large segments of the database to be downloaded in seconds. The manual for this tool can be accessed by clicking on the “manual” icon in the upper right next to the Login button. TalkBankDB provides an intuitive on-line interface for researchers to explore TalkBank’s media and transcripts, specify data to be extracted, and pass these data on to statistical programs for further analysis. It supports n-gram and CQL (Corpus Query Language) searches across all tiers in CHAT and allows for a variety of visualizations and analyses of data. Alternatively, users can download data sets directly from Python or R. With the entirety of TalkBank freely accessible from a simple web interface, resources that were previously found only by advanced users are now open to a broader community. Features such as utterance length, lexical variables, morphological content, or error production by demographics or aphasia type can easily be selected, output, plotted, and analyzed through the web interface. There is also a GitHub account where users can upload scripts and analyses. These various options allow TalkBankDB to provide a single point where users can explore, share their research, and see what others are doing in the TalkBank community.
11.5 Summary The assessment and treatment of discourse is now receiving intense attention from a wide range of disciplines. Researchers who have joined as members come from the departments of Biostatistics, Computer Science, Electrical Engineering, English, Geriatrics, Informatics, Linguistics, Medicine, Neurology, Psychology, and Speech and Hearing Sciences. The types of analyses that have been and can be applied to the TalkBank clinical corpora are equally broad. The need for high-quality, accessible shared databases to make this work possible cannot be overstated. Many fundamental questions continue to plague the assessment and treatment of discourse, grammar and phonology in the clinical arena (Dietz & Boyle, 2018; Kurland & Stokes, 2018). For example, in the study of TBI, Snow and Douglas (2000) have laid out issues
154 Davida Fromm and Brian MacWhinney regarding sampling (e.g., which genres, how many, how elicited, with whom), transcribing (e.g., cost-benefit considerations), measuring (e.g., microlinguistic vs. macrolinguistic analyses), and criteria for comparison. The combination of shared databases, standard protocols, consistent transcription formats, and automated analyses now provides concrete methods for addressing these issues. Technological and methodological advances and an increased acceptance of the importance of data-sharing are helping clinical researchers advance our understanding and better inform our approaches to clinical management.
ACKNOWLEDGMENTS This work is supported by NIDCD grant (R01-DC008524) for AphasiaBank, NIDCD grant DC015494 for FluencyBank, NICHD grant HD051698 for PhonBank, and NICHD grant (R01-HD082736) for CHILDES. We also gratefully acknowledge the teachers, researchers and clinicians who have contributed data to the shared database and the participants who have consented to participate and share their data. We are also deeply grateful to Audrey Holland and Margaret Forbes for their contributions to all these databases and their support over the years.
REFERENCES Becker, J. T., Boller, F., Lopez, O. L., Saxton, J., & McGonigle, K. L. (1994). The natural history of Alzheimer’s disease: Description of study cohort and accuracy of diagnosis. Archives of Neurology, 51(6), 585–594. Bernstein Ratner, N., & MacWhinney, B. (2018). Fluency Bank: A new resource for fluency research and practice. Journal of Fluency Disorders, 56, 69–80. https://doi.org/10.1016/j. jfludis.2018.03.002 Boyle, M. (2015). Stability of word-retrieval errors with the AphasiaBank stimuli. American Journal of Speech Language Pathology, 24(4), 953–960. https://doi.org/10.1044/2015_AJSLP-14-0152 Casilio, M., Rising, K., Beeson, P. M., Bunton, K., & Wilson, S. M. (2019). Auditory-perceptual rating of connected speech in aphasia. American Journal of Speech-Language Pathology, 28(2), 550–568. https://doi.org/10.1044/2018_AJSLP-18-0192 Chang, S.-E., Garnett, E. O., Etchell, A., & Chow, H. M. (2019). Functional and neuroanatomical bases of developmental stuttering: Current insights. The Neuroscientist, 25(6), 566–582. https://doi.org/10.1177/1073858418803594 Clough, S., & Gordon, J. K. (2020). Fluent or nonfluent? Part A. Underlying contributors to categorical classifications of fluency in aphasia. Aphasiology, 34(5), 515–539. https://doi.org/ 10.1080/02687038.2020.1727709 Coelho, C., Youse, K., Le, K., & Feinn, R. (2003). Narrative and conversational discourse of
adults with closed head injuries and nonbrain-injured adults: A discriminant analysis. Aphasiology, 17(5), 499–510. https://doi. org/10.1080/02687030344000111 Cummings, L. (2019). On making a sandwich: Procedural discourse in adults with righthemisphere damage. In A. Capone, M. Carapezza, & F. Lo Piparo (Eds.), Further advances in pragmatics and philosophy: Part 2 theories and applications (pp. 331–355). Springer. Dalton, S. G., Kim, H., Richardson, J., & Wright, H. H. (2020). A compendium of core lexicon checklists. Seminars in Speech and Language, 41(01), 045–060. https://doi. org/10.1055/s-0039-3400972 de la Fuente Garcia, S., Ritchie, C. W., & Luz, S. (2020). Artificial intelligence, speech, and language processing approaches to monitoring Alzheimer’s disease: A systematic review. Journal of Alzheimer’s Disease, 78(4), 1547–1574. https://doi.org/10.3233/JAD-200888 Dietz, A., & Boyle, M. (2018). Discourse measurement in aphasia: Consensus and caveats. Aphasiology, 32(4), 487–492. https:// doi.org/10.1080/02687038.2017.1398803 Fergadiotis, G., Wright, H., & West, T. M. (2013). Measuring lexical diversity in narrative discourse of people with aphasia. American Journal of Speech-Language Pathology, 22(2), 397–408. https://doi. org/10.1044/1058-0360(2013/12-0083)
Clinical Corpus Linguistics 155 Ferreira, F. (2007). Prosody and performance in language production. Language and Cognitive Processes, 22(8), 1151–1177. https://doi. org/10.1080/01690960701461293 Forbes, M., Fromm, D., & MacWhinney, B. (2012). AphasiaBank: A resource for clinicians. Seminars in Speech and Language, 33(3), 217–222. https://doi.org/10.1055/s-0032-1320041 Fridriksson, J., Hubbard, I., Hudspeth, S. G., Holland, A., Bonilha, L., Fromm, D., & Rorden, C. (2012). Speech entrainment enables patients with Broca’s aphasia to produce fluent speech. Brain, 135(Pt 12), 3815–3829. https://doi. org/10.1093/brain/aws301 Fromm, D., Greenhouse, J., Pudil, M., Shi, Y., & MacWhinney, B. (2021). Enhancing the classification of aphasia: A statistical analysis using connected speech. Aphasiology, 36(12), 5. https://doi.org/10.1080/02687038.2021.1975636 Fromm, D., Katta, F., Paccione, M., Hecht, F., Greenhouse, J. B., MacWhinney, B., & Schnur, T. T. (2021). A comparison of manual vs. automated Quantitative Production Analysis of connected speech. Journal of Speech, Language, and Hearing Research, 64(4), 1271–1282. https:// doi.org/10.1044/2020_JSLHR-20-00561 Fromm, D., MacWhinney, B., & Thompson, C. K. (2020). Automation of the northwestern narrative language analysis system. Journal of Speech, Language and Hearing Research, 63(6), 1835–1844. https://doi.org/10.1044/2020_JSLHR-19-00267 Gerlach, H., Totty, E., Subramanian, A., & Zebrowski, P. (2018). Stuttering and labor market outcomes in the United States. Journal of Speech, Language, and Hearing Research, 61(7), 1649–1663. https://doi.org/10.1044/2018_JSLHR-S-17-0353 Gordon, J. K., & Clough, S. (2020). How fluent? Part B. Underlying contributors to continuous measures of fluency in aphasia. Aphasiology, 34(5), 643–663. https://doi.org/10.1080/02687 038.2020.1712586 Herd, P., Carr, D., & Roan, C. (2014). Cohort profile: Wisconsin longitudinal study (WLS). International Journal of Epidemiology, 43(1), 34–41. https://doi.org/10.1093/ije/dys194 Holland, A., Forbes, M., Fromm, D., & MacWhinney, B. (2019). Communicative strengths in severe aphasia: The famous people protocol and its value in planning treatment. American Journal of Speech Language Pathology, 28(3), 1010–1018. https://doi. org/10.1044/2019_AJSLP-18-0283 Hsu, C.-J., & Thompson, C. K. (2018). Manual versus automated narrative analysis of agrammatic production patterns: The northwestern narrative language analysis and
computerized language analysis. Journal of Speech, Language, and Hearing Research, 61(2), 373–385. https://doi.org/10.1044/2017_JSLHR-L-17-0185 Johnson, M., Randolph, E., Fromm, D., & MacWhinney, B. (2019). Comparisons of narrative discourse in Right Hemisphere Brain Damage (RHD), aphasia, and healthy adults. Poster presented at the American Speech-LanguageHearing Association convention, Orlando, FL. Kennedy, M. R., Strand, E. A., Burton, W., & Peterson, C. (1994). Analysis of first-encounter conversations of right-hemisphere-damaged adults. Clinical Aphasiology, 22, 67–80. Kim, H., & Wright, H. H. (2020). Concurrent validity and reliability of the core lexicon measure as a measure of word retrieval ability in aphasia narratives. American Journal of Speech-Language Pathology, 29(1), 101–110. https://doi.org/10.1044/2019_AJSLP-19-0063 Kuijper, S. J., Hartman, C. A., & Hendriks, P. (2015). Who is he? Children with ASD and ADHD take the listener into account in their production of ambiguous pronouns. PLoS ONE, 10(7), e0132408. https://doi.org/ 10.1037/abn0000231 Kurland, J., & Stokes, P. (2018). Let’s talk real talk: An argument to include conversation in a D-COS for aphasia research with an acknowledgment of the challenges ahead. Aphasiology, 32(4), 475–478. https://doi.org/ 10.1080/02687038.2017.1398808 Le, D., Licata, K., & Provost, E. M. (2018). Automatic quantitative analysis of spontaneous aphasic speech. Speech Communication, 100, 1–12. https://doi. org/10.1016/j.specom.2018.04.001 Lin, D., Crabtree, J., Dillo, I., Downs, R. R., Edmunds, R., Giaretta, D., DeGiusti, M., L’Hours, H., Hugo, W., Jenkyns, R., Khodiyar, V., Martone, M. E., Mokrane, M., Navale, V., Petters, J., Sierman, B., Sokolova, D. V., Stockhause, M., & Westbrook, J., Jenkyns, R. (2020). The TRUST Principles for digital repositories. Scientific Data, 7(1), 1–5. https:// doi.org/10.1038/s41597-020-0486-7 Luz, S., Haider, F., de la Fuente, S., Fromm, D., & MacWhinney, B. (2021). Detecting cognitive decline using speech only: The ADReSSo Challenge. arXiv preprint arXiv:2104.09356. MacWhinney, B., Fromm, D., Forbes, M., & Holland, A. (2011). AphasiaBank: Methods for studying discourse. Aphasiology, 25(11), 1286–1307. https://doi.org/10.1080/0268703 8.2011.589893 Minga, J., Fromm, D., Jacks, A., Stockbridge, M., Nelthropp, J., & MacWhinney, B. (2022). The
156 Davida Fromm and Brian MacWhinney effects of right hemisphere brain damage on question-answering in conversation. Journal of Speech, Language and Hearing Research, 65(2), https://doi.org/10.1044/2021_JSLHR-21-00309 Minga, J., Fromm, D., Williams-DeVane, C., & MacWhinney, B. (2020). Question use in adults with Right-Hemisphere Brain Damage. Journal of Speech, Language, and Hearing Research, 63(3), 738–748. https://doi. org/10.1044/2019_JSLHR-19-00063 Minga, J., Johnson, M., Blake, M. L., Fromm, D., & MacWhinney, B. (2021). Making sense of right hemisphere discourse using RHDBank. Topics in Language Disorders, 41(1), 99–122. https://doi.org/10.1097/tld.0000000000000244 Norman, R. S., Mueller, K. D., Huerta, P., Shah, M. N., Turkstra, L. S., & Power, E. (2022). Discourse performance in adults with mild traumatic brain injury, orthopedic injuries, and moderate to severe traumatic brain injury, and healthy controls. American Journal of Speech-Language Pathology, 31(1), 67–83. https://doi.org/10.1044/2021_ AJSLP-20-00299 Olness, G. S., Ulatowska, H. K., Wertz, R. T., Thompson, J. L., & Auther, L. L. (2002). Discourse elicitation with pictorial stimuli in African Americans and Caucasians with and without aphasia. Aphasiology, 16(4–6), 623–633. Perez, M., Aldeneh, Z., & Provost, E. M. (2020). Aphasic speech recognition using a mixture of speech intelligibility experts. arXiv preprint arXiv:2008.10788. Power, E., Weir, S., Richardson, J., Fromm, D., Forbes, M., MacWhinney, B., & Togher, L. (2019). Patterns of narrative discourse in early recovery following severe Traumatic Brain Injury. Brain Injury, 34(1), 98–109. https://doi. org/10.1080/02699052.2019.1682192 Richardson, J., & Dalton, S. G. H. (2020). Main concepts for two picture description tasks: An addition to Richardson and Dalton, 2016. Aphasiology, 34(1), 119–136. https://doi.org/10 .1080/02687038.2018.1561417 Richardson, J., Grace Dalton, S., Greenslade, K., Jacks, A., Haley, K., & Adams, J. (2021). Main concept, sequencing, and story grammar (MSSG) analyses of Cinderella narratives in a large sample of persons with aphasia. Brain Sciences, 11(1), https://doi.org/10.3390/ brainsci11010110 Sekine, K., Rose, M. L., Foster, A. M., Attard, M. C., & Lanyon, L. E. (2013). Gesture production patterns in aphasic discourse: In-depth description and preliminary predictions. Aphasiology, 27(9), 1031–1049. https://doi.org/ 10.1080/02687038.2013.803017
Smith, K. G., & Clark, K. F. (2019). Error analysis of oral paragraph reading in individuals with aphasia, Aphasiology, 33(2), 234–252. https://doi. org/10.1080/02687038.2018.1545992 Smith, M., Cunningham, K. T., & Haley, K. L. (2019). Automating error frequency analysis via the phonemic edit distance ratio. Journal of Speech, Language, and Hearing Research, 62(6), 1719–1723. https://doi.org/10.1044/2019_JSLHR-S-18-0423 Snow, P. C., & Douglas, J. M. (2000). Subject review: Conceptual and methodological challenges in discourse assessment with TBI speakers: Towards an understanding. Brain Injury, 14(5), 397–415. https://doi.org/ 10.1080/026990500120510 Stark, B. (2019). A comparison of three discourse elicitation methods in aphasia and agematched adults: Implications for language assessment and outcome. American Journal of Speech-Language Pathology, 28(3), 1067–1083. https://doi.org/10.1044/2019_AJSLP-18-0265 Stubbs, E., Togher, L., Kenny, B., Fromm, D., Forbes, M., MacWhinney, B., McDonald, S., Tate, R., Turkstra, L., Power, E. (2018). Procedural discourse performance in adults with severe traumatic brain injury at 3 and 6 months post injury. Brain Injury, 32(2), 167–181. https://doi.org/10.1080/02699052.20 17.1291989 Tichenor, S., & Yaruss, J. S. (2020). Repetitive negative thinking, temperament, and adverse impact in adults who stutter. American Journal of Speech-Language Pathology, 29(1), 201–215. https://doi.org/10.1044/2019_AJSLP-19-00077 van Nispen, K., van de Sandt-Koenderman, M., Sekine, K., Krahmer, E., & Rose, M. L. (2017). Part of the message comes in gesture: How people with aphasia convey information in different gesture types as compared with information in their speech. Aphasiology, 31(9), 1078–1103. https://doi.org/10.1080/02687038. 2017.1301368 Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Appleton, G., Axton, M., Baak, A., Blomberg,N., Boiten, J. W., da Silva Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Finkers, R., GonzalezBeltran, A., … Bourne, P. E. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3(1), 1–9. https://doi.org/10.1038/ sdata.2016.18 Yairi, E., & Ambrose, N. G. (1999). Early childhood stuttering I: Persistency and recovery rates. Journal of Speech, Language, and Hearing Research, 42(5), 1097–1112.
Part 2: Syntax and Semantics
12 Generative Syntactic Theory and Language Disorders MARTINA PENKE AND EVA WIMMER 12.1 Preamble “Syntax is the study of the principles and processes by which sentences are constructed in particular languages” (Chomsky, 1957, p. 11). In individuals with acquired or developmental language disorders, the ability to construct or comprehend sentences by applying these syntactic principles and processes is often affected, resulting in a syntactic deficit. A syntactic deficit might be so severe as to prohibit the production of sentences altogether, limiting the individual to utterances that consist of single words only. In less severe cases, syntactic disorders result in the production of non-target like, incorrect syntactic structures. In the last decades, syntactic disorders have been documented in a growing number of acquired or developmental language disorders. The literature is vast, and hence this chapter can only provide a broad overview of the field. We will start with describing the core symptoms of syntactic disorders that have been observed in language production and comprehension across different disorder syndromes. We will then provide a gist on different syntactic accounts that have been proposed in the field to account for syntactic deficits, focusing on accounts that have been formulated within the framework of generative syntax. Approaches that pin deficits to problems in the build-up of syntactic structures or to difficulties with performing syntactic operations seem naturally well-suited to account for syntactic disorders. However, it has been suggested alternatively that problems in producing and comprehending sentences might arise from more general deficits that affect the processing of sentences without being “syntactic” in nature. We will shortly sketch these so-called processing-deficit accounts. An appendix to this chapter provides a short tabular overview directing the reader to some of the research that has aimed to characterize the nature of syntactic deficits in aphasia, developmental language disorder (DLD), Down syndrome, hearing impairment and autism spectrum disorder.
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
160 Martina Penke and Eva Wimmer
12.2 Characteristics and Diagnosis of Syntactic Disorders 12.2.1 Symptoms in Language Production Syntactic deficits affect the comprehension as well as the production of utterances. In language production, syntactic deficits result in utterances that are short, lack necessary elements and are limited in syntactic complexity (see Table 12.1 for exemplification). A hallmark of syntactic deficits in language production is that free functional elements such as determiners (Table 12.1a,c), auxiliaries (Table 12.1e), or complementizers introducing subordinate clauses (Table 12.1e,f) are often omitted. Problems also affect bound inflectional morphemes that express grammatical relations between sentence constituents, as is the case in subject-verb-agreement inflection, or that mark the syntactic function of an element in a sentence (e.g. subject or object) via case markers. Depending on the morphological characteristics of a given language, such deficits will either lead to omissions of the inflectional marker (Table 12.1d) or to substitutions of the inflected form that is required by the grammatical context by another incorrect and often unmarked form, such as the infinitive form or the stem form in verbs (Table 12.1b,c) or the default nominative form in nouns (see also Chapter 15 this volume). Utterances where a finite verb form has been replaced by a non-finite verb form are called root infinitives. The production of a non-finite verb form instead of a finite one might have consequences for verb placement, as is the case in German where finite verbs appear in the second structural position in a sentence whereas non-finite verbs are placed clause-finally (Table 12.1b,c). Root infinitives, that are a characteristic sign of early stages in typical language acquisition, also characterize the language production of children with DLD, of individuals with Broca’s aphasia and of individuals with Down syndrome. Subordinate clauses and other types of complex sentences that deviate from the most common, canonical sentence structure of a language, such as wh-object questions, topicalized sentences or passives are rarely produced spontaneously. In experiments eliciting such noncanonical sentence types, they are typically more affected than syntactic structures that do not deviate from the canonical word-order pattern, resulting in ungrammatical productions (Table 12.1f,g). The seriousness of these deficit symptoms is dependent on the severity of the disorder. In the extreme, utterances are reduced to strings of single words that lack syntactic structure altogether. An example would be the utterance “jacket” as description of a scene where a person puts on a jacket (also Table 12.1a). In less grave disorders, the deficit might lead to less complex utterances and more incorrect syntactic structures compared to the performance of a group of neurologically healthy control participants. While some symptoms indicative of syntactic deficits might already become apparent in a conversation with affected individuals (e.g. omissions of free function words or inflectional markers, incorrect word order), other symptoms such as the reduction of syntactic complexity (e.g. lack of subordination or passives, preponderance of canonical word order) are less apparent and require explicit testing. Whether spontaneous language production is or is not indicative of a syntactic disorder, the nature of the deficit is best assessed by carefully designed experiments that allow for a systematic collection of relevant data with respect to those syntactic constructions that are assumed to be affected in the specific disorder. Note that an analysis of spontaneous speech might not provide a realistic picture of an individual’s syntactic deficit as forms or constructions whose production pose a problem for a language-impaired speaker are often avoided in spontaneous-speech production and deficits only show up when the context provided in the experimental task requires their production (e.g. Eisenbeiss, 2010; Kolk & Heeschen, 1992; Penke, 2015).
“ich Schuhe anziehn” (= I shoes put on) target: “ich ziehe die Schuhe an” (= I am putting on the shoes)
Note: missing/incorrect elements underlined.
d. “der koch” (= he cook) target: “der kocht” (= he is cooking) e. “hier zehn Jahre, einigermaßen reden” (= here 10 years, speak passably) target: “wenn ich zehn Jahre hier war, dann werde ich einigermaßen reden” (= when I have been here for ten years then I will speak passably) f. “hast du die karte _conni hm die Katze füttert?” (= do you have the card _ conni uh is feeding the cat?) Target: “hast du die Karte, wo Conni die Katze füttert” (= do you have the card where conni is feeding the cat?) g. “kitzelt den Jungen?” (= tickling the boy?) target: “Wer kitzelt den Jungen?” (= who is tickling the boy?)
c.
a.
“du Jacke an” (= you jacket on) target: “du ziehst die Jacke an” (= you put on the jacket) b. “da die Jacke anziehn” (= there the jacket put on) target: “da ziehst du die Jacke an” (= there, you put on the jacket)
Example
Table 12.1 Syntactic error types in language production.
Hearing impairment (4;10)
Wernicke’s aphasia (77)
• missing wh-word
Hearing impairment (4;6) Broca’s aphasia (33)
• missing complementizer “wo” (= where)
• missing complementizer “wenn” (= when) • missing subject “ich” (= I) • missing verb “war” ( = have been) and auxiliary “werde” (= will)
SLI/DLD (4;6)
• missing subject “du” (= you) • incorrect verbal inflection: infinitive instead of 2nd sg marking • incorrect word order: Vfinal instead of V2 (root infinitive) • missing article “die” (= the) • incorrect verbal inflection: infinitive instead of 1st sg marking • incorrect word order: Vfinal instead of V2 (root infinitive) • incorrect verbal: inflection: omission of 3rd sg marking
Wernicke’s aphasia (79)
Down’s syndrome (9;5)
Syndrome (age)
• error: missing article “die” (= the) • missing finite verb
Error
162 Martina Penke and Eva Wimmer
12.2.2 Symptoms in Language Comprehension In language comprehension, the identification of a syntactic deficit requires specific testing, in order to ensure that comprehension of a sentence requires syntactic processing and cannot be based on the semantic content of the lexical words in the sentence alone. Classic designs for testing language comprehension capacities are the sentence-picture matching task and the picture-pointing task. In a sentence-picture matching task the individual has to match a given sentence such as the passive clause “The girl is tickled by the boy” to one of two pictures: one depicting the action described in the sentence (boy tickling girl), the other depicting the reverse action (girl tickling boy). In a picture-pointing task, the participant must point to one of several depicted entities in response to a question or command. For example, given a picture with a boy who is tickling a girl who is tickling a boy (see Figure 12.1), the participant might be asked “which boy is the girl tickling?” A typical error indicative of a syntactic deficit affecting the comprehension of such sentences is that the first noun phrase occurring in the sentence is interpreted as the Agent of the action, independently of the available morphosyntactic cues. While this will result in correct performance and seemingly unimpaired comprehension in sentences with canonical subject/Agent before object/Patient word order such as in active SVO sentences (“the girl tickled the boy”) or subject-questions (“which boy is tickling the girl?”), impaired syntactic comprehension becomes obvious when the first noun phrase is the Theme/Patient of the action. Then, for example, an inability to parse the morphosyntactic cues in a passive clause such as “the girl is tickled by the boy” might lead to the interpretation “girl tickled boy”. Likewise, an object question such as “which boy is the girl tickling?” might be answered by pointing to the boy who tickles the girl in Figure 12.1. The better comprehension of subject/Agent-initial sentences as opposed to object/Patient-initial sentences is referred to as subject-object asymmetry. It has been suggested that individuals who encounter problems in parsing the morphosyntactic cues provided by word order and free or bound functional elements as to who is Agent or Patient/Theme in a given sentence stick to heuristic interpretations of the sentence.
Figure 12.1 Typical three-person scenario as tested in a picture-pointing task (Wimmer, 2010).
Generative Syntactic Theory and Language Disorders 163 In languages such as English that display SVO word order, a heuristic interpretation might assume the first NP in a sentence to be the Agent of the action (Grodzinsky, 2000). Other strategies involve factors such as animacy (e.g. an animate entity is more likely to be the agent of an event than an inanimate entity) or plausibility (e.g. dog biting man vs. man biting dog) to derive the most likely interpretation of a sentence. A syntactic deficit in language comprehension can only be identified by materials where the correct interpretation of the sentence has to be based on the parsing of syntactic structure and morphosyntactic cues. This might not always be the case in standardized test materials used for diagnostic purposes. Thus, for instance, a number of items in the Test for the Reception of Grammar (TROG, Bishop, 1989) that is available in a number of languages might be solved with the help of content keywords only, without necessarily parsing the structure syntactically. Moreover, performance in comprehension tasks is also influenced by factors not directly related to an individual’s ability to parse the morphosyntactic information provided. Thus, choosing the matching picture out of a display of several pictures (for instance, four pictures in the TROG) taxes working memory resources as the given sentence has to be kept in verbal short-term memory until it has been parsed, all pictures have been investigated and a decision as to the target picture has been made. In addition, choosing one picture out of a display involves executive functions. Thus, problems in sentence-picture matching tasks might not only result from syntactic deficits but might be due to limitations in other cognitive components such as working memory or executive functions (Penke & Wimmer, 2020). An intriguing finding with respect to the abovementioned symptoms of syntactic disorders in language production and/or comprehension (such as omissions of free functional elements, problems with bound inflectional morphology, reduction of syntactic complexity and problems in interpreting sentences with non-canonical argument order) is that they occur in a wide variety of developmental and acquired deficit syndromes. One challenge for research on syntactic deficits is to find out whether syndrome-specific disorder phenotypes can be identified within these vulnerable syntactic domains. As an example, consider deficits of subject-verb-agreement inflection that are typically observed both, in German children with DLD and in children with a sensorineural hearing impairment. While German children with DLD display a broad deficit with respect to all verbal agreement affixes, the deficit in children with hearing impairment affects only those inflectional markers whose perception is affected by the hearing impairment (Chapter 15.4.6, Penke & Rothweiler, 2018). The identification of such syndrome-specific phenotypes allows not only for a more detailed characterization of the observed syntactic deficits, but it is also important for identifying the underlying causes of the observed problems, thereby providing indications on effective therapeutic interventions specifically targeting these deficits.
12.3 Accounts on Syntactic Deficits Proposals accounting for syntactic deficits can be broadly divided into two opposing views: syntax-specific deficit accounts and processing accounts. In syntax-specific approaches a deficit is assumed that either affects (i) the build-up of syntactic phrase structure, (ii) the specification and interpretation of specific syntactic features or (iii) specific movement operations and structural dependencies (see Penke, 2015). In contrast, in processing-deficit accounts syntactic deficits do not result from an impairment in the syntactic system per se. Instead, limitations of processing capacities or resources are assumed to be the chief cause of the above-described symptoms in language production and comprehension. In the following, we will shortly sketch the gist of the different accounts, exemplifying how they deal with one particular sentence type that has been found to be affected by syntactic deficits in
164 Martina Penke and Eva Wimmer language production as well as in language comprehension across a variety of language disorders, namely wh-questions (see Figure 12.1 in the Appendix for overview of syntactic deficit accounts in different clinical disorders).
12.3.1 Syntax-specific Deficit Accounts Most syntactic deficit theories have been formulated within the framework of Generative Grammar and here within the theories of Government and Binding (GB, Chomsky, 1981) and the Minimalist Program (MP, Chomsky, 1995). In both frameworks lexical categories (e.g. nouns, verbs) and functional categories (i.e. free-standing grammatical elements like determiners and complementizers as well as bound functional morphemes like verbal inflections) combine hierarchically to form phrases and sentences. A sentence consists of several structural shells. The verbal shell (called D(eep)-Structure in GB) encodes the argument structure associated with the described event, along with the lexical and grammatical properties of the arguments involved and their thematic relations (e.g. Agent, Patient/Theme) in the event. An inflectional shell takes care of the inflection of the sentence constituents, for example of case inflection or subject-verb agreement inflection. Finally, the so-called CP shell (CP short for complementizer phrase) is involved in word order and word order variation associated with different sentence types, such as subordinate clauses, sentences with topicalized objects or wh-questions. In GB, inflectional shell and the CP shell build the so-called S(surface)-Structure that encodes surface properties, such as agreement, tense, and case inflection as well as word order. S-Structure is derived by movement from elements out of the underlying D-Structure into projections of functional heads such as I(NFL, for inflection) or C(OMP, for complementizer) that subserve inflection and word order variation. Consider as exemplification the syntactic tree associated with a short German wh-question such as “wen kitzelt der Junge” (= who is the boy tickling?) (Figure 12.2). In the D-Structure, that is, the verbal phrase (VP), the verb describing the action (“kitzeln” (= tickle)) occupies the head of the VP, the Agent of the action (“the boy”) is situated in the specifier position (Spec) of the VP and the Theme/Patient of the action is lexicalized by a wh-pronoun in the complement position of the verbal head. To derive S-Structure, the Agent moves to the specifier position of the inflection phrase IP where it is marked as subject by receiving nominative
Figure 12.2 Syntactic tree of the wh-question “Wen kitzelt der Junge?” (= who is the boy tickling).
Generative Syntactic Theory and Language Disorders 165 case inflection. The verb moves to the functional head I(NFL) to enter into an agreement relation with the person and number specifications of the subject, expressed by verbal agreement markers. Each moved constituent leaves behind a trace (t). The moved constituent and its trace are connected via a syntactic chain, indicated by an index. In case that a noun phrase moves out of the VP, its thematic role (Agent, Theme/Patient) that is assigned by the verb within the VP is transmitted via this syntactic chain. German is a so-called verbsecond language. In main clauses the finite verb is in the second structural position, the C(OMP)-position in the tree. Thus, the finite verb (i.e. the verb that agrees with the subject) has to move from the functional head I(NFL) to the functional head C(OMP). Verb-second word order then comes about by movement of another sentence constituent, here the whpronoun, to the specifier position of the CP (so-called wh-movement). Now, how can the structures and processes involved in deriving or parsing a sentence such as the wh-question in Figure 12.2 be affected in syntactic deficits?
12.3.1.1 Deficits in Structure Building According to one group of syntactic-deficit approaches, it is the build-up of syntactic structures that is impaired in syntactic deficits. Most prominent among these approaches is the Tree-Pruning Hypothesis (Friedmann & Grodzinsky, 1997). According to this theory, the syntactic tree is pruned at the functional layers. The more severe the deficit is, the fewer functional projections can be projected. The theory assumes that all functional categories below the respective cutting-site are unaffected by the tree pruning, whereas all functional projections above can no longer be projected (Figure 12.3). Although several functional categories can be affected by the pruning of the syntactic tree, the deficit minimally leads to a pruning of the CP-layer. For the German example presented in Figure 12.2, a pruning of the CP-layer would predict that the landing sites for the wh-pronoun and the finite verb within the CP can no longer be projected. In consequence, movement of the wh-phrase to the specifier position of CP and movement of the finite verb to the head of CP will no longer be possible. Note that the Tree-Pruning Hypothesis is a theory on syntactic deficits in language production. Affected individuals should no longer be able to produce wh-questions with a wh-element in the specifier position of the CP position. Rather, the wh-pronoun should remain in-situ in the VP. Also, movement of the finite verb from I(NFL) to C(OMP) should be impossible, resulting in incorrect productions such as *“Junge wen kitzelt?”. An inability to project the CP layer would also affect the production of subordinate clauses that are
Figure 12.3 Pruned syntactic tree of the wh-question “Wen kitzelt der Junge?” (= who is the boy tickling).
166 Martina Penke and Eva Wimmer introduced by a complementizer in C(OMP) (e.g. “I think, C[that IP[the boy is VP[tickling a girl]]]”), or the production of relative clauses and clauses with topicalized sentence constituents where a sentence element occupies the specifier position of the CP. The Tree-Pruning Hypothesis was originally proposed for agrammatic language production in individuals with Broca’s aphasia (Friedmann & Grodzinsky, 1997). However, research since has identified the CP-layer as particularly vulnerable to syntactic deficits of diverse origin. Thus, deficits affecting the projection of the CP-layer have also been suggested to account for deficits observed in individuals with hearing loss (Szterman & Friedmann, 2014) or children with DLD (Hamann, 2006), and they might also account for the deficits observed in individuals with Down syndrome (Wimmer et al., 2020). Deficit theories, such as the Tree-Pruning Hypothesis, that assume that functional layers of the syntactic tree can no longer be projected, can account for the observation that complex syntactic structures targeting the CP-layer such as wh-questions or subordinate clauses are often missing or are incorrectly produced in the language production of affected individuals. However, experiments that have elicited the production of the critical sentence structures such as wh-questions have yielded that affected individuals will typically not completely lack the ability to produce such sentences, but will succeed in producing, for instance, correct wh-questions in at least some of the tested cases (aphasia: Neuhaus & Penke, 2008; DLD: Rothweiler et al., 2012; Down syndrome: Wimmer et al., 2020). Thus, the claim that the CP-layer can no longer be projected by the syntactic system of affected individuals is clearly too strong. This observation highlights a characteristic challenge for syntactic deficit t heories: the challenge to reconcile the variable performance of affected individuals that sometimes fail and sometimes succeed in producing critical structures with the all-or-none flavor of the assumed deficit accounts that claim specific syntactic structures or functions to be either retained or impaired (Penke, 2015). Moreover, tree-pruning hypotheses are challenged to accommodate developments in syntactic theory that affect the postulated types and numbers of functional categories as well as their hierarchical ordering within the syntactic tree. Thus, while in later developments of GB the I(NFL)-layer was split up into two independent functional projections, a Tense Phrase (TP) and an Agreement Phrase (AGRP) – allowing for selective deficits of tense inflection, sparing agreement inflection – in the Minimalist Program, tense and agreement features are again subsumed under a single node (Chomsky, 2000). Also, postulated orderings of functional projections (CP>TP>AGRP) are subject to cross-language variation. Thus, for instance in German, the ordering CP>AGRP>TP has been suggested, a difference that has consequences for the deficits expected by tree pruning (Penke, 2000).
12.3.1.2 Deficits with the Feature Specification of Functional Heads A second group of syntactic-deficit approaches assumes that the syntactic features that are hosted by functional categories are underspecified, whereas the build-up of syntactic structure – including the CP-layer – is intact. With the advent of the Minimalist Program (MP, Chomsky, 1995), syntactic features started to play a central role in the derivation of syntactic structure. In the MP, functional heads dominate bundles of syntactic features that have to be checked against the information provided by elements retrieved from the lexicon. Feature checking can only occur within the checking domain of the functional head hosting the respective syntactic feature and will eliminate the features dominated by this functional head from the syntactic representation. The MP draws a distinction between different types of features. Strong features force overt movement, that is, feature checking occurs prior to articulation (so-called spell-out). Weak features are checked after spell-out, hence their movement into the checking domain is covert, that is, not visible on the surface. Another distinction is drawn between interpretable and uninterpretable features. Interpretable features can be interpreted at the logical form level, that is, they are relevant to the semantic
Generative Syntactic Theory and Language Disorders 167 interpretation of an expression. Uninterpretable features, in contrast, cannot be interpreted at this level and need to be checked off before the logical form level, lest the derivation crashes at this level. Syntactic deficit accounts have especially targeted the tense and agreement features of verbs to account for the frequent omission or substitution of verbal affixes and the production of root infinitives in individuals with language disorders. According to the Extended Optional Infinitive Hypothesis, for instance, children with DLD exhibit a protracted period in development where they optionally leave the tense feature of the functional category I(NFL) underspecified (Rice & Wexler, 1996), leading to the production of root infinitives, that is, utterances that only contain a non-finite verb in base-generated VP-internal position (Table 12.1b,c). In contrast to the underspecified I(NFL) node, all other functional categories, features or syntactic operations are assumed to be intact. The production of root infinitives is not only a key symptom in the language production of children with DLD, but also characterizes the language production of individuals with Broca’s aphasia and individuals with Down syndrome. Consequently, research has targeted whether the underspecification of tense and/or agreement features can account for similar deficits in these syndromes. Whereas this approach has not deemed fruitful to account for the deficits in individuals with Down syndrome that encompass inflectional markers other than verbal tense and agreement markers (Eadie et al., 2002; O’Neill & Henry, 2002), related accounts have been proposed for Broca’s aphasia (Burchert et al., 2005; Wenzlaff & Clahsen, 2004). For instance, Burchert and colleagues have suggested that in agrammatic syntactic representations tense and/or agreement features of the verb are underspecified, leading to omissions and substitutions of verbal affixes. While underspecification accounts capture the omission and substitution of inflectional markers as well as the production of root infinitives that are observed in many acquired and developmental language disorders, they also make predictions that are not always born out of the data. Thus, in German, the underspecification of tense and/or agreement features should result in frequent productions of incorrect wh-questions with clause final non-finite verbs (e.g. *“wen der Junge kitzeln?”). However, in a task eliciting the production of whquestions from German individuals with Broca’s aphasia (Neuhaus & Penke, 2008), only five of the elicited 378 wh-questions were root infinitives introduced by a wh-element. Moreover, deficit accounts that see the basis of syntactic deficits in particular syntactic features are challenged by the cross-language variation that has been observed within particular disorder syndromes. Consider as an example the assumption that DLD is characterized by a deficit affecting the uninterpretable agreement features of verbs, resulting in impaired agreement between subject and verb in German affected children (Clahsen et al., 1997). In contrast, tense marking, and hence the interpretable tense feature, is assumed to be unaffected. While this proposal well captures observations on German children with DLD, a number of studies have found tense inflection to be severely affected in English-speaking children with DLD (Rice & Wexler, 1996; van der Lely & Ullman, 2001), challenging the assumption of a selective deficit affecting only uninterpretable agreement features in DLD. Note, that the classification of a feature as interpretable or uninterpretable holds across languages. Hence, similar deficits with tense or agreement inflection should be observed across languages and such crosslanguage variations within a specific disorder syndrome have to be accounted for.
12.3.1.3 Deficits with Syntactic Movement and Structural Dependencies Sentences with a non-canonical ordering of constituents in which sentence constituents move from their base position to a higher landing site in the syntactic tree (as is the case in passive sentences, object clefts or wh-object questions) are often problematic in language disorders. It has been suggested that such difficulties might be due to deficits with specific syntactic movement operations and structural dependencies between sentence elements. For instance,
168 Martina Penke and Eva Wimmer the Trace Deletion Hypothesis and its later refinements (Grodzinsky, 2000) assume that traces of moved noun phrases are deleted from syntactic representations. Consider again a wh-object question as in (Figure 12.4a). As pointed out in Section 12.3.1, the wh-object phrase is moved out of its position inside the VP to the left sentence periphery leaving behind a trace (tj in Figures 12.2 and 12.4a). Also, the subject-NP is moved from its VP-internal position. According to the theory, both these traces would be deleted. Consequently, both NPs cannot receive their thematic roles as the respective syntactic chains are disrupted (Figure 12.4a). Therefore, when the comprehension of such questions is tested, the individual cannot syntactically determine who is Agent and who is Theme/Patient of the event. Instead, it is assumed that the individual makes use of a heuristic default strategy whereby the Agent role is assigned to the first NP of the sentence (the wh-object phrase in Figure 12.4a). When hearing the sentence, an individual with a trace-deletion deficit will thus interpret a wh-object question as a subject question and will consequently point to the agent of the action in a picture-pointing scenario (as described in Section 12.2.2). For wh-subject questions, however, the Trace Deletion Hypothesis predicts unimpaired comprehension, as the subject NP receives the very same Agent role via the default strategy that it would have received via syntactic assignment by the verb. The Trace Deletion Hypothesis has been invoked to capture the subject-object asymmetry that is observed in the comprehension of wh-questions and other sentence types (subject vs. object relative clauses) in language disorders such as Broca’s aphasia (Grodzinsky, 2000) or Wernicke’s aphasia (Grodzinsky & Finkel, 1998) as well as in children with DLD (Friedmann & Novogrodsky, 2004).
Figure 12.4 Trace Deletion Hypothesis (a) and Relativized Minimality approach (b, c) in individuals with language disorders and absent morphosyntactic features.
Generative Syntactic Theory and Language Disorders 169 More recently problems in comprehending sentences with non-canonical word order have been accounted for within the Relativized Minimality approach to locality in syntax (Rizzi, 2013). According to this framework, a local syntactic chain formation is blocked in a structural configuration with the elements X – Z – Y when there is an intervener (Z) between the trace (Y) and the landing site of a moved element (X) and X and the intervener Z coincide with respect to their morphosyntactic features (e.g. +NP, +wh, gender, number) (Garraffa & Grillo, 2008). In a wh-object question (Figure 12.4b) the subject-NP (“das Mädchen” (= the girl)) constitutes a potential intervener for the wh-object-phrase (“welchen Jungen” (= which boy)) which moves over this NP to sentence-initial position. However, in this example the moved NP (+NP) and the wh-pronoun (+wh) differ with respect to their morphosyntactic features. Hence, no intervention effect occurs for neurotypical speakers and the chain b etween the moved element and its trace can be constructed. In individuals with language disorders the morphosyntactic features of the moved element X and the intervener Z might be absent from the syntactic representation, resulting in problems to compute the chain between trace (Y) and landing site of the moved constituent in the presence of a potential intervener (Garraffa & Grillo, 2008). Due to the underspecified morphosyntactic features of the moved element X and the intervener Z, an intervention effect occurs and the formation of a chain between the moved element X and its trace Y is blocked (Figure 12.4b). In this case, the production or comprehension of an object question should fail. In contrast, in wh-subjectquestions (Figure 12.4c) there is no potential intervener and hence, the comprehension of those kinds of wh-movement configurations should be spared even when morphosyntactic features are lacking in the syntactic representation. Thus, like the Trace Deletion Hypothesis, the theory accounts for the subject–object asymmetry typically observed in language disorders. Although originally proposed to account for syntactic deficits in individuals with Broca’s aphasia (Garraffa & Grillo, 2008), the account has since been adopted for individuals with hearing impairment (Friedmann & Szterman, 2011), with DLD (Friedmann & Novogrodsky, 2011), or with disorders on the autistic spectrum (Durrleman et al., 2016). While both theories account for the subject–object asymmetry in comprehension, both are challenged by the large variability of performance patterns that have been observed across individuals with the same deficit syndrome. For instance, individuals with Broca’s aphasia do not only show problems comprehending object questions, but they might also have problems in understanding subject questions. They might show a reverse pattern of better comprehension of object questions than subject questions, or their problems might only show up in a specific type of wh-questions, namely which- but not who-questions, since the former require the additional processing step of linking the which-phrase to one of the potential referents given in the context (Thompson et al., 1999). Our own studies on the comprehension of whquestions in German-speaking individuals with Broca’s aphasia (Neuhaus & Penke, 2008), hearing impairment (Wimmer et al., 2017) or Down syndrome (Wimmer & Penke, 2020) confirm a huge variability of performance patterns between affected individuals with the same syndrome. Although there is agreement in the field that non-canonical sentences where the object has moved over the subject are in general more prone to difficulties in comprehension and production than sentences with canonical subject-object order, to capture the observed individual variability of performance patterns (including divergent patterns) within a generalizing syntactic deficit account remains a challenge.
12.3.2 Processing-deficit Accounts In contrast to accounts assuming that syntactic representations are lacking or deficient in individuals with language disorders, processing-deficit accounts claim that the symptoms observed in syntactic comprehension and production impairments are due to limitations that affect the processing of morphosyntactic information. For instance, there might be too
170 Martina Penke and Eva Wimmer little workspace or energy available to compute a given linguistic construction, that is, processing capacity is limited, the processing speed might be too slow to compute syntactic structures or structural relationships in a given temporal window, or the results of a syntactic computation might decay before the processing of the sentence can be terminated (Leonard et al., 2007). Such limitations may be domain-general, that is, they affect non-verbal cognitive operations or components that are also involved in language processing, thereby causing language deficits besides other cognitive deficits. Especially, limitations in working memory (including verbal short-term memory) have come into the focus of research on syntactic deficits (Adams et al., 2018) and have been claimed to be at the base of syntactic deficits of individuals with Broca’s (Caplan et al., 2007) or Wernicke’s aphasia (Kolk & Friederici, 1985), children with DLD (Leonard, 2014 for overview), Down syndrome (Laws & Gunn, 2004; Penke & Wimmer, 2020 for overview), or hearing impairment (Tuller & Delage, 2014; Penke & Wimmer, 2018 for overview). But how do memory limitations relate to syntactic impairments couched in the framework of generative syntax? Correct syntactic parsing of a wh-question (as in Figure 12.2) requires that the moved wh-phrase (and its features) has to be kept in memory until its trace (e.g. the object trace of the moved wh-phrase wen) is encountered in its base position in the VP and its thematic role can be identified and processed via establishing a syntactic chain. A person suffering from a limitation of verbal short-term memory might not be able to keep the verbal material active long enough in memory to complete these computations, affecting whmovement. The comprehension of wh-object questions might hence be more affected than the comprehension of wh-subject questions because the distance between moved element and trace is longer in the former compared to the latter question type (compare Figure 12.4a,b vs. 12.4c), resulting in higher processing costs. This allows for the prediction that comprehension of wh-questions might rather be dependent on the distance between moved element and gap than on a specific question type and the presence of a structural intervener. The distance between gap and moved element can be varied, for instance by an embedded sentence (“Wenj, sagst Du, kitzelt das Mädchen tj?” (= who, are you saying, is the girl tickling?)) or by other enriching material like adjectives (“Wenj kitzelt das langhaarige, freche Mädchen tj?” (= who is the long-haired, naughty girl tickling?)). If comprehension of wh-questions were dependent on the distance between moved element and gap, questions including a longer distance should be harder to comprehend than questions with a shorter distance. Indeed, such memory effects have, for instance, been observed for children with DLD (Deevy & Leonard, 2004). While these findings indicate an influence of memory components on sentence comprehension, they do, however, not rule out the influence of question type (whsubject or wh-object) on comprehension performance. There can be no doubt that working memory (including verbal short-term memory) is involved in sentence comprehension as the auditory signal is encountered serially over time and has to be kept in memory systems until it is parsed and the meaning of the sentence can be derived. However, whether impaired memory components are the chief cause of a particular language disorder or rather a comorbid symptom aggravating the deficit is a matter of lively debate. Crucially, research on the relation between deficits in memory components and syntactic deficits is impeded by methodological challenges. Thus, wellestablished measures of memory components make use of verbal material (e.g., the repetition of (non)words or numbers) making it difficult to measure memory capacities independent from language skills (Marshall, 2020). Also, tasks such as sentence-picture matching which are typically used to investigate syntactic comprehension, are particularly taxing for memory components as the individual has to keep the presented sentence in the verbal short-term store until the sentence has been parsed, the presented pictures have been visually analysed, sentence meaning and scene interpretations have been compared, and a decision as to the correct picture has been made. Thus, a close relation between sentence comprehension
Generative Syntactic Theory and Language Disorders 171 deficits and deficits in memory components comes as no surprise if tested in this way. Future research is challenged to devise tasks that allow to disentangle the contribution of memory components from problems rooted in the processing of particular syntactic features, structures, or operations. Processing limitations may also be more specific to the domain of language, and here specific to morphosyntactic processing. Thus, it has been proposed, for instance, that the search for the correctly inflected form in the mental lexicon might be too costly, resulting in the activation of unmarked and easier to access forms such as stem forms or the infinitive (Lapointe, 1985). Garraffa and Grillo (2008) have suggested that limitations in processing capacity that lead to a slowed down activation of morphosyntactic information and structure-building or to a faster decay of the processed information, result in the underspecification of morphosyntactic features in the syntactic tree, thus causing intervention effects in sentences where the object precedes the subject (Section 12.3.1.3). Processing limitations are likely to force the syntactic system to make economic use of the available processing resources. This might affect structure-building, resulting in trees that lack higher-up functional projections because the available resources limit the number of operations that merge the material on hand into phrases (Hagiwara, 1995; Jakubowicz, 2011). Economy considerations might also result in the avoidance of movement operations, that are considered to be costly for the processing system (Bastiaanse & van Zonneveld, 2005; Jakubowicz, 2011). One advantage of the recourse to syntactic theory is that it permits to establish a complexity hierarchy of syntactic structures and operations associated with increasing processing demands that allows for precise predictions which syntactic structures or operations should be more affected than others if processing capacities are limited. A general advantage of deficit accounts that seek the problems of language-impaired speakers in limited processing capacities is that they can deal with (i) the gradeability of language deficits with respect to the severity of the disorder, (ii) the within-group variation of language behavior between different individuals suffering from the same deficit syndrome, and (iii) the variable performance within a single individual due, for instance, to task demands or physiological factors affecting an individual’s performance during testing. With growing or decreasing processing costs and processing capacities the ability to produce or parse syntactic structures ameliorates or deteriorates. Processing-deficit accounts are also able to account for performance that is neither completely impaired nor completely intact. All these observations are difficult to account for in deficit accounts which claim that syntactic representations are either intact or impaired. A challenge to processing-deficit accounts is the open issue whether syntactic deficits might be specific to a particular deficit syndrome. Processing-deficit accounts see the costs of syntactic computations at the basis of syntactic deficits. These costs hold across deficit syndromes and, indeed, the observation that similar areas of difficulty have been observed across syndromes (see Section 12.2) has been taken as evidence that syntactic deficits are due to processing limitations that target critical (complex and costly) areas in the language system independent of the disorder syndrome. Whether this claim holds is, however, open to investigation.
12.4 Treatment of Syntactic Disorders A central goal of research on language disorders is to provide a basis for effective therapeutical interventions that target the underlying deficits in the respective syndromes. Probably one of the best evaluated treatment approaches targeting syntactic deficits is the Treatment of Underlying Forms (TUF), developed to treat problems with the production and comprehension of sentences with noncanonical word order in individuals with Broca’s aphasia (Thompson & Shapiro, 2005). The training involves improving knowledge about the verb
172 Martina Penke and Eva Wimmer and its thematic role information by using simple declarative sentences as a starting point and subsequently seeks to make the process of moving arguments transparent to the patient, usually by supporting, visual material. Thus, a treatment of wh-questions as “Wen kitzelt der Junge” (Figure 12.2) starts with identifying the thematic roles of the involved arguments in the underlying declarative clause (“the boy tickles the girl”), then the wh-object question pronoun (“wen” (= whoobj) is introduced followed by a replacement of the object-noun phrase (“the girl”) with this pronoun (“the boy tickles whoobj”) and subsequent steps illustrating whmovement, that is, moving the wh-pronoun to sentence-initial position (Stadie et al., 2008). A critical issue in evaluating treatment approaches concerns to what extent generalization effects occur. Does the treatment generalize to untrained sentences of the same syntactic structure? Do treatment effects generalize to a different syntactic structure of the same movement type (e.g. object topicalization vs. wh-object-question) or to a structure of a different type of movement (e.g. wh-movement in wh-questions vs. NP-movement in passives)? Are there generalization effects across language modalities (comprehension vs. production)? While generalization effects to untrained material of the same syntactic type have been observed across modalities for TUF (Stadie et al., 2008; Thompson & Shapiro, 2005), an interesting finding of this approach is that treatment should target more complex structures first, since positive treatment effects on more complex structures (e.g. wh-object questions) extend to simpler structures (e.g. wh-subject questions), whereas the reverse does not hold. Building-up on this finding, that turns upside down the clinical practice of training simpler structures first, Friedmann et al. (2000) proposed a training concept targeting syntactic tree pruning that starts with training the CP-layer. In a single-case study (individual with Broca’s aphasia) they could show that this training generalized to lower syntactic nodes (tense and agreement nodes) not targeted in the treatment. Intervention programs based on TUF or tree pruning have meanwhile been adapted to other disorders, such as hearing impairment (D’Ortenzio et al., 2020) or DLD (Levy & Friedmann, 2009). Recently, treatment studies have also indicated that a program training working memory did not only enhance working memory capacity but also improved syntactic skills in children with DLD (Delage et al., 2021) and children with autism spectrum disorder (Delage et al., 2022). However promising these results are, more research employing randomized controlled trials is required to adequately measure the efficacy of these treatment programs for syntactic impairments.
12.5 Conclusion Having trained students and future language pathologists for some decades now, we know about the intimidating nature of syntactic tree structures, syntactic theories, and syntactic deficit approaches that refer to syntactic theoretical concepts. However, syntactic deficits are a key symptom of language disorders and occur in many developmental and acquired language disorders, including aphasia, DLD, Down syndrome, autism spectrum disorder, or hearing impairment. Thus, language therapists can be certain to encounter individuals with such deficits in clinical practice. Also, as exemplified by the utterances in Table 12.1, syntactic deficits are a considerable impediment to academic achievements, social relationships and participation in society. Hence, it is imminent that syntactic deficits be addressed in language therapy. Moreover, therapy targeting syntactic structures has been shown to be effective, both in adult individuals (Thompson & Shapiro, 2005) as well as in children with developmental language disorders (Zwitserlood et al., 2015). Thus, we conclude this chapter with expressing our hope that we could provide a comprehendible overview on syntactic deficits, offering some assistance in further exploring the intriguing topic of syntactic deficits in language disorders.
Appendix 12.1 Syntactic disorders discussed in different clinical language disorders and reference list.
APPENDIX
174 Martina Penke and Eva Wimmer
REFERENCES Adams, E. J., Nguyen, A. T., & Cowan, N. (2018). Theories of working memory. Language, Speech, and Hearing Services in Schools, 49(3), 340–355. Bastiaanse, R., & van Zonneveld, R. (2005). Sentence production with verbs of alternating transitivity in agrammatic Broca’s aphasia. Journal of Neurolinguistics, 18(1), 57–66. Bishop, D. V. M. (1989). TROG: Test of reception of grammar. University of Manchester. Burchert, F., Swoboda-Moll, M., & Bleser, R. de (2005). Tense and Agreement dissociations in German agrammatic speakers. Brain and Language, 94(2), 188–199. Caplan, D., Waters, G., Dede, G., Michaud, J., & Reddy, A. (2007). A study of syntactic processing in aphasia (I). Brain and Language, 101(2), 103–150. Chomsky, N. (1957). Syntactic structures. Mouton. Chomsky, N. (1981). Lectures on government and binding. Foris. Chomsky, N. (1995). The Minimalist Program. MIT Press. Chomsky, N. (2000). Minimalist inquiries. In R. Martin, D. Michaels, & J. Uriagereka (Eds.), Step by step (pp. 89–155). MIT Press. Clahsen, H., Bartke, S., & Göllner, S. (1997). Formal features in impaired grammars. Journal of Neurolinguistics, 10(2–3), 151–171. D’Ortenzio, S., Montino, S., Martini, A., Trevisi, P., & Volpato, F. (2020). A syntactically based treatment of relative clauses. In V. Torrens (Ed.), Typical and impaired processing in morphosyntax (Vol. 64, pp. 177–207). John Benjamins. D'Ortenzio, S., & Volpato, F. (2020). How do Italian-speaking children handle wh-questions? A comparison between children with hearing loss and children with normal hearing. Clinical Linguistics & Phonetics, 34(4), 407–429. Deevy, P., & Leonard, L. B. (2004). The Comprehension of wh-questions in children with Specific Language Impairment. Journal of Speech, Language, and Hearing Research, 47(4), 802–815. Delage, H., Eigsti, I.‑M., Stanford, E., & Durrleman, S. (2022). A preliminary examination of the impact of working memory training on syntax and processing speed in children with ASD. Journal of Autism and Developmental Disorders, 52(10), 4233–4251.
Delage, H., Stanford, E., & Durrleman, S. (2021). Working memory training enhances complex syntax in children with Developmental Language Disorder. Applied Psycholinguistics, 42(5), 1341–1375. Durrleman, S., Marinis, T., & Franck, J. (2016). Syntactic complexity in the comprehension of wh-questions and relative clauses in typical language development and autism. Applied Psycholinguistics, 37(6), 1501–1527. Eadie, P. A., Fey, M. E., Douglas, J. M., & Parsons, C. L. (2002). Profiles of grammatical morphology and sentence imitation in children with Specific Language Impairment and Down Syndrome. Journal of Speech, Language, and Hearing Research, 45(4), 720–732. Eisenbeiss, S. (2010). Production methods in language acquisition research. In E. Blom & S. Unsworth (Eds.), Language learning & language teaching (Vol. 27, pp. 11–34). John Benjamins. Friedmann, N. (2002). Question production in agrammatism: The tree pruning hypothesis. Brain and Language, 80(2), 160–187. Friedmann, N., & Grodzinsky, Y. (1997). Tense and agreement in agrammatic production: Pruning the syntactic tree. Brain and Language, 56(3), 397–425. Friedmann, N., & Novogrodsky, R. (2004). The acquisition of relative clause comprehension in Hebrew: A study of SLI and normal development. Journal of Child Language, 31(3), 661–681. https://doi.org/10.1017/ s0305000904006269 Friedmann, N., & Novogrodsky, R. (2011). Which questions are most difficult to understand? Lingua, 121(3), 367–382. Friedmann, N., & Szterman, R. (2011). The comprehension and production of Wh-questions in deaf and hard-of-hearing children. Journal of Deaf Studies and Deaf Education, 16(2), 212–235. Friedmann, N., Wenkert-Olenik, D., & Gil, M. (2000). From theory to practice: Treatment of agrammatic production in Hebrew based on the Tree Pruning Hypothesis. Journal of Neurolinguistics, 13(4), 250–254. Fyndanis, V., Varlokosta, S., & Tsapkini, K. (2012). Agrammatic production: Interpretable features and selective impairment in verb inflection. Lingua, 122(10), 1134–1147. Garraffa, M., & Grillo, N. (2008). Canonicity effects as grammatical phenomena. Journal of Neurolinguistics, 21(2), 177–197.
Generative Syntactic Theory and Language Disorders 175 Grodzinsky, Y. (1995). A restrictive theory of agrammatic comprehension. Brain and Language, 50(1), 27–51. Grodzinsky, Y. (2000). The neurology of syntax. The Behavioral and Brain Sciences, 23(1), 1–21. Grodzinsky, Y., & Finkel, L. (1998). The neurology of empty categories. Journal of Cognitive Neuroscience, 10(2), 281–292. Hagiwara, H. (1995). The breakdown of functional categories and the economy of derivation. Brain and Language, 50(1), 92–116. Hamann, C. (2006). Speculations about early syntax. Catalan Journal of Linguistics, 5(1), 143–189. Hamann, C., Penner, Z., & Lindner, K. (1998). German impaired grammar: The clause structure revisited. Language Acquisition, 7(2–4), 193–245. Jakubowicz, C. (2011). Measuring derivational complexity. Lingua, 121(3), 339–351. Kolk, H., & Friederici, A. D. (1985). Strategy and impairment in sentence understanding by Broca’s and Wernicke’s aphasics. Cortex, 21(1), 47–67. Kolk, H., & Heeschen, C. (1992). Agrammatism, paragrammatism and the management of language. Language and Cognitive Processes, 7(2), 89–129. Lapointe, S. G. (1985). A theory of verb form use in the speech of agrammatic aphasics. Brain and Language, 24(1), 100–155. Laws, G., & Gunn, D. (2004). Phonological memory as a predictor of language comprehension in Down syndrome. Journal of Child Psychology and Psychiatry, and Allied Disciplines, 45(2), 326–337. Leonard, L. B. (2014). Children with specific language impairment and their contribution to the study of language development. Journal of Child Language, 41(Suppl 1), 38–47. Leonard, L. B., Ellis Weismer, S., Miller, C. A., Francis, D. J., Tomblin, J. B., & Kail, R. V. (2007). Speed of processing, working memory, and language impairment in children. Journal of Speech, Language, and Hearing Research, 50(2), 408–428. Levy, H., & Friedmann, N. (2009). Treatment of syntactic movement in syntactic SLI. First Language, 29(1), 15–49. Marshall, C. (2020). Investigating the relationship between syntactic and short-term/working memory impairments in children with developmental disorders is not a straightforward endeavour. First Language, 40(4), 491–499.
Nanousi, V., Masterson, J., Druks, J., & Atkinson, M. (2006). Interpretable vs. uninterpretable features: Evidence from six Greek-speaking agrammatic patients. Journal of Neurolinguistics, 19(3), 209–238. Neuhaus, E., & Penke, M. (2008). Production and comprehension of wh-questions in German Broca’s aphasia. Journal of Neurolinguistics, 21(2), 150–176. O’Neill, M., & Henry, A. (2002). The grammatical morpheme difficulty in Down’s syndrome. Belfast Working Papers in Language and Linguistics, 64–73. Penke, M. (2000). Unpruned trees in German Broca’s aphasia. Behavioral and Brain Sciences, 23(1), 46–47. Penke, M. (2015). Syntax and language disorders. In T. Kiss & A. Alexiadou (Eds.), Syntax – Theory and analysis (pp. 1833–1874). De Gruyter Mouton. Penke, M., & Rothweiler, M. (2018). Comparing Specific Language Impairment and hearing impairment. Language Acquisition, 25(1), 39–57. Penke, M., & Wimmer, E. (2018). Deficits in comprehending wh-questions in children with hearing loss. Clinical Linguistics & Phonetics, 32(3), 267–284. Penke, M., & Wimmer, E. (2020). Verbal shortterm memory and sentence comprehension in German children and adolescents with Down syndrome. First Language, 40(4), 367–389. Prévost, P., Tuller, L., Barthez, M. A., Malvy, J., & Bonnet-Brilhault, F. (2017). Production and comprehension of French wh-questions by children with autism spectrum disorder: A comparative study with specific language impairment. Applied Psycholinguistics, 38(5), 1095–1131. Rice, M. L., & Wexler, K. (1996). Toward tense as a clinical marker of specific language impairment in English-speaking children. Journal of Speech and Hearing Research, 39(6), 1239–1257. Ring, M., & Clahsen, H. (2005). Distinct patterns of language impairment in Down's syndrome and Williams syndrome: The case of syntactic chains. Journal of Neurolinguistics, 18(6), 479–501. Rizzi, L. (2013). Locality. Lingua, 130, 169–186. Rothweiler, M., Chilla, S., & Clahsen, H. (2012). Subject–verb agreement in Specific Language Impairment. Bilingualism: Language and Cognition, 15(1), 39–57. Ruigendijk, E., & Friedmann, N. (2017). A Deficit in movement-derived sentences in Germanspeaking hearing-impaired children. Frontiers in Psychology, 8, Article 689, 1–22.
176 Martina Penke and Eva Wimmer Stadie, N., Schröder, A., Postler, J., Lorenz, A., Swoboda-Moll, M., Burchert, F., & Bleser, R. de (2008). Unambiguous generalization effects after treatment of non-canonical sentence production in German agrammatism. Brain and Language, 104(3), 211–229. Szterman, R., & Friedmann, N. (2014). Relative clause reading in hearing impairment. Frontiers in Psychology, 5, Article 1229, 1–16. Thompson, C. K., & Shapiro, L. P. (2005). Treating agrammatic aphasia within a linguistic framework: Treatment of Underlying Forms. Aphasiology, 19(10–11), 1021–1036. Thompson, C. K., Tait, M. E., Ballard, K. J., & Fix, S. C. (1999). Agrammatic aphasic subjects’ comprehension of subject and object extracted wh questions. Brain and Language, 67(3), 169–187. Tsakiridou, M. (2006). The linguistic profile of Down's syndrome subjects: evidence from wh-movement construction. SOAS Working Papers in Linguistics, 14, 227–248. Tuller, L., & Delage, H. (2014). Mild-to-moderate hearing loss and language impairment. Lingua, 139, 80–101. van der Lely, H. K. J., Jones, M., & Marshall, C. R. (2011). Who did Buzz see someone? Grammaticality judgement of wh-questions in typically developing children and children with Grammatical-SLI. Lingua, 121(3), 408–422. van der Lely, H. K., & Ullman, M. T. (2001). Past tense morphology in specifically language impaired and normally developing children. Language and Cognitive Processes, 16(2–3), 177–217.
Wenzlaff, M., & Clahsen, H. (2004). Tense and agreement in German agrammatism. Brain and Language, 89(1), 57–68. Wexler, K., Schütze, C. T., & Rice, M. (1998). Subject case in children with SLI and unaffected controls: Evidence for the Agr/Tns Omission Model. Language Acquisition, 7(2–4), 317–344. Wimmer, E. (2010). Die syntaktischen Fähigkeiten von Wernicke-Aphasikern – eine experimentelle Studie [Doctoral dissertation]. Heinrich-HeineUniversity Düsseldorf. Wimmer, E., & Penke, M. (2020). The comprehension of wh-questions and passives in German children and adolescents with Down syndrome. In V. Torrens (Ed.), Typical and impaired processing in morphosyntax (Vol. 64, pp. 279–301). John Benjamins. Wimmer, E., Rothweiler, M., & Penke, M. (2017). Acquisition of who-question comprehension in German children with hearing loss. Journal of Communication Disorders, 67, 35–48. Wimmer, E., Witecy, B., & Penke, M. (2020). Syntactic problems in German individuals with Down Syndrome. In P. Guijarro-Fuentes & C. Suárez-Gómez (Eds.), New trends in language acquisition within the generative perspective (pp. 141–163). Springer. Zwitserlood, R., Wijnen, F., van Weerdenburg, M., & Verhoeven, L. (2015). “MetaTaal”: Enhancing complex syntax in children with SLI. International Journal of Language & Communication Disorders, 50(3), 273–297.
13 Formulaic Sequences and Language Disorders ALISON WRAY 13.1 Preamble The term “formulaic sequence” refers to a range of types of multiword string that are, or at least seem to be, stored and accessed in a fully or partly prefabricated form (Wray, 2002). They encompass idioms and proverbs, pre-memorized texts including songs, rhymes and prayers, common phrases like see you later and well I never, filler phrases like you know, I mean and I guess, pro-forms like thingamajig and what d’ya call it and any personal turns of phrase that individuals happen to use repeatedly. Some are completely fixed, for example, high time; if I were you, and most others relatively so, like pull someone’s leg; the first X of the Y. Many have a nonliteral meaning or connotation beyond the component words, for example, on paper; wet blanket. Some have a non-canonical form, so they are not the most obvious way to express the idea, for example, in any event; woe betide. This chapter explores how (some types of) formulaic sequences manifest in three language disorders: aphasia, Alzheimer’s disease and autism spectrum disorder (ASD), though many of the observations may apply to other disorders as well. It considers which areas of the brain have been associated with the production and comprehension of formulaic sequences. There is then a discussion of testing and measuring methods, which draws attention to the challenges of capturing an accurate picture of the amount of formulaicity in someone’s language. The final section asks why formulaic sequences might play the roles that they do in disordered language. There is no question that certain types of formulaic sequence characterize a variety of language disorders, whether through their disproportionate presence (e.g. in ASD with intellectual impairment, Alzheimer’s, aphasia) or relative absence (e.g. in ASD without intellectual impairment, Parkinson’s, right hemisphere damage). But the picture is complicated. First, much more is known about what is produced than what is understood, since it is more difficult to capture and measure the latter. Second, our attention is easily drawn to the formulaic sequences that a person can produce, particularly if they are not generating much novel language. But the presence of, often, quite a small set can eclipse the fact that the vast majority of the formulaic sequences in the general language repertoire are not being produced. Third, there is no consensus on how to identify formulaic sequences (Wray, 2009), so different linguistic material might be captured in different studies.
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
178 Alison Wray To understand why certain formulaic sequences are, or are not, being used, careful attention must be paid to what they are really for. In particular, insofar as they are a flexible resource for repairing problems in communication and social interaction (Wray, 2002, 2008, 2020, 2021), they are likely to respond dynamically to changes in a person’s capabilities (see final section, below). Furthermore, since people’s capabilities are a joint product of their preexisting behaviors, the nature of the disorder and their response to it, we would anticipate that the range and number of formulaic sequences used will be different even in people with similar language disorders. Indeed, although we obviously look for indications that specific areas of brain damage, or atypical development, are associated with the retention or loss of formulaic sequences, it is not always easy to distinguish pathological uses from neurotypical ones (Sidtis, 2022). Individual differences play a role at every level, from the precise nature and trajectory of someone’s damage and the underlying brain on which it acts, to personality and contextdetermined levels of resilience (Wray, 2020). Moreover, what a person is able to say once they have an acquired language disorder depends on what they previously knew and used. This information is typically not available at a granular level, leaving us to make assumptions about what someone’s active and passive repertoire is likely to have been. At a broad-brush level, such generalizations are unproblematic. For example, Gerstenberg and Hamilton’s (2022) finding that “narrative crystals” occur in the discourse of older, unimpaired speakers helps us better understand the role they play in Alzheimer’s discourse (e.g. Davis & Maclagan, 2013). Similarly, the recognized function of formulaic sequences in signaling social identity (Wray, 2002) and managing processing pressure (Wray, 2017) in unimpaired speakers gives us insight into both what their purpose might be in language impairment, and the potential impact on communication if they are not available for use. Nevertheless, individual variation makes it hard to gauge what has changed in response to an acquired disorder. These caveats are the necessary context for the accounts that follow. Understanding how formulaic sequences resist or succumb to developmental and acquired language disorders can help us not only map how the brain manages the components of language but also build better models of language as a complex phenomenon. It could also offer ideas for how to best support effective communication in those with language disorders (Wray, 2020, 2021).
13.2 Formulaic Sequences in Aphasia Aphasia is typically associated with damage to the perisylvian areas in the left hemisphere of the brain. Within that, non-fluent aphasia is linked with damage to the inferior frontal gyrus and fluent aphasia with damage to the posterior section of the superior temporal gyrus. As outlined later, formulaic sequences are, themselves, associated with particular areas of the brain which are not necessarily damaged in people with aphasia. Their availability is most obvious in non-fluent aphasia, where they are striking instances of fluent expression. They are harder to detect in fluent aphasia, but certainly may also occur there (Wray, 2002). Accounts of islands of complex language in non-fluent aphasia date back several centuries (see reviews by Benton & Joynt, 1960; Van Lancker Sidtis, 2004). An individual only otherwise capable of yes and no might retain: ●
●
deliberately memorized material such as prayers, chants, Bible verses, nursery rhymes, songs; lists such as the numbers to ten, the alphabet, the days of the week;
Formulaic Sequences and Language Disorders 179 ● ●
common expressions including greetings and swearing; little phrases that typically open or punctuate sentences, such as I don’t know; now wait a minute.
Idiosyncratic expressions are also common (Critchley, 1970) including repeated nonsense strings (Code, 1982, 1994). The material within the items cannot be used creatively, only reproduced verbatim. Van Lancker Sidtis (2012, p. 68) noted that “it is dramatic to observe persons with aphasia produce these expressions with normal articulation and prosody in the context of severe deficits in production of propositional language.” It is reasonable to assume that the multiword strings retained in aphasia are retrieved whole from memory, so that their intact form is achieved without any temporary restoration of the impaired linguistic abilities. However, access to prefabricated phrases within the lexical store is not sufficient explanation on its own: something is determining why this handful of internally complex lexical items is available when most single words and the majority of formulaic sequences are not. One possibility is that certain formulaic sequences are particularly memorable, whether because they are frequently used or have a format that supports recall, such as a tune. Kasdan and Swathi (2018) found that people with non-fluent aphasia were as good as unimpaired controls when completing phrases from well-known songs, but only when the phrases were sung, rather than spoken. Songs, prayers, and rhymes aside, function is the preferred explanation for why certain formulaic sequences are retained in aphasia – that is, the surviving ones play a role in supporting interaction and the needs of the speaker (e.g. greeting, thanking). A function-based account does not in itself predict multiword strings, just the retention of the ability to express key interactional messages. However, since complete messages tend to be more than one word long, it follows that there will be a disproportionate retention of multiword strings. However, the functional role of an expression in aphasia may not always be the same as in typical usage, with pro-forms or idiosyncratic “fillers” often carrying a range of alternative meanings (see Wray, 2002, for a review). And sometimes there may appear to be no intentional semantic content in formulaic sequences at all – one reason why Hughlings Jackson (1874/1958) termed them “non-propositional” (but see Wray, 2002 for problems with this term). As outlined below, tests may underestimate linguistic ability in aphasia and not pick up on improvements in communication over time (Edwards & Knott, 1994). However, research directly examining conversational exchanges (McElduff & Drummond, 1991; Oelschlaeger & Damico, 1998) confirmed what carers frequently report – that formulaic sequences can be effectively employed to achieve significant communicative functions. Furthermore, one study (Stahl et al., 2020) has demonstrated that phonological features within a person’s retained formulaic language can be used as a facilitator for producing novel words.
13.3 Formulaic Sequences in Alzheimer’s Disease Classic symptoms of language in Alzheimer’s disease (AD) include difficulties with reference, resulting in pronouns without antecedents, paraphrasis, repetition, and empty phrases (Davis & Bernstein, 2005; see also Orange, 2001). Bridges and Van Lancker Sidtis (2013) found in their data that formulaic sequences were significantly more common in people with AD than healthy controls. This, they suggest, is because formulaic sequences are generated in brain areas that are spared until the very late stages of the disease. As discussed above in relation to aphasia, there may also be social and functional reasons why formulaic sequences are found useful and effective when spontaneous language generation is difficult.
180 Alison Wray Notwithstanding the wide range of formulaic sequences potentially available, the ones typically observed in AD play roles in sustaining fluency (e.g. fillers like you know) (Davis & Maclagan, 2010) and substituting for inaccessible content words (e.g. proforms like those things and like that): And uh, oh, he … he does … uh. He does a lot of going around and see that the stores are stocked with … have what they have to. And you know. And that, that sort of thing (Bridges & Van Lancker Sidtis, 2013, p. 806) You see the thing is sometimes you get a bit when you do these things (Wray, 2010, p. 519)
Another characteristic of language in AD is the repeated use of favorite expressions, which may be personal to that individual or institutionalized. For example, nuns who “could barely articulate a sentence … managed to answer the priest with appropriate responses” (Snowdon, 2001, p. 22). Despite the ubiquity of formulaic sequences in their output, people with AD are not necessarily good at interpreting formulaic material produced by others. In tests, Kempler et al. (1988) found that people with AD did not reliably choose pictures that depicted the figurative meanings of expressions, though it must be noted that in unimpaired individuals, similar misconstruals may occur, if there is insufficient contextual information to apply pragmatics appropriately (Wray, 2020, 2021). This disparity between production and comprehension is not as mysterious as it might seem. In production, formulaic sequences are “language autopilot” (Hamilton, 2019, p. 202) and it might be difficult to manage input as automatically. Meanwhile, the production of formulaic responses can obscure the level of the speaker’s more general comprehension and engagement. In the extract below, from Davis and Bernstein (2005, p. 75), formulaic sequences appear to be maintaining the exchange, even at the expense of the truth. LW’s replies are plausible but, as subsequently revealed, untrue, since he had in fact got a bad cold. BD: LW: BD: LW: BD: LW:
How have you been – feeling okay? Yeah. I’m improving right along. That’s great Sure is Of the people I come here to see, a lot of them have colds and you don’t – you look well It’s my iron will.
All in all, it would be mistaken to claim that formulaic sequences are universally spared in AD. Some are evidently available and readily used but they are generally deployed in stereotypical ways that don’t exploit their full semantic potential. On the other hand they achieve functional and social goals, in sustaining the underlying patterns of interaction, holding the turn and/or buying planning time (Davis & Bernstein, 2005).
13.4 F ormulaic Sequences in Autism Spectrum Disorder (ASD) Patterns of formulaic sequences in ASD cannot be understood without reference to two considerations. The first is where on the spectrum a person’s challenges lie. In their review of research into language in ASD, Luyster et al. (2022) mapped a continuum of patterns, from nongenerative (echolalia, self-repetition), through transitional (mitigated echolalia, formulaic/gestalt language), to generative (pedantic and idiosyncratic language).
Formulaic Sequences and Language Disorders 181 Specifically, while formulaic sequences are often overused by verbal people with ASD and intellectual impairment, they can be underused in those without intellectual impairment (Van Lancker Sidtis, 2012). This observation probably reflects the communicative and/or processing capabilities, preferences and strategies that are available at different points on the spectrum. For instance, pre-memorized phrases, sentences and even entire schemas and scripts may compensate for inherent challenges with language and social interaction (Ullman & Pullman, 2015). Dobbinson et al. (2003, p. 305) speculated that “the social deficit in autism may be the root cause of … inflexibility [in formula usage].” Conversely, Prizant (1983) suggested that difficulties coping with generative language might account for the avoidance of social situations by a person with ASD. In ASD with intellectual impairment, formulaic language occurs in the context of a general stereotypy in behavior based around “routines and rituals always to be carried out in precisely the same way” (Paul, 2004, p. 117). Typically, an affected person will have a specific way of opening a conversation, and may have a routinized script for continuing it, covering the same topics in the same order and using the same words (Prizant, 1983). Because of the likelihood that linguistic behavior in ASD is a manifestation of a broader tendency to behave formulaically, formulaic language in ASD needs to be seen in a wider linguistic context. For instance, Dobbinson et al. (2003) identified voice quality and tone as formulaic markers of discourse functions in ASD. The second consideration is the type(s) of formulaic sequence that feature or fail to. In severe ASD, echolalia (the direct or modified repetition of phrases) is observed. Dobbinson et al. (2003, p. 305) referred to a “continuum of productivity-formulaicity rather than a repertoire in which items are either distinctly formulaic or available for productive usage.” At the extreme end of fixedness, echoes will feature “pronominal reversal”, that is, usually, use of the second person to refer to self. But less fixed, or “mitigated” echolalia can accommodate the pronoun change, so that the question Do you want a drink? elicits Do I want a drink? (Roberts, 1989). Echolalia may be a means of managing excessive cognitive demand in language processing (Rydell & Mirenda, 1991) or in navigating tasks: “Typical comments of those who interact with echolalic autistic children include ‘he tells himself what to do,’ ‘he learns language through repeating,’ and ‘echoing helps him to understand’” (Prizant & Duchan, 1981, p. 242). McEvoy et al. (1988) found that as language comprehension improved, echolalia decreased. Learning does not occur in all echolalic individuals, however, and Prizant and Duchan (1981) recommended different approaches according to whether echolalia is a permanent state or likely to be transitory. In ASD at the milder end of the spectrum, stock phrases are correctly but perhaps excessively used, with any subtle social functions associated with them not always fully achieved (Boucher & Anns, 2018). As for figurative expressions such as idioms and proverbs, a metaanalysis by Morsanyi and Stamenković (2021) found that participants with ASD generally performed less well than typically developing children. This could be because metaphorical meanings require theory of mind capabilities. For a recent detailed overview of language in ASD, see Luyster et al. (2022).
13.5 Brain Areas Associated with Formulaic Sequences As discussed above, different disorders are associated with different types of, and facility with, formulaic sequences, which could help establish which areas of the brain manage their production and comprehension. Formulaic sequences won’t be available if the brain area(s) responsible for processing them are not functioning. Meanwhile, if brain areas managing other aspects of language processing are damaged but those associated with formulaic language are not, that could explain not only why they are spared but why they are brought into greater use.
182 Alison Wray The right hemisphere and, more recently, the right basal ganglia have been identified as the most likely candidates for the management of formulaic sequences. Links between the right hemisphere and at least some types of formulaic sequence date back many centuries. Benton and Joynt (1960) cited a case from 1683 of a woman who, after suffering a stroke in the left hemisphere, had no spontaneous spoken language but could still recite prayers and Biblical verses, provided she followed a particular sequence. Hughlings Jackson (1874/1958) proposed that “non-propositional” expressions like thank God, retained after left hemisphere damage, must be stored holistically in the right hemisphere, complete with their intonation contour and pragmatic color, since they were not amenable to modification. However, it may not be that simple. Our current, more fine-grained understanding of how the brain works generally points to distributed processing across different areas. Furthermore, a now sizeable literature indicates that the right hemisphere’s contribution to language processing relates to relevance, inference, prosody, and pragmatics, including humor and the comprehension of idioms and metaphors (Blake, 2021). Since formulaic sequences often have an indirect, holistic meaning that relies on context for a correct interpretation, it’s easy to see how the right hemisphere could use them to support communication when the left hemisphere is damaged, and, conversely, how the comprehension and production of formulaic sequences might be undermined by right hemisphere damage. Another reason for caution is that while a simple, holistic storage arrangement, whether in the right hemisphere or not, might be argued for immutable word strings such as proverbs, fixed idioms, and memorized rhymes and prayers, it is less plausible for the many formulaic sequences that have to be modified in some way for appropriate use. That includes anything containing a finite verb, for example, I/you/she pull/are pulling/pulled his/your/my leg. The same applies to expressions that require the insertion of variable material, like [time] ago; not only … but also. In such cases, ensuring the expression is grammatically and semantically apposite would most likely involve other brain areas. The association of formulaicity with the subcortical right basal ganglia is a more recent discovery and is substantially based on the observation that formulaic sequence use is reduced or impaired in people with Parkinson’s disease (PD) (Lee & Van Lancker Sidtis, 2020). For example, participants with PD made more errors than controls when reciting well known nursery rhymes, prayers and other memorized material (Bridges et al., 2013). This pattern contrasted with that for people with AD, where the basal ganglia are intact (Van Lancker Sidtis et al., 2015). The basal ganglia play a central role in facilitating and inhibiting motor movement, functions that are disturbed in PD. The link with formulaic sequences may relate to the basal ganglia’s role in procedural memory, that is, routine motor actions (Sidtis, 2022). An alternative possibility comes from Skipper et al. (2022). They offered evidence that repeated exposure to word strings shifts the processing out of “language” areas and into sensorimotor regions. Their findings led them to suggest that the apparent roles of the right hemisphere and right basal ganglia in formulaic language preservation might in fact be due to the preservation of the right sensorimotor region. Kaltenböck (2020) distinguished the formulaic sequences performing discourse functions and linked to the right-hemisphere (e.g. greetings, interjections) from those expressing content (e.g. a waste of time; straight away) or grammatical relations (e.g. be about to; each other), linked to the left. This separation was confirmed in his own data from people with left and right hemisphere damage. It also chimes with earlier proposals for a distributed lexicon (Wray, 2002) and a dual system for processing. Several such “dual systems” models have been proposed, including those of Bates et al. (2007), Kallens and Christiansen (2022), Sinclair (1991), Ullman (2001), Van Lancker Sidtis (2004) and Wray (1992). These conceptualizations envisage two distinct types of processing for linguistic material, though they vary in whether particular forms are permanently allocated to one type or the other, or whether all forms are subject to flexibility in how they are processed on a given occasion.
Formulaic Sequences and Language Disorders 183
13.6 Issues with Testing and Measuring Methods As noted earlier, identifying what is formulaic in disordered (or indeed any) language is not straightforward. The main reason is that formulaic language is ubiquitous and doesn’t always look very different from non-formulaic language. As a result, in regular interaction, where one cannot measure speed of production or brain activity, it can be difficult to distinguish what is and what isn’t formulaic. There are various potential solutions to this. One is tightening the definition of formulaic sequences to only those word strings that have particular easily identified characteristics, such as a non-canonical form (e.g. growed like Topsy), a constituent word that occurs nowhere else (e.g. get one’s dander up), a non-literal meaning (e.g. wear one’s heart on one’s sleeve), or sufficient usage to deserve a dictionary entry in their own right (e.g. the green-eyed monster). This solution can certainly improve the reliability of research designs, but of course it excludes the types of formulaic sequence that fall outside those specifications. Another approach is to use human judgement to decide what counts as formulaic (e.g. Bridges & Van Lancker Sidtis, 2013). This allows for more intuitive and contextualized analyses, but potentially suffers from the inconsistencies associated with such decision-making. At the other extreme is complete automation, such as looking for frequent word groups that match those in a reference corpus (e.g. Zimmerer et al., 2016). This method is good at finding material that is both formulaic and frequent but, depending on the nature of the algorithm, it might not easily identify formulaic sequences that are infrequent. While those approaches are used to examine speech or writing that has been previously collected, it is also possible to look at processing in real time. Two main methods are already much used for research into formulaic language, though relatively rarely in research on formulaicity in language disorders. In eye-tracking research (e.g. Carrol & Conklin, 2020), a participant’s eye movements are captured as they read stimuli. During reading, we do not attend to all the words (or parts of words) equally and so identifying where someone’s gaze stays (or returns to) can give clues about how the information is being processed. However, transferring this research method to the context of language disorders might not be straightforward. This is because two variables would simultaneously be of interest. In non-disordered processing, it is possible to focus purely on establishing what is and is not formulaic for the participant. However, in people with a language disorder, one is also interested in how they are processing formulaic language, since it might be different from how they processed it before. But to find this out, one would have to be certain about what is (or was) formulaic for them. Without pinning down one or the other, it could be difficult to interpret observations. A similar issue potentially applies in another method for observing processing – brain activity. In electroencephalographical (EEG) studies, receptors on the scalp are used to measure event-related potentials (ERPs), showing evidence of how and where the brain is engaged in a given test activity (see Siyanova-Chanturia & Van Lancker Sidtis, 2019 for an overview). ERP research has been used to measure the impact on language tasks of therapeutic interventions in aphasia (e.g. Barbancho et al., 2015) and might be able to predict which individuals with Mild Cognitive Impairment will develop AD (e.g. Chapman et al., 2011; Taylor & Olichney, 2007). However, there seems to be no specific research yet that combines EEG measures with formulaic language in the context of language disorders. A further challenge arises in any clinical or laboratory context in which individuals are in some sense being “tested.” In research on AD, it has been noted that data from tests and data from real conversation are markedly different in kind (Bucks et al., 2000; Davis, 2005; Perkins et al., 1998) and similar observations have been made about aphasia (Edwards & Knott, 1994; Oelschlaeger & Damico, 1998). In fact, a nest of related hazards pervades formulaic language in clinical and non-clinical testing.
184 Alison Wray First, language demarcated for a testing purpose has its own pragmatic agenda: a roverb cited in a test does not carry the pragmatics of a proverb, but of a citation. Gathering p information about naturalistic language in an unnatural situation relies on the testee’s ability to understand the pretence inherent in testing and to instate the r esearcher’s intended pragmatic script. For instance, providing the “correct” picture match for the idiom he paid an arm and a leg for it entails understanding that humorous worlds in which l imb-bartering occurs are not relevant. Furthermore, participants must share the tester’s assumption that a non-literal interpretation is “better,” even though folk linguistic beliefs could classify the non-literal meaning, like slang, as less correct, and therefore less acceptable in a test. In short, the pragmatics of testing are complex and cannot safely be ignored, particularly when testing individuals who have a pragmatic impairment. Irrespective of any “suppression deficit,” deciding what should be suppressed is dependent on what you think is expected of you. Second, it should not be assumed that people with developmental pragmatic difficulties (e.g. ASD) will have acquired the holistic meaning of idioms, metaphors and proverbs, and if they have not, then they will not be able to access them in tests (Huber-Okrainec et al., 2005). Third, testing demands a focus on language that is rarely necessary or useful in normal communication. Actions and reactions that are usually effortless can become confusing and difficult when attended to, even perhaps because that attention prevents them from being achieved using the customary processing routes (Wray, 1992). Fourth, people who are self-conscious about their communication may find it especially difficult to perform well in tests and may have developed strategies that are not optimal for the intended measurements. A person with impaired grammatical ability, for instance, might find it preferable only to attend to recognizable lexical items, filtering out the rest of the detail. In real interaction, it could mean that the grammar-impaired person filters out most of Let’s get your shoes on, ’cos we’re going to the shops, to end up with *** shoes *** shops. They would then rely, perhaps quite successfully, on the meaning of those items, plus pragmatics, to extract a likely interpretation. But the same strategy, when faced with the test stimulus he paid an arm and a leg for it, will render *** arm *** leg ***. With such a minimal representation, which could underlie many different sentences, it would be safer to point at a picture featuring images of an arm and a leg, than one that does not. In fact, it can be argued that all of us, given only *** arm *** leg *** to work with, would not easily think of the idiom, because those lexical items are not salient within the meaning of the non-literal interpretation. It would be little different from giving someone ***sing and expecting them to come up with browsing, or ***pet*** and expecting them to think of competition. Thus, when a participant with a grammatical impairment selects the picture representing the literal meaning of an idiom, it is worth considering whether this necessarily means that the idiom has been interpreted literally, or perhaps only means that the form has been selectively attended to.
13.7 T owards a Deeper Understanding of Formulaic Sequences in Language Disorders Making sense of why formulaic sequences are often so prominent in language disorders requires an understanding of what their purpose is in unimpaired language, and how different outset parameters (developmental disorders) or changes to previous capabilities (acquired disorders) affect what language is available and how that language needs to be used. In short, we need to ask what we use formulaic language for and how its use might change when something disrupts the customary processes of communication. Two key observations can be made in this regard.
Formulaic Sequences and Language Disorders 185 The first relates to the role that formulaic sequences play in managing the general processing load in our language production. Wray (2017) proposed that they act as a regulatory mechanism, to sustain a relatively even level of processing effort. Most of us will have had the experience of trying to carry out a complex task while someone starts a conversation with us. We may well give prefabricated responses. To reply less automatically, we would have to break off from the other task because the combined processing demands are too high. If formulaic sequences play this role in reducing processing load in unimpaired speakers, it follows that in disorders impacting on processing speed or capacity, more formulaic language will be called upon. In other words, the greater the disruption to the equilibrium of language production, the more likely formulaic sequences are to step in and mitigate the impact (provided they are available). The second observation regards the relationship between formulaic sequences and the achievement of the core purpose of communication. According to Wray (2020), we communicate in order to get others to help us make desirable changes to our experiential world, when we cannot make those changes for ourselves. Formulaic sequences play a significant role in achieving this outcome. One reason is that they are familiar in form and conventionalized meaning, so that they are easy for the speaker to produce fluently and, if also in the hearer’s repertoire, for the hearer to recognize and link to a conventionalized purpose (Wray, 2002). For example, hearing mind your backs while in a crowded bar will trigger awareness that you need to avoid stepping backward while someone passes behind you, usually carrying something. Since the formula is familiar, you might respond so automatically that you barely paused in your conversation. Meanwhile, because formulaic sequences are often tied to particular speech communities, the speaker’s choices can signal familiarity and solidarity with, or distance from, the hearer, which might help achieve part of their aim in speaking (e.g. wanting the hearer not to perceive a request as presumptuous). In relation to both of the above observations, formulaic sequences are in the center of normal language, rather than at its periphery, and this helps make sense of their resilience in aphasia and AD, as well as their role in anchoring interaction and learning in ASD. Formulaic sequences operate at the interface of cognitive and social activity, making possible the production (and comprehension) of the language needed for managing interaction with others. For example, familiar expressions like greetings and platitudes (e.g. oh that’s nice) can help bridge the relational gap between speaker and hearer, by facilitating turntaking and signaling attentiveness. The test here is to ask what the interaction of a person with AD or aphasia would be like if they did not have a few formulaic sequences to hand. Would it be more difficult to feel a communicative connection with them? It seems likely that it would. Another role of formulaic sequences in language disorders seems to be as a stand-in for what the speaker would really like to say but cannot. Even if the form is repetitive and the meaning inaccurate to the intention, an impaired speaker is able, at least, to draw from the mental lexicon something to say, with fewer processing demands than there would be for novel output. Easily produced multiword expressions can be loaded with tone and intonation, acting as a carrier for emotive messages. Speaking of the French poet Baudelaire, who had a stroke shortly before his death, Starkie (1958, p. 512) said: The only two words he could remember were “sacré nom” [sacred name (of God)] … With these two words, he who had loved and practised the art of conversation was obliged to express the whole gamut of his feelings and thoughts – joy, sorry, anger and impatience.
Thus, one explanation for a person using only a small number of formulaic sequences is that they have lost the capability to access more. However, in less extreme cases, a person may use a limited range largely because their daily interactional circumstances have narrowed.
186 Alison Wray For example, if a person is rarely engaged in conversations about substantive topics, they could find they get little opportunity to use more than rather superficial greetings and comments. In such instances, their repertoire might be widened by using a more imaginative approach to interaction (e.g. Wray, 2021). If we adopt the position that formulaic sequences are important for keeping interaction going and for achieving more with less, then what effect will there be when, as with right hemisphere and/or basal ganglia damage, formulaic sequences are not easy to retrieve? To be deprived of formulaicity in language may be like knowing all the moves but no longer knowing the dance. Meanings would be made at the informational level, but without the nuance that derives from picking just the right expression for that particular hearer or occasion. As a result, while interactions would appear to be accurate, impaired speakers might often fail to annex the hearer’s agency in making desired changes to their experiential world. This problem, often referred to as a pragmatic deficit in right hemisphere damaged speakers, could therefore be closely tied to the difficulties of harnessing the formulaic language most effective for motivating the hearer to respond in the intended way. Even for those who conceptualize the purpose of communication differently and/or who see formulaic language in a different light, it is difficult to avoid recognizing that formulaic sequences are pervasive and central to human communication. This, in turn, requires that attention be paid to them, when trying to understand the patterns of language in a communication disorder. Research into the effects of their absence, and of their preservation, may have barely scratched the surface.
REFERENCES Barbancho, M. A., Berthier, M. L., Navas-Sánchez, P., Dávila, G., Green-Heredia, C., GarcíaAlberca, J. M., Ruiz-Cruces, R., LópezGonzález, M.V., Dawid-Milner, M.S., Pulvermüller, F., Laraa, J. P. (2015). Bilateral brain reorganization with memantine and constraint-induced aphasia therapy in chronic post-stroke aphasia: An ERP study. Brain and Language, 145–146, 1–10. Bates, T. C., Castles, A., Luciano, M., Wright, M. J., Coltheart, M., & Martin, N. G. (2007). Genetic and environmental bases of reading and spelling: A unified genetic dual route model. Reading and Writing, 20(1–2), 147–171. Benton, A. L., & Joynt, R. J. (1960). Early descriptions of aphasia. Archives of Neurology, 3(2), 205–222. Blake, M. L. (2021). Communication deficits associated with right hemisphere brain damage. In J. S. Damico, N. Müller, & M. J. Ball (Eds.), The handbook of language and speech disorders (pp. 571–589). Wiley Blackwell. Boucher, J., & Anns, S. (2018). Memory, learning and language in autism spectrum disorder. Autism and Developmental Language Impairments, 3, 1–13.
Bridges, K. A., & Van Lancker Sidtis, D. (2013). Formulaic language in Alzheimer’s disease. Aphasiology, 27(7), 799–810. Bridges, K. A., Van Lancker Sidtis, D., & Sidtis, J. J. (2013). The role of subcortical structures in recited speech: Studies in Parkinson’s disease. Journal of Neurolinguistics, 26(6), 591–601. Bucks, R. S., Singh, S., Cuerden, J. M., & Wilcock, G. K. (2000). Analysis of spontaneous, conversational speech in dementia of Alzheimer type: Evaluation of an objective technique for analysing lexical performance. Aphasiology, 14(1), 71–91. Carrol, G., & Conklin, K. (2020). Is all formulaic language created equal? Unpacking the processing advantage for different types of formulaic sequences. Language and Speech, 63(1), 95–122. Chapman, R. M., McCrary, J. W., Gardner, M. N., Sandoval, T. C., Guillily, M. D., Reilly, L. A., & DeGrush, E. (2011). Brain ERPcomponents predict which individuals progress to Alzheimer’s disease and which do not. Neurobiology of Aging, 32(10), 1742–1755. Code, C. (1982). Neurolinguistic analysis of recurrent utterance in aphasia. Cortex, 18(1), 141–152.
Formulaic Sequences and Language Disorders 187 Code, C. (1994). Speech automatism production in aphasia. Journal of Neurolinguistics, 8(2), 135–148. Critchley, M. (1970). Aphasiology and other aspects of language. Arnold. Davis, B. (2005). Introduction: Some commonalities. In B. Davis (Ed.), Alzheimer talk, text and context: Enhancing communication (pp. xi–xxi). Palgrave Macmillan. Davis, B. H., & Bernstein, C. (2005). Talking in the here and now: Reference and politeness in Alzheimer conversation. In B. H. Davis (Ed.), Alzheimer talk, text and context: Enhancing communication (pp. 60–86). Palgrave Macmillan. Davis, B. H., & Maclagan, M. (2010). Formulaicity, pauses and fillers in Alzheimer’s discourse: Gluing relationships as impairment increases. In N. Amiridze, B. Davis, & M. Maclagan (Eds.), Fillers, pauses and placeholders (pp. 189–216). John Benjamins. Davis, B. H., & Maclagan, M. (2013). “Aw, so how’s your day going?”: Ways that persons with dementia keep their conversational partner involved. In B. Davis & J. Guendouzi (Eds.), Pragmatics in dementia discourse (pp. 83–116). Cambridge Scholars Publishing. Dobbinson, S., Perkins, M. R., & Boucher, J. (2003). The interactional significance of formulas in autistic language. Clinical Linguistics and Phonetics, 17(4), 299–307. Edwards, S., & Knott, R. (1994). Assessing spontaneous language abilities of aphasic speakers. In D. Graddol & J. Swann (Eds.), Evaluating language (pp. 91–101). Multilingual Matters. Gerstenberg, A., & Hamilton, H. E. (2022). Older adults’ conversations and the emergence of “narrative crystals”. Narrative Inquiry, 33(3), 27–60. https://doi.org/10.1075/ni.21075.ger Hamilton, H. E. (2019). Language, dementia and meaning. Palgrave Macmillan. Huber-Okrainec, J., Blaser, S. E., & Dennis, M. (2005). Idiom comprehension deficits in relation to corpus callosum agenesis and hypoplasia in children with spina bifida meningomyelocele. Brain and Language, 93(3), 349–368. Hughlings Jackson, J. (1874/1958). On the nature of the duality of the brain. In J. Taylor (Ed.), Selected writings of John Hughlings Jackson (Vol. 2, pp. 129–145). Staples Press. Kallens, P. C., & Christiansen, M. H. (2022). Models of language and multiword expressions. Frontiers in Artificial Intelligence, 5(781962), 1–14.
Kaltenböck, G. (2020). Formulaic language and discourse grammar. In A. Haselow & G. Kaltenböck (Eds.), Grammar and cognition: Dualistic models of language structure and language processing (pp. 233–265). John Benjamins. Kasdan, A., & Swathi, K. (2018). Please don’t stop the music: Song completion in patients with aphasia. Journal of Communication Disorders, 75(1), 72–86. Kempler, D., Van Lancker, D., & Read, S. (1988). Proverb and idiom comprehension in Alzheimer disease. Alzheimer Disease and Associated Disorders, 2(1), 38–49. Lee, B., & Van Lancker Sidtis, D. (2020). Subcortical involvement in formulaic language: Studies on bilingual individuals with Parkinson’s disease. Journal of Speech, Language, and Hearing Research, 63(12), 4029–4045. Luyster, R. J., Zane, E., & Well, L. W. (2022). Conventions for unconventional language: Revisiting a framework for spoken language features in autism. Autism and Developmental Language Impairments, 7, 1–9. McElduff, K., & Drummond, S. S. (1991). Communicative functions of automatic speech in non-fluent aphasia. Aphasiology, 5(3), 265–278. McEvoy, R. E., Loveland, K. A., & Landry, S. H. (1988). The functions of immediate echolalia in autistic children: A developmental perspective. Journal of Autism and Developmental Disorders, 18(4), 657–668. Morsanyi, K., & Stamenković, D. (2021). Idiom and proverb processing in autism: A systematic review and meta-analysis. Journal of Cultural Cognitive Science, 5(3), 367–387. Oelschlaeger, M., & Damico, J. S. (1998). Spontaneous verbal repetition: A social strategy in aphasic conversation. Aphasiology, 12(11), 971–988. Orange, J. B. (2001). Family caregivers, communication, and Alzheimer’s disease. In M. L. Hummert & J. F. Nussbaum (Eds.), Aging, communication and health (pp. 225–248). Lawrence Erlbaum Associates. Paul, R. (2004). Autism. In R. D. Kent (Ed.), The MIT handbook of communication disorders (pp. 115–119). MIT Press. Perkins, L., Whitworth, A., & Lesser, R. (1998). Conversing in dementia: A conversation analytic approach. Journal of Neurolinguistics, 11(1–2), 33–53. Prizant, B. M. (1983). Language acquisition and communicative behavior in autism: Toward an
188 Alison Wray understanding of the “whole” of it. Journal of Speech and Hearing Disorders, 48(3), 296–307. Prizant, B. M., & Duchan, J. F. (1981). The functions of immediate echolalia in autistic children. Journal of Speech and Hearing Disorders, 46(3), 241–249. Roberts, J. M. A. (1989). Echolalia and comprehension in autistic children. Journal of Autism and Developmental Disorders, 19(2), 271–281. Rydell, P. J., & Mirenda, P. (1991). The effects of two levels of linguistic constraint on echolalia and generative language production in children with autism. Journal of Autism and Developmental Disorders, 21(2), 131–157. Sidtis, D. (2022). Foundations of familiar language. Wiley Blackwell. Sinclair, J. M. (1991). Corpus, concordance, collocation. Oxford University Press. Siyanova-Chanturia, A., & Van Lancker Sidtis, D. (2019). What online processing tells us about formulaic language. In A. Siyanova-Chanturia & A. Pellicer-Sánchez (Eds.), Understanding formulaic language: A second language acquisition perspective (pp. 38–61). Routledge. Skipper, J., Aliko, S., Brown, S., Jo, Y. J., Lo, S., Molimpakis, E., & Lametti, D. R. (2022). Reorganizing of the neurobiology of language after sentence overlearning. Cerebral Cortex, 32(11), 2447–2468. Snowdon, D. A. (2001). Aging with Grace. Fourth Estate. Stahl, B., Gawron, B., Regenbrecht, F., Flöel, A., & Kotz, S. A. (2020). Formulaic language resources may help overcome difficulties in speech-motor planning after stroke. PLOS one, 16(6), e0233608. Starkie, E. (1958). Baudelaire. New Directions. Taylor, J. R., & Olichney, J. (2007). From amnesia to dementia: ERP studies of memory and language. Clinical EEG and Neuroscience, 38(1), 8–17. Ullman, M. T. (2001). A neurocognitive perspective on language: The declarative/ procedural model. Nature Reviews Neuroscience, 2(October), 717–726. Ullman, M. T., & Pullman, M. Y. (2015). A compensatory role for declarative memory in neurodevelopmental disorders. Neuroscience and Biobehavioral Reviews, 51, 205–222.
Van Lancker Sidtis, D. (2004). When novel sentences spoken or heard for the first time in the history of the universe are not enough: Toward a dual-process model of language. International Journal of Language and Communication Disorders, 39(1), 1–44. Van Lancker Sidtis, D. (2012). Formulaic language and language disorders. Annual Review of Applied Linguistics, 32, 62–80. Van Lancker Sidtis, D., Choi, J., Alken, A., & Sidtis, J. J. (2015). Formulaic language in Parkinson’s disease and Alzheimer’s disease: Complementary effects of subcortical and cortical dysfunction. Journal of Speech, Language, and Hearing Research, 58(5), 1493–1507. Wray, A. (1992). The focusing hypothesis: The theory of left hemisphere lateralised language re-examined. John Benjamins. Wray, A. (2002). Formulaic language and the lexicon. Cambridge University Press. Wray, A. (2008). Formulaic language: Pushing the boundaries. Oxford University Press. Wray, A. (2009). Identifying formulaic language: Persistent challenges and new opportunities. In R. Corrigan, E. Moravcsik, H. Ouali, & K. Wheatley (Eds.), Formulaic language: Vol 1: Structure, distribution and historical change (pp. 27–51). John Benjamins. Wray, A. (2010). “We’ve had a wonderful, wonderful thing”: Formulaic interaction when an expert has dementia. Dementia, 9(4), 517–534. Wray, A. (2017). Formulaic sequences as a regulatory mechanism for cognitive perturbations during the achievement of social goals. Topics in Cognitive Science, 9(3), 569–587. Wray, A. (2020). The dynamics of dementia communication. Oxford University Press. Wray, A. (2021). Why dementia makes communication difficult: A guide to better outcomes. Jessica Kingsley. Zimmerer, V. C., Wibrow, M., & Varley, R. A. (2016). Formulaic language in people with probable Alzheimer’s Disease: A frequencybased approach. Journal of Alzheimer’s Disease, 53(3), 1145–1160.
14 Syntactic Processing in Developmental and Acquired Language Disorders THEODOROS MARINIS 14.1 Preamble Research in developmental and acquired language disorders has traditionally investigated the language strengths and weaknesses of individuals with the aim to identify different profiles and patterns of performance, develop theories that can explain the nature of these disorders and can inform the design of appropriate treatments. In Developmental Language Disorders (DLD) an important debate has been about whether language impairment is caused by incomplete linguistic knowledge/representation or by processing limitations. Similarly, in acquired language disorders a debate has revolved around whether the physiological cause of the disorder, for example, a lesion, has affected the language system directly or the processing mechanisms that enable language use. The present chapter addresses this issue by reviewing literature on syntactic processing in developmental and acquired language disorders with focus on Developmental Language Disorder (DLD) (previously called Specific Language Impairment, SLI) and Aphasia. Within the last 10 years, the number of studies using online research methodologies has increased exponentially. As a result, there is more evidence about how individuals process sentences in real-time that can address the question about whether the language impairment is due to an impairment in linguistic representation or an impairment in processing mechanisms.
14.2 What is Syntactic Processing? Healthy adult individuals who have a full-fledged language system can listen and comprehend effortlessly what other people say. Similarly, trained readers can easily understand the sentences they read. This ease of comprehension conceals the number of processes involved, as well as the cognitive demands required when we comprehend sentences in real-time. For example, when we listen to a sentence, such as The zebra was kissed
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
190 Theodoros Marinis by the camel, we have to decode the sounds, segment and recognize words from the speech stream (the/zebra/was/kissed/…), assign syntactic categories to words (the = determiner, zebra = noun), combine words together into constituents (the zebra = Noun Phrase), assign thematic roles (e.g., the zebra = patient/theme, the camel = agent), and interpret the sentence. In sentences, such as Which donkey is the zebra carrying?, which donkey has to be interpreted as the object of the verb carrying. Therefore, a listener has to keep the whphrase which donkey in working memory until they encounter the verb carrying and at that point they can interpret it as the object of the verb carrying. These examples illustrate that to be able to interpret a sentence we have to process and integrate rapidly different types of information (lexical/semantic, structural, discourse/pragmatic, etc.), we have to store and retrieve information from working memory and build up the grammatical structure of the sentence. Research in syntactic processing or parsing investigates the mental processes involved when we comprehend sentences in real-time and the way different types of information are utilized to build up the grammatical structure of the sentence, and thus, lead to sentence interpretation.
14.3 Syntactic Processing in Neurotypical Children There is ample evidence that healthy adults and skilled readers are able to utilize and rapidly integrate different types of information when they read or listen to sentences in real-time (e.g., Frazier, 1999; Gibson & Pearlmutter, 1998). Recently, the amount of research into sentence processing has grown. Findings have shown that, by the age of four years, neurotypical children can make use of structural/syntactic information similarly to adults. A study by Tyler and Marslen-Wilson (1981) was the first to show that 5, 7, and 10 year old children show the same processing pattern with adults when they monitor sentences to detect a word in three conditions: normal prose, semantically anomalous, and syntactically anomalous sentences. Contemori and Marinis (2014) investigated syntactic dependencies in children by focusing on how 6-to-8-year-old children comprehend relative clauses and passives using a self-paced listening task with a picture verification. In this task children first see pictures on a computer screen that create a mental representation of an event. This leads to expectations about the sentence that follows. After having looked at the pictures, children listen to a sentence word-by-word or phrase-by-phrase by pressing the space bar or another button on the keyboard. The computer tracks how fast they press the button which provides a step-by-step measure of how quickly they process each part of the sentence in real-time, that is, before the sentence is complete. This sheds light on predictive processing, that is, the children’s ability to use incoming information in order to interpret the sentence as it unfolds. To investigate how children process passives, Contemori and Marinis (2014) presented sentences such as (1) and (2) below. (1) Subject relative active This is/the bear/that is painting/the elephant/in the woods/on Sunday (2) Subject relative passive This is/the bear/that is being painted/by the elephant/in the woods/on Sunday The two pictures presented on the computer screen showed a figure carrying out an action on another (e.g., a bear painting an elephant), while the other picture showed the same figures with the roles reversed (e.g., an elephant painting a bear). Only one of the two pictures
Syntactic Processing in Developmental and Acquired Language Disorders 191 matched the interpretation of the sentence. For (1) the picture matching the sentence showed a bear painting an elephant, whereas for (2) the picture matching the sentence showed the reversed thematic roles (an elephant painting a bear). This task allows us to get an insight into how listeners process morphosyntactic information in the following way. When participants encounter the first NP of the sentence (the bear), they are expected to assign to it the thematic role of the agent (Ferreira, 2003). The phrase that is painting is in line with this expectation. This is not the case for the phrase that is being painted. Upon encountering the verb being painted they have to revise the prediction that the first NP is the agent: the thematic role of the first NP has to be changed from agent to patient. Thus, processing of passives requires processing of morphosyntactic cues of the verb and reanalysis of thematic roles. Processing of passives is, therefore, more complex than processing of actives. As a result, it is predicted that processing of passives incurs extra processing cost that can be reflected in a slow-down of pressing the space bar on the computer after participants hear the phrase that provides disambiguation cues for passives (being painted). This prediction was borne out in the Contemori and Marinis (2014) study for both adults and children, indicating that 6-to-8-yearold children process passives in a similar way to adults. However, children were processing sentences at a slower rate as adults and were not as successful as adults in selecting the correct picture that goes with the sentence. This indicates that although they are sensitive to morphosyntactic cues as the sentence unfolds, they may sometimes have difficulties revising their initial interpretation. This could reflect the fact that their executive functioning is still developing and therefore are not at the same level as in adults. This is in line with previous studies showing differences in the processing pattern of children compared to adults that related to low working memory in children (e.g., Booth et al., 2000; Roberts, Marinis, Felser, & Clahsen, 2007) and is very important because l anguage impaired children and adults seem to have limitations in their working memory capacity. The processing of passives was also investigated in a study by Marinis and Saddy (2013) using a similar design but with only one picture that either matched or did not matched the sentence. Similar to the Contemori and Marinis (2014) study, the expectation was that children will slow down when they encounter the verb that provides a cue that they have to reanalyze the sentence. However, because this study presented only one picture for each sentence, the picture either matched or did not match the sentence – the active or the passive. As a result, the design enabled investigation of the children’s ability to reanalyze both active and passive sentences. The results showed that neurotypical children slowed down when they encountered a verb that did not match the initial interpretation of the sentence in both actives and passives. In examples (3) and (4) below a mismatching picture for (3) was a camel kissing a zebra and a mismatching picture for (4) was a zebra kissing a camel. (3) Active I think/that/the zebra/was kissing/the camel/at the zoo/last Monday (4) Passive I think/that/the zebra/was kissed/by the camel/at the zoo/last Monday A slow down after was kissing/was kissed indicated that children were able to process the verbal morphology and use it as a cue to reanalyze the thematic roles of the sentence. In summary, these studies provide evidence that overall neurotypical children are able to use morphosyntactic cues and process sentences similarly to adults. However, they often process sentences at a slower speed compared to adults and may have difficulties revising their initial interpretation due to age related limitations in executive functioning.
192 Theodoros Marinis
14.4 Sentence Processing in Children with Developmental Language Disorder Developmental Language Disorder (DLD) is a language disorder that occurs in the absence of a known biomedical aetiology and may affect one or more of the following domains: phonology, morphology, syntax, word finding and lexical semantics, pragmatics, discourse, verbal learning and memory (Bishop et al., 2017). Children with DLD are a heterogeneous group. Apart from the domains mentioned above, research has shown that children with DLD may also have deficits in some non-linguistic abilities, such as symbolic play (Johnston, 1994) and motor skills (Hill, 2001). Moreover, a large body of evidence has revealed that children with DLD show deficits in phonological memory (Gathercole & Baddeley, 1990), and process both linguistic and non-linguistic information at a slower rate than neurotypical children (Miller et al., 2001). However, there is a debate regarding the nature and cause of DLD with some theories arguing that it is caused by a deficit in grammar (van der Lely, 2005), and others that it is caused by general processing capacity limitations (Joanisse & Seidenberg, 1998). Most studies addressing the language abilities of children with DLD have used off-line comprehension, production, and grammaticality judgment tasks. For example, the most widely used tasks for investigating sentence comprehension are picture selection and picture verification tasks. In a picture selection task, children typically see a set of two to four pictures, listen to a sentence and after the end of the sentence, they have to select the picture that matches the sentence. In the picture verification task, children see only one picture. They then listen to one sentence, and they have to say whether the sentence matches the picture. In both tasks, children must listen and build up the grammatical structure of the sentence, store it in memory, observe pictures, and then make a decision. If the child has to select one picture out of two, four or even more pictures, a level of complexity is added to the task: it also requires good observation skills, and the ability to spot differences between pictures. In addition, as the number of pictures increases, so does the processing capacity required from the child in order to decide which picture matches the sentence. Thus, these tasks do not only involve sentence comprehension, but also, memory, observation skills, and they place attentional demands and variable processing capacity demands depending on the number of pictures. Given that it is impossible to separate these factors, these tasks cannot genuinely disentangle whether DLD results from a grammatical impairment or processing capacity limitations. In contrast, on-line sentence processing tasks are able to address this debate because they are implicit: they tap how children process sentences as they unfold, and therefore they rely less on memory. Sentence processing research has revealed that overall children with DLD process sentences at a slower rate than neurotypical children and some studies have found that their pattern of processing is similar to neurotypical children, whereas others have found qualitative differences in the processing pattern of children with DLD compared to neurotypical children. Studies using word-monitoring tasks have revealed that children with DLD process sentences at a slower rate than neurotypical children but do not differ in their processing pattern from neurotypical children. Montgomery et al. (1990) showed that 7 to 12 year-old children with DLD are able to make use of syntactic, semantic, and real-world information when they process sentences in real-time. Montgomery (2000) showed that sentence processing is facilitated by the accumulation of sentential information in 7-to-10-year-old children with DLD in a similar way to age and language controls. Furthermore, Montgomery (2002) demonstrated that 6-to-10-year-old children with DLD process sentences with high proportion of stop consonants and a high proportion of non-stop consonants in the same way as age- and language-matched controls.
Syntactic Processing in Developmental and Acquired Language Disorders 193 In contrast to these studies, Montgomery and Leonard (1998) and Marinis and van der Lely (2007) found qualitative differences between children with DLD and neurotypical children. Montgomery and Leonard (1998) investigated how 6-to-10-year-old children with DLD process verbs with low perceptual saliency morphemes (third-person-singular -s and past tense -ed) as opposed to verbs with a high perceptual saliency morpheme (present-progressive -ing) in a word-monitoring task. Children had to detect words (cookies, breakfast) following an inflected verb (eats), as in (5), or an uninflected verb (eat) that is ungrammatical in the given context, as in (6) below. (5) Jerry can’t wait to get home from school. Every day he races home and eats cookies before dinner. (6) Becky loves Saturday mornings. She always gets up early and eat breakfast before she watches cartoons. Processing of an ungrammatical sentence was predicted to slow down the children’s reaction times if they were able to process the morpho-syntactic cues encoded at the verb inflection and build the syntactic structure. Neurotypical children slowed down in ungrammatical sentences with missing morphemes with both high and low perceptual saliency (presentprogressive -ing vs. third person-singular -s and past tense -ed). In contrast, children with DLD slowed down only in ungrammatical sentences with missing morphemes with high perceptual saliency (present-progressive – ing). This qualitatively different pattern was taken as evidence in favor of the Surface Account, according to which children with DLD have greater difficulty processing low perceptual saliency morphemes. Qualitatively different processing in children with DLD compared to neurotypical children was also found in a study by Marinis and van der Lely (2007) investigating how 10-to17-year-old children with DLD processed wh-questions using a cross-modal picture priming experiment. For example, (7) Balloo gives a long carrot to the rabbiti. Whoi did Balloo give the long carrot to ti at the farm? Sentences, such as in (7), involve a dislocated wh-word (who) that has to be interpreted at the position it has moved from as indicated by the ti (trace/gap). In this task, children saw a picture while listening to the question, and had to press a button to indicate an animacy decision. The picture was either the antecedent of who, that is, a picture of a rabbit, or an unrelated picture. The picture was presented at the position of the trace/gap (after the preposition to), after the verb (give), or at an unrelated position (control position). Neurotypical children showed shorter reaction times for the picture of the rabbit compared to the unrelated picture at the position of the trace/gap and not at the position of the verb or the control position. This finding provided evidence that neurotypical children established a syntactic dependency between the dislocated wh-word (who) and the trace and they tried to interpret it in this position. In contrast, the children with DLD showed shorter reaction times for the picture of the rabbit compared to the unrelated picture presented at the position of the verb. This suggested that they try to interpret the wh-word at the verb give, something that is in line with evidence from adults who seem to reactivate all possible arguments when they process a verb (Nicol, 1996). When they subsequently encountered the preposition to, they should have revised this hypothesis and interpret the wh-word at the position of the gap/trace. Two possible alternatives can account for the fact that children with DLD did not show an effect at the gap. The first relates to processing capacity limitations: children with DLD may lack the processing capacity to revise their initial hypothesis.
194 Theodoros Marinis A second explanation could relate to slower processing and lexical retrieval. The word-monitoring studies mentioned above have shown that children with DLD had longer reaction times which could be linked to problems with lexical retrieval. Given that the cross-modal priming involves lexical retrieval, slower lexical retrieval could have caused lack of priming at the trace/gap rather than an inability to construct syntactic dependencies. Evidence for slower processing in children with DLD comes from two further studies by Marinis and Saddy (2013) and Chondrogianni et al. (2015). The study by Marinis and Saddy (2013) investigated the processing of passive compared to active sentences and was mentioned earlier in the section on neurotypical children. The children with DLD were able to process the verbal morphology and use it as a cue to start reanalyzing their initial interpretation of the sentence. However, at the end of the sentence they still had longer reaction times in the sentences that had a mismatch between the sentence and the picture. This applied for both passive and active conditions and suggested, that although they are able to process morphosyntactic cues, children with DLD have difficulties reanalyzing their initial interpretation of sentences. Chondrogianni et al. (2015) investigated the processing of definite articles and object clitics in Greek-speaking children with DLD in a self-paced listening study. Children listened to sentences phrase-by-phrase as shown in (8) to (10). At the end of the presentation of the sentence, children had to press a button and answer a comprehension question. In half of the sentences the definite article or the accusative clitic shown in the brackets below was omitted although it is required in the specific context. (8) Definite article–subject position … Argha/to apoghevma/(to) delfini/kinighise/ta psaria. … Late/in the afternoon/(the) dolphin/chased/the fish. (9) Definite article–object position … To kanguro/klotsise/(ti) bala/sto ghipedho/chtes to apoghevma. … The kangaroo/kicked/(the) ball/on the pitch/yesterday afternoon. (10) Direct object clitic pronoun … To elafi/tromakse poli /otan/to liontari /(to) dhagose/sti zugla/pano stus vrachus. … The deer/got very scared/when/the lion/(it) bit/in the jungle/on the rocks. Neurotypical children showed longer reaction times when the article or accusative clitic was omitted, indicating that that they were able to process the ungrammaticality of the sentence. The children with DLD showed overall slower speed of processing compared to neurotypical children of the same age and a similar pattern of processing as neurotypical children regarding definite articles. However, they did not show longer reaction times in sentences with omitted accusative clitics. This suggested that they did not process the ungrammaticality resulting from the omission of accusative clitics. Accusative clitics are verbal arguments and are more complex than definite articles. The difficulty in processing accusative clitics could relate to their complexity. In summary, children with DLD seem to have a slower speed of processing than neurotypical children of the same age, but the overwhelming majority of studies show that their pattern of processing does not differ from that of neurotypical children. This implies that they are capable of processing and integrating different types of information (syntactic, semantic, world-knowledge). When there is a discrepancy between children with DLD and neurotypical children, this seems to reflect difficulties with lexical retrieval (Marinis & van der Lely, 2007), a difficulty in revising an initial interpretation (Marinis & Saddy, 2013), or a difficulty to process syntactically complex structures (Chondrogianni et al., 2015), all of which could be the result of slower speed of processing rather than a deficit in the grammatical system.
Syntactic Processing in Developmental and Acquired Language Disorders 195
14.5 Sentence Processing in Acquired Disorders In contrast to developmental language disorders, acquired language disorders result from a biomedical etiology, for example, a stroke or a neurodegenerative disease after the language system has been fully established. This section focuses on sentence processing in agrammatic aphasia. Similar to research in DLD, the great majority of research in agrammatic aphasia has used off-line tasks to assess sentence processing. These have shown that individuals with agrammatic aphasia perform above chance in the comprehension of canonical sentences, such as actives and subject clefts. In contrast, they perform at chance level in sentences with non-canonical word-order, such as object wh-questions, object relative clauses, object clefts, and passives. This has led to the formulation of several theories for the nature of the impairment, some of which argue that the impairment is at the structural level, that is, there is trace deletion from the syntactic representation (e.g., Grodzinsky, 2000), while others argue that it is caused by slower speed of processing (e.g., Haarman & Kolk, 1994), processing capacity limitations (e.g., Burkhardt et al., 2008; Murray, 2018) or in difficulties in predictive processing and revising thematic roles (e.g., Dickey et al., 2007; Dickey & Thompson, 2009; Meyer et al., 2012). However, off-line research task methods are affected by memory and attentional demands and contaminate the participants’ performance on the language tasks. Therefore, based only on off-line data it is not possible to disentangle between the two types of hypotheses. Recently, several studies have used on-line methodologies to examine how aphasic patients process sentence in real-time. In the rest of this section, I will first review these studies, and then address the implications of these results for theories of aphasia and the nature of the impairment. Early studies by Tyler (1985) and Shankweiler et al. (1989) showed that although individuals with agrammatic aphasia are not able to use syntactic information to determine the meaning of sentences off-line, they are capable of using syntactic information on-line. Similar results were obtained in a study by Blumstein et al. (1998) who used auditory priming tasks with several types of sentences involving movement (subject, and object relative clauses, simple and embedded wh-questions) and showed reactivation of the antecedent at the trace in agrammatic patients. This is in contrast to a series of studies by Swinney, Zurif and colleagues who addressed the comprehension of subject and object relative clauses, as in (11) and (12), using the cross-modal priming paradigm (e.g., Balogh et al., 1998; Swinney et al., 1996; Zurif et al., 1993). (11) T he gymnast loved the professori from the northwestern city whoi ti complained about the bad coffee. (12) The priest enjoyed the drinki that the caterer was serving ti to the guests. The cross-modal priming experiments revealed that individuals with agrammatic aphasia did not show reactivation at the trace in both subject and object relative clauses. This was interpreted as a processing difficulty due to either abnormally slow linking of antecedents and traces or failure to link the two. Non-grammatical strategies, such as the agent-first strategy (Caplan & Futter, 1986) were argued to compensate for their inability to establish dependency relations. This was linked to evidence showing slower than normal lexical activation (Prather et al., 1991). An alternative interpretation of the results from the studies using cross-modal priming tasks is that the lack of a reactivation of the trace is due to high processing demands required by the task itself. This is because it involves both the visual and auditory modality. This could explain the discrepancy between the study by Blumstein et al. (1998) and the studies by Swinney, Zurif and colleagues.
196 Theodoros Marinis A further study looking at syntactic processing in people with aphasia was conducted by Caplan and Waters (2003) using a self-paced listening task with sentences of different syntactic complexity – cleft object sentences (14) and center-embedded subject-object relative clauses (16) that are more complex than cleft subject sentences (13), and right branching object-subject relative clauses (15). (13) It was the food that nourished the child. (14) It was the woman that the toy amazed. (15) The father read the book that terrified the child. (16) The man that the fire injured called the doctor. People with aphasia and control participants listened to the sentences in a phrase-by-phrase fashion by pushing a button. At the end of the sentence, they had to judge the plausibility of the sentences. In the plausibility judgment, both people with aphasia and control participants took more time to judge more complex sentences compared to less complex ones. Control participants were equally good in judging the plausibility of all sentence types, but the participants with aphasia were less accurate in the judgment of more complex compared to less complex sentences. Overall, participants with aphasia had slower reaction times than control participants and the pattern of processing differed as a function of their level of comprehension. Good comprehenders performed similarly to control participants; in centerembedded subject-object relative clauses they showed longer reaction times at the end of clauses and at points of syntactic complexity compared to right branching object-subject relative clauses. These effects were not found in poor comprehenders, suggesting that they did not assign the syntactic structure of center-embedded subject-object relative clauses on-line. Poor comprehenders also showed a different pattern of processing compared to good comprehenders in cleft sentences: poor comprehenders’ RTs on the verb were longer in sentences that were incorrectly judged to be implausible, than in those that were correctly judged. This effect was not attested in good comprehenders. This indicates that when poor comprehenders made errors, they spent more time trying to build up the structure of the sentence and allocated additional time to process the most demanding phrase of the sentence. More recently several eye-tracking studies have been conducted to address how individuals with agrammatic aphasia process non-canonical sentences in real-time using the visualworld paradigm (e.g., Dickey et al., 2007; Dickey & Thompson, 2009; Mack & Thompson, 2017; Meyer et al., 2012). In the visual-world paradigm participants listen to sentences while they are looking at pictures and, sometimes at the end of the aural presentation of the sentence, they have to select the picture that best represents the meaning of the sentence. This type of multi-modal task requires less cognitive resources than the cross-modal priming task and has better ecological validity than the self-paced listening task because in the visualworld paradigm sentences are presented unsegmented with natural prosody. Additionally, interpreting visual information (i.e. looking at pictures) while listening to sentences is very close to what we do in everyday life. Dickey et al. (2007) investigated non-canonical sentences involving wh-movement (object wh-questions, object clefts) compared to control yes/no questions and found that although individuals with agrammatic aphasia were less accurate in the comprehension of object whquestions and object clefts than in yes/no questions their eye-movement pattern at the position of the gap did not differ from control participants, which was interpreted as evidence for being able to create filler-gap dependencies. However, differences were observed towards the end of the sentence. This was taken as evidence that individuals with aphasia have weak representations that make them vulnerable to competition from other representations. This was also confirmed by Choy and Thompson (2005) in a study that investigated the processing of pronouns and reflexives.
Syntactic Processing in Developmental and Acquired Language Disorders 197 The comprehension of non-canonical sentences using eye-tracking was also investigated by Dickey and Thompson (2009), but this study tested object relative clauses and passives, as in (17) and (18). (17) Point to who the bride was tickling t in the mall. (18) Point to who was tickled t by the bride in the mall. The results from the object relatives were similar to the results from the object wh-questions from the study by Dickey et al. (2007). Participants with aphasia looked at the picture of the theme after listening to the verb, which provided evidence for a filler-gap dependency but at the end of the sentence they were looking also at the competing picture. The results from the sentences with passives did not show an effect of gap-filling for either of the groups, but although passives are non-canonical sentences, they do not involve wh-movement but NP-movement. The lack of gap-filling in both groups could relate to the type of structure. Processing of passives was also addressed in Meyer et al. (2012) with simpler passive sentences than in Dickey and Thompson (2009), as shown in (19) and actives (20). (19) The man was shaved by the boy (20) The man was shaving the boy Meyer et al. (2012) showed a discrepancy between individuals with aphasia and control participants in the comprehension of both passive and active sentences. Healthy controls showed an agent-first bias in both actives and passives and in passives they changed their gaze to the theme after encountering the verb. In contrast, individuals with aphasia did not show an agent-first bias in any of the sentence types. In active sentences, they fixated the correct picture only at the end of the sentence, reflecting slowed processing and in passives their fixations were at chance. These findings were interpreted to indicate slowed lexical access and/or lexical integration deficits. Summarizing, studies on syntactic processing in aphasia using on-line methods have provided invaluable insight into the nature of the processing system of individuals with aphasia. Performance at chance in off-line sentence comprehension tasks does not necessarily coincide with an inability to make use of morphosyntactic cues and assign syntactic structure in real-time. Poor performance in off-line tasks and discrepancies between individuals with aphasia and healthy controls in on-line tasks could result from a variety of reasons. Although individuals with aphasia seem to be sensitive to morphosyntactic information in canonical sentences, slower speed of processing, limitations in processing capacity or difficulties in predictive processing and revising thematic roles may result in difficulties to process non-canonical sentences in real-time, especially passives. Further research using designs that can tease apart effects of speed of processing from limitations in processing capacity, predictive processing, and thematic role revision are necessary to be able to characterize the nature of the processing system in individuals with aphasia.
14.6 Conclusion Research on syntactic processing in DLD and aphasia has increased exponentially within the last 10 years and with it also the evidence base about how individuals process sentences in real-time. The studies reviewed in this chapter do not provide evidence that children with DLD or adults with aphasia have a deficit in their grammatical representation per se. Both groups seem to be able to process and integrate different types of information in real-time in canonical sentences but seem to have slower speed of processing and difficulties with
198 Theodoros Marinis non-canonical sentences with increased syntactic complexity. These difficulties could be the result of slower speed of processing, processing capacity limitations, or difficulties in predictive processing. It is necessary for future research to adopt designs that combine sentence processing experiments using predictive processing designs in combination with sensitive and reliable tasks that measure speed of processing and processing capacity. This is important in order to tease apart the source of difficulty in children with DLD and adults with aphasia and understand the nature of their processing system. This will make an important contribution to theories that seek to explain the nature of these disorders and can inform the design and implementation of future interventions.
REFERENCES Balogh, J., Zurif, E., Prather, P., Swinney, D., & Finkel, L. (1998). Gap-filling and end-ofsentence effects in real-time language processing: Implications for modeling sentence comprehension in aphasia. Brain & Language, 61(2), 169–182. Bishop, D. V., Snowling, M. J., Thompson, P. A., & Greenhalgh, T., & CATALISE-2 Consortium. (2017). Phase 2 of CATALISE: A multinational and multidisciplinary Delphi consensus study of problems with language development: Terminology. Journal of Child Psychology & Psychiatry, 58(10), 1068–1080. Blumstein, S., Byma, G., Kurowski, K., Hourihan, J., Brown, T., & Hutchinson, A. (1998). On-line processing of filler-gap constructions in aphasia. Brain & Language, 61(2), 149–168. Booth, J. R., MacWhinney, B., & Harasaki, Y. (2000). Developmental differences in visual and auditory processing of complex sentences. Child Development, 71(4), 981–1003. Burkhardt, P., Avrutin, S., Piñango, M. M., & Ruigendijk, E. (2008). Slower-than-normal syntactic processing in agrammatic Broca’s aphasia: Evidence from Dutch. Journal of Neurolinguistics, 21(2), 120–137. Caplan, D., & Futter, C. (1986). Assignment of thematic roles by an agrammatic aphasic patient. Brain & Language, 27(1), 117–135. Caplan, D., & Waters, G. (2003). On-line syntactic processing in aphasia: Studies with auditory moving window presentation. Brain & Language, 84(2), 222–249. Chondrogianni, V., Marinis, T., Edwards, S., & Blom, E. (2015). Production and on-line comprehension of definite articles and clitic pronouns by Greek sequential bilingual children and monolingual children with Specific Language Impairment. Applied Psycholinguistics, 36(5), 1155–1191.
Choy, J. J., & Thompson, C. K. (2005). Online comprehension of anaphor and pronoun constructions in Broca’s aphasia: Evidence from eyetracking. Brain and Language, 95(1), 119–120. Contemori, C., & Marinis, T. (2014). The impact of number mismatch and passives on the real-time processing of relative clauses. Journal of Child Language, 41(3), 658–689. Dickey, M. W., Choy, J. J., & Thompson, C. K. (2007). Real-time comprehension of whmovement in aphasia: Evidence from eyetracking while listening. Brain and Language, 100(1), 1–22. Dickey, M. W., & Thompson, C. K. (2009). Automatic processing of wh- and NP-movement in agrammatic aphasia: Evidence from eyetracking. Journal of Neurolinguistics, 22(6), 563–583. Ferreira, F. (2003). The misinterpretation of noncanonical sentences. Cognitive Psychology, 47, 164–203. Frazier, L. (1999). On sentence interpretation. Kluwer. 45(3-4), 306–310. Gathercole, S., & Baddeley, A. (1990). Phonological memory deficits in language disordered children: Is there a causal connection? Journal of Memory and Language, 29(3), 336–360. Gibson, E., & Pearlmutter, N. (1998). Contraints on sentence comprehension. Trends in Cognitive Science, 2(7), 262–268. Grodzinsky, Y. (2000). The neurology of syntax: Language use without Broca’s area. Behavioural and Brain Sciences, 23(1), 1–71. Haarman, J., & Kolk, H. (1994). On-line sensitivity to subject-verb agreement violations in Broca’s aphasics: The role of syntactic complexity and time. Brain & Language, 46(4), 493–516.
Syntactic Processing in Developmental and Acquired Language Disorders 199 Hill, E. L. (2001). Non-specific nature of specific language impairment: A review of the literature with regard to concomitant motor impairments. International Journal of Language and Communication Disorders, 36(2), 149–171. Joanisse, M., & Seidenberg, M. (1998). Specific language impairment: A deficit in grammar or processing? Trends in Cognitive Sciences, 2(7), 240–247. Johnston, J. R. (1994). Cognitive abilities of children with language impairment. In R. Watkins & M. Rice (Eds.), Specific language impairments in children: Current directions in research and intervention (Vol. 40 (2), pp. 137–149). Paul H. Brookes. Mack, J. E., & Thompson, C. K. (2017). Recovery of online sentence processing in aphasia: Eye movement changes resulting from treatment of underlying forms. Journal of Speech, Language, and Hearing Research, 60(5), 1299–1315. Marinis, T., & Saddy, D. (2013). Parsing the passive: Comparing children with Specific Language Impairment to sequential bilingual children. Language Acquisition, 20(2), 155–179. Marinis, T., & van der Lely, H. (2007). Processing of wh-questions in children with G-SLI and typically developing children. International Journal of Language and Communication Disorders, 42(5), 557–582. Meyer, A. M., Mack, J. E., & Thompson, C. K. (2012). Tracking passive sentence comprehension in agrammatic aphasia. Journal of Neurolinguistics, 25(1), 31–43. Miller, C., Kail, R., Leonard, L., & Tomblin, B. (2001). Speed of processing in children with specific language impairment. Journal of Speech, Language and Hearing Disorders, 44(2), 416–433. Montgomery, J. (2000). Relation of working memory to off-line and real-time sentence processing in children with specific language impairment. Applied Psycholinguistics, 21(1), 117–148. Montgomery, J. (2002). Examining the nature of lexical processing in children with Specific Language Impairment: Temporal processing or processing capacity deficit. Applied Psycholinguistics, 23(3), 447–470.
Montgomery, J., & Leonard, L. (1998). Real-time inflectional processing by children with specific language impairment: Effects of phonetic substance. Journal of Speech, Language and Hearing Research, 41(6), 1432–1443. Montgomery, J., Scudder, R., & Moore, C. (1990). Language-impaired children’s real-time comprehension of spoken language. Applied Psycholinguistics, 11(3), 273–290. Murray, L. L. (2018). Sentence processing in aphasia: An examination of material-specific and general cognitive factors. Journal of Neurolinguistics, 48, 26–46. Nicol, J. L. (1996). Syntactic priming. Language and Cognitive Processes, 11(6), 675–679. Prather, P., Shapiro, L., Zurif, E., & Swinney, D. (1991). Real-time examination of lexical processing in aphasia. Journal of Psycholinguistic Research, 20(3), 271–281. Roberts, L., Marinis, T., Felser, C., & Clahsen, H. (2007). Antecedent priming at gap positions in children's sentence processing. Journal of Psycholinguistic Research, 36, 175–188. Tyler, L. K. & Marslen-Wilson, W. D. (1981). Children’s processing of spoken language. Journal of Verbal Learning and Verbal Behavior, 20, 400–416. Shankweiler, D., Crain, S., Gorrell, P., & Tuller, B. (1989). Reception of language in Broca’s aphasia. Language and Cognitive Processes, 4(1), 1–33. Swinney, D., Zurif, E., Prather, P., & Love, T. (1996). Neurological distribution of processing operations underlying language comprehension. Journal of Cognitive Neuroscience, 8(2), 174–184. Tyler, L. (1985). Real-time comprehension processes in agrammatism: A case study. Brain & Language, 26(2), 259–275. van der Lely, H. K. J. (2005). Domain-specific cognitive systems: Insight from grammatical specific language impairment. Trends in Cognitive Sciences, 9(2), 53–59. Zurif, E., Swinney, D. A., Prather, P. A., Solomon, J. A., & Bushell, C. (1993). On on-line analysis of syntactic processing in Broca’s and Wernicke’s aphasia. Brain and Language, 45(3), 448–464.
15 Inflectional Morphology and Language Disorders MARTINA PENKE 15.1 Preamble Morphology is concerned with the structure of words. Traditionally, morphological operations are divided into word formation, that is, the creation of new words with new meanings, and inflection, the process by which grammatical information is realized on a word. While all morphological operations can be affected in language disorders, research has focused on inflectional deficits. One reason for this is that inflectional morphology is widespread in the languages of the world and when present in a language is likely to surface in every utterance. Deficits with inflectional morphology are, therefore, easy to identify, by listening to the spontaneous or probed speech production of affected speakers. Also, inflection is situated at the interface of morphology, syntax, and phonology. While inflection creates grammatical word forms and thus is part of morphology, the grammatical information added typically exerts effects on other constituents in a phrase/sentence and hence is effective in syntax. Moreover, the choice between different inflectional allomorphs might be phonologically determined and the inflected word form must adhere to the phonological constraints operative in a given language. In addition, the production and perception of inflectional morphemes are dependent on articulatory and perceptual abilities. These interrelations make inflectional morphology vulnerable to language deficits that can target any of these components: perception, articulation, phonology, morphology, or syntax. Not surprisingly then, deficits with inflectional morphology are a hallmark of language disorders and have been observed in practically every acquired or developmental language disorder that has come under the scrutiny of clinical linguists. To summarize the wealth of research that has been conducted on inflectional deficits across different languages and different syndromes would go far beyond the limits of this chapter. Rather, its aim is to highlight factors that affect the occurrence and shape of inflectional deficits across disorder syndromes and to sketch the gist of proposals that have been advocated to account for inflectional deficits.
15.2 Inflection – A Short Linguistic Background Inflection creates word forms of a lexeme (or word) that express grammatical information. An example is the word form “dogs” that expresses the grammatical information plural and indicates that more than one dog is around. Three types of grammatical information are expressed by inflection. First, inflection can express information on grammatical
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
202 Martina Penke dimensions, such as the dimension NUMBER, realized by the ending -s in the word form “dogs.” Second, inflection marks grammatical relations between words in a sentence or phrase. An example is subject-verb agreement inflection where the verb is marked for person and number features that have to agree with the person and number features expressed by the subject of the sentence. Thus, in the sentence “the dog barks” the third-person singular subject “the dog” requires a third-person singular marking -s on the verb. Third, inflection marks the grammatical function of an element in a grammatical construction. An example is case inflection that marks phrases as subject/agent or object/patient of an action. Thus, case inflection in Russian marks the dog (“sobak-”) as the subject and agent of a biting action in “sobaka kusayet malchika” (= the dog bites the boy) whereas it is the object/patient suffering a biting action in the sentence “sobaku kusayet malchik” (= the boy bites the dog). Grammatical dimensions such as PERSON, NUMBER, TENSE, ASPECT, or MOOD are central for the conceptual representation of events or entities as they mark the number of participants, their function in the context of the utterance (speaker or addressee) and the anchoring of the event with respect to a specific time and world (actual, non-actual, desired). These dimensions are therefore expressed by inflection in many languages. Nevertheless, languages can display huge differences with respect to which grammatical information is expressed by inflection. While English, Greek, and Russian, for instance, have inflectional markers indicating an ongoing or recurring action (e.g. English -ing), in German the dimension ASPECT is not marked by inflectional verbal affixes. Cross-language variation also holds regarding the feature specifications that are distinguished within a grammatical dimension. For example, while German distinguishes four different cases, Finnish marks 15. English and German differentiate singular from plural nouns and Arabic additionally marks when precisely two exemplars of an entity are referred to, the so-called dual marking. Inflectional systems can be fusional or agglutinative. An agglutinative system makes use of inflectional affixes that express one unit of grammatical information each. Turkish case inflection is a prime example. In Turkish, the dative case is marked by the affix -a, plural is marked by the affix -lar. In the dative plural form of the stem “yil” (= year) both affixes are added to produce the form “yil-lar-a”. The German equivalent “Jahren” (stem “Jahr”), in contrast, expresses information on dative case and plural number in one marker, the ending -en. This type of inflection, where one marker expresses more than one grammatical dimension, is called fusional. Another important distinction refers to whether inflection is realized by affixes that can be segmented from the word stem as in the English verb form “reads,” where the third-person singular marking -s can be easily identified as appendix to the word stem “read.” This type of inflection is called concatenative. Nonconcatenative inflection, in contrast, is realized by modifications of the word stem itself. The key example for this type of inflection can be found in Semitic languages such as Arabic. Here a word’s base consists of a consonantal skeleton such as “k-t-b” (= write). Vowels are added to this skeleton to produce inflected forms such as “katab” (= he writes). In contrast to concatenative inflectional morphology, no string of stem and a segmentable affix results. For an exhaustive exposition of inflectional morphology, covering the immense diversity across languages, see Bickel and Nichols (2007). Crucially, the variance that characterises inflectional systems across languages necessarily determines the appearance and shape of inflectional deficits in a given language and inflectional system.
15.3 Inflectional Deficits in Language Production Although inflectional deficits will also affect the parsing of the grammatical information supplied by inflectional morphology – and thus language comprehension, most research has focused on language production. A reason for this is that inflectional deficits are relatively easy to identify in spontaneous or probed speech production. In the latter, the experimenter
Inflectional Morphology and Language Disorders 203 creates a grammatical context which requires the participant to produce a particular inflected form. An example is the famous wug-test (Berko, 1958) where children are first presented with a picture showing a single, non-existent entity introduced by a noun (“This is a wug.”). Then a picture showing two of these entities is presented and the child is coaxed to continue a sentence started by the experimenter with a plural noun form (“Now there is another one. Now there are two____?”). Tasks such as the wug-test serve to elicit a particular grammatical form from the participant. In comparison to analyses of spontaneous speech, elicitation tasks have a quantitative and a qualitative advantage. Qualitatively, they allow for manipulating and controlling factors that influence the production of inflected word forms such as their frequency or phonological shape. They also reduce task demands that are, for instance, induced by the search for words because the word to be inflected is typically provided. In spontaneous speech, the tested individual is free to use or avoid particular inflected forms, for instance, by relying on overlearned set phrases. This may lead to false estimates of inflectional abilities. Quantitatively, an elicitation task provides more contexts for producing the inflectional marker under scrutiny and, thus, also allows for testing forms that do only rarely occur in spontaneous speech. In contrast to highly controlled laboratory experiments, elicitation tasks have the advantage that they can be performed without sophisticated technical equipment, exert less stress and constraints on the tested individual and have a higher ecological validity as they are closer to a naturalistic conversational setting – if constructed well. These advantages render elicitation tasks a suitable method to investigate inflectional deficits in language-impaired children and adults. Consequently, such tasks are often part of standardized diagnostic tests on language impairments. Errors in producing an inflected form required by the grammatical context are typically classified in omission errors, where the required inflectional ending is missing from the word, and substitution errors, where the required inflectional marker or inflected form is replaced by another inflectional marker/inflected form that does not conform to the given grammatical context or that violates word-specific constraints on the inflectional marker/ word form. The noun phrase *“two dog” exemplifies an omission of the plural ending -s. The examples *“you walks” and *“she goed” illustrate different types of substitution errors. In the first example, the grammatical context requires the use of the form “walk” which is replaced by the inflected form “walks”, incorrect in the given grammatical context. In the example *“she goed”, an inflectional marker (-ed) is used to produce a past-tense form although the stem go has a stored past-tense form (“went”) which prohibits the application of -ed. Another case is exemplified by German noun plural inflection which makes use of four different inflectional markers (allomorphs), the endings -(e)n, -e, -s, and -er. Which noun takes which plural marker is critically determined by the noun’s phonological shape and grammatical gender. The German equivalent of the plural form “dogs” is “Hunde” where the plural marker -e is added to the stem “Hund.” Errors where one of the other plural markers is added to the stem (*“Hunden, Hunds, Hunder”) would constitute substitution errors. They violate the selection restrictions of German plural markers, although all forms are phonologically licit word forms in German. As these examples illustrate, substitution errors can result in word forms that exist in a given language (e.g. “walk”), but they can also result in non-existent but phonologically licit inflected forms (e.g. *“goed”, *“Hunder”). Omission and substitution errors do not only appear in language-impaired speakers but they can also be observed in speakers without language impairment, either in slips-of-thetongue or during the process of acquiring an inflectional system. In an elicitation task on German noun plurals, a typically developing five-year-old German child will only inflect some words taking the -e-plural correctly while incorrectly marking others (e.g. *“Hunden”), indicating that at this age typically developing German children have not yet mastered the system of noun-plural marking. Hence, identification of an inflectional deficit requires us to compare the language behavior observed in affected individuals to the behavior of a control
204 Martina Penke group of typically developing, neurologically healthy individuals. Decisions on what constitutes a suitable control group are vital as the choice of the control group determines the evaluation of language behavior of the experimental group as different or comparable to the behavior of the control group. Also, choice of a control group is dependent on the aim of the research. Thus, a control group of children matched for chronological age can highlight delayed language development in a group of language-impaired children. A control group matched for the mean length of utterances (MLU) can reveal whether performance with the inflectional phenomenon under scrutiny is adequate for the general language development achieved as measured by MLU. Besides omission and substitution errors a third indicator of inflectional deficits is the avoidance of the inflected word form at whole. An avoidance strategy can be very effective. In an analysis of spontaneous-speech data obtained from German individuals with Broca’s aphasia, the production of irregular inflected past-participle forms was nearly error-free (Penke, 1998). Direct testing in an elicitation task, however, revealed that the tested individuals with Broca’s aphasia produced incorrectly inflected participle forms for about 1/3 of the tested irregular verbs (Penke & Westermann, 2006), indicating a problem in accessing infrequent irregular participles in the mental lexicon. This deficit is not apparent in spontaneous speech where the production of unavailable inflected forms can be avoided. Note that the avoidance of particular inflected forms is not a conscious strategy applied by language-impaired speakers, but is most likely the result of the language system not being able to supply the form requested by the grammatical context. While avoidance often goes unnoticed in spontaneous-speech production, the underlying deficit becomes apparent in elicitation tasks where the grammatical context is given and avoidance is no viable option.
15.4 Factors Influencing Inflectional Deficits A number of intra- and extra-linguistic factors shape the occurrence and the appearance of inflectional deficits in language-impaired speakers. Whether and how these factors affect inflectional deficits is critically dependent on the particular properties of the inflectional systems in a given language.
15.4.1 Typology and Complexity of Inflectional Systems In a seminal work, Grodzinsky (1984) pointed out that omissions of inflectional markers in the language production of language-impaired speakers will only occur where licensed by the grammar of the language, that is, if the remaining word stem is a possible word in the respective language. Thus, the omission of the plural marker -s in *“2 book” results in a possible word “book” in English. Corresponding omissions of the inflectional elements in languages such as Italian or Arabic would, in contrast, result in stems which cannot surface as possible words in these languages (Italian “libr” instead of “libri”, Arabic “k-t-b” instead of “kitab”) and, hence, cannot be observed in language-impaired speakers of these l anguages. The finding that omissions of inflectional markers will only occur where licensed by the grammar of a language constitutes an important generalization on inflectional deficits in language disorders. Cross-language research suggests that the importance of inflectional morphology in a language system affects whether language-impaired speakers will tend to omit or try to supply inflectional markers. While synthetic languages make use of inflectional morphology to express grammatical information, relations, and functions, analytic languages solve the same tasks without inflection, by employing free-standing function words or word order. Thus, in English, an analytic language with a largely reduced inflectional
Inflectional Morphology and Language Disorders 205 component, inflectional markers tend to be omitted by language-impaired speakers. In contrast, in synthetic languages where inflectional systems are more elaborate and express more syntactic information (such as in Finnish, German, Italian, or Hebrew) omission rates are markedly lower compared to English (e.g. Bates et al., 1987; Dromi et al., 1999). A core characteristic of English-speaking children with DLD, for instance, is the frequent omission of the third-person singular marker -s, the only overt subject-verb agreement marker in English. In an elicitation task, the tested preschool children omitted the marker -s in about 79% of the obligatory contexts (Leonard et al., 1997). In contrast, the omission rate of Germanspeaking preschool children with DLD in an elicitation task on subject-verb agreement (four distinct inflectional markers) was at only 27% (Penke & Rothweiler, 2018). It has been suggested that the sparseness of English inflectional systems and the prevalence of words not marked with overt inflectional affixes might constitute a detrimental factor in acquiring English inflectional systems leading to the preponderance of omission errors in developmental language impairments (Leonard & Kueser, 2019). In acquired language disorders, the quantitative difference in omission errors observed in less versus more inflecting languages might reflect that individuals who face a limitation of their language abilities tend to neglect inflectional markers if they are of minor importance in the language, whereas they will strive to provide them if inflection is central for grammatical parsing (e.g. Bates et al., 1987).
15.4.2 Inflectional Dimensions Inflectional processes are typically restricted to words of a certain grammatical category. Tense inflection appears on verbs, comparative inflection is restricted to adjectives and case inflection occurs on nominal elements. Moreover, inflection encodes information on a number of different morphosyntactic dimensions such as TENSE, ASPECT, NUMBER, or CASE. Typically, inflectional deficits do not affect all inflectional systems of a language in parallel, but selectively affect only some inflectional systems of a language. Inflectional deficits selective for a specific grammatical class of words have been observed in individuals with aphasia (e.g. Laiacona & Caramazza, 2004). In elicitation tasks where speakers had to produce inflected forms for verbs and nouns that were homophonous (e.g. “This is a guide; these are ___ .” “This person guides, these people ___ .”) these individuals showed a dissociation in their capability to produce correctly inflected noun or verb forms: whereas some of the tested individuals displayed significantly more problems in producing inflected verb forms compared to homophonous noun forms, others showed the opposite pattern. Inflectional deficits can also be selective within an inflectional domain such as verbal inflection. Research on agrammatic Broca’s aphasia, for instance, has provided evidence for inflectional deficits that affect verbal tense but not agreement inflection (e.g. Bastiaanse et al., 2011; Wenzlaff & Clahsen, 2005). Even within inflectional dimensions such as TENSE or ASPECT, selective deficits have been reported. Thus, reference to the present tense has been found to be better preserved than reference to the past in English- and Turkish-speaking individuals with Broca’s aphasia (Bastiaanse et al., 2011). Likewise, perfective aspect (describing an event as completed) has been reported to be more impaired than imperfective aspect that describes an event as ongoing (Koukoulioti & Bastiaanse, 2020). Inflectional affixes can be selectively affected in language disorders even when homo‑ phonous, such as plural -s, possessive -s, and third-person singular -s in English. Thus, Goodglass and Berko (1960) found that aphasic speakers experienced more problems in providing forms inflected with the possessive -s (error rate 56%) compared to third-person singular (error rate 43%) and plural forms on -s (error rate 21%). Similar findings of more impaired third-person singular -s compared to plural and possessive -s have also been reported for English children/adolescents with Down syndrome (Eadie et al., 2002) or children with DLD (Leonard et al., 1997).
206 Martina Penke The sketched findings suggest that inflected forms or inflectional affixes that belong to different inflectional systems are independent from each other. A number of different approaches have been advocated to account for such selective deficits. They might occur because different classes of grammatical words (noun or verb) or different morphological processes (such as agreement, tense, or plural inflection) are subserved by different brain areas selectively affected by brain damage (Laiacona & Caramazza, 2004). Hypotheses based in linguistic theory have suggested that selective impairments of different inflectional dimensions are due to distinct computational processes or linguistic structures that can be selectively affected. It has, for instance, been suggested that selective deficits of tense, aspect, or agreement inflection are due to different hierarchical positions of the functional categories responsible for these inflectional processes in the syntactic tree (i.e. T, AGR, or ASP). A pruning of the syntactic tree, hence, only affects the inflectional processes subserved by the pruned functional nodes (Friedmann & Grodzinsky, 1997, see also Chapter 12). Alternatively, some inflectional systems might require the integration of morphosyntactic information and situational information, for instance, regarding event and speech time. The required integration might be too taxing for the limited processing resources of individuals with a neurological impairment, leading to a selective deficit with these inflectional systems (Bastiaanse et al., 2011; Kok et al., 2007; Wenzlaff & Clahsen, 2005). In addition, factors related to the perceivability of specific inflectional markers, to their articulatory complexity, their frequency, their regularity, to morphological complexity, or to task demands (Faroqi-Shah & Friedman, 2015; Kok et al., 2007) might also come into play when accounting for selective inflectional deficits. Whatever their basis, selective deficits certainly pose a fascinating challenge and have inspired much research during the last 20 years (see Faroqi-Shah & Friedman, 2015; Koukoulioti & Bastiaanse, 2020 for recent overviews on selective deficits within verbal inflection in individuals with aphasia).
15.4.3 Regularity Inflectional systems may exhibit a distinction between regular and irregular inflected forms. English past-tense inflection constitutes a prime example. Regular forms are inflected with -ed (“laughed”) whereas irregular ones like “went” are idiosyncratic and largely unpredictable. According to an influential proposal, regular inflected forms are built by application of a mental symbolic operation that adds an affix to a stem when the inflected form is needed. In contrast, irregular inflected forms are stored in the mental lexicon (Pinker, 1999). A central tenet of such a dualistic approach to inflection is that the representations and mechanisms involved in the production and comprehension of regular and irregular inflectional forms are fundamentally different, subserved by different brain systems and, hence, selectively affected by different types of language disorders. According to the Procedural/DeclarativeModel (Ullman et al., 2005) regular inflection is assumed to be dependent on the procedural system that underlies the learning and execution of motor and cognitive skills (including grammatical abilities) and involves left frontal brain regions and the basal ganglia. Hence, language disorders associated with structural or functional frontal brain lesions should selectively affect regular inflection. Irregular inflected forms, in contrast, are said to be stored in the declarative memory system, situated in left temporo-parietal brain regions, and should be affected by damage to these brain regions. The issue whether language disorders can be found that selectively affect only regular or only irregular inflection has led to intensive research and debates in the field. Selective deficits of regular inflection have been reported for English-speaking individuals with Broca’s aphasia, Parkinson’s disease, DLD or Down syndrome (Eadie et al., 2002; Ullman et al., 2005; van der Lely & Ullman, 2001; Walenski, 2015). In elicitation tasks, testing the production of regular and irregular inflected past-tense forms (e.g. “Everyday, I wash my car. Just like
Inflectional Morphology and Language Disorders 207 everyday, yesterday I ____ .”), speakers with such language disorders typically display more problems in providing correctly inflected regular forms compared to irregular forms. Moreover, affected speakers will not use the regular affix to produce past-tense forms for nonce-verbs, as unimpaired speakers will typically do (e.g. “ploamphed”). Selective deficits of irregular inflection have conversely been reported for children with Williams syndrome, individuals with fluent anomic aphasia, herpes simplex encephalitis or degenerative brain disease (see Walenski, 2015 for overview). In these cases, speakers display significantly more problems in providing irregular inflected forms compared to regular inflected forms. Moreover, as regular inflection is unaffected in these speakers, they will often over-apply regular inflectional markers to irregular inflected stems (e.g. *”goed”). While the dualistic view of inflection has inspired much research on language disorders, it has not gone unchallenged. One important string of criticism has targeted the issue whether the findings of selective deficits of regular or irregular inflection are indeed valid or whether they are artifacts based on the experimental design or on properties of the inflectional system. Thus, it has been proposed that English regular inflected past-tense forms are of greater phonological complexity (Joanisse & Seidenberg, 1999) since they often display complex word-final consonant clusters (e.g. “danced” /dɑːnst/), whereas irregular forms do not (e.g. “sang” /sæŋ/). A selective deficit with regular inflection might then simply come about by articulatory problems that more strongly affect the production of the phonologically complex, regular inflected forms than the production of irregular forms. Accordingly, single- mechanism models of inflection have been proposed that assume that all inflected forms – regular and irregular – are stored and processed in a single associative network structure (Joanisse & Seidenberg, 1999). In this so-called triangle model, selective deficits of regular inflection come about by phonological/articulatory impairments that affect the production of the phonologically more complex regular inflected forms. Selective deficits of irregular inflection are due to semantic problems that particularly affect the semantic connections linking the different verb forms of irregular inflecting verbs. While the triangle model has provided an alternative explanation for the selective deficits of regular and irregular inflected forms in speakers of English, findings on other languages are not easily accommodated in the model. German past-participle inflection closely resembles English past-tense inflection and distinguishes regular (“getanzt” = danced) and irregular (“gesungen” = sang) inflected forms. According to the triangle model, the phonological impairments that are a characteristic sign of Broca’s aphasia should lead to selective deficits of regular inflection in English as well as in German. However, despite the fact that regular inflected forms in English and German are of similar phonological complexity (all add a /t/ to the verb stem resulting in a word-final consonant cluster /nst/ in the example), only English individuals with Broca’s aphasia display a selective deficit with regular inflected forms. In German affected individuals regular participle inflection is intact (Penke & Westermann, 2006). Moreover, data on German noun plurals indicate that regular and irregular inflected forms can be selectively affected although they involve the same inflectional marker and result in inflected forms of similar phonological complexity. Thus, in German individuals with Broca’s aphasia the /n/-plural which is regular on feminine nouns ending in the vowel schwa (e.g. “Biene-Bienen” = bee-bees) might be retained while irregular /n/plurals on masculine and neuter nouns (e.g. “Bär-Bären” = bears) are impaired (Penke & Wimmer, 2017). The debate between dualistic and single-mechanism approaches to inflection that underlies the search for selective deficits of regular and irregular inflected forms is ongoing. In the wake of this debate, findings on other languages and other disorder syndromes are going to further our understanding of the representation and processing of inflected forms (see e.g. Penke & Wimmer, 2017 or Walenski, 2015 for a more detailed overview on selective deficits of regular and irregular inflection).
208 Martina Penke
15.4.4 Frequency Frequency is considered a major factor in determining how error-prone inflected word forms are in language disorders. As a rule of thumb infrequent inflected forms are more prone to error than frequent ones in speakers with and without inflectional deficits. However, several types of frequency effects must be distinguished (Ambridge et al., 2015). Effects of token frequency (i.e. the number of times you here a particular word form such as “sang”) are indicative of processes of lexical storage and access. Memory traces get stronger with each exposure, making frequently occurring forms easier to acquire and better accessible than infrequent ones. A frequency effect, however, will only affect inflected forms or components of inflected forms that are stored in the mental lexicon, such as irregular inflected forms that are stored as fully inflected whole word forms in the mental lexicon. In production experiments error-rates for irregular inflected forms are typically related to the frequency of the inflected form: the less frequent the irregular form, the higher the error-rate observed for language-impaired speakers (Penke & Wimmer, 2017). This effect of token frequency is explained by difficulties to store or access infrequent irregular inflected forms in the mental lexicon. Type frequency, in contrast, is related to the productivity of a specific inflectional pattern. The more inflected forms display a particular inflectional marker, such as the past-tense -ed, the higher its productivity and, thus, the likelihood to apply this inflectional pattern in building new inflected forms (Ambridge et al., 2015; Bybee, 1995; Harmon et al., 2021). Type frequency has, for instance, been made responsible for the greater error-proneness of the syllabic allomorphs [əd] and [əz] of the past-tense and third-person singular marker (e.g. “wanted, crosses”) as opposed to their segmental variants [d/t] and [z/s] (e.g. “loved, sings”) in English-speaking children with DLD, since the former are less frequent than the latter (Tomas et al., 2017). Similar effects have been observed in individuals with acquired language impairments. A series of experiments on inflectional deficits in German individuals with Broca’s aphasia revealed a close correspondence between the number of words an inflectional affix is used with and the aphasic speakers’ error rates: the more words a regular affix occurs with, the lower the error rate obtained (Penke, 2006). While frequency is considered a relevant factor in accounting for inflectional deficits in developmental as well as in acquired language impairments, the reasons for its impact differ. Children with developmental language impairments such as DLD might require more exposure to a specific inflected form to store it in the mental lexicon and more exposure to a specific inflectional pattern to determine its productivity (Harmon et al., 2021). In contrast, speakers with acquired language disorders might suffer from difficulties in accessing infrequent, stored inflected word forms or inflectional markers in the mental lexicon (Penke, 2006; Penke & Wimmer, 2017).
15.4.5 Morphosyntactic Specifications and Markedness Besides frequency, morphological theories and concepts such as markedness have been applied to elucidate why some forms are more error-prone than others and why some types of errors occur more often than others. Inflectional affixes can be organized in inflectional paradigms that are structured along morphosyntactic dimensions such as PERSON or NUMBER. In morphological theory, feature specifications within these dimensions are typically represented in terms of binary features with marked (positive) and unmarked (negative) values, such as the feature [± plural]. Whether forms are marked or unmarked with respect to a specific feature is determined on the basis of typological, morphological, syntactic, or conceptual arguments and might vary between languages. Plural forms (e.g. “books”), for instance, are generally
Inflectional Morphology and Language Disorders 209 considered to be marked in comparison to singular forms (“book”), since plural forms are often expressed by a morphological element (e.g. -s), whereas singular forms are not. Also singular markers are very rare in the languages of the world. Error analyses suggest that, for instance in Broca’s aphasia and DLD, errors within an inflectional system do not result in random exchanges of one inflected form of the paradigm by another. Instead, errors are typically “near misses” (Leonard & Kueser, 2019) that remain within one morphosyntactic dimension and, for instance, only affect the number specification without affecting person specifications (or vice versa). Moreover, inflectional errors tend to replace forms with a marked feature specification by forms with an unmarked feature specification within the same dimension of the paradigm (Janssen & Penke, 2002). Thus, a plural form with the feature specification [+plural] is typically replaced by a singular form with the specification [-plural], whereas the reverse error is rare. The tendency to replace marked forms by unmarked ones might also account for the frequently made observation that language-impaired speakers substitute inflected finite verbs forms, marked for PERSON, NUMBER, or TENSE, by nonfinite forms (infinitives or participles) unmarked for these morphosyntactic properties, or exchange marked case-inflected forms by the citation form, typically the unmarked nominative. Moreover, inflectional errors are affected by morphophonological markedness (Demuth & Tomas, 2016). Inflectional affixes often are consonants. What if a consonantal inflectional ending is affixed to a stem that ends in the very same consonant? Adding the German past-participle ending /t/ to a stem such as “hust” (= cough) which already ends in /t/ would result in a sequence of two adjacent identical phones “hustt.” To avoid this, in German (like in English) an epenthetic vowel is inserted between the two identical segments (“gehust- + -t => gehustet” = coughed). Another option is chosen in Dutch where only one of the two identical /t/-segments is realized (“gehoest- + -t => gehoest”). Data from language acquisition indicate that the German solution is more marked than the Dutch solution, taking more time in acquisition (Grijzenhout & Penke, 2005). An analysis of inflection errors produced by German individuals with Broca’s aphasia also revealed that the marked German solution was prone to error (Grijzenhout & Penke, 2005). While affected individuals correctly inflected 92% of regular verbs with stem final segments other than /t/ (“gelach-+t => gelacht” =laughed), they typically omitted the participle affix -t when the verb stem already ended in /t/, resulting in forms such as *“gehust” instead of “gehustet.” Similar observations have been made for English-speaking children with DLD who display significantly lower accuracy rates for the syllabic allomorphs [əd] and [əz] of the pasttense and third-singular marker (e.g. “wanted, crosses”) compared to their segmental variants [d/t] and [z/s] (e.g. “loved, sings”) (e.g. Tomas et al., 2017). These data suggest that morphophonologically marked forms are more error-prone and that language-impaired speakers opt for unmarked solutions.
15.4.6 Phonological Complexity The phonological complexity of inflected forms is another important factor that has been found to influence the error-proneness of inflected forms. English-speaking children with DLD achieve significantly lower accuracy scores in producing past-tense forms if the verbal stem ends in two consonants (e.g. “danced” /dɑːnst/) than when it ends in one consonant (e.g. “hugged” /hʌgd/). Accuracy scores were highest for verbs ending in vowels (e.g. “paid” /peɪd/) (Marshall & van der Lely, 2007). These findings indicate that phonologically more complex forms are more error-prone than phonologically simpler forms. Inflectional affixes typically display phonological characteristics that make them difficult to perceive and produce. Affixes are shorter than other lexemes, they are typically unstressed, can be expressed by a single consonant, and frequently appear at the right edge of words where the sound-intensity level is lower than at the word’s beginning. Consequently,
210 Martina Penke perceptual problems can affect the acquisition of inflected forms and articulatory problems can affect their production. Consider sensorineural hearing loss in children. Sensorineural hearing deficits particularly affect the perception of high-pitched speech sounds such as /s/ and /t/ while low-pitched nasal sounds are typically better perceived. Not surprisingly, then, English-speaking children with hearing loss have been found to omit third-person singular and past-tense markers at higher rates than control children with unimpaired hearing (e.g. Norbury et al., 2001). German also uses the consonants /s/ and /t/ in verbal agreement inflection. In addition, it uses the nasal /n/ as a verbal agreement marker. This allows for testing whether the problems children with hearing loss display in producing inflected forms are due to a deficit in acquiring inflectional morphology per se or whether they are due to perceptual limitations affecting these children’s language intake. Indeed, the deficits of German children with hearing loss seem closely tied to the acoustic/perceptual properties of the inflectional markers expressing verbal agreement. While agreement markers expressed by /s/ and /t/ were often omitted, performance with the verbal agreement marker -n was unimpaired in comparison to a control group of age-matched preschool children without hearing loss. Moreover, omissions of /s/ and /t/ were modulated by syllable complexity: these agreement markers were omitted more often when the verbal stem ended in an obstruent as opposed to cases where the stem ended in a vowel (Penke et al., 2016). A follow-up study conducted when the children were seven years old revealed that one of the 11 tested children with moderate hearing loss still displayed a deficit with respect to verbal agreement inflection, suggesting that the limitations in perceiving inflectional endings have the potential to cause a persistent deficit with inflectional morphology when not adequately treated in speech and language intervention (Rothweiler & Penke, 2017). Articulatory problems often accompany developmental or acquired language impairments and can affect an individual’s ability to articulate phonologically complex inflected forms, making it necessary to distinguish an inflectional deficit from an articulatory deficit. A point in case is provided by Down syndrome (DS). Individuals with DS often display differences in the structure and functioning of articulators, such as tongue and palate, as well as muscle hypotonia that affect speech production. Among the most frequent phonological processes observed in children/adolescents with DS are the reduction of consonant clusters and the deletion of word-final consonants. As inflectional markers are often realized by consonants affixed at the right edge of a word’s stem, they should be particularly susceptible to these phonological processes. Hence, it has been suggested that deficits in producing inflectional morphemes in individuals with DS might simply be due to problems in articulating these inflected forms, only mimicking an inflectional deficit (Christodoulou & Wexler, 2016). Disentangling the influence of perceptual or articulatory deficits from inflectional deficits proper necessitates specific testing of inflected forms and simplex words of comparable phonological complexity. Compare the German noun “Haut” (= skin) with the inflected verb form “hau+ -t => haut” (= hits[3rd.sg.]). Both display an obstruent /t/ in syllabic coda position. However, in the noun it is a stem final consonant, whereas it is the third-person singular marker -t in the verb form. Perceptual or articulatory problems should affect the perception or production of both words alike. An inflectional deficit, in contrast, should only affect inflected forms, sparing the perception/production of stem final consonants. Adopting this procedure revealed that deficits in children with hearing loss are primarily perceptual (Penke et al., 2016), whereas deficits with inflectional morphology in children/adolescents with DS cannot be reduced to articulatory problems (Penke, 2018).
15.4.7 Task Demands and Language External Factors Besides the factors discussed so far, factors external to the grammatical system of a language also exert some influence on inflectional deficits and affect the performance of languageimpaired individuals. Such factors relate to the testing situation (e.g. familiarity with the
Inflectional Morphology and Language Disorders 211 investigator, formality of the testing situation), the cognitive capacities of the individual during testing (e.g. fatigue, concentrativeness), or to how demanding a task is for the languageimpaired individual. Typically, tasks which minimize processing load, such as cloze tasks where the participant only has to produce an inflected form while the sentential context is given, lead to better performance than more unrestrained tasks or tasks which require several processes besides inflection, such as lexical selection or word ordering (Kok et al., 2007). Which tasks are suitable is a matter of methodological discussions. To control for language external influences, inflectional deficits should best be explored by using different methodologies and tasks and testing should be repeated at different times.
15.5 Accounting for Inflectional Deficits Inflection has close connections to syntax, phonology, and the mental lexicon. Hence, inflectional deficits have been attributed to deficits in syntactic, phonological, or morphological components of the language faculty, to deficits in lexicon organization and lexical access. In addition, perceptual or articulatory problems have been discussed as causes for impairments affecting the acquisition, production, and comprehension of inflected forms. Although the field is constantly evolving, the underlying causes for inflectional deficits are still debated for each and every disorder syndrome. A taxonomy of the different deficit accounts discussed in the field can be made along the following lines: Deficit theories that see the impairment in peripheral abilities of perceiving or articulating inflected forms (§15.4.3, §15.4.6) can be distinguished from theories that assume central deficits affecting the mental operations in acquiring, computing or accessing inflected forms. Within the latter type of accounts, theories that posit language-specific impairments can be distinguished from theories that hold limitations of cognitive operations not specific to language responsible for inflectional deficits. Limitations in working memory might, for instance, affect a child’s ability to detect inflectional patterns in the input (Leonard & Kueser, 2019). A task might be too demanding for the reduced processing capacities of a languageimpaired individual, leading to adaptive behavior resulting in omissions of inflected forms (Kolk & Heeschen, 1992). A slowing down of processing abilities might impair an i ndividual’s ability to compute grammatical structures or operations within a given temporal window, affecting the ability to detect regularities (Yoder et al., 2006) or to provide the necessary inflectional marker (Kok et al., 2007). Theories that propose language-specific deficits, in contrast, claim specific grammatical structures, relations, operations, or features to be affected in inflectional deficits. Syntactic deficit accounts, for instance, propose that the functional categories relevant for the realization of inflectional markers can no longer be projected, resulting in pruned syntactic trees. Other syntactic accounts suggest that the morphosyntactic information that is hosted in specific functional nodes is left unspecified. Morphological or lexical deficit accounts assume impairments affecting the computation of the inflected form required by the grammatical context or problems in accessing the a ppropriately marked form in the mental lexicon. Such deficits can lead to substitution errors that are near misses or to the selection of a competing more frequent or less-marked form (§ 15.4.4, §15.4.5). Phonological deficit accounts see deficits with the build-up or computation of complex phonological structures, such as complex syllable structures, at the basis of inflectional deficits. Whatever the deficit suggested, all theories are challenged by accommodating the variability of symptoms observed for a given syndrome across languages as well as by integrating the various factors that shape inflectional deficits, such as the regularity, frequency, markedness, and phonological complexity of inflected forms. Thus, while syntactic-deficit theories, that claim pruned syntactic trees, account for deficits that affect only specific word classes (e.g. verbs) or morphosyntactic features (e.g. tense but not agreement), they also need to address how the claimed deficit can selectively affect only regular or only irregular
212 Martina Penke inflection, or how it can account for the influence of frequency or markedness on inflectional errors. In contrast, the Declarative/Procedural Model provides a well-elaborated account on selective deficits of regular or irregular inflection but is challenged to accommodate for the influence of markedness on inflectional errors as well as for the selectivity of inflectional deficits, affecting, for example, only verbal morphology, only tense morphology, or only past but not present tense (§15.4.2). Naturally, the type of deficit assumed to underlie an inflectional disorder directly impacts on the therapeutical intervention deemed necessary to treat the deficit. Difficulties that are considered to be based in articulatory problems will result in a different therapeutical intervention than when the inflectional deficit is assumed to be syntactical or morphological in nature. While a training of articulatory abilities is likely to resolve the inflectional problems in the former case, in the latter case, inflectional deficits are likely to persist. Given the ongoing controversies on how to capture inflectional deficits and the r elevance deficit theories have for therapeutic intervention – should we despair? No! Science is progress. Research over the decades has constantly accumulated our knowledge on inflectional deficits. Especially the cross-language investigation of different deficit syndromes has sharpened our ideas about what causes inflectional deficits, enforcing revisions of proposals that were originally posited to account for inflectional deficits in English-speaking individuals (§15.4.1, §15.4.3). Cross-language comparisons of inflectional deficits also enable us to determine which deficits are characteristic for a given language disorder across languages and which are not, enhancing our understanding of what is going wrong in a particular language disorder. Whereas, for example, a deficit with irregular inflection seems to be a characteristic sign of Williams syndrome across languages (Walenski, 2015), the deficit with regular inflection observed in English-speaking individuals with Broca’s aphasia or DLD is not observed across languages (§15.4.3). Detailed investigations on which aspects of inflection are more or less error-prone in specific disorders have highlighted the relevance of intra- and extra-linguistic factors in shaping the appearance of inflectional deficits and have taught us to scrutinize such factors in research. And finally, the comparison of inflectional deficits across disorder syndromes in a specific language serves to uncover disorder-specific differences in inflectional deficits that will contribute to our understanding of the impairments that cause a particular deficit syndrome (e.g. Penke & Rothweiler, 2018). Thus, research that adopts a differentiating look at inflectional deficits, that takes observations across languages and across syndromes into account, and that scrutinizes intra- and extra-linguistic factors shaping inflectional deficits is certain to further our understanding of inflectional deficits in the future.
15.6 Conclusion The aim of this chapter was to give an overview on how inflectional morphology can be impaired in language disorders. The cross-language investigation of language disorders has provided important insights into how different inflectional systems shape the manifestation of inflectional deficits associated with specific disorder syndromes. Whether a language disorder will result in omission or substitution errors, how many and which type of errors are likely to occur are crucially dependent on language-specific characteristics of inflectional systems. In the endeavor to understand inflectional deficits morphological theory has played and will play an important role. It points out areas of potential problems in languageimpaired speakers, such as complex inflectional paradigms or marked forms, it helps in designing experiments and in accounting for the observed behavior. Theoretically guided investigations of inflectional deficits will not only further our understanding of language disorders, they might also be profitable for theoretical linguistics as they allow for testing
Inflectional Morphology and Language Disorders 213 competing theoretical conceptions (§15.4.3). Although this chapter could only provide a rough sketch of the research conducted in the field, I hope that it serves its purpose in providing a starting point for further explorations into the fascinating topic of inflectional deficits in language disorders.
REFERENCES Ambridge, B., Kidd, E., Rowland, C. F., & Theakston, A. L. (2015). The ubiquity of frequency effects in first language acquisition. Journal of Child Language, 42(2), 239–273. Bastiaanse, R., Bamyaci, E., Hsu, C. J., Lee, J., Duman, T. Y., & Thompson, C. K. (2011). Time reference in agrammatic aphasia. Journal of Neurolinguistics, 24(6), 652–673. Bates, E., Friederici, A., & Wulfeck, B. (1987). Grammatical morphology in aphasia. Cortex, 23(4), 545–574. Berko, J. (1958). The child’s learning of English morphology. Word, 14(2–3), 150–177. Bickel, B., & Nichols, J. (2007). Inflectional morphology. In T. Shopen (Ed.), Language typology and syntactic description (pp. 169–240). CUP. Bybee, J. (1995). Regular morphology and the lexicon. Language and Cognitive Processes, 10(5), 425–455. Christodoulou, C., & Wexler, K. (2016). The morphosyntactic development of case in Down syndrome. Lingua, 184, 25–52. Demuth, K., & Tomas, E. (2016). Understanding the contributions of prosodic phonology to morphological development. First Language, 36(3), 265–278. Dromi, E., Leonard, L. B., Adam, G., & Zadunaisky-Ehrlich, S. (1999). Verb agreement morphology in Hebrew-speaking children with specific language impairment. Journal of Speech, Language, and Hearing Research, 42(6), 1414–1431. Eadie, P. A., Fey, M. E., Douglas, J. M., & Parsons, C. L. (2002). Profiles of grammatical morphology and sentence imitation in children with specific language impairment and Down syndrome. Journal of Speech, Language, and Hearing Research, 45(4), 720–732. Faroqi-Shah, Y., & Friedman, L. (2015). Production of verb tense in agrammatic aphasia. Behavioural Neurology, 2015, 1–15. Friedmann, N. A., & Grodzinsky, Y. (1997). Tense and agreement in agrammatic production. Brain and Language, 56(3), 397–425.
Goodglass, H., & Berko, J. (1960). Agrammatism and inflectional morphology in English. Journal of Speech and Hearing Research, 3(3), 257–267. Grijzenhout, J., & Penke, M. (2005). On the interaction of phonology and morphology in language acquisition and German and Dutch Broca’s Aphasia. In Booij, G., & Van Marle, J. (Eds.), Yearbook of morphology 2005 (pp. 49–81). Springer. Grodzinsky, Y. (1984). The syntactic characterization of agrammatism. Cognition, 16(2), 99–120. Harmon, Z., Barak, L., Shafto, P., Edwards, J., & Feldman, N. H. (2021). Making heads or tails of it: A competition–compensation account of morphological deficits in language impairment. Proceedings of the Annual Meeting of the Cognitive Science Society, 43(43), 1872–1878. Janssen, U., & Penke, M. (2002). How are inflectional affixes organized in the mental lexicon? Brain and Language, 81(1–3), 180–191. Joanisse, M. F., & Seidenberg, M. S. (1999). Impairments in verb morphology after brain injury: A connectionist model. Proceedings of the National Academy of Sciences, 96(13), 7592–7597. Kok, P., van Doorn, A., & Kolk, H. (2007). Inflection and computational load in agrammatic speech. Brain and Language, 102(3), 273–283. Kolk, H., & Heeschen, C. (1992). Agrammatism, paragrammatism and the management of language. Language and Cognitive Processes, 7(2), 89–129. Koukoulioti, V., & Bastiaanse, R. (2020). Time reference in aphasia. Journal of Neurolinguistics, 53, 100872. Laiacona, M., & Caramazza, A. (2004). The noun/verb dissociation in language production. Cognitive Neuropsychology, 21(2–4), 103–123. Leonard, L. B., Eyer, J. A., Bedore, L. M., & Grela, B. G. (1997). Three accounts of the grammatical morpheme difficulties of English-speaking children with specific language impairment. Journal of Speech, Language, and Hearing Research, 40(4), 741–753.
214 Martina Penke Leonard, L. B., & Kueser, J. B. (2019). Five overarching factors central to grammatical learning and treatment in children with developmental language disorder. International Journal of Language & Communication Disorders, 54(3), 347–361. Marshall, C. R., & van der Lely, H. K. (2007). The impact of phonological complexity on past tense inflection in children with Grammatical-SLI. Advances in Speech Language Pathology, 9(3), 191–203. Norbury, C. F., Bishop, D. V., & Briscoe, J. (2001). Production of English finite verb morphology. Journal of Speech, Language, and Hearing Research, 44(1), 165–178. Penke, M. (1998). Die Grammatik des Agrammatismus. De Gruyter. Penke, M. (2006). Flexion im mentalen Lexikon. Max Niemeyer. Penke, M. (2018). Verbal agreement inflection in German children with Down syndrome. Journal of Speech, Language, and Hearing Research, 61(9), 2217–2234. Penke, M., & Rothweiler, M. (2018). Comparing specific language impairment and hearing impairment. Language Acquisition, 25(1), 39–57. Penke, M., & Westermann, G. (2006). Broca’s area and inflectional morphology. Cortex, 42(4), 563–576. Penke, M., & Wimmer, E. (2017). Regular and irregular inflectional morphology in acquired language disorders – The case of German. In L. Escobar, V. Torrens, & T. Parodi (Eds.), Language processing and disorders (pp. 314–344). Cambridge Scholars Publishing.
Penke, M., Wimmer, E., Hennies, J., Hess, M., & Rothweiler, M. (2016). Inflectional morphology in German hearing-impaired children. Logopedics Phoniatrics Vocology, 41(1), 9–26. Pinker, S. (1999). Words and rules: The ingredients of language. Basic Books. Rothweiler, M., & Penke, M. (2017). Subjekt-VerbKongruenz bei schwerhörigen Kindern. Logos interdisziplinär, 25(1), 15–24. Tomas, E., Demuth, K., & Petocz, P. (2017). The role of frequency in learning morphophonological alternations. Journal of Speech, Language, and Hearing Research, 60(5), 1316–1329. Ullman, M. T., Pancheva, R., Love, T., Yee, E., Swinney, D., & Hickok, G. (2005). Neural correlates of lexicon and grammar. Brain and Language, 93(2), 185–238. van der Lely, H. K., & Ullman, M. T. (2001). Past-tense morphology in specifically language impaired and normally developing children. Language and Cognitive Processes, 16(2–3), 177–217. Walenski, M. (2015). Disorders. In M. Baerman (Ed.), The Oxford handbook of inflection (pp. 375–402). OUP. Wenzlaff, M., & Clahsen, H. (2005). Finiteness and verb-second in German agrammatism. Brain and Language, 92(1), 33–44. Yoder, P. J., Camarata, S., Camarata, M., & Williams, S. M. (2006). Association between differentiated processing of syllables and comprehension of grammatical morphology in children with Down syndrome. American Journal on Mental Retardation, 111(2), 138–152.
16 Normal and Impaired Semantic Processing of Words MARILYNE JOYAL, MAXIMILIANO A. WILSON, AND YVES JOANETTE 16.1 Preamble The goal of this chapter is to provide a state-of-the-art review of how acquired brain lesions can interfere with the normal semantic processing of the meaning of words. After reminding readers of some basic concepts in word semantics, we will present an account of word semantic impairments in post-stroke aphasia, as well as right-hemisphere lesions, traumatic brain injury, and neurodegenerative diseases causing dementia.
16.2 The Two Facets of Word Semantics To understand how the brain processes word meanings or semantics, it is important to distinguish two aspects of semantics: semantic memory and semantic control. Semantic memory is a type of long-term memory that refers to the knowledge acquired throughout the lifespan. It includes knowledge about words and other symbols as well as their meanings and referents. It also includes knowledge about the links that unite symbols and concepts and about the rules that govern their use (Quillian, 1966; Tulving, 1972). Semantic memory therefore refers to the memory for facts and general knowledge, which is organized as in a mental dictionary. This knowledge can be selected, retrieved and manipulated in a timely fashion for specific purposes and in specific contexts through semantic control processes (Jackson, 2021; Jefferies, 2013; Lambon Ralph et al., 2017). Semantic control comes into play in performing language tasks, such as understanding word meanings and producing words to refer to specific concepts. It is well known that semantic memory is preserved and even increases throughout the life course (Park et al., 2002; Salthouse, 2019; Toepper, 2017; Verhaegen & Poncelet, 2013). Semantic control processes, on the other hand, decline with age (Hoffman, 2018, 2019; Krieger-Redwood et al., 2019). For instance, Hoffman (2018) observed that older adults were more accurate than young adults in tasks assessing semantic memory (e.g., a synonymy task with low-frequency
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
216 Marilyne Joyal, Maximiliano A. Wilson, and Yves Joanette words and a lexical decision task), but less accurate than young adults in a feature association task in which they had to ignore irrelevant semantic associations. The inhibition of such associations is underpinned by semantic control. The involvement of control processes varies depending on the task and stimuli. Control processes are particularly involved in tasks that require the explicit use of semantic knowledge with high selection and inhibition demands, such as picture-naming tasks with semantically related distractors (Jackson, 2021; Noonan et al., 2013). Conversely, the involvement of control processes can be minimized in implicit tasks such as semantic priming paradigms for lexical decision. For this task, a robust and convergent effect that has been repeatedly demonstrated in neurologically intact participants is that lexical decisions are made faster and more accurately on targets (e.g., “doctor”) that are primed by a preceding word that is related in meaning (e.g., “nurse”) than on words that are preceded by an unrelated word (e.g., “cat”; for a review, see Neely, 1991). Semantic priming is thought to reflect an automatic spread of activation in semantic memory (Collins & Loftus, 1975), especially when the task uses a low proportion of related targets, a short stimulus-onset asynchrony (SOA), and instructions to participants that are devoid of any allusion to related pairs in the stimulus set. The combination of several semantic tasks, both explicit and implicit, can therefore provide evidence regarding the integrity of semantic memory representations and control processes. In clinical populations, semantic deficits can be predominantly related to impaired semantic control processes, impaired semantic memory representations, or both (e.g., Gagnon et al., 1994; Noonan et al., 2010; Rogers & Friedman, 2008). Patients who show semantic deficits normally have lesions in the left temporal, parietal, and frontal regions (Chapman et al., 2020; Jefferies & Lambon Ralph, 2006), in accordance with neuroanatomical models of semantic processing that indicate that semantics is supported by multiple brain regions and networks, which are more lateralized to the left hemisphere (e.g., Binder et al., 2009; Xu et al., 2017). Different semantic components are supported by different brain networks (e.g., semantic control is thought to be supported by a left frontoparietal network, while the default mode network hosts semantic representations that are closely related to multimodal experiences; Xu et al., 2017). The main objective of this chapter is to describe the nature of the semantic deficits found in the most relevant clinical populations with acquired language disorders, that is in individuals with post-stroke aphasia, right-hemisphere damage, traumatic brain injury, and dementia.
16.3 Word Semantic Impairments in Post-Stroke Aphasia Word semantic impairments constitute one of the most frequent signs of acquired language impairment resulting from a brain lesion. They are present in most types of aphasia, an acquired impairment of language following a brain lesion. In some types of aphasia, the semantic impairment also affects other levels of linguistic representations, such as sentences or discourse. Semantic deficits have been observed in different types of classic aphasia syndromes due to ischemic or hemorrhagic left-hemisphere stroke. One type of aphasic syndrome associated with semantic deficits is Wernicke’s aphasia, which results from lesions to left posterior temporoparietal regions (Ogar et al., 2011; Robson et al., 2012). Individuals with Wernicke’s aphasia show a considerable impairment of comprehension, partly due to semantic deficits. This comprehension impairment occurs at the single-word, phrase and sentence levels (Robson et al., 2017). With regard to language production, individuals with Wernicke’s aphasia may speak fluently but their output is difficult to understand since they produce numerous paraphasias, add unnecessary words and produce neologisms. Studies in which participants were required to perform verbal and nonverbal explicit semantic tasks have
Normal and Impaired Semantic Processing of Words 217 provided evidence of severe disruptions of semantic processing in patients with Wernicke’s aphasia. For instance, their performance in semantic association, word-picture matching, semantic fluency and picture-naming tasks is impaired (Grober et al., 1980; Ogar et al., 2011; Robson et al., 2012; Whitehouse et al., 1978). Naming errors are mostly phonemic and neologistic, with occasional semantic paraphasias (Ogar et al., 2011; Robson et al., 2012). Although semantic deficits contribute to the comprehension impairment of individuals with Wernicke’s aphasia, these patients also display auditory verbal agnosia (Bernal & Ardila, 2016). Thus, their performance may be lower in semantic tasks performed in the auditory modality (Robson et al., 2012, 2017). A number of word-priming studies in patients with aphasia (e.g., Hagoort, 1993; Milberg et al., 2003; Prather et al., 1997) have shown that, despite significantly longer response latencies, participants with Wernicke’s aphasia consistently produced the same pattern of results as the neurologically healthy participants (Salles et al., 2012); that is, participants in both groups needed less time to recognize a target as a word when it was preceded by an associatively related word (Milberg & Blumstein, 1981; Nakano & Blumstein, 2004; Prather et al., 1997). Because of the evidence of semantic facilitation in Wernicke’s aphasia, it has been suggested that these patients’ semantic impairments are not due to a loss of stored linguistic representations but rather to the patient’s inability to use or manipulate semantic information. This claim is supported by more recent findings. A positive association has been found between semantic and executive function skills in participants with Wernicke’s aphasia (Robson et al., 2012). Further support for this assumption comes from a study using a wordpicture matching task. Participants had to determine whether a spoken word and a picture referred to the same concept. Participants with Wernicke’s aphasia performed less accurately on semantic near incongruent trials (e.g., a picture of a bath presented with the word “sink”) than in congruent trials in which the picture and word referred to the same concept (Robson et al., 2017). Near incongruent trials require greater semantic control to distinguish between related concepts that share semantic features or belong to the same semantic category. Together, these findings suggested that patients with Wernicke’s aphasia have semantic control impairments rather than a loss of knowledge. In contrast to patients with Wernicke’s aphasia, the performance of patients with Broca’s aphasia is similar to that of neurologically healthy participants in semantic tasks, apart from a word-finding difficulty (Grober et al., 1980; Whitehouse et al., 1978). This has led to the claim that semantic memory is largely unaffected in Broca’s aphasia (Grober et al., 1980). Individuals with Broca’s aphasia have damage to the frontal lobe of the brain. They frequently speak in short, meaningful phrases that are produced with great effort. Broca’s aphasia is therefore described as a non-fluent aphasia. The results of priming studies conducted in patients with Broca’s aphasia are contradictory (Salles et al., 2012). In the majority of priming studies, semantic priming effects were found in patients, especially when prime– target pairs were highly associated (Blumstein et al., 1982; Hagoort, 1993). In contrast, when the semantic relationship between the prime and the target was more subtle, or when the stimuli were presented as triplets (i.e., participants made a lexical decision on the third word of a series), no priming effects were observed in these patients (Milberg & Blumstein, 1981; Milberg et al., 1987). Milberg and colleagues concluded that participants with Broca’s aphasia have an impairment affecting their automatic access to semantic representations of words. However, the fact that patients with Broca’s aphasia can make semantic judgments indicates that, although the activation level of lexical entries may be reduced, the lexical entries are accessed and the organization of the semantic network appears to be intact. Consequently, these patients are able to use strategies in an offline task to judge the semantic relationship between prime–target pairs. The above-mentioned studies indicated that patients with Broca’s aphasia might have an impairment affecting their automatic routines for accessing semantic information. However,
218 Marilyne Joyal, Maximiliano A. Wilson, and Yves Joanette it is important to mention that word-priming studies do not tap solely into the automatic processing of semantic information, including word meanings. Although the reliance on semantic control processes is reduced in semantic priming paradigms, semantic control is thought to be involved when SOA higher than 300 ms are used in a task (Neely, 1991; Salles et al., 2012). In order to dissociate the automatic and controlled aspects of semantic processing, some researchers have included both short and long SOAs (Hagoort, 1993; Prather et al., 1997; Python et al., 2018; Silkes et al., 2020). For instance, Prather et al. (1997) examined the time course of semantic activation with a list priming paradigm in participants with aphasia, including a group of patients with Broca’s aphasia. In their study, temporal delays between successive words were manipulated, ranging from 300 to 2,100 ms. Unlike neurologically intact participants, who primed at relatively short interstimulus intervals (ISI) of as little as 500 ms, the participants with Broca’s aphasia showed reliable priming only at a long ISI of 1,500 ms. That is, the patients with Broca’s aphasia could access semantic information implicitly if allowed sufficient time to do so. This result may help explain their disrupted comprehension of normal-speed conversational speech. In contrast, participants with Wernicke’s aphasia showed priming effects at all SOAs, from 300 to 1,100 ms (Prather et al., 1997). In another study, Silkes et al. (2020) also reported delayed semantic priming effects in patients with aphasia exhibiting agrammatism (a feature of Broca’s aphasia) using a lexical decision task. However, these same participants showed no semantic priming in a semantic decision task at different SOAs (500, 1,000, and 1,500 ms). The explicit access to semantic representations that was required in this task may have interfered with the automatic spreading activation in semantic memory (Silkes et al., 2020). Overall, although there is no obvious consensus among priming studies conducted in individuals with Broca’s aphasia, there is some evidence in support of slowed or impaired implicit semantic processing in this population, which could be offset by later compensatory controlled processes, along with preserved semantic memory representations. Multimodal semantic deficits in explicit tasks have also been described in patients with other aphasia syndromes (e.g., transcortical sensory, global, conduction, anomic). These studies support the idea that semantic impairment is associated with control mechanisms in aphasia, similarly to studies conducted in patients with Wernicke’s aphasia. Thus, studies have reported performance variations as a function of semantic control demands, such as strong effects of distractor interference and poorer performance when processing distant semantic associations (Chapman et al., 2020; Noonan et al., 2010). When processing the meaning of ambiguous words, individuals with aphasia who present with multimodal semantic impairments have also shown greater cueing effects than healthy participants. For instance, phonological cues can help them to access less common word meanings, which place a high demand on semantic control processes to select the less frequent meaning and inhibit the dominant one (Noonan et al., 2010). Similarly, significant improvements in oral picture naming have been observed in participants with aphasia who exhibit multimodal semantic deficits when phonological cues are provided (Jefferies et al., 2007; Jefferies & Lambon Ralph, 2006; Noonan et al., 2010), which indicates that naming errors are at least partly due to access deficits. However, it is important to note that researchers have also reported cross-task consistency (i.e., consistent performance in different semantic tasks), as well as correlations in performance of tasks with different semantic control demands (Chapman et al., 2020). These findings indicate that semantic memory deficits, and not just semantic control deficits, may contribute to the comprehension impairments of individuals with aphasia. The nature of these impairments may also differ among patients, depending on the location of the brain lesions. Semantic deficits can arise due to damage to one or more of the different brain regions and networks involved in semantic processing, which may explain the interindividual heterogeneity in the origin and nature of deficits. Furthermore, even though individuals with aphasia typically show damage to perisylvian regions due to a left middle
Normal and Impaired Semantic Processing of Words 219 cerebral artery stroke, damage may extend to brain regions involved in semantic memory representations such as the left anterior temporal lobe (ATL; Walker et al., 2011). In a study with 64 patients with different aphasia types, Walker et al. (2011) found that semantic errors in oral picture naming were more strongly associated with the left ATL. Therefore, although semantic impairments in post-stroke aphasia appear to be mostly driven by deficits in semantic control mechanisms, semantic memory representations may also be affected. In summary, numerous studies have provided further insights into semantic memory and semantic control processes as they may contribute to the semantic deficits of individuals with aphasia, including Wernicke’s and Broca’s aphasia. On one hand, because of the evidence of semantic facilitation in Wernicke’s aphasia, it has been suggested that at least some aspects of the semantic memory representations of word meanings are preserved in this kind of aphasia. These patients’ language comprehension deficits seem to reflect an inability to overtly access, use or manipulate semantic information (i.e., semantic control) rather than a loss of the underlying semantic representations of words. This seems to apply more generally to people with aphasia who present with multimodal semantic deficits. On the other hand, patients with Broca’s aphasia sometimes show a deficit affecting implicit access to semantic memory representations. The comprehension deficits found in individuals with aphasia appear to be related to the assessment method used to test them. Indeed, how semantic information is used in tasks requiring explicit semantic judgments might differ from access to semantic information under implicit task conditions, which do not focus the participants’ attention on the semantics of the words presented visually. In addition, the sites of lesions in the left hemisphere may have a specific impact on the nature of the semantic processes that are most affected in individuals with aphasia.
16.4 W ord Semantic Impairments in Right-hemisphere Damage and Traumatic Brain Injury Individuals with right-hemisphere damage (RHD) have been described as mainly having difficulties with higher-order communication abilities, such as abstract semantics or metaphors, figurative language and joke comprehension (Gainotti, 2016; Joanette et al., 2013; Wilson et al., 2017). However, individuals with RHD may also present with single-word semantic deficits. People with RHD are reported to have impaired retrieval or use of semantic information. Such impairments affect the semantic processing of words, particularly words that are infrequent, abstract or non-imageable (Beeman & Chiarello, 1998; Joanette et al., 1990; Tompkins, 1990). In addition to the question of the semantic specificity of word-level impairments in patients with RHD, another research stream has attempted to determine whether they constitute impairments to the somewhat conscious access to semantic knowledge or disruptions of the automatic activation of this knowledge. In studies by Gagnon et al. (1990, 1994), the objective was to determine whether the right hemisphere’s contribution related to the automatic activation of the semantic organization of lexical items or the strategic use of semantic knowledge. Both participants with RHD and neurologically intact participants were given three tasks with varying activation requirements: two lexical decision tasks with semantic priming – one with a short SOA and the other with a long one – and a semantic judgment task. The results showed that individuals with RHD were impaired on the semantic judgment task, however they showed normal priming effects. These findings are congruent with other studies that reported normal semantic priming effects (automatic and controlled) in participants with RHD (Tompkins, 1990). In the systematic review conducted by Muller and de Salles (2013) on
220 Marilyne Joyal, Maximiliano A. Wilson, and Yves Joanette semantic priming in individuals with RHD, the majority of the studies that used ambiguous words observed impaired semantic priming effects. In those studies, ambiguous words were words with more than one meaning (polysemous words) or words with an ambiguous interpretation for which the right meaning needed to be derived from the sentential context. Conversely, most of the studies that used unambiguous words found preserved semantic priming effects. The only two studies that did not find semantic priming effects could be said to have elicited only controlled or strategic processes. In other words, they used SOAs of over 300 ms, the instructions informed participants of the existence of word pairs (primes and targets), or they used more semantically related pairs than control word pairs. The authors concluded that individuals with RHD have difficulties using contextual information to disambiguate ambiguous words (see Tompkins, 2012 for other possible explanations). In addition to semantic priming paradigms, numerous studies have shown that individuals with RHD perform worse than matched controls on verbal fluency tasks in which the production criterion is semantic (e.g., animals) but not when it is orthographic (e.g., words starting with the letter L or B) (Goulet et al., 1997; Joanette & Goulet, 1986). Moreover, such impairments appear to stem from problems affecting the use of recall strategies, that is, semantic control. Joanette et al. (1988) compared both people with RHD and neurologically intact participants on a word-fluency task using a semantic criterion. An analysis of responses over a two-minute production period showed no significant difference between groups in the first 30 seconds of recall, but significant differences emerged subsequently. This suggests that, in the first period, participants recalled highly automatic, closely associated items. Once these were exhausted, participants needed to guide their recall by making use of retrieval strategies. In line with earlier findings (Cardebat et al., 1990), Zimmermann et al. (2014) suggested that the use of verbal fluency tasks that are longer than the classic one-minute ones may be more sensitive to detect difficulties in individuals with RHD. Collectively, these findings implicate the right hemisphere in the exhaustive retrieval of semantic category members, particularly those that are not highly accessible. Le Blanc and Joanette (1996) reported that people with RHD had a specific tendency to produce less prototypical words in an unconstrained verbal fluency task. In addition, studies of patients with RHD have revealed difficulties in maintaining or in imparting coherence, as well as a deficit in their ability to access and/or report more distantly related category members. Word semantic impairments have also been described in patients after traumatic brain injury (TBI). McWilliams and Schmitter-Edgecombe (2008) reported that participants with moderate-to-severe TBI produced less informative descriptions than healthy controls in an object description task. The authors interpreted these results as indicating a deficit in semantic access (i.e., semantic control) with preserved semantic knowledge after TBI. These difficulties may stem from attentional or executive difficulties in individuals who have experienced TBI (McWilliams & Schmitter-Edgecombe, 2008; Wang et al., 2022; Whelan et al., 2007). In summary, a number of word semantic impairments in people with RHD and TBI have been described in the literature. The performance of these individuals does not appear to be clearly associated with deficits in automatic processing; rather, it suggests impaired access to explicit semantic information. Indeed, semantic difficulties in individuals with RHD and TBI seems to become prominent when attentional or conscious access to semantic processing is needed. Overall, this could be interpreted as evidence of semantic control or general executive difficulties.
16.5 Word Semantic Impairments in Dementia Semantic deficits are observed in different types of dementias, including the neurodegenerative syndromes that are part of the frontotemporal dementia (FTD) spectrum. One particular FTD subtype is characterized by pervasive semantic deficits, namely the semantic variant of
Normal and Impaired Semantic Processing of Words 221 primary progressive aphasia (svPPA; previously known as semantic dementia). A progressive and bilateral atrophy of the anterior temporal lobes (ATLs) occurs in svPPA, although damage is typically asymmetrical and mainly to the left ATL. Its clinical features include impaired single-word comprehension, anomia, and surface dyslexia and dysgraphia (GornoTempini et al., 2011). These are mostly due to a deterioration of semantic memory representations, which occurs more or less equally for all types of concepts regardless of input and output modalities (Patterson et al., 2007). The performance of individuals with svPPA on semantic tests varies depending on several factors, such as disease severity or stage of progression; concept familiarity (knowledge of less familiar concepts, words, and objects deteriorates first); item typicality (performance is poorer for items that are less typical of a semantic category, for instance, “duck” is a typical exemplar of the category “birds” whereas “ostrich” is much less typical); and the specificity of the semantic knowledge required to perform a task (finer semantic features deteriorate first, and thus a patient might recognize that a duck is an animal, but not be able to say it lays eggs; Patterson et al., 2006, 2007). Because of their semantic memory impairment, individuals with svPPA not only fail at explicit semantic tasks (Auclair-Ouellet et al., 2020; Jefferies & Lambon Ralph, 2006; Libon et al., 2013; Ogar et al., 2011; Zannino et al., 2021) but also exhibit impaired semantic priming effects (Calabria et al., 2009; Catricala et al., 2021; Merck et al., 2014; Nakamura et al., 2000; Rogers & Friedman, 2008; Verfaellie & Giovanello, 2006). Compared to healthy participants, semantic priming effects in these individuals are often reduced or absent even for high-frequency words (Calabria et al., 2009; Nakamura et al., 2000; Rogers & Friedman, 2008), and priming diminishes with disease progression (Verfaellie & Giovanello, 2006). A lack of semantic facilitation in this clinical population has been observed for different types of concepts, including social concepts and famous faces and names (Calabria et al., 2009; Catricala et al., 2021), as well as different semantic relationships, including attributes (e.g., the word “fabric” primed by the word “couch”), category coordinates (e.g., “apple” primed by “cherry”) and superordinate categories (e.g., “wood” primed by “walnut”; Rogers & Friedman, 2008). These findings are indicative of a severe degradation of semantic memory representations. Another type of dementia that presents with progressive semantic impairment is Alzheimer’s disease (AD). In this type of dementia, medial temporal structures are implicated in the early stages of the disease. The neocortex is involved next, with the posterior association cortex altered more than frontal association regions. Both left and right cerebral hemispheres are usually affected in parallel and to comparable degrees. Even in the mild stage of the disease, individuals with AD show deficits in explicit semantic tasks, including semantic knowledge questionnaires, as well as picture naming, association, categorization, semantic fluency and odd-one-out tasks (Cervera-Crespo et al., 2019; Corbett et al., 2012; Libon et al., 2013; Passafiume et al., 2012; Salehi et al., 2017; Simoes Loureiro & Lefebvre, 2016b; Westfall & Lee, 2021; Zannino et al., 2021). Although their performance is less impaired overall than in patients with svPPA (Corbett et al., 2015; Zannino et al., 2021), their semantic impairment is multimodal in nature, manifesting itself in both verbal and nonverbal tasks and in both receptive and expressive modalities (Arroyo-Anllo et al., 2011; Corbett et al., 2012; Luotonen et al., 2021). In AD, the semantic impairment is thought to be related to impaired semantic control processes, especially in the early stages, but also to a progressive degradation of semantic memory representations as the disease progresses. Many studies support the dual origin of the semantic impairment in AD and its development over time. The alteration of semantic control is supported by studies that found greater difficulties in tasks with higher demands for semantic control (Arroyo-Anllo et al., 2011; Corbett et al., 2012; Zannino et al., 2021). For instance, Zannino et al. (2021) found that patients with AD were more impaired on category fluency than on a free association task that relied more on automatic processes. Performance on semantic tasks has been found to be positively associated with measures of executive functioning in AD groups (Corbett et al., 2012, 2015), an association that is stronger in AD than in svPPA
222 Marilyne Joyal, Maximiliano A. Wilson, and Yves Joanette (Corbett et al., 2015; Reilly et al., 2011). Studies that compared groups of patients at different stages of the disease also indicated a gradual deterioration of semantic knowledge. For instance, Mardh et al. (2013) assessed semantic knowledge in individuals with AD over three consecutive years using a word-picture matching task and a semantic attribute judgment task. In the latter task, participants had to select from a list of semantic features those associated with the target words. Their comprehension of those target words had been assessed previously in the matching task. The study results indicated a decline in performance on both tasks over time. The number of semantically related attributes correctly associated with target words was higher for target words for which meaning understanding was preserved. Together, these findings support the claim that a progressive loss of semantic information occurs in AD, leading to a comprehension deficit at the single-word level. The loss of semantic knowledge has further been supported by the work of Cervera-Crespo et al. (2019), although their results also highlighted a deficit in semantic control processes. In their crosssectional study, patients with mild and moderate AD performed the Hayling task (Burgess & Shallice, 1997), in which they had to complete sentences with the last word missing, using either a semantically related word (automatic section) or an unrelated word (inhibition section). While the performance of patients with mild AD differed from that of the control group only in the inhibition section (semantic control), the performance of patients with moderate AD was lower than the controls’ in both the automatic (semantic memory) and inhibition sections of the Hayling test. Several studies have investigated semantic priming effects in patients with dementia of the Alzheimer’s type. Some studies showed a lower-than-normal priming effect (Ober & Shenaut, 1988; Silveri et al., 1996); some reported an equivalent semantic priming effect in both patients with AD and neurologically intact participants (e.g., Calabria et al., 2009; Perri et al., 2003); and others showed a hyper-priming phenomenon in patients with AD (i.e., an increased semantic priming effect; e.g., Alathari et al., 2004; Duong et al., 2006; Giffard et al., 2009). Although at first glance these results seem to conflict, when considered as a whole, they support the progressive deterioration of semantic memory representations (for a review, see Giffard et al., 2005). Given the conceptual structure of hierarchical models of semantic memory, both general and specific semantic information is supposed to be stored at different levels. In AD, specific information represented at lower hierarchical levels can be disrupted even if general information represented at a higher superordinate level remains intact, especially in the first stages of the disease. Therefore, semantic priming effects are affected differently depending on the level of the semantic structure involved. For instance, Rogers and Friedman (2008) observed that patients with AD showed no semantic facilitation for attributes (e.g., “duck”– “beak”), reduced facilitation for coordinates (e.g., “duck”– “penguin”) and preserved priming for superordinate relationships (e.g., “duck”– “animal”). This is consistent with a progressive deterioration of semantic memory representations from lower to higher levels of the hierarchy. As for the hyper-priming effect, it has mostly been observed for coordinate relationships and early in the course of the disease (Giffard et al., 2001, 2002; Simoes Loureiro & Lefebvre, 2016a). This effect is thought to be due to a deficit affecting storage of specific attribute information. It makes it more difficult to distinguish between coordinate concepts since they share the same preserved superordinate category while their specific attributes, which allow them to be distinguished, are lost. Later on, the deterioration of semantic information at higher hierarchical levels results in the disappearance of the hyperpriming effect (Giffard et al., 2005). In summary, several studies have shown that semantic impairments are a core feature in svPPA and represent a major feature of dementia of the Alzheimer’s type. These deficits are observed in explicit and implicit semantic tasks and they progress over the course of the disease, following a similar pattern of decline in both types of dementia, with attributes deteriorating first. However, while semantic memory representations deteriorate early in
Normal and Impaired Semantic Processing of Words 223 individuals with svPPA, patients with AD at the onset of the disease are generally characterized by spared knowledge but impaired semantic control processes.
16.6 Conclusion The purpose of this chapter was to review the nature of the semantic deficits most often described in individuals with post-stroke aphasia, right-hemisphere damage, traumatic brain injury, and dementia. Semantic memory and semantic control are two key concepts that help elucidate the nature of the semantic issues affecting those clinical populations. Semantic memory is a long-term collection of knowledge. Semantic control allows for the selection, retrieval, and manipulation of this knowledge. Since these two abilities are sustained by different brain networks, their impairment after brain damage – focal or degenerative – affects language processing differently. In post-stroke aphasia, both semantic memory and semantic control have been observed to be affected. In individuals with RHD and TBI, it is primarily semantic control (or more general executive abilities such as attention) that seems to underlie the semantic deficits in these two clinical populations. In individuals with AD, semantic control with preserved semantic memory seems to be at play at the beginning of the disease, whereas semantic memory impairment is the core characteristic of individuals with svPPA. These profiles evolve as the disease progresses.
REFERENCES Alathari, L., Trinh Ngo, C., & Dopkins, S. (2004). Loss of distinctive features and a broader pattern of priming in Alzheimer’s disease. Neuropsychology, 18(4), 603–612. https://doi. org/10.1037/0894-4105.18.4.603 Arroyo-Anllo, E. M., Bellouard, S., Ingrand, P., & Gil, R. (2011). Effects of automatic/controlled access processes on semantic memory in Alzheimer’s disease. Journal of Alzheimer’s Disease, 25(3), 525–533. https://doi.org/ 10.3233/JAD-2011-110083 Auclair-Ouellet, N., Fossard, M., Macoir, J., & Laforce, R., Jr. (2020). The nonverbal processing of actions is an area of relative strength in the semantic variant of primary progressive aphasia. Journal of Speech, Language, and Hearing Research, 63(2), 569–584. https://doi. org/10.1044/2019_JSLHR-19-00271 Beeman, M. J., & Chiarello, C. (1998). Right hemisphere language comprehension: Perspectives from cognitive neuroscience. Erlbaum. Bernal, B., & Ardila, A. (2016). From hearing sounds to recognizing phonemes: Primary auditory cortex is a truly perceptual language area. AIMS Neuroscience, 3(4), 454–473. https:// doi.org/10.3934/Neuroscience.2016.4.454
Binder, J. R., Desai, R. H., Graves, W. W., & Conant, L. L. (2009). Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies. Cerebral Cortex, 19(12), 2767–2796. https://doi.org/10.1093/ cercor/bhp055 Blumstein, S. E., Milberg, W., & Shrier, R. (1982). Semantic processing in aphasia: Evidence from an auditory lexical decision task. Brain and Language, 17(2), 301–315. https://doi.org/ 10.1016/0093-934x(82)90023-2 Burgess, C., & Shallice, T. (1997). The Hayling and Brixton tests. Thames Valley Test Company. Calabria, M., Miniussi, C., Bisiacchi, P. S., Zanetti, O., & Cotelli, M. (2009). Face-name repetition priming in semantic dementia: A case report. Brain and Cognition, 70(2), 231–237. https://doi.org/10.1016/j.bandc.2009.02.005 Cardebat, D., Doyon, B., Puel, M., Goulet, P., & Joanette, Y. (1990). Formal and semantic lexical evocation in normal subjects. Performance and dynamics of production as a function of sex, age and educational level (Evocation lexicale formelle et semantique chez des sujets normaux. Performances et dynamiques de production en fonction du sexe, de l’age et du
224 Marilyne Joyal, Maximiliano A. Wilson, and Yves Joanette niveau d’etude). Acta Neurologica Belgica, 90(4), 207–217. https://www.ncbi.nlm.nih.gov/ pubmed/2124031 Catricala, E., Conca, F., Borsa, V. M., Cotelli, M., Manenti, R., Gobbi, E., Binetti, G., Cotta Ramusino, M., Perini, G., Costa, A., Rusconi, M.L., & Cappa, S. F. (2021). Different types of abstract concepts: Evidence from two neurodegenerative patients. Neurocase, 27(3), 270–280. https://doi.org/10.1080/13554794.20 21.1931345 Cervera-Crespo, T., Gonzalez-Alvarez, J., & Rosell-Clari, V. (2019). Semantic inhibition and dementia severity in Alzheimer’s disease. Psicothema, 31(3), 305–310. https://doi. org/10.7334/psicothema2019.40 Chapman, C. A., Hasan, O., Schulz, P. E., & Martin, R. C. (2020). Evaluating the distinction between semantic knowledge and semantic access: Evidence from semantic dementia and comprehension-impaired stroke aphasia. Psychonomic Bulletin and Review, 27(4), 607–639. https://doi.org/10.3758/ s13423-019-01706-6 Collins, A. M., & Loftus, E. F. (1975). A spreadingactivation theory of semantic processing. Psychological Review, 82(6), 407–428. Corbett, F., Jefferies, E., Burns, A., & Lambon Ralph, M. A. (2012). Unpicking the semantic impairment in Alzheimer’s disease: Qualitative changes with disease severity. Behavioural Neurology, 25(1), 23–34. https:// doi.org/10.3233/BEN-2012-0346 Corbett, F., Jefferies, E., Burns, A., & Lambon Ralph, M. A. (2015). Deregulated semantic cognition contributes to object-use deficits in Alzheimer’s disease: A comparison with semantic aphasia and semantic dementia. Journal of Neuropsychology, 9(2), 219–241. https://doi.org/10.1111/jnp.12047 Duong, A., Whitehead, V., Hanratty, K., & Chertkow, H. (2006). The nature of lexicosemantic processing deficits in mild cognitive impairment. Neuropsychologia, 44(10), 1928–1935. https://doi.org/10.1016/j.neuropsychologia. 2006.01.034 Gagnon, J., Goulet, P., & Joanette, Y. (1990). Utilisation automatique et contrôlée du savoir lexico-sémantique chez les cérébrolésés droits. Langages, 96(1), 95–111. Gagnon, J., Goulet, P., & Joanette, Y. (1994). Activation of the lexical-semantic system in right-brain-damaged right-handers. In D. Hillert (Eds.), Linguistics and Cognitive Neuroscience. Linguistische Berichte. VS Verlag für Sozialwissenschaften. https://doi.org/10.1007/ 978-3-322-91649-5_3.
Gainotti, G. (2016). Lower- and higher-level models of right hemisphere language. A selective survey. Functional Neurology, 31(2), 67–73. https://doi.org/10.11138/fneur/2016.31.2.067 Giffard, B., Desgranges, B., & Eustache, F. (2005). Semantic memory disorders in Alzheimer’s disease: Clues from semantic priming effects. Current Alzheimer Research, 2(4), 425–434. https:// www.ncbi.nlm.nih.gov/pubmed/16248848 Giffard, B., Desgranges, B., Nore-Mary, F., Lalevee, C., Beaunieux, H., de la Sayette, V., Pasquier, F., &Eustache, F. (2002). The dynamic time course of semantic memory impairment in Alzheimer’s disease: Clues from hyperpriming and hypopriming effects. Brain, 125(Pt 9), 2044–2057. https://www.ncbi.nlm. nih.gov/pubmed/12183350 Giffard, B., Desgranges, B., Nore-Mary, F., Lalevee, C., de la Sayette, V., Pasquier, F., & Eustache, F. (2001). The nature of semantic memory deficits in Alzheimer’s disease: New insights from hyperpriming effects. Brain, 124(Pt 8), 1522–1532. https://doi.org/10.1093/ brain/124.8.1522 Giffard, B., Laisney, M., Eustache, F., & Desgranges, B. (2009). Can the emotional connotation of concepts modulate the lexicosemantic deficits in Alzheimer’s disease? Neuropsychologia, 47(1), 258–267. https://doi. org/10.1016/j.neuropsychologia.2008.07.013 Gorno-Tempini, M. L., Hillis, A. E., Weintraub, S., Kertesz, A., Mendez, M., Cappa, S. F., Ogar, J. M., Rohrer, J. D., Black, S., Boeve, B. F., Manes, F., Dronkers, N. F., Vandenberghe, R., Rascovsky, K., Patterson, K., Miller, B. L., Knopman, D. S., Hodges, J. R., Mesulam, M. M., & Grossman, M. (2011). Classification of primary progressive aphasia and its variants. Neurology, 76, 1006–1014. Goulet, P., Joanette, Y., Sabourin, L., & Giroux, F. (1997). Word fluency after a right-hemisphere lesion. Neuropsychologia, 35(12), 1565–1570. http://www.ncbi.nlm.nih.gov/entrez/query. fcgi?db=pubmed&cmd=Retrieve&dopt=Abstr actPlus&list_uids=9460726 Grober, E., Perecman, E., Kellar, L., & Brown, J. (1980). Lexical knowledge in anterior and posterior aphasics. Brain and Language, 10(2), 318–330. https://doi.org/10.1016/ 0093-934x(80)90059-0 Hagoort, P. (1993). Impairments of lexical-semantic processing in aphasia: Evidence from the processing of lexical ambiguities. Brain and Language, 45(2), 189–232. https://doi.org/ 10.1006/brln.1993.1043 Hoffman, P. (2018). An individual differences approach to semantic cognition: Divergent
Normal and Impaired Semantic Processing of Words 225 effects of age on representation, retrieval and selection. Scientific Reports, 8(1), 8145. https:// doi.org/10.1038/s41598-018-26569-0 Hoffman, P. (2019). Divergent effects of healthy ageing on semantic knowledge and control: Evidence from novel comparisons with semantically impaired patients. Journal of Neuropsychology, 13(3), 462–484. https://doi. org/10.1111/jnp.12159 Jackson, R. L. (2021). The neural correlates of semantic control revisited. Neuroimage, 224(1), Article 117444. https://doi.org/10.1016/j. neuroimage.2020.117444 Jefferies, E. (2013). The neural basis of semantic cognition: Converging evidence from neuropsychology, neuroimaging and TMS. Cortex, 49(3), 611–625. https://doi.org/10.1016/j. cortex.2012.10.008 Jefferies, E., Baker, S. S., Doran, M., & Lambon Ralph, M. A. (2007). Refractory effects in stroke aphasia: A consequence of poor semantic control. Neuropsychologia, 45(5), 1065–1079. https://doi. org/10.1016/j.neuropsychologia.2006.09.009 Jefferies, E., & Lambon Ralph, M. A. (2006). Semantic impairment in stroke aphasia versus semantic dementia: A case-series comparison. Brain, 129(Pt 8), 2132–2147. https://doi.org/ 10.1093/brain/awl153 Joanette, Y., Ferré, P., & Wilson, M. A. (2013). Right-hemisphere damage and communication. In L. Cummings (Ed.), The Cambridge handbook of communication disorders (pp. 247–265). Cambridge University Press. Joanette, Y., & Goulet, P. (1986). Criterion-specific reduction of verbal fluency in right braindamaged right-handers. Neuropsychologia, 24(6), 875–879. https://doi.org/10.1016/ 0028-3932(86)90087-4 Joanette, Y., Goulet, P., & Hannequin, D. (1990). Right hemisphere and verbal communication. Springer Verlag. http://www.ncbi.nlm.nih. gov/entrez/query.fcgi?db=pubmed&cmd=Ret rieve&dopt=AbstractPlus&list_uids=95048258 25935062270related:_nTR3rrs54MJ Joanette, Y., Goulet, P., & Le Dorze, G. (1988). Impaired word naming in right-brain-damaged right-handers: Error types and time-course analyses. Brain and Language, 34(1), 54–64. https://doi.org/10.1016/0093-934x(88)90124-1 Krieger-Redwood, K., Wang, H. T., Poerio, G., Martinon, L. M., Riby, L. M., Smallwood, J., & Jefferies, E. (2019). Reduced semantic control in older adults is linked to intrinsic DMN connectivity. Neuropsychologia, 132(1), Article 107133. https://doi.org/10.1016/j.neuropsychologia.2019.107133
Lambon Ralph, M. A., Jefferies, E., Patterson, K., & Rogers, T. T. (2017). The neural and computational bases of semantic cognition. Nature Reviews Neuroscience, 18(1), 42–55. https://doi.org/10.1038/nrn.2016.150 Le Blanc, B., & Joanette, Y. (1996). Unconstrained oral naming in left- and right-hemispheredamaged patients: An analysis of naturalistic semantic strategies. Brain and Language, 55(1), 42–45. Libon, D. J., Rascovsky, K., Powers, J., Irwin, D. J., Boller, A., Weinberg, D., McMillan, C. T., & Grossman, M. (2013). Comparative semantic profiles in semantic dementia and Alzheimer’s disease. Brain, 136(Pt 8), 2497–2509. https:// doi.org/10.1093/brain/awt165 Luotonen, I., Karrasch, M., Korpilahti, P., & Renvall, K. (2021). Factor structure and clinical applicability of new semantic tasks in Alzheimer’s disease and aphasia. Applied Neuropsychology: Adult, 1–12. https://doi.org/ 10.1080/23279095.2021.1986511 Mardh, S., Nagga, K., & Samuelsson, S. (2013). A longitudinal study of semantic memory impairment in patients with Alzheimer’s disease. Cortex, 49(2), 528–533. https://doi. org/10.1016/j.cortex.2012.02.004 McWilliams, J., & Schmitter-Edgecombe, M. (2008). Semantic memory organization during the early stage of recovery from traumatic brain injury. Brain Injury, 22(3), 243–253. https://doi.org/10.1080/02699050801935252 Merck, C., Jonin, P. Y., Laisney, M., Vichard, H., & Belliard, S. (2014). When the zebra loses its stripes but is still in the savannah: Results from a semantic priming paradigm in semantic dementia. Neuropsychologia, 53, 221–232. https:// doi.org/10.1016/j.neuropsychologia.2013.11.024 Milberg, W., Blumstein, S., Giovanello, K. S., & Misiurski, C. (2003). Summation priming in aphasia: Evidence for alterations in semantic integration and activation. Brain and Cognition, 51(1), 31–47. https://doi.org/10.1016/s0278-2626 (02)00500-6 Milberg, W., & Blumstein, S. E. (1981). Lexical decision and aphasia: Evidence for semantic processing. Brain and Language, 14(2), 371–385. https://doi.org/10.1016/0093-934x(81)90086-9 Milberg, W., Blumstein, S. E., & Dworetzky, B. (1987). Processing of lexical ambiguities in aphasia. Brain and Language, 31(1), 138–150. https://doi.org/10.1016/0093-934x(87)90065-4 Muller, J. L., & de Salles, J. F. (2013). Studies on semantic priming effects in right hemisphere stroke: A systematic review. Dementia e
226 Marilyne Joyal, Maximiliano A. Wilson, and Yves Joanette Neuropsychologia, 7(2), 155–163. https://doi. org/10.1590/S1980-57642013DN70200004 Nakamura, H., Nakanishi, M., Hamanaka, T., Nakaaki, S., & Yoshida, S. (2000). Semantic priming in patients with Alzheimer and semantic dementia. Cortex, 36(2), 151–162. https://doi.org/10.1016/s0010-9452(08)70521-5 Nakano, H., & Blumstein, S. E. (2004). Deficits in thematic integration processes in Broca’s and Wernicke’s aphasia. Brain and Language, 88(1), 96–107. https://doi.org/10.1016/ s0093-934x(03)00280-3 Neely, J. H. (1991). Semantic priming effects in visual word recognition: A selective review of current findings and theory. In D. Besner & G. W. Humphreys (Eds.), Basic processes in reading: Visual word recognition (pp. 264–336). Lawrence Erlbaum Associates, Inc. Noonan, K. A., Jefferies, E., Corbett, F., & Lambon Ralph, M. A. (2010). Elucidating the nature of deregulated semantic cognition in semantic aphasia: Evidence for the roles of prefrontal and temporo-parietal cortices. Journal of Cognitive Neuroscience, 22(7), 1597–1613. https://doi.org/10.1162/jocn.2009.21289 Noonan, K. A., Jefferies, E., Visser, M., & Lambon Ralph, M. A. (2013). Going beyond inferior prefrontal involvement in semantic control: Evidence for the additional contribution of dorsal angular gyrus and posterior middle temporal cortex. Journal of Cognitive Neuroscience, 25(11), 1824–1850. https://doi. org/10.1162/jocn_a_00442 Ober, B. A., & Shenaut, G. K. (1988). Lexical decision and priming in Alzheimer’s disease. Neuropsychologia, 26(2), 273–286. https://doi. org/10.1016/0028-3932(88)90080-2 Ogar, J. M., Baldo, J. V., Wilson, S. M., Brambati, S. M., Miller, B. L., Dronkers, N. F., & GornoTempini, M. L. (2011). Semantic dementia and persisting Wernicke’s aphasia: Linguistic and anatomical profiles. Brain and Language, 117(1), 28–33. https://doi.org/10.1016/j.bandl.2010. 11.004 Park, D. C., Lautenschlager, G., Hedden, T., Davidson, N. S., Smith, A. D., & Smith, P. K. (2002). Models of visuospatial and verbal memory across the adult life span. Psychology and Aging, 17(2), 299–320. https://www.ncbi. nlm.nih.gov/pubmed/12061414 Passafiume, D., De Federicis, L. S., Carbone, G., & Di Giacomo, D. (2012). Loss of semantic associative categories in patients with Alzheimer’s disease. Applied Neuropsychology: Adult, 19(4), 305–311. https://doi.org/10.1080/0 9084282.2012.670160
Patterson, K., Lambon Ralph, M. A., Jefferies, E., Woollams, A., Jones, R., Hodges, J. R., & Rogers, T. T. (2006). “Presemantic” cognition in semantic dementia: Six deficits in search of an explanation. Journal of Cognitive Neuroscience, 18(2), 169–183. Patterson, K., Nestor, P. J., & Rogers, T. T. (2007). Where do you know what you know? The representation of semantic knowledge in the human brain. Nature Reviews Neuroscience, 8(12), 976–987. Perri, R., Carlesimo, G. A., Zannino, G. D., Mauri, M., Muolo, B., Pettenati, C., & Caltagirone, C. (2003). Intentional and automatic measures of specific-category effect in the semantic impairment of patients with Alzheimer’s disease. Neuropsychologia, 41(11), 1509–1522. https://doi.org/10.1016/ s0028-3932(03)00075-7 Prather, P. A., Zurif, E., Love, T., & Brownell, H. (1997). Speed of lexical activation in nonfluent Broca’s aphasia and fluent Wernicke’s aphasia. Brain and Language, 59(3), 391–411. https://doi. org/10.1006/brln.1997.1751 Python, G., Glize, B., & Laganaro, M. (2018). The involvement of left inferior frontal and middle temporal cortices in word production unveiled by greater facilitation effects following brain damage. Neuropsychologia, 121(1), 122–134. https://doi.org/10.1016/j. neuropsychologia.2018.10.026 Quillian, R. (1966). Semantic memory. Air Force Cambridge Research Laboratories, Office of Aerospace Research, United States Air Force. Reilly, J., Peelle, J. E., Antonucci, S. M., & Grossman, M. (2011). Anomia as a marker of distinct semantic memory impairments in Alzheimer’s disease and semantic dementia. Neuropsychology, 25(4), 413–426. https://doi. org/10.1037/a0022738 Robson, H., Pilkington, E., Evans, L., DeLuca, V., & Keidel, J. L. (2017). Phonological and semantic processing during comprehension in Wernicke’s aphasia: An N400 and phonological mapping negativity study. Neuropsychologia, 100(1), 144–154. https://doi.org/10.1016/j. neuropsychologia.2017.04.012 Robson, H., Sage, K., & Lambon Ralph, M. A. (2012). Wernicke’s aphasia reflects a combination of acoustic-phonological and semantic control deficits: A case-series comparison of Wernicke’s aphasia, semantic dementia and semantic aphasia. Neuropsychologia, 50(2), 266–275. https://doi. org/10.1016/j.neuropsychologia.2011.11.021
Normal and Impaired Semantic Processing of Words 227 Rogers, S. L., & Friedman, R. B. (2008). The underlying mechanisms of semantic memory loss in Alzheimer’s disease and semantic dementia. Neuropsychologia, 46(1), 12–21. https:// doi.org/10.1016/j.neuropsychologia.2007.08.010 Salehi, M., Reisi, M., & Ghasisin, L. (2017). Lexical retrieval or semantic knowledge? Which one causes naming errors in patients with mild and moderate Alzheimer’s disease? Dementia and Geriatric Cognitive Disorders Extra, 7(3), 419–429. https://doi.org/10.1159/000484137 Salles, J. F. D., Holderbaum, C. S., Parente, M. A. M. P., Mansur, L. L., & Ansaldo, A. I. (2012). Lexical-semantic processing in the semantic priming paradigm in aphasic patients. Archives of Neuropsychiatry, 70(9), 718–726. Salthouse, T. A. (2019). Trajectories of normal cognitive aging. Psychology and Aging, 34(1), 17–24. https://doi.org/10.1037/pag0000288 Silkes, J. P., Baker, C., & Love, T. (2020). The time course of priming in aphasia: An exploration of learning along a continuum of linguistic processing demands. Topics in Language Disorders, 40(1), 54–80. https://doi.org/10.1097/ TLD.0000000000000205 Silveri, M. C., Monteleone, D., Burani, C., & Tabossi, P. (1996). Automatic semantic facilitation in Alzheimer’s disease. Journal of Clinical and Experimental Neuropsychology, 18(3), 371–382. https://doi.org/10.1080/ 01688639608408994 Simoes Loureiro, I., & Lefebvre, L. (2016a). Distinct progression of the deterioration of thematic and taxonomic links in natural and manufactured objects in Alzheimer’s disease. Neuropsychologia, 91, 426–434. https://doi. org/10.1016/j.neuropsychologia.2016.09.002 Simoes Loureiro, I., & Lefebvre, L. (2016b). Retrogenesis of semantic knowledge: Comparative approach of acquisition and deterioration of concepts in semantic memory. Neuropsychology, 30(7), 853–859. https://doi. org/10.1037/neu0000272 Toepper, M. (2017). Dissociating normal aging from Alzheimer’s disease: A view from cognitive neuroscience. Journal of Alzheimer’s Disease, 57(2), 331–352. https://doi. org/10.3233/JAD-161099 Tompkins, C. A. (1990). Knowledge and strategies for processing lexical metaphor after right or left hemisphere brain damage. Journal of Speech and Hearing Research, 33(2), 307–316. https://doi.org/10.1044/jshr.3302.307 Tompkins, C. A. (2012). Rehabilitation for cognitive-communication disorders in right
hemisphere brain damage. Archives of Physical Medicine and Rehabilitation, 93(1 Suppl), S61–S69. https://doi.org/10.1016/j.apmr.2011.10.015 Tulving, E. (1972). Episodic and semantic memory. In E. Tulving & D. W (Eds.), Organization of memory (pp. 381–403). Academic Press. Verfaellie, M., & Giovanello, K. S. (2006). Conceptual priming in semantic dementia: A window into the cognitive and neural basis of conceptual implicit memory. Cognitive Neuropsychology, 23(4), 606–620. https://doi. org/10.1080/02643290500346099 Verhaegen, C., & Poncelet, M. (2013). Changes in naming and semantic abilities with aging from 50 to 90 years. Journal of the International Neuropsychological Society, 19(2), 119–126. https://doi.org/10.1017/S1355617712001178 Walker, G. M., Schwartz, M. F., Kimberg, D. Y., Faseyitan, O., Brecher, A., Dell, G. S., & Coslett, H. B. (2011). Support for anterior temporal involvement in semantic error production in aphasia: New evidence from VLSM. Brain and Language, 117(3), 110–122. https://doi. org/10.1016/j.bandl.2010.09.008 Wang, Y., Zhou, Y., Zhang, X., Wang, K., Chen, X., & Cheng, H. (2022). Orienting network impairment of attention in patients with mild traumatic brain injury. Behavioural Brain Research, 437(1), 114133. https://doi.org/10.1016/j. bbr.2022.114133 Westfall, H. A., & Lee, M. D. (2021). A model-based analysis of the impairment of semantic memory. Psychonomic Bulletin and Review, 28(5), 1484–1494. https://doi.org/10.3758/s13423-020-01875-9 Whelan, B. M., Murdoch, B. E., & Bellamy, N. (2007). Delineating communication impairments associated with mild traumatic brain injury: A case report. Journal of Head Trauma Rehabilitation, 22(3), 192–197. https:// doi.org/10.1097/01.HTR.0000271120.04405.db Whitehouse, P., Caramazza, A., & Zurif, E. (1978). Naming in aphasia: Interacting effects of form and function. Brain and Language, 6(1), 63–74. https://doi.org/10.1016/0093-934x(78)90044-5 Wilson, M. A., Ska, B., & Joanette, Y. (2017). Discourse and social cognition disorders affecting communication abilities. In A. M. Raymer & L. J. Gonzalez Rothi (Eds.), The Oxford handbook of aphasia and language disorders (pp. 263–276). Oxford University Press. Xu, Y., He, Y., & Bi, Y. (2017). A tri-network model of human semantic processing. Frontiers in Psychology, 8(1), Article 1538. https://doi. org/10.3389/fpsyg.2017.01538
228 Marilyne Joyal, Maximiliano A. Wilson, and Yves Joanette Zannino, G. D., Perri, R., Marra, C., Caruso, G., Baroncini, M., Caltagirone, C., & Carlesimo, G. A. (2021). The free association task: Proposal of a clinical tool for detecting differential profiles of semantic impairment in semantic dementia and Alzheimer’s disease. Medicina, 57(11). https://doi.org/10.3390/medicina57111171
Zimmermann, N., Branco, L., Ska, B., Gasparetto, E. L., Joanette, Y., & Fonseca, R. (2014). Verbal fluency in right brain damage: Dissociations among production criteria and duration. Applied Neuropsychology: Adult, 21(4), 260–268. https://doi.org/10.1080/0908 4282.2013.802693
17 Neural Correlates of Neurotypical and Pathological Language Processing SONJA A. KOTZ, STEFAN FRISCH, AND ANGELA D. FRIEDERICI 17.1 The Classical Models and Beyond In 1874, the German neuroanatomist Eduard Hitzig presented his ideas on language and the brain to the Berlin Anthropological Society (cf. Hagner, 2000). He interpreted aphasia as the loss of “motor images of words” very similar to neuronal representations of other types of motor activity in humans and non-human animals. Hitzig was sharply criticized by Heymann Steinthal, a linguist, who had analyzed most of the available aphasiological evidence at that time. Steinthal was convinced that the leading view of language in the second half of the nineteenth century completely underestimated the complexity of language as a psychological function. He concluded that language had to be conceived as a complex psychological mechanism beyond the current view of the leading neurologists and neuroanatomists. Although Steinthal discussed his ideas with many important scientists at the time, the leading theoretical views on aphasia and language prevailed. These views had begun to gain influence after the scientific descriptions of motor aphasia by Paul Broca and of sensory aphasia by Carl Wernicke. Wernicke (1874) incorporated both findings into a model of a motor speech center in the inferior frontal and a sensory speech center in the superior temporal cortex, the two being connected by a massive fiber bundle (arcuate fasciculus). Lichtheim (1885) added a “concept center” to this model and arrived at his famous “house model” of language that supposedly made it possible for all types of aphasic syndromes to be explained. Although the so-called Wernicke–Lichtheim model of language has been very influential as a heuristic for both research and therapy, it is faced with several problems (see also Hickok & Poeppel, 2004, 2015): the idea of a few aphasic syndromes is not sufficient to explain the variety of aphasic phenomena; nor is their association to different anatomical areas as clear as the classical model suggests. Furthermore, the model is (psycho)linguistically strongly underspecified. Today, Steinthal’s claim that the complex structure of language is inherently tied to
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
230 Sonja A. Kotz, Stefan Frisch, and Angela D. Friederici a differentiated brain network has gained much influence. Many neural models of language are inextricably bound to (psycho)linguistic theories. Furthermore, new techniques to measure brain activity in vivo give us an idea of how complex the neural basis of language is and how the different language functions are supported by a distributed network of cortical as well as subcortical areas. Through these new methods, we gain understanding about language being processed in time. When the lexical entry of a word is retrieved, many different types of linguistic information (phonological, syntactic, and semantic) need to be integrated into a sentence representation. This happens very fast, even though the process engages multiple interactions between information types. Thus, a model of language not only has to describe anatomically and functionally distinct language-related areas in the brain. It must also explain when these different areas come into play and interact with each other so that language is produced and understood in time-critical conditions of communication.
17.2 Language Processing in Real Time Numerous studies have described time as a critical parameter of aphasic language processing. For example, Friederici and Kilborn (1989) reported that people with Broca’s aphasia showed longer lexical decision times for target words in sentence contexts rather than in isolation. Further, reaction times were longer in people with aphasia than neurotypical controls when there was no pause between a context and a target word. As grammatical knowledge of a sentence was preserved, results suggested that sentence processing under strong time restrictions was impaired. A study by Burkhardt et al. (2003) reported that p eople with Broca’s aphasia showed a priming effect in a cross-modal lexical decision task at the original position of a moved argument such as “the cheese” in (1), as did controls (the c oncept of movement will be explained in more detail below). However, participants showed a priming effect for a word related to cheese in (1) (such as cheddar) compared to an unrelated word (such as album) only when this target word was presented with a considerable delay (650 ms) relative to its original position (i.e., at trace position “t”). (1) The kid loved the cheesei whichi the new microwave melted ti yesterday afternoon. In contrast, control participants already showed a comparable effect 100 ms after the target position. These results as well as those from Friederici and Kilborn (1989) highlight the importance of the temporal dynamics of pathological language processing. They emphasize some of the limitations of representational accounts that assume loss of grammatical knowledge for sentences such as (1) (e.g., Grodzinsky, 2000). In addition to reaction-time experiments, time-sensitive neurophysiological measures have received increasing attention in pathological language processing research. E vent-related brain potentials (ERPs) that display electrophysiological correlates of cognitive processes with a very high time resolution (millisecond-by-millisecond) have attracted a lot of attention over the last decades. ERPs are obtained by averaging epochs of spontaneous EEG activity that are time-locked to the onset of a target event (e.g., syntactically or semantically mismatching words in a sentence). The averaging procedure results in a wavelike pattern consisting of typical peaks that are positive or negative compared to a control condition (e.g., syntactically and/or semantically legal words in a sentence). These peaks are termed components. They are defined not only by their polarity (positive or negative), but also by the time delay after the onset of a target event (latency) and the area over the skull where they are maximal (topography). For example, the N400 component, is a negative (hence “N”) deflection that occurs approximately 400 milliseconds after the onset of a target event. Although ERP components
Neural Correlates of Neurotypical and Pathological Language Processing 231 are defined by their topography on the surface of the skull, deductions on their brain sources are not possible as EEG activity is oriented orthogonally to the sulcated cortex surface and not to the skull surface. Therefore, for each ERP pattern that is recorded from the surface of the skull, there is an infinite number of possible brain sources (generators). There are several ways to determine the neural basis of a specific ERP component and therefore of the specific step in language processing it represents. One possibility is to test participants with circumscribed brain lesions and to find out whether they show the component in question or not. Another possibility is to test similar experimental manipulations with neuroimaging methods which allow a high spatial resolution. These methods trace changes in the cerebral blood flow either via a radioactive substance in positron emission tomography (PET) or via changes in the magnetic field in functional magnetic resonance imaging (fMRI). However, these methods are limited by the fact that the physiological mechanism they depend upon (i.e., cerebral blood flow) changes relatively slowly in comparison to electrophysiological activity. Thus, there are obvious trade-offs between the spatial and temporal resolution in the different methods. ERPs, on the one hand, and fMRI/PET on the other can thus be seen as complementary methods that play important roles in formulating neurocognitive model of language processing. In the following, we present ERP and fMRI evidence on syntactic and semantic processing at the sentence level and integrate them into a model. Of note is that this is still an area of active and ongoing research. Accordingly, sentence processing-models are very much “in flux.” Due to space limitations, we will not discuss results on early processes of speech segmentation (see Hickok & Poeppel, 2004, 2015) or on phonological processing (see Friederici & Alter, 2004).
17.3 Semantic Integration Kutas and Hillyard (1980) were the first to report that semantically anomalous sentences such as (2) lead to a specific ERP response. (2) He spread the warm bread with socks. Compared to correct sentences, Kutas and Hillyard found a negative ERP deflection occurring approximately 400 ms after the word socks was presented, rendering (2) semantically inappropriate. Since then, the N400 has been the focus of numerous studies. There was some debate about what the N400 exactly reflects, but there is good evidence that it can best be characterized as a marker of semantic integration (Hagoort, 2003, 2008). The N400 has also been related to lexical access and semantic memory retrieval (Brouwer et al., 2012; Kutas & Federmeier, 2000) but also to thematic mismatch of argument-structure violations (Frisch, Hahne et al., 2004; Osterhout et al., 1994) as well as hierarchic thematic interpretation problems (Bornkessel-Schlesewsky & Schlesewsky, 2006, 2016; Frisch & Schlesewsky, 2001, 2005). To determine the brain areas within the language network that support different language processes, several studies with people with aphasia (PWA) have been carried out. These studies broadly fall into three categories: (i) direct comparisons of controls and people with aphasia, (ii) level of aphasia severity (i.e., functional deficits), and (iii) lesion location(s) (i.e., structural deficits) (Meechan et al., 2021). For example, one of the first N400 aphasia studies was a passive listening paradigm (Swaab et al., 1997) that aligns with categories (i) and (ii). The authors reported that the N400 latency was delayed in people with aphasia with low comprehension scores (measured on an independent test) compared to participants with aphasia patients with high comprehension scores, participants with right-hemisphere damage, and control participants.
232 Sonja A. Kotz, Stefan Frisch, and Angela D. Friederici Neither the exact lesion site within the left hemisphere nor the specific aphasic syndrome (Broca’s versus Wernicke’s aphasia) was a defining factor, but the level of aphasia severity was. The authors concluded that PWA with low comprehension abilities were delayed in lexical integration. Further evidence along these two categories have been reported for N400 sentence processing deficits in aphasia (comparison to controls: Connolly et al., 1999; Hagoort et al., 2003a, 2003b; Khachatryan et al., 2017; Revonsuo & Laine, 1996; ter Keurs et al., 1999; Wassenaar & Hagoort, 2005, 2007; severity: Chang et al., 2016; Kawhol et al., 2010; Sheppard et al., 2017; Swaab et al., 1998). While comprehension performance was the critical criterion in some of the abovedescribed experiments, there are also studies that sub-grouped participants based on their lesion location(s). Friederici et al. (1998) reported no N400 effect in a person with aphasia with a left-temporoparietal lesion in semantically anomalous sentences such as (3). (3) Der Honig wurde ermordet. The honey was murdered. A further study by Friederici, von Cramon et al. (1999) showed that the N400 for sentences such as (3) was preserved in patients with left inferior-frontal lesions as well as in patients with subcortical lesion of the left basal ganglia. This was taken to show that the respective structures do not play a crucial role in processes of semantic integration. fMRI studies on sentence level semantic processing have reported activation in an inferior fronto-temporal network when participants are confronted with semantically anomalous sentences. For example, early studies such as from Ni et al. (2000) presented semantically anomalous sentences such as (4) to participants. (4) Trees can eat. Sentences such as (4) led to more activation in the posterior superior temporal gyrus (STG), the middle temporal gyrus (MTG), the angular gyrus, and the inferior frontal gyrus (IFG). Increased activations in the MTG (BA21), the angular gyrus and the inferior frontal region (BA46/BA9), but also the medial temporal cortex, for semantically incongruent sentences were also reported by Newman et al. (2001). Kuperberg et al. (2000) tested sentences with semantic violations in the strict sense, that is, selectional restriction violations, such as (5). (5) The young man drank the guitar. The authors found enhanced activation differences in the (right) STG as well as the (right) MTG. In comparison to both syntactic and selectional restriction violations, pragmatically anomalous sentences such as (6) lead to increased activation differences in the left STG. (6) The young man buried the guitar. Friederici et al. (2003) found enhanced activity for semantic anomalies in comparison to a baseline condition in the middle to posterior STG and the insular cortex bilaterally, but no activation in the IFG. Rüschemeyer et al. (2005) contrasted semantically anomalous with correct sentences and found the lateral prefrontal cortex (BA44/ 45) and an area including the posterior MTG and the superior temporal sulcus (STS) to be specifically active.
Neural Correlates of Neurotypical and Pathological Language Processing 233 Both (posterior) superior and middle temporal areas seem to be involved in semantic integration processes, but also an inferior frontal area (namely BA47) anterior to Broca’s area, which has been associated with syntactic processing. As has been suggested by Dapretto and Bookheimer (1999), these two regions may serve different aspects of language processing. In a study testing identical sentences in a syntactic and a semantic task the authors found the anterior portion of left IFG (mainly BA47) more active in the semantic task than in the syntactic task. On the other hand, activation differences were stronger in the posterior portion of left IFG (mainly BA44) in the syntactic task than the semantic task. These results suggest that the IFG may respond as a function of the strategic aspects of the task employed. More recent studies on semantic integration used a two-word phrase paradigm focusing on phrase level rather than sentence level integration processes (Bemis & Pylkkänen, 2011, 2013; Schell et al., 2017). A fMRI study in healthy individuals showed that against baseline the processing of syntactic phrases (this ship) activated BA44 whereas semantic phrases (blue ship) activated anterior IFG and the left angular gyrus (AG) (Schell et al., 2017). However, focusing on semantic processes it was found that semantically anomalous phrases lead to activation in the anterior IFG, the anterior temporal lobe and the AG (Graessner, Zaccarella, Hartwigsen 2021). In a lesion-behavior mapping study using the two-word phrase paradigm PWA were confronted with adjective noun phrases that were meaningful (anxious horse), anomalous (anxious salad), or had a noun replaced by a pseudoword (anxious gufel) as well as a control condition (horse) (Graessner, Zaccarella, Friederici et al., 2021). This study revealed that reduced accuracy for anomalous phrases was associated with lesions in the left anterior IFG whereas increased reaction time for anomalous phrase correlated with lesions in the anteriorto-mid temporal lobe. These results suggest that the anterior IFG supports executive control for decisions on within phrase semantic integration whereas lesions in the anterior-to-midtemporal lobe show semantic processes of constituents of a phrase. In sum, studies on the processing of semantic information converge in the finding that processes of semantic integration take place around 400 ms after a critical stimulus and that these processes are subserved by a cortical network including the MTG, middle and posterior portions of the STG, the anterior temporal lobe and the anterior IFG, whereby the involvement of the latter is presumably tied to strategic aspects of processing (Friederici, 2011; Lau et al., 2008) (see Figure 17.1 for location of brain regions).
17.4 S yntactic Processes: Word Category Information, Morphosyntactic Information and Syntactic Integration Apart from semantic information, syntactic information (word category, argument structure) is marked in a word’s lexical entry. In addition to argument structure related morphosyntactic information (case), inflections provide relevant information for the syntactic relation of lexical elements in a sentence. During on-line sentence processing this information must be linked with syntactic restrictions provided by the sentence structure. As ERP studies have shown, syntactic processes take place in three different time windows: ERP effects have been observed first in a very early time window between 150–200ms, the early left anterior negativity (ELAN), second in a mid-latency time window between 300–500ms, the left anterior negativity (LAN) and third in a late time window around 600 ms, a late positivity the P600. In the early phase (around 150 ms) words must be integrated into the ongoing sentence structure based on their syntactic category. If the word that must be integrated into the sentence structure does not has the correct word category, it cannot be integrated. In (7), for example, the integration of a verb (such as gegessen) is impossible after a preposition (such as im) in German.
234 Sonja A. Kotz, Stefan Frisch, and Angela D. Friederici (7) Der Honig wurde im gegessen. the honey was in-the eaten (8) Der Honig wurde gegessen. the honey was eaten Verbs such as gegessen in (7) create a word-category violation. In comparison to a correct sentence such as (8) (without the preposition). Verbs in such a syntactic context elicit an early negative ERP deflection peaking at around 150 ms. The component has its topographical maximum over (left-) anterior electrode sites and is therefore termed early left anterior negativity (ELAN). In the model of Friederici (2002), word-category integration temporally and functionally precedes the integration of all other types of information (syntactic and semantic) associated with a word. This seems to be warranted since the ELAN occurs irrespective of simultaneous violations based on other types of syntactic or semantic information, for example verb-argument structure (Frisch, Hahne et al., 2004) or selectional restrictions (Friederici, Steinhauer et al., 1999; Hahne & Friederici, 2002). In addition, the ELAN is independent of non-linguistic factors such as the predictability of a word-category violation (Hahne & Friederici, 2002). By contrast, the electrophysiological correlates of other types of violation (such as a verb-argument structure or a semantic violation) are not found if the sentence contains an additional word-category violation (Friederici, Steinhauer et al., 1999; Frisch, Hahne et al., 2004; Hahne & Friederici, 2002). This finding is independent of whether the word category of the violating word is available before or after its semantic properties (Friederici et al., 2004). An early negativity in response to a word-category violation was found not only in German, but also in English (Neville et al., 1991), Dutch (Hagoort, Wassenaar, & Brown, 2003a), Japanese (Kubota et al., 2003) and Chinese (Ye et al., 2006). For an alternative account of the ELAN/LAN see also (Hasting & Kotz, 2008; Steinhauer & Drury, 2012). The ERP effect in response to morphosyntactic violations in the mid latency time window is the so-called left-anterior negativity (LAN; 300–500 ms). LAN effects have been observed for number-agreement violations in different languages such as English (Osterhout & Mobley, 1995) and Dutch (Gunter et al., 1997). They have also been observed for violations of gender (Gunter et al., 2000) and case (Coulson et al., 1998; Friederici & Frisch, 2000). The LAN can therefore be characterized as reflecting unsuccessful integration of morpho-syntactic information. There is little systematic evidence on the neuronal basis of morphosyntactic integration processes, but the STG and possibly the IFG seem to play an important role (Ni et al., 2000; Raettig et al., 2005). Sentences that contain a word-category or a morphosyntactic violation, in addition to the ELAN or LAN component, elicit a positive deflection component peaking around 600 ms, the so-called P600 (Friederici, Steinhauer et al., 1999; Hahne & Friederici, 2001). It was first reported by Osterhout and Holcomb (1992) for words that create a syntactic violation in a sentence. It has also been termed the syntactic positive shift (Coulson et al., 1998; Hagoort et al., 1999). The P600 is not specific to the kind of violation but occurs with most other syntactic violations. Among others, these include violations of agreement (Gunter et al., 1997, 2000; Osterhout & Mobley, 1995), case (Coulson et al., 1998; Friederici & Frisch, 2000; Frisch & Schlesewsky, 2001, 2005) and verb-argument structure (Friederici & Frisch, 2000; Frisch, Hahne et al., 2004; Osterhout et al., 1994). Apart from outright violations, the P600 is also sensitive to processing differences between sentences which are all syntactically legal. Osterhout and Holcomb (1992) first reported P600 effects in locally ambiguous sentences such as (9). (9) The broker persuaded to sell the stock was …. Up to the preposition “to,” sentence (9) can be parsed as a main clause structure consisting of a subject and a verb. The preposition requires that this preferred (as structurally simplest)
Neural Correlates of Neurotypical and Pathological Language Processing 235 reading is given up in favor of a more complex reduced relative clause (the broker who had been persuaded to …). The finding that the revision of a preferred reading of a locally ambiguous sentence induces a P600 has been replicated many times (Frisch, beim Graben et al., 2004; Frisch et al., 2002; Kotz et al., 2008; Mecklinger et al., 1995). A P600 has also been found for differences in syntactic integration difficulty between different non-ambiguous sentences (Kaan et al., 2000) as well as for local ambiguities compared to unambiguous structures (Frisch et al., 2002). In addition, the P600 is sensitive to sentence level complexity, that is, its amplitude increases with the number of active noun phrases increasing in a sentence (beim Graben et al., 2008; Kaan & Swaab, 2003). However, alternative accounts of the P600 exist. Kim and Osterhout (2005) reported a “semantic” P600 in grammatically correct but semantically anomalous sentences which violate the expected thematic structure (“The hearty meal was devouring the kids”). This latter effect has led to significant discussion about the nature of the P600 (e.g., Bornkessel-Schlesewsky & Schlesewsky, 2008; Kuperberg, 2007). With respect to the different experimental contexts in which late positivities have been found, the P600 can be seen as a marker of enhanced syntactic processing cost, due to either repair, revision/reanalysis, integration costs, temporary ambiguity, an anomaly. In contrast to the ELAN, the P600 amplitude decreases with increasing probability of the syntactic violation (Coulson et al., 1998; Gunter et al., 1997; Hahne & Friederici, 2001) and can be modulated by additional (non-syntactic) violations (Gunter et al., 1997). These findings emphasize the role of the P600 as reflecting stages of controlled evaluative processing. Can these different types of syntactic processes (reflected by the ELAN, LAN, and P600) be found in the brain? An answer to this question provides a good example of how evidence from different methodological approaches must be integrated to form a more complete picture of the dynamic character of language processing in the brain. In an fMRI study, Friederici et al. (2003) tested word-category violations such as (7) which elicit an ELAN–P600 pattern in the ERP. Compared to a correct condition, these violations activated superior temporal (the anterior and posterior part of left STG), inferior frontal (the left deep frontal operculum) as well as subcortical areas (the putamen of the left basal ganglia). However, the problem with fMRI is its relatively low time resolution that does not allow subsequent subprocesses to be distinguished. Three approaches are taken to solve this problem. One approach is to conduct ERP studies with participants with circumscribed lesions. A second approach uses magnetoencephalography (MEG) which has the same temporal resolution as the EEG but allow the calculation of the location of activity related dipoles in the brain. The third approach is to systematically delete non-relevant information from the sentence, for example, deleting semantic information by replacing content words by pseudowords leaving only syntactic information such as in jabberwocky sentences. Causal evidence from ERP patient studies might provide an informative answer to the question of the underlying brain basis of early and late ERP effects. In a study by Friederici et al. (1998), a participant with a temporo-parietal lesion did not show an N400 effect for semantic violations but did both an ELAN and a P600 in response to wordcategory violations. Furthermore, the authors found no ELAN, but a P600 for the same type of violation (as well as an N400 for a semantic violation) in a second participant with a left inferior frontal lesion (see also Friederici, von Cramon et al., 1999, for a similar result with a larger sample of participants with left-frontal lesions). The same pattern (a P600, but no ELAN) was found in a study with participants who had a lesion of the anterior temporal lobe (Kotz, von Cramon et al., 2003). Further evidence that a word-category mismatch activates a network of inferior frontal (deep frontal operculum) and anterior temporal areas comes from a study using MEG, which traces changes in the magnetic fields of neuron assemblies as they depolarize. In a MEG study with healthy participants, Friederici et al. (2000) conducted a dipole source localization and found that the early negativity was best explained by two generators, one in the anterior part of the STG (planum polare) and a second one in the inferior frontal cortex. These MEG results are complemented by a study that included a larger sample of participants with left-frontal lesions and intact temporal cortex (Jakuszeit et al., 2013; Regel et al., 2017).
236 Sonja A. Kotz, Stefan Frisch, and Angela D. Friederici Word-category violations in these participants led to a reduced early negativity, confirming that at least a second dipole in addition to one in the frontal cortex contributes to the early negativity in response to word-category violations. Thus, early responses to morphosyntactic violations seem to rely on the contribution of left frontal brain structure, while the processing of word-category seems to depend on deeper frontal sources (deep frontal operculum) and at least partially also on anterior temporal brain structure. These regions do not seem to be crucial for late, controlled syntactic processes, as P600 effects were found in patients with lesions in the anterior temporal or inferior frontal area. However, a P600 for a word-category violation (or other syntactic violation) was reduced or absent in participants with lesions in the left basal ganglia (Friederici, von Cramon et al., 1999; Frisch et al., 2003; Kotz, Frisch et al., 2003). Obviously, syntactic processes are not exclusively hosted by cortical areas, but subcortical structures also play an important role (see also Ullman, 2004). For an account on aphasia severity in relation to P600 effects be referred Wassenaar and colleagues (2004). Taken together, all these results suggest that word-category integration (as reflected by the ELAN) takes place early and is supported by a network of inferior frontal and anterior temporal areas. By contrast, late controlled processes of syntactic repair (as reflected by the P600) might be regulated in the basal ganglia and the posterior STG. The brain basis involved in syntactic processes was, moreover, investigated in several fMRI studies (Friederici, 2011; Grodzinsky et al., 2021). Broca’s area and the posterior temporal cortex are identified as the main loci of syntactic processing in healthy individuals. In a systematic pair of studies lexical-semantic and derivational morphology carrying semantic information was deleted from the sentences (Goucha & Friederici, 2015). Normal sentences activated a large network including Broca’s area (BA45, BA44) and the anterior temporal lobe as well as the posterior temporal cortex. Deleting content words and replacing these by pseudowords lead to activation in BA45 and BA44 as well as the posterior temporal gyrus. When deleting derivational morphology (such as “-hood” from “brotherhodd”) in addition to the content word only BA44 was active. This result indicates that BA44 is the core area to process syntactic information and that the posterior temporal cortex may support the integration of syntactic and semantic information. Investigating the brain basis of local phrase structure building a fMRI study with healthy individuals used a two-word phrase paradigm (Zaccarella & Friederici, 2015). A two-word phrase consisting of a determiner and a pseudoword (the pish) was compared to a minimal list of the same words (pish, the). The activation found was located a confined area within BA44, in its most ventral part. This supports the notion that BA44 is the core area of syntactic phrase structure building.
17.5 Violations and Beyond In the preceding section we have demonstrated that language processing in the brain takes place in different subsequent phases. These phases are supported by different parts of a large corticosubcortical network which is summarized in Figure 17.1. In the first phase (at around 150 ms), the syntactic category of a word is integrated into a sentence context. If this fails, an ELAN is elicited. The neuronal basis for this process seems to be a Perisylvian network of (anterior) STG and IFG (deep frontal operculum). Although most of the fMRI activity is found in left-hemisphere regions, right-hemisphere homologues are often coactivated. In a second phase (approximately between 300 and 500 ms), the integration of lexical-semantic/thematic information (reflected in an N400) as well as morphosyntactic information (reflected in a LAN) takes place. Semantic integration is provided by the (posterior) STG and MTG as well as the IFG, whereas the STG and the IFG also play a role in the integration of morphosyntactic information. A third phase
Neural Correlates of Neurotypical and Pathological Language Processing 237
Figure 17.1 Left-hand side: View of the left hemisphere with the cortical gyri (IFG, STG and MTG) that are most relevant for language processing. The respective Brodmann areas (BA) are numbered1. BA39 is the Angular Gyrus (AG). Right-hand side: The neurocognitive model is an adapted version of Friederici (2017) showing the subsequent phases of syntactic and semantic processing (associated with the different language-related ERP components) and the brain regions that support them. Note that early processes of speech segmentation and phonological processing are not depicted here, since they are not discussed in the present chapter. For further explanation see text.
follows in which a general (largely syntactic) evaluation of the sentence takes place. It seems to be supported by a cortico-subcortical network including (posterior) STG and the basal ganglia. The studies we have presented here are largely based on the processing of violations. Especially with respect to syntax, however, there is another type of experimental manipulation which has attracted increasing interest, namely, the processing of sentences with noncanonical word orders. Syntactic theories assume that each language has a basic (“canonical”) order of core constituents (i.e., verb and arguments). Sentences that do not follow this order are not necessarily illegal but associated with enhanced processing cost. In English, for example, the canonical order is subject–verb–object, as in (10a). (10a) The girl called [the boy]i who ti sold the ice cream. (10b) [The boy]i who the girl called ti sold the ice cream. In (10a), the NP “the boy” is the object of the main clause, where it follows the subject (“the girl”) and the verb (“called”). At the same time, it is the subject of the relative clause where it precedes the verb and the object. In (10b), by contrast, “the boy” is the object of the relative clause but precedes the subject and the verb. Therefore, (10b) is an object relative clause, whereas (10a) is a subject relative clause. Some syntactic theories assume that “the boy” has been moved from its original object position in (10b) (and from the subject position in 10a) to derive the “surface” structure object–subject–verb. This is indicated by the “t” (for “trace”) in both (10a) and (10b) and by the “i” that coindexes trace and moved constituent. Sentences in which the constituent order deviates from the canonical one have played an important role in research involving PWA. It was shown that participants with Broca’s aphasia not only have characteristic impairments in language production (non-fluent, “telegram-style” output) but also experienced severe comprehension problems with non-canonical sentences, at least if it is not clear on grounds of plausibility alone which c onstituent is the subject and which the object (Caramazza & Zurif, 1976). It has been proposed that people with Broca’s aphasia lack the knowledge about the original position of the moved NP (“the boy”) and therefore they cannot reconstruct a movement (cf. Grodzinsky, 2000). As Broca’s aphasia is a syndrome often
238 Sonja A. Kotz, Stefan Frisch, and Angela D. Friederici associated with the left IFG (BA44 and 45, Broca’s area), it was proposed that this cortical area plays a key role in the processing of the dependencies between the moved sentence constituents on the “surface” of the sentence and their original positions in the underlying canonical order. Accordingly, imaging research has been undertaken to find out which areas are activated when confronted with a n on-canonical sentence structure. It has been argued that the processing difficulty for non-canonical sentences in Broca’s aphasia does not result from the inability to reconstruct a movement but is due to the inability of these individuals to meet the higher working memory demands necessary, as the moved constituent must be rehearsed until it can be assigned to its original position (Just et al., 1996). Grewe et al. (2005) have addressed this question in German, a language with more flexibility in word order than English. The authors found bilateral IFG (pars opercularis/BA44) activation for scrambled sentences compared to sentences with a canonical order. This result is not explained in terms of higher working- memory cost, questioning the working memory account. Another approach assumes that IFG activation increases with the number of transformations that must be computed to receive a non-canonical surface order (Ben-Shahar et al., 2003). A study in German investigated the processing of canonical versus non-canonical sentences in which either one or two noun phrases were moved while keeping the sentences’ length constant (Friederici et al., 2006b). Activation in Broca’s area parametrically increased with the number of moved elements. A recent meta-analysis over seventeen studies in different languages with 316 participants revealed that the syntactic operation Move consistently activated left Broca’s region and to a somewhat lesser extend posterior temporal regions (Grodzinsky et al., 2021). Another study which directly contrasted grammatical scrambled sentences with ungrammatical ones, there is evidence that ungrammaticality activates not Broca’s area, but the deep frontal operculum (Fiebach et al., 2004). By contrast, word-order variations lead to IFG activation but do not alter activity in the frontal operculum. The frontal operculum is a phylogenetic older territory than Broca’s area, this suggests a profound functional differentiation between these two brain regions (see Friederici et al., 2006a). To conclude, the variety of empirical results in the field of neuronal language processing is large. This is due not only to the inherent differences in the methods employed (electrophysiological versus brain imaging), but also to the variety of linguistic manipulations, experimental designs, tasks, languages, etc. Nevertheless, we have shown that the picture becomes much more coherent if we analyze similar questions under different perspectives, that is, with different methods, which cover both the temporal and the spatial parameters of language processing in the brain.
NOTE 1 The best-known and most widely used parcellation of the human cortex based on its
cytoarchitecture goes back to the German neuroanatomist Korbinian Brodmann (1868– 1918). Resulting cortical areas are therefore termed “Brodmann areas” (“BA”).
REFERENCES beim Graben, P., Gerth, S., & Vasishth, S. (2008). Towards dynamical system models of languagerelated brain potentials. Cognitive Neurodynamics, 2(3), 229–255. https://doi.org/10.1007/s11571008-9041-5
Bemis, D. K., & Pylkkänen, L. (2011). Simple composition: A magnetoencephalography investigation into the comprehension of minimal linguistic phrases. Journal of
Neural Correlates of Neurotypical and Pathological Language Processing 239 Neuroscience, 31(8), 2801–2814. https://doi. org/10.1523/JNEUROSCI.5003-10.2011 Bemis, D. K., & Pylkkänen, L. (2013). Flexible composition: MEG evidence for the deployment of basic combinatorial linguistic mechanisms in response to task demands. PLoS One, 8(9), e73949. https://doi. org/10.1371/journal.pone.0073949 Ben-Shahar, M., Hendler, T., Kahn, I., BenBashat, D., & Grodzinsky, Y. (2003). The neural reality of syntactic transformations: Evidence from functional magnetic resonance imaging. Psychological Science, 14, 433–440. https://doi.org/10.1111 %2F1467-9280.01459 Bornkessel-Schlesewsky, I., & Schlesewsky, M. (2006). The extended argument dependency model: A neurocognitive approach to sentence comprehension across languages. Psychological Review, 113(4), 787–821. https://doi. org/10.1037/0033-295X.113.4.787 Bornkessel-Schlesewsky, I., & Schlesewsky, M. (2008). An alternative perspective on “semantic P600” effects in language comprehension. Brain Research Reviews, 59(1), 55–73. https:// doi.org/10.1016/j.brainresrev.2008.05.003 Bornkessel-Schlesewsky, I., & Schumacher, P. B. (2016). Towards a neurobiology of information structure. In C. Féry & S. Ishihara (Eds.), The Oxford handbook of information structure (pp. 581–598). Oxford University Press. Brouwer, H., Fitz, H., & Hoeks, J. (2012). Getting real about Semantic Illusions: Rethinking the functional role of the P600 in language comprehension. Brain Research, 1446, 127–143. https://doi.org/10.1016/j.brainres.2012.01.055 Burkhardt, P., Piñango, M. M., & Wong, K. (2003). The role of the anterior left hemisphere in real-time sentence comprehension: Evidence from split intransitivity. Brain and Language, 86(1), 9–22. https://doi.org/10.1016/ S0093-934X(02)00526-6 Caramazza, A., & Zurif, E. B. (1976). Dissociation of algorithmic and heuristic processes in language comprehension: Evidence from aphasia. Brain and Language, 3(4), 572–582. https://doi.org/10.1016/0093-934X(76)90048-1 Chang, C.-T., Lee, C.-Y., Chou, C.-J., Fuh, J.-L., & Wu, H.-C. (2016). Predictability effect on N400 reflects the severity of reading comprehension deficits in aphasia. Neuropsychologia, 81, 117–128. https://doi.org/10.1016/j. neuropsychologia.2015.12.002 Connolly, J. F., Mate-Kole, C. C., & Joyce, B. M. (1999). Global aphasia: An innovative assessment approach. Archives of Physical
Medicine and Rehabilitation, 80(10), 1309–1315. https://doi.org/10.1016/S0003-9993(99)90035-7 Coulson, S., King, J. W., & Kutas, M. (1998). Expect the unexpected: Event-related brain response to morphosyntactic violations. Language and Cognitive Processes, 13(1), 21–58. https://doi.org/10.1080/016909698386582 Dapretto, M., & Bookheimer, S. Y. (1999). Form and content: Dissociating syntax and semantics in sentence comprehension. Neuron, 24(2), 427–432. https://doi.org/10.1016/S0896-6273(00)80855-7 Fiebach, C. J., Schlesewsky, M., Bornkessel, I., & Friederici, A. D. (2004). Distinct neural correlates of legal and illegal word order variations in German: How can fMRI inform cognitive models of sentence processing? In M. Carreiras & C. E. Clifton Jr. (Eds.), The on-line study of sentence comprehension: Eyetracking, ERPs and beyond (pp. 357–370). Psychology Press. https://doi.org/10.4324/9780203509050 Friederici, A. D. (2002). Towards a neural basis of auditory sentence processing. Trends in Cognitive Sciences, 6(2), 78–84. https://doi.org/ 10.1016/S1364-6613(00)01839-8 Friederici, A. D. (2011). The brain basis of language processing: From structure to function. Physiological Reviews, 91, 1357–1392. https://doi.org/10.1152/physrev.00006.2011 Friederici, A. D. (2017). Language in our brain. The origins of a uniquely human capacity. MIT Press. Friederici, A. D., & Alter, K. (2004). Lateralization of auditory language functions: A dynamic dual pathway model. Brain and Language, 89(2), 267–276. https://doi.org/10.1016/ S0093-934X(03)00351-1 Friederici, A. D., Bahlmann, J., Heim, S., Schubotz, R. I., & Anwander, A. (2006a). The brain differentiates human and non-human grammars: Functional localization and structural connectivity. Proceedings of the National Academy of Sciences of the United States of America, 103(7), 2458–2463. https://doi. org/10.1073/pnas.0509389103 Friederici, A. D., Fiebach, C. J., Schlesewsky, M., Bornkessel, I., & von Cramon, D. Y. (2006b). Processing linguistic complexity and grammaticality in the left frontal cortex. Cerebral Cortex, 16(12), 1709–1717. https://doi. org/10.1093/cercor/bhj106 Friederici, A. D., & Frisch, S. (2000). Verb argument structure processing: The role of verb-specific and argument-specific information. Journal of Memory and Language, 43(3), 476–507. https:// doi.org/10.1006/jmla.2000.2709 Friederici, A. D., Gunter, T. C., Hahne, A., & Mauth, K. (2004). The relative timing of syntactic and
240 Sonja A. Kotz, Stefan Frisch, and Angela D. Friederici semantic processes in sentence comprehension. NeuroReport, 15(1), 165–169. https://doi. org/10.1097/00001756-200401190-00032 Friederici, A. D., Hahne, A., & von Cramon, D. Y. (1998). First-pass versus second pass parsing processes in a Wernicke’s and a Broca’s aphasic: Electrophysiological evidence for a double dissociation. Brain and Language, 62(3), 311–341. https://doi.org/10.1006/ brln.1997.1906 Friederici, A. D., & Kilborn, K. (1989). Temporal constraints on language processing: Syntactic priming in Broca’s aphasia. Journal of Cognitive Neuroscience, 1(3), 262–272. https://doi. org/10.1162/jocn.1989.1.3.262 Friederici, A. D., Rüschemeyer, S.-A., Hahne, A., & Fiebach, C. J. (2003). The role of left inferior frontal and superior temporal cortex in sentence comprehension: Localizing syntactic and semantic processes. Cerebral Cortex, 13(2), 170–177. https://doi.org/10.1093/ cercor/13.2.170 Friederici, A. D., Steinhauer, K., & Frisch, S. (1999). Lexical integration: Sequential effects of syntactic and semantic information. Memory and Cognition, 27(3), 438–453. https://doi. org/10.3758/BF03211539 Friederici, A. D., von Cramon, D. Y., & Kotz, S. A. (1999). Language related brain potentials in patients with cortical and subcortical left hemisphere lesions. Brain, 122(6), 1033–1047. https://doi.org/10.1093/brain/122.6.1033 Friederici, A. D., Wang, Y., Herrmann, C. S., Maess, B., & Oertel, U. (2000). Localization of early syntactic processes in frontal and temporal cortical areas: A magnetoencephalographic study. Human Brain Mapping, 11(1), 1–11. https:// doi.org/10.1002/1097-0193(200009)11: 1%3C1::AID-HBM10%3E3.0.CO;2-B Frisch, S., beim Graben, P., & Schlesewsky, M. (2004). Parallelizing grammatical functions: P600 and P345 reflect different cost of reanalysis. International Journal of Bifurcation and Chaos, 14(2), 531–549. https://doi. org/10.1142/S0218127404009533 Frisch, S., Hahne, A., & Friederici, A. D. (2004). Word category and verb-argument structure information in the dynamics of parsing. Cognition, 91(3), 191–219. https://doi. org/10.1016/j.cognition.2003.09.009 Frisch, S., Kotz, S. A., von Cramon, D. Y., & Friederici, A. D. (2003). Why the P600 is not just a P300: The role of the basal ganglia. Clinical Neurophysiology, 114(2), 336–340. https://doi.org/10.1016/ S1388-2457(02)00366-8
Frisch, S., & Schlesewsky, M. (2001). The N400 reflects problems of thematic hierarchizing. NeuroReport, 12(15), 3391–3394. https://doi. org/10.1097/00001756-200110290-00048 Frisch, S., & Schlesewsky, M. (2005). The resolution of case conflicts: A neurophysiological perspective. Cognitive Brain Research, 25(2), 484–498. https://doi. org/10.1016/j.cogbrainres.2005.07.010 Frisch, S., Schlesewsky, M., Saddy, D., & Alpermann, A. (2002). The P600 as an indicator of syntactic ambiguity. Cognition, 85(3), B83–B92. https://doi.org/10.1016/ S0010-0277(02)00126-9 Goucha, T. B., & Friederici, A. D. (2015). The language skeleton after dissecting meaning: A functional segregation within Broca’s area. NeuroImage, 114(6), 294–302. https://doi. org/10.1016/j.neuroimage.2015.04.011 Graessner, A., Zaccarella, E., Friederici, A. D., Obrig, H., & Hartwigsen, G. (2021). Dissociable contributions of frontal and temporal brain regions to basic semantic composition. Brain Communications, 3(2), fcab090. https://doi. org/10.1093/braincomms/fcab090 Graessner, A., Zaccarella, E., & Hartwigsen, G. (2021). Differential contributions of lefthemispheric language regions to basic semantic composition. Brain Structure & Function, 226(2), 501–518. https://doi. org/10.1007/s00429-020-02196-2 Grewe, T., Bornkessel, I., Zysset, S., Wiese, R., von Cramon, D. Y., & Schlesewsky, M. (2005). The emergence of the unmarked: A new perspective on the language-specific function of Broca’s area. Human Brain Mapping, 26(3), 178–190. https://doi.org/10.1002/hbm.20154 Grodzinsky, Y. (2000). The neurology of syntax: Language use without Broca’s area. Behavioral and Brain Sciences, 23(1), 1–71. https://doi. org/10.1017/S0140525X00002399 Grodzinsky, Y., Pieperhoff, P., & Thompson, C. (2021). Stable brain loci for the processing of complex syntax: A review of the current neuroimaging evidence*. Cortex, 142, 252–271. https://doi.org/10.1016/j.cortex.2021.06.003 Gunter, T. C., Schriefers, H., & Friederici, A. D. (2000). Syntactic gender and semantic expectancy: ERPs reveal early autonomy and late interaction. Journal of Cognitive Neuroscience, 12(4), 556–568. https://doi. org/10.1162/089892900562336 Gunter, T. C., Stowe, L. A., & Mulder, G. (1997). When syntax meets semantics. Psychophysiology, 34(6), 660–676. https://doi. org/10.1111/j.1469-8986.1997.tb02142.x
Neural Correlates of Neurotypical and Pathological Language Processing 241 Hagner, M. (2000). Homo cerebralis. Insel. Hagoort, P. (2003). Interplay between syntax and semantics during sentence comprehension: ERP effects of combining syntactic and semantic violations. Journal of Cognitive Neuroscience, 15(6), 883–899. https://doi. org/10.1162/089892903322370807 Hagoort, P. (2008). The fractionation of spoken language understanding by measuring electrical and magnetic brain signals. Philosophical Transactions of the Royal Society B – Biological Sciences, 363(1493), 1055–1069. https://doi.org/10.1098/rstb.2007.2159 Hagoort, P., Brown, C. M., & Osterhout, L. (1999). The neurocognition of syntactic processing. In C. M. Brown & P. Hagoort (Eds.), The neurocognition of language (pp. 273–316). Oxford University Press. https://doi.org/10.1093/acp rof:oso/9780198507932.003.0009 Hagoort, P., Wassenaar, M., & Brown, C. M. (2003a). Syntax-related ERP effects in Dutch. Cognitive Brain Research, 16(1), 38–50. https:// doi.org/10.1016/S0926-6410(02)00208-2 Hagoort, P., Wassenaar, M., & Brown, C. M. (2003b). Real-time semantic compensation in patients with agrammatic comprehension: Electrophysiological evidence for multipleroute plasticity. Proceedings of the National Academy of Sciences of the United States of America, 100(7), 4340–4345. https://doi. org/10.1073/pnas.0230613100 Hahne, A., & Friederici, A. D. (2001). Processing a second language: Late learners’ comprehension strategies as revealed by event-related brain potentials. Bilingualism: Language and Cognition, 4(2), 123–141. https:// doi.org/10.1017/S1366728901000232 Hahne, A., & Friederici, A. D. (2002). Differential task effects on semantic and syntactic processes as revealed by ERPs. Cognitive Brain Research, 13(3), 339–356. https://doi. org/10.1016/S0926-6410(01)00127-6 Hasting, A., & Kotz, S. A. (2008). Speeding up syntax: On the relative timing and automaticity of local phrase structure and morphosyntactic processing as reflected in event-related brain potentials. Journal of Cognitive Neuroscience, 20(7), 1207–1219. https://doi.org/10.1162/jocn.2008.20083 Hickok, G., & Poeppel, D. (2004). Dorsal and ventral streams: A framework for understanding aspects of the functional anatomy of language. Cognition, 92(1–2), 67–99. https://doi.org/10.1016/j.cognition.2003.10.011 Hickok, G., & Poeppel, D. (2015). Neural basis of speech perception. Handbook of Clinical
Neurology, 129, 149–160. https://doi. org/10.1016/B978-0-444-62630-1.00008-1 Jakuszeit, M., Kotz, S. A., & Hasting, A. S. (2013). Generating predictions: Lesion evidence on the role of left inferior frontal cortex in rapid syntactic analysis. Cortex, 49(10), 2861–2874. https://doi.org/10.1016/j.cortex.2013.05.014 Just, M., Carpenter, P., Keller, T., Eddy, W., & Thulborn, K. (1996). Brain activation modulated by sentence comprehension. Science, 274(5284), 114–116. https://doi. org/10.1126/science.274.5284.114 Kaan, E., Harris, A., Gibson, E., & Holcomb, P. J. (2000). The P600 as an index of syntactic integration difficulty. Language and Cognitive Processes, 15(2), 159–201. https://doi.org/ 10.1080/016909600386084 Kaan, E., & Swaab, T. (2003). Repair, revision, and complexity in syntactic analysis: An electrophysiological differentiation. Journal of Cognitive Neuroscience, 15(1), 98–110. https:// doi.org/10.1162/089892903321107855 Kawhol, W., Bunse, S., Willmes, K., Hoffrogge, A., Buchner, H., & Huber, W. (2010). Semantic event-related potential components reflect severity of comprehension deficits in aphasia. Neurorehabilitation & Neural Repair, 24(3), 282–928. https://doi. org/10.1177/1545968309348311 Khachatryan, E., De Letter, M., Vanhoof, G., Goeleven, A., & Van Hulle, M. M. (2017). Sentence context prevails over word association in aphasia patients with spared comprehension: Evidence from N400 eventrelated potential. Frontiers in Human Neuroscience, 10, 684. https://doi.org/10.3389/ fnhum.2016.00684 Kim, A., & Osterhout, L. (2005). The independence of combinatory semantic processing: Evidence from event-related potentials. Journal of Memory and Language, 52(2), 205–225. https://doi.org/10.1016/j. jml.2004.10.002 Kotz, S. A., Frisch, S., von Cramon, D. Y., & Friederici, A. D. (2003). Syntactic language processing: ERP lesion data on the role of the basal ganglia. Journal of the International Neuropsychological Society, 9(7), 1053–1060. https://doi.org/10.1017/S1355617703970093 Kotz, S. A., Holcomb, P. J., & Osterhout, L. (2008). ERPs reveal comparable syntactic sentence processing in native and non-native readers of English. Acta Psychologica, 128(3), 514–527. https://doi.org/10.1016/j.actpsy.2007.10.003 Kotz, S. A., von Cramon, D. Y., & Friederici, A. D. (2003). Differentiation of syntactic processes in
242 Sonja A. Kotz, Stefan Frisch, and Angela D. Friederici the left and right anterior temporal lobe: Event-related brain potential evidence from lesion patients. Brain and Language, 87(1), 135–136. https://doi.org/10.1016/ S0093-934X(03)00236-0 Kubota, M., Ferrari, P., & Roberts, T. P. L. (2003). Magnetoencephalography detection of early syntactic processing in humans: Comparison between L1 speakers and L2 learners of English. Neuroscience Letters, 353(2), 107–110. https://doi.org/10.1016/j.neulet.2003.09.019 Kuperberg, G. R. (2007). Neural mechanisms of language comprehension: Challenges to syntax. Brain Research, 1146, 23–49. https:// doi.org/10.1016/j.brainres.2006.12.063 Kuperberg, G. R., McGuire, P. K., Bullmore, E. T., Brammer, M. J., Rabe-Hesketh, S., Wright, I. C., Lythgoe, D. J., Williams, S. C. R., & David, A. S. (2000). Common and distinct neural substrates for pragmatic, semantic, and syntactic processing of spoken sentences: An fMRI study. Journal of Cognitive Neuroscience, 12(2), 321–341. https://doi.org/10.1162/089892900562138 Kutas, M., & Federmeier, K. D. (2000). Electrophysiology reveals semantic memory use in language comprehension. Trends in Cognitive Sciences, 4(12), 463–470. https://doi. org/10.1016/S1364-6613(00)01560-6 Kutas, M., & Hillyard, S. A. (1980). Reading senseless sentences: Brain potentials reflect semantic incongruity. Science, 207(4427), 203–205. https://doi.org/10.1126/ science.7350657 Lau, E. F., Phillips, C., & Poeppel, D. (2008). A cortical network for semantics: (de)constructing the N400. Nature Reviews Neuroscience, 9(12), 920–933. https://doi.org/10.1038/nrn2532 Lichtheim, L. (1885). On aphasia. Brain, 7(4), 433–484. https://doi.org/10.1093/ brain/7.4.433 Meechan, R. J. H., McCann, C. M., & Purdy, S. C. (2021). The electrophysiology of aphasia: A scoping review. Clinical Neurophysiology, 132(12), 3025–304. https://doi.org/10.1016/j. clinph.2021.08.023 Mecklinger, A., Schriefers, H., Steinhauer, K., & Friederici, A. D. (1995). Processing relative clauses varying on syntactic and semantic dimensions: An analysis with event-related potentials. Memory and Cognition, 23(4), 477–494. https://doi.org/10.3758/BF03197249 Neville, H., Nicol, J., Barss, A., Forster, K., & Garrett, M. (1991). Syntactically based sentence processing classes: Evidence from eventrelated brain potentials. Journal of Cognitive
Neuroscience, 3(2), 151–165. https://doi. org/10.1162/jocn.1991.3.2.151 Newman, A. J., Pancheva, R., Ozawa, K., Neville, H. J., & Ullman, M. T. (2001). An event-related fMRI study of syntactic and semantic violations. Journal of Psycholinguistic Research, 30(3), 339–364. https://doi. org/10.1023/A:1010499119393 Ni, W., Constable, T., Menci, W. E., Pugh, K. R., Fulbright, R. K., Shaywitz, S. E., Shaywitz, B. A., Gore, J. C., & Schankweiler, D. (2000). An event-related neuroimaging study: Distinguishing form and content in sentence processing. Journal of Cognitive Neuroscience, 12(1), 120–133. https://doi. org/10.1162/08989290051137648 Osterhout, L., Holcomb, P., & Swinney, D. (1994). Brain potentials elicited by garden-path sentences: Evidence of the application of verb information during parsing. Journal of Experimental Psychology: Learning, Memory and Cognition, 20(4), 786–803. https://psycnet.apa. org/doi/10.1037/0278-7393.20.4.786 Osterhout, L., & Holcomb, P. J. (1992). Eventrelated brain potentials elicited by syntactic anomaly. Journal of Memory and Language, 31(6), 785–806. https://doi. org/10.1016/0749-596X(92)90039-Z Osterhout, L., & Mobley, L. A. (1995). Event-related brain potentials elicited by failure to agree. Journal of Memory and Language, 34(6), 739–773. https://doi.org/10.1006/jmla.1995.1033 Raettig, T., Kotz, S. A., Frisch, S., & Friederici, A. D. 2005. Neural correlates of verb-argument structure and semantic processing: An efMRI study. Journal of Cognitive Neuroscience, Supplement, 77. Regel, S., Kotz, S. A., Henseler, I., & Friederici, A. D. (2017). Left inferior frontal gyrus mediates morphosyntax: ERP evidence from verb processing in left-hemisphere damaged patients. Cortex, 86, 156–171. https://doi. org/10.1016/j.cortex.2016.11.007 Revonsuo, A., & Laine, M. (1996). Semantic processing without conscious understanding in a global aphasic: Evidence from auditory event-related brain potentials. Cortex, 32(1), 29–48. https://doi.org/10.1016/ S0010-9452(96)80015-3 Rüschemeyer, S.-A., Fiebach, C. J., Kempe, V., & Friederici, A. D. (2005). Processing lexical semantic and syntactic information in first and second language: FMRI evidence from German and Russian. Human Brain Mapping, 25(2), 266–286. https://doi.org/10.1002/hbm.20098
Neural Correlates of Neurotypical and Pathological Language Processing 243 Schell, M., Zaccarella, E., & Friederici, A. D. (2017). Differential cortical contribution of syntax and semantics: An fMRI study on two-word phrasal processing. Cortex, 96, 105–120. https://doi.org/10.1016/j. cortex.2017.09.002 Sheppard, S. M., Love, T., Midgley, K. J., Holcomb, P. J., & Shapiro, L. P. (2017). Electrophysiology of prosodic and lexicalsemantic processing during sentence comprehension in aphasia. Neuropsychologia, 107, 9–24. https://doi.org/10.1016/j. neuropsychologia.2017.10.023 Steinhauer, K., & Drury, J. E. (2012). On the early left-anterior negativity (ELAN) in syntax studies. Brain and Language, 120(2), 135–162. https://doi.org/10.1016/j.bandl.2011.07.001 Swaab, T. Y., Brown, C. M., & Hagoort, P. (1997). Spoken sentence comprehension in aphasia: Event-related potential evidence for a lexical integration deficit. Journal of Cognitive Neuroscience, 9(1), 39–66. https://doi. org/10.1162/jocn.1997.9.1.39 Swaab, T. Y., Brown, C. M., & Hagoort, P. (1998). Understanding ambiguous words in sentence contexts: Electrophysiological evidence for delayed contextual selection in Broca’s aphasia. Neuropsychologia, 36(8), 737–761. https://doi. org/10.1016/S0028-3932(97)00174-7 ter Keurs, M., Brown, C. M., Hagoort, P., & Stegeman, D. F. (1999). Electrophysiological manifestations of open- and closed-class words in patients with Broca’s aphasia with agrammatic comprehension: An event-related brain potential study. Brain, 122(5), 839–854. https://doi.org/10.1093/brain/122.5.839
Ullman, M. T. (2004). Contributions of memory circuits to language: The declarative/ procedural model. Cognition, 92(1–2), 231–270. https://doi.org/10.1016/j.cognition. 2003.10.008 Wassenaar, M., Brown, C. M., & Hagoort, P. (2004). ERP effects of subject-verb agreement violations in patients with Broca’s aphasia. Journal of Cognitive Neuroscience, 16(4), 553–576. https://doi.org/10.1162/089892904323057290 Wassenaar, M., & Hagoort, P. (2005). Wordcategory violations in patients with Broca’s aphasia: An ERP study. Brain and Language, 92(2), 117–137. https://doi.org/10.1016/j. bandl.2004.05.011 Wassenaar, M., & Hagoort, P. (2007). Thematic role assignment in patients with Broca’s aphasia: Sentence-picture matching electrified. Neuropsychologia, 45(4), 716–740. https://doi. org/10.1016/j.neuropsychologia.2006.08.016 Wernicke, C. (1874). Der aphasische Symptomencomplex. Eine psychologische Studie auf anatomischer Basis [The aphasic symptom complex: A psychological study from an anatomical basis]. M. Cohn und Weigert. Ye, Z., Lou, Y. J., Friederici, A. D., & Zhou, X. L. (2006). Semantic and syntactic processing in Chinese sentence comprehension: Evidence from event-related potentials. Brain Research, 1071(1), 186–196. https://doi.org/10.1016/j. brainres.2005.11.085 Zaccarella, E., & Friederici, A. D. (2015). Merge in the human brain: a sub-region based functional investigation in the left pars opercularis. Frontiers in Psychology, 6, 1818. https://doi.org/10.3389/fpsyg.2015.01818
18 Developmental Language Disorder in a Bilingual Context JAN DE JONG 18.1 Introduction Developmental language disorder (DLD)1 is an impairment in the learning of language which cannot be explained by another condition that causes the language difficulties. This disorder can occur in monolingual and bilingual2 children alike. Within bilingualism, a distinction must be made. Children can be bilingual from birth (typically when parental input was in two languages from the start: simultaneous bilingualism) or acquire a second language after the first (successive or sequential bilingualism). In bilingual children the verification of a DLD diagnosis is being complicated by the fact that language delay in the second language of bilingual children can also be caused by limited exposure to the language. In this case the delay will usually be transient in typically developing (TD) children. This confusing situation is particularly relevant for successive-bilingual children, who are exposed to the second language (L2) later. To detect DLD in bilinguals, a measurement of their language skills in the first language (L1), where input was available from the outset, may therefore be warranted. In diagnostic practice it is often not easy to diagnose the L1 of a child, since the professional responsible for the testing or observation often lacks sufficient knowledge of the first language of the child. For this and other reasons misdiagnosis often happens in bilingual children. Paradis, Genesee and Crago (2021) described two types of potential misidentification of bilingual children. On the one hand, poor insight into what is normal in second l anguage learning can sometimes lead to a referral for language intervention when there is no need for that (“overidentification”): a temporary delay is interpreted as a disorder. On the other hand, the problems of a language-impaired child can be overlooked (“underidentification”), due perhaps to the fact that the child’s slow development is seen as the natural consequence of the time it takes to master a second language, leading to a “wait and see” approach. It will be clear that this diagnostic problem is most pertinent to successive-bilingual children, where assessment of the L1 may serve as a litmus test for the identification of impairment. In simultaneous bilingual children, the assumption is that input in both l anguages is sufficient to ensure that the child is not at risk of delay due to lack of input; signs of impairment will be found in either language.
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
246 Jan de Jong In this chapter, we will first describe some patterns of bilingual DLD that are found in the research literature, where groups differing in impairment condition (that is, with or without DLD) and/or language status (monolingual vs bilingual) have been compared in order to understand what bilingual DLD entails (a more elaborate discussion of these comparisons can be found in reviews by Bedore and Peña (2008) and Kohnert (2010)). These comparisons will be used to come to an understanding of some basic aspects of bilingual DLD. Subsequently, we will exemplify ways in which the assessment of bilingual DLD can be (or has been) improved, thanks to research and new diagnostic instruments. The earlier version of this chapter (de Jong, 2008) focused on the state of knowledge about bilingual DLD. At that point in time the difficulties involved in diagnosis of DLD in bilinguals had been identified but solutions less so. Since then, innovative proposals for assessment have dominated the literature. This demonstrates that what was presented as a (diagnostic) problem earlier on, has more recently led to fruitful attempts to solve the problem. This chapter is an attempt to reap the benefits of that progress.
18.1.1 The Nature of Bilingual DLD – Relevant Comparisons In choosing comparisons that involve (bilingual) DLD, researchers have often questioned what counts as a “fair comparison” to identify its nature. This mirrors a question that diagnosticians also face in practice. When a speech-language therapist encounters a bilingual child with suspected DLD the question may be to what children in the therapist’s caseload it should be compared to understand whether there is reason for concern. To answer these questions, it is useful to briefly review group comparisons in the research literature that pertain to this.
18.1.1.1 Monolingual Children with DLD and Typically Developing L2 Learners Similarities between monolingual DLD and typical L2 have drawn attention early. They were the topic of a discussion initiated by Paradis (2010) in a special issue of the journal Applied Psycholinguistics. Two early studies highlighting similarities between the two groups illustrate this. Paradis and Crago (2000) found similarities between monolingual Frenchspeaking children with DLD and TD English-speaking L2 learners of French. The similarity showed in shared difficulties in the marking of finiteness, tense and subject-verb agreement as well as in the production of object clitics – all problems known to be markers of DLD in French. Paradis (2005) extended this comparison to include a diverse group of TD L2 learners of English whose first language is a minority language in Canada; these languages were also typologically different. Paradis found that this linguistically diverse group also exhibited difficulties with the grammatical morphology of English (their L2). Their error patterns and accuracy rates resembled those of the monolingual English-speaking children with DLD. Such comparisons suggest that there are areas in the target language that are a challenge for both monolingual children with DLD and typically developing L2 learners. Results such as these present a significant diagnostic dilemma. After all, if the same “symptom” is identified with children with DLD as well as with TD second language learners, how does one correctly diagnose bilingual children with DLD based on their second language skills? The clear implication of these studies is that the monolingual symptoms of DLD in the target language cannot be taken as valid markers for bilingual DLD. In L2 learners,
Developmental Language Disorder in a Bilingual Context 247 these characteristics may be a temporary part of typical development. This is not a trivial conclusion in a situation where the identification of DLD in bilingual children (due to a lack of diagnostic tools for the L1) is often based on their L2 performance. As for the “fairness” of the comparison: this comparison is inherently flawed, since the groups differ in DLD as well as in language status: one cannot establish to what extent the bilingualism (lack of input) or a language impairment contribute to the child’s poor performance. To complement – and correct – the picture of apparent similarity presented above, two observations are important. One is that the similarity found in grammatical morphology – which has long been the focus of research – does not extend to other aspects of language, like syntactic complexity, where children with DLD consistently fall behind. Another important consideration is that the similarity is superficial since the causes are different. Whereas the cause of the morphological difficulties in TD bilinguals is primarily lack of input, in DLD the cause is the language disorder. This difference can be made visible by research into processing of grammatical morphology. For example, Chondrogianni and Marinis (2012) found that TD bilinguals, unlike children with DLD, showed sensitivity to ungrammaticality (absence of morphemes) in an online processing task, while in (offline) production of the same morphemes they made similar errors.
18.1.1.2 Successive Bilingual Children with DLD and Monolingual Children with DLD This comparison has been used to test a specific hypothesis: If bilingualism is an additional burden for language-impaired children, there will be a difference (in terms of severity of the disorder) between monolingual and bilingual children with DLD. This hypothesis originates in the assumption that a child with DLD may face a processing “overload” by having to deal with input in two languages. Another rationale for the comparison could be that group differences reveal specific clinical markers of bilingual DLD in the L2. The search for such markers has been proven elusive, also because comparisons typically focus on the presence or absence of markers already found in monolingual DLD, thus missing out on exclusive markers of bilingual DLD, if any. There is a way in which this comparison is also made in everyday practice: for speechlanguage therapists dealing with bilingual children with language impairment, what may draw their clinical attention is the way in which the L2 of these children (which usually is also the language the therapists speak themselves) resembles that of monolingual children with DLD for whom the same language is their L1. To return to the hypothesis about “double” processing mentioned above, research so far shows no incremental disadvantage. Occasionally apparent disadvantage was found: Verhoeven et al. (2011) concluded that bilingual children with DLD with Dutch as L2 were “additionally disadvantaged,” given the difference between monolingual and bilingual children with DLD. For such results, however, there is an alternative explanation: the bilingual children’s lack of input. By definition, bilingual children with DLD show the effects of impairment as well as of less input, creating a difference between monolinguals and bilingual children with DLD that is essentially predictable (Leonard, 2014; Paradis, 2010). Paradis suggested that disproportionate impairment in bilingual DLD should be measured by effect size. Following that rationale, her re-interpretation of the Dutch study showed the DLD not to be aggravated by bilingualism. The effect was not cumulative, but additive. This is an important finding, also relevant for practice: it provides a strong argument against discouraging parents to use the L1 at home (because two languages would be “too much”) – a recommendation often favored in the past.
248 Jan de Jong
18.1.1.3 Simultaneous Bilingual Children with DLD and Monolingual Children with DLD Paradis et al. (2003) also compared monolinguals and bilinguals. However, their subjects were French-English simultaneous-bilingual children with DLD, who were compared to French children without DLD and to monolingual children with DLD in either language. (Of course, in simultaneous bilingualism, there is no real distinction between what is the L1 and what is the L2). As in the study on successive bilinguals referred to above, Paradis et al. found similarities between the two groups. Problems with tense marking appeared in all language-impaired children, both the bilingual group and the monolingual groups. Nontense morphemes fared better in both groups. The authors concluded that this s uggests that (given the absence of disproportionate problems among the bilinguals) children with DLD can learn two languages, even while they show effects of DLD in both. Bilingualism is thus not a “risk factor” or an additional burden. This echoes what was said earlier about successive-bilingual children, where a cumulative effect was disproved. If that effect were to be found, bilingual children with DLD should differ from their TD counterparts and from the monolingual children with DLD. Note that whereas in the case with successive bilinguals an important factor – lack of input – had to be “extracted” from the comparison, this is not the case in simultaneous bilinguals: their input is not assumed to be lacking.
18.1.1.4 Bilingual Children with DLD and Typically Developing Bilinguals In one of the first studies that used this comparison, Jacobson and Schwartz (2005) focused on English past tense morphology in two groups of sequential bilingual Spanish–English speakers (age 7;0–9;0). They found differences in the profiles of language-impaired (LI) and typically developing children (the study did not refer to the then commonly used term SLI (see note 1) “because standardized test scores necessary to meet the criterion for classification of SLI were not available” (Jacobson & Schwartz, 2005, p. 314). Not only did the LI children produce more errors, as was expected, there was also a qualitative difference between the two groups. Children with LI performed better on irregular verbs, the typically developing group on regular verbs (a later study by Jacobson and Yu (2018) found greater difficulties with irregular verbs). The typically developing children also produced more productive error types (overregularisations) while the LI children produced more omissions. The authors proposed that an error analysis based on these findings may be helpful in the diagnosis of language impairment in bilingual children. The pattern they found – omissions in the language of children with DLD next to “creative errors” in TD children – has also been found in other studies, for different languages (Blom et al., 2022). The study referred to above addressed grammatical morphology. Again, this is just one aspect of DLD, a less complex syntax will also be found when comparing these groups. Over the last few years this comparison has often been carried out with newly developed language tests (to be discussed later). The aim in those studies was to establish whether the instruments could distinguish bilinguals with and without DLD.
18.1.1.5 Multiple Comparisons The ideal way of teasing apart the respective influences of DLD and bilingualism is to make comparisons where they can be isolated. This requires the inclusion of children with and without DLD and of monolinguals as well as bilinguals, thus multiplying the comparisons included. The value of this comparison was shown by a study by Blom et al. (2013) who found that in a multiple comparison in Dutch (including bilingual and monolingual
Developmental Language Disorder in a Bilingual Context 249 children, with and without DLD) TD children outperformed children with DLD on subjectverb agreement groupwise, while bilingualism was not decisive. Even though the results raised doubts about inflection as a diagnostic marker for individual children, the study illustrates the rationale for separating the respective roles of impairment and bilingualism.
18.1.1.6 What Do These Approaches Contribute to Our Understanding of Bilingual DLD? The comparisons described above allow for some conclusions about the nature of bilingual DLD. First of all, a similarity between the error patterns of TD bilinguals and monolingual children with DLD was found in multiple studies. It concerns, however, a limited set of grammatical markers. While the errors may look alike, they are caused by different underlying factors – poor language ability in DLD and lack of input in bilinguals. Processing studies can reveal the differences behind superficial similarities. The assumption of a cumulative effect of bilingualism and DLD has been disproven. Both factors do have their impact, by definition, but bilingualism does not aggravate the DLD condition. By and large the symptoms of DLD in each of the two languages are similar to those in monolinguals, although (as we shall see) there is occasional transfer from the L1 to the L2 and language dominance plays a significant role. Finally, in selecting research groups that differ in impairment and language status, the relative contribution of bilingualism and DLD can be separated. This is important in order to verify which characteristics in bilinguals can mark language impairment exclusively and thus prevent overidentification. Before turning to assessment, the obvious must be stated: DLD in a bilingual child by definition is found in both languages. After all, DLD is a child-internal state, not dependent on environment. At the same time, due to the crosslinguistic differences, the symptoms may not be the same. Linguistic symptoms, after all, are not only determined by the DLD condition itself, but also by the typology (structure) of the languages involved. A consequence of the definition of bilingual DLD is that the diagnosis cannot be based on the L2 alone. In successive-bilingual children, a measurement of the language abilities in the L1 constitutes a litmus test for the diagnosis. The reality, meanwhile, is that diagnosticians who are not familiar with the child’s L1 (and this is often the case) must often resort to instruments developed for monolingual speakers of the L2 and interpret the results while taking into account that the child is bilingual. For that reason, we will discuss assessment for both languages.
18.2 Bilingual Assessment 18.2.1 Assessing the First Language As mentioned before, assessment of the L1 is an important step in identifying successivebilingual children with DLD, in particular when the L1 is dominant. The key question here is how to unlock information on the L1 in the absence of native speaker knowledge of the language. One important condition for understanding a child’s L1 performance is a basic knowledge of how typology impacts on DLD. The influence of typology on symptoms of DLD is understood much better now, due to a range of crosslinguistic studies of DLD. A brief sketch of such studies is helpful here.
250 Jan de Jong
18.2.1.1 Typological Differences Comparative studies mostly include monolingual children. These studies are relevant because a bilingual child’s two languages may be quite different in structure and any cross-linguistic differences found in monolingual groups may show in a single bilingual child. While the profiles of the two languages used by bilingual children with DLD may superficially be similar (in that each of their languages is delayed and shows less complexity than in children without DLD) the structure of each language influences which elements of the language might be easy to learn and which might be challenging. There is a rich tradition of research into cross-linguistic differences in the symptoms of monolingual DLD. Leonard (2014) gives a broad overview in his seminal book. His conclusion from the cross-linguistic research so far is that grammatical morphology is impaired in every language studied, but the nature of the impairment varies with the typological characteristics of the language. For instance, when the language has a rich (and uniform) morphology, fewer errors are found than in a language with sparse morphology, including in children with DLD. In languages with a rich morphology, substitutions are found more often in the output of children with DLD; in languages with a sparse morphology, omissions predominate. Saliency also plays a significant role in explaining the vulnerability of morphemes: nonsalient morphemes (phonologically weak, non-syllabic) are more vulnerable; in languages where the same feature is marked by a salient morpheme, children with DLD make fewer errors. Another difference is in the domains affected: in some languages nominal morphology is more affected by DLD than verbal morphology and vice versa – depending on the importance of either. This can be exemplified by studies of bilingual children with Turkish as their L1 and Dutch as their L2: in Turkish, the symptoms are more prominent in nominal morphology, whereas in Dutch verbal morphology is more affected (Blom et al., 2022). The differences mentioned above can all be found in the two languages of a bilingual child with DLD. In the words of Paradis et al. (2021, p. 302): “Bilingualism does not change the language-specific profiles” in a child with DLD. This means that symptoms resemble symptoms in monolinguals in each of the two languages. There is, however, a way in which the languages are not separate. Apart from the fact that elements of the L1 are occasionally transferred to the L2 (leading to interference errors), typological proximity can also assist the child, leading to positive transfer. For example, when subject-verb agreement is a feature of L1 as well as L2, the child’s familiarity with it in L1 may support its acquisition in L2. As Paradis et al. (2021, p. 159) observed, negative transfer (in the form of errors) is more visible than positive transfer, thus masking the scaffolding potential that the latter has. Of course, transfer is most relevant for its impact on the second language and thus has to be kept in mind when assessing L2 (the topic of the next section).
18.2.1.2 Language Assessment: Language Sampling As argued above, in the first language a lack of input will not influence the child’s language skills and thus DLD can be more reliably identified. There are, however, several factors that complicate L1 assessment. The first is that (monolingual) test materials are not available for all languages. Thordardottir (2015) makes a distinction between languages for which no formal test is available and communities where there is no diagnostic tradition in the first place. In the absence of that, she proposes, use can be made of case history (including parental information) and a sample of spontaneous language (and non-word repetition, to be discussed later). Language sampling raises a new problem. Quite often the diagnostician does not speak the L1 and testing or analysis of spontaneous speech in the L1 can only be done
Developmental Language Disorder in a Bilingual Context 251 with the help of an interpreter. In certain situations, a diagnostician may be able to carry out the assessment in both languages. An obvious example is the testing of L1 Spanish in the US, where many of the speech therapists are also bilingual and where bilingual assessment tools for these two languages are widely available. However, this is the exception. A recommendation like Thordardottir’s to collect a language sample in the L1 does come with some caveats. The clinician’s lack of knowledge of the L1 must be compensated for. It is important that the clinician has some information on the typology of the language. As mentioned above, we have knowledge of the symptoms of DLD in a number of languages (Leonard, 2014). Literature on how languages differ can also be helpful (consider, for example, websites like Ethnologue (https://www.ethnologue.com). Taking into consideration the role of crosslinguistic differences mentioned earlier, this can lead to an initial hypothesis about possible symptoms (in particular when symptoms of DLD are known for a typologically related language). Julien (2019) proposed a procedure for gathering and analyzing spontaneous language that can serve as an example. Her guideline involves the following steps: (1) record 25 utterances in the L1; (2) ask the interpreter to make a transcription, while clarifying that transcription should not correct errors; (3) demand a literal translation for each utterance; (4) discuss with the interpreter whether the utterance is correct and if not, what the correct form would be; (5) identify the nature of the error (in morpho-syntax, lexicon, pragmatics, or phonology). This strategy requires a well-informed interpreter, who is aware of the aims of the language analysis. It is worth mentioning that analysis of spontaneous language can sometimes be facilitated by dedicated methods like LARSP. This method allows the clinician to draw a grammatical profile of the child’s language production. Importantly, in three recent volumes (the first of which is Ball et al., 2012) LARSP analyses and profiles for a range of languages have been made available, allowing for the analysis of many children’s L1. Another method, SALT (Miller et al., 2015), is also available for Spanish and Turkish. Again, these methods require collaboration with a native speaker of the language.
18.2.1.3 Language Change and Loss A factor that should be considered when assessing the L1 is the potential difference between a language as spoken in the country of origin and as a heritage language. After speakers of a language migrate to another country, their (first) language may change, also under the influence of the language of the host country. When their children learn the same language (now a heritage language) from their parents the L1 input they encounter may be different from that of children in the country of origin. An example can be taken from Turkish as spoken in the Netherlands. Grammatical contexts in which accusative marking of the direct object is obligatory in Turkey, have become optional contexts in Turkish in the Netherlands. This “deflection” – in the form of a loss of accusative – is now part of the input of Turkish children in the Netherlands. Consequently, absence of an accusative marker cannot be interpreted as an error (or as a marker of DLD) (Blom et al., 2022). The implication is that if a monolingual test includes items for a grammatical feature that is not present in a bilingual context, the items are invalid. The heritage language can also differ in use. Language dominance may change in that the L2 becomes dominant – an important factor in L2 learners. In addition, in children (or adults) the L1 may get lost under the influence of attrition. When a child (partly) loses its L1, a delay in L1 will be the sum of attrition and possible DLD, making it difficult to identify
252 Jan de Jong the contribution of each. Very little research is available about the significance of attrition in DLD, but there are anecdotal indications that attrition may affect children with DLD disproportionally. Restrepo and Kruth (2000) described a case that exemplifies that; the authors hypothesize that in children with DLD, who need more exemplars of a language form in order to learn it, the effect of attrition may compound the difficulties in the L1.
18.2.2 Assessing the Second Language In this section we depart from the assumption that there are (monolingual) instruments for the assessment of the L2 – recognizing that this is not always the case. The overarching question then is to what extent available instruments can be used with bilingual children. We already mentioned that misdiagnosis of these children may be caused by the limitations of standardized language tests for the L2. This concerns the tests used but also their interpretation (Paradis et al., 2021, p. 325). Paradis et al. recommend the use of a broad set of languagegeneral and language-specific tests, considering that no single test with monolingual norms can generate conclusive results.
18.2.2.1 Language Assessment and Test Interpretation In order for the test to be adequate, it has to be sensitive and specific. Sensitivity refers to the test’s ability to correctly identify children who fall behind and are cause for concern; specificity is the ability to identify children whose performance is typical. In use with bilingual groups in particular the specificity of the tests often turns out to be low. This means that potentially the child is incorrectly judged to be language-delayed because it does not meet the monolingual norms. The fact that the norms are monolingual is precisely the problem. The children are compared to children in the normative base who have had more input in the L2, putting the bilingual children at a disadvantage. A solution (of which there are few examples) is to develop bilingual norms – an approach which has its own challenges due to the heterogeneity of a bilingual population (differences in L1s and in language history and dominance). For test publishers the expenses of creating a normative database may also be a discouraging factor. Alternatively, proposals have been made for “local norms,” based on a cumulative database of bilingual children (Bedore & Peña, 2008); this approach reflects a common experience in clinicians, whose diagnostic judgment is developed by encountering multiple bilingual children. Kohnert et al. (2021, p. 175) explicitly relate this approach to clinical practice: “The clinician can compare the child to his or her classmates or siblings, or to local norms that are developed over time.” Thordardottir (2015) has proposed an alternative way to interpret monolingual test results for L2 by adopting more lenient criteria for suspected DLD in bilingual children. Cut-off criteria here are dependent on language dominance. For example, a cut-off point of –2.25 to –2.50 standard deviations from the mean in two areas of language on a test for L2 is suggested for children whose L1 is dominant. Another approach to assessment has been described by Kohnert (2010). Given that the L2 of bilingual children to a large extent is the product of experience, she proposes that measures that minimize the contribution of input are better tools. While tests (and spontaneous language in the L2) are product measures, a different road taken might be the measuring of language processing – mentioned above as a way of distinguishing typical L2 from monolingual DLD. To further illustrate what makes “process” different from “product”: a task in which the items are nonsense words or low-frequency words is not just challenging for bilingual children but also for monolinguals. This way, both groups are on an equal footing in terms of processing demands. Non-word repetition is discussed in the next section.
Developmental Language Disorder in a Bilingual Context 253
18.2.2.2 Dynamic Assessment Another way of diminishing the role of input/product is to focus on the learning ability itself, as is done in dynamic assessment (DA). The key principle of DA is based on Vygotsky’s Zone of proximal development (Gutierrez-Clellen & Peña, 2001). This zone is “between the level of performance the child can reach unassisted, and the level that can be attained when adult assistance is provided” (p. 212). The assumption with respect to bilingual children is that the assistance given, in the form of explicit instruction, will lead to more change in TD children (who are prone to benefit) than in children with DLD (who are restricted by their limited abilities). The dependent measure in DA is the “modifiability” (the potential for change) after such assistance. Importantly, thus, the present state of knowledge is not the variable measured, but the ability to learn. Recently, a systematic review (Hunt et al., 2022) and a meta-analysis (Orellana et al., 2019) were published, c oncluding, with some reservations, that DA is a promising diagnostic tool when used with bilingual children.
18.2.2.3 Language Assessment: Language Sampling Spontaneous language analysis does not usually come with norms. Instead, it allows the clinician to identify errors. The challenge in bilingual assessment is the interpretation of these errors. In monolingual children, errors can mark DLD. In bilingual children, errors can also mark a (typical) developmental stage – like the stage that is sometimes labeled “interlanguage.” In addition, errors in the L2 can also originate in the L1: interference or transfer errors. Code-switching between the two languages may also be found (as in L1 samples). It should be stressed that these are not errors; code-switching in children with DLD does not seem to differ from TD children (e.g. Kapantzoglou et al., 2021). As with language tests, the “fair” comparison here is to TD bilingual children.
18.2.3 Parallel Assessment of First and Second Language The definition of bilingual DLD predicts that the impairment will affect both L1 and L2. For that reason, the development of parallel tasks has become popular, where the tasks share basic principles and are similar in administration but differ where the languages themselves differ. A clear example of this is the availability of tests that can be used for English and Spanish adopting an identical approach (cf. Peña et al., 2014). A similar consideration – that the tools for both languages should be essentially the same – has inspired the construction of a set of multilingual tools developed in the course of a European collaboration, together named the LITMUS tools (Language Impairment Testing in Multilingual Settings; Armon-Lotem et al., 2015).3 Some of the instruments that will be mentioned here are part of that set. I will focus here on how each of the tasks attains comparability across administration in different languages. A first important source of knowledge of the child’s languages is the use of parental questionnaires, which, after all, include information supplied by those who are c losest to the child. The child’s language history can lead to hypotheses about possible language impairment, given that parental concern is a strong indicator of DLD and has been shown to correlate with findings from standardized tests (Ebert, 2017). In bilingual children, questionnaires also generate important information about the bilingualism itself, like age of earliest exposure to the L2 and language dominance. Information on dominance gives crucial information on which language to prioritize in diagnosis. Examples of parental questionnaires are PABIQ (part of the LITMUS tools; Tuller, 2015) and the questionnaire on which it is modelled: ALDeQ (Paradis et al., 2010). Paradis et al. (2021) recommend that the parental
254 Jan de Jong report be collected orally. The questionnaire does not require L1 knowledge for the clinician, but the assistance of an interpreter could be needed if the parent is not fluent in the L2 (Paradis et al., 2021, p. 329). Sentence repetition has been shown to be highly sensitive in detecting DLD in monolinguals (Conti-Ramsden et al., 2001). The LITMUS sentence repetition task (Marinis & Armon-Lotem, 2015; for a more elaborate discussion, see Chapter 19 of this volume) allows for cross-linguistic comparison by using two common factors that contribute to the complexity of sentences – syntactic movement and embedding – to define syntactic complexity across languages (and thus make findings from both languages comparable); in addition language-specific features are included that may be sensitive variables for the detection of DLD. Non-word repetition (NWR) is one of the tools selected by Kohnert et al. (2021) that exemplify the assessment of process rather than product. After all, non-words are just as unfamiliar to monolingual children as they are to bilingual children. NWR – for m onolinguals – is also listed as having high sensitivity and specificity by Conti-Ramsden et al. (2001). There are several versions of the NWR test, that differ in, for example, the inclusion or exclusion of consonant clusters. The LITMUS NWR test (Chiat, 2015) comes in two formats, a languagespecific one that takes into account the phonological and phonotactic features of the target language and a “quasi-universal” (QU) task that contains phonemes and sequences that are common to most languages (and thus, for example, excludes consonant clusters). Recent meta-analyses of the diagnostic accuracy of NWR’s (Ortiz, 2021; Schwob et al., 2021) find that the QU task has the strongest potential for assessment in bilinguals. The reason for this is that input still plays a role in the language-specific tasks: if the child’s L1 phonology differs from that of the language tested it is at a disadvantage. An example would be a task including consonant clusters while the child’s L1 does not. Limited lexical ability is an early indicator of language difficulties. Also in lexicon, a diagnostic bias may lead to overidentification of DLD: bilingual children’s L2 lexicons are, initially, smaller. This has led to parallel testing of the two languages. One instrument for measuring lexicon is the MacArthur-Bates Communicative Development Inventories (Fenson et al., 2006), a parental questionnaire that has been adapted for many languages. The Cross-linguistic Lexical Task (LITMUS-CLT; Haman et al., 2015) is a vocabulary test (not a questionnaire). It uses pictures that have been selected to be valid cross-culturally and that allow for the cross-linguistic testing of production and comprehension of lexical items (nouns and verbs). Elicitation of narratives has also been proposed as a measure that is less susceptible to the influence of input, particularly where macrostructure (story plot) is concerned. The LITMUS narrative analysis (MAIN; Gagarina et al., 2019) has four different picture sequences that can be used for the elicitation of narratives in both the child’s languages. The plots of the four narratives are very similar, making it possible to use two different picture sets and elicit essentially similar narratives across languages.
18.2.3.1 Diagnostic Accuracy As mentioned earlier, monolingual tests lack the psychometric qualities to properly identify bilingual children with DLD. A crucial question is therefore whether newly created tests for bilinguals (including monolingual tests that are part of parallel assessment of L1 and L2) manage to correct that misdiagnosis. Recently some studies focused on the diagnostic accuracy of bilingual assessment (cf. Dollaghan & Horner, 2011). The key question is whether, in the absence of bilingual norms, the tests are able to identify (in a group study) which children are TD and which show signs of DLD. The crucial requirement is that the test results are sensitive to DLD effects and “immune” to the effects of bilingualism. The future will bring
Developmental Language Disorder in a Bilingual Context 255 us more studies on diagnostic accuracy of instruments. Some studies on the LITMUS tools are promising in that respect (Boerma & Blom, 2021; Tuller et al., 2018). Other new instruments will, in various ways, attempt to fill the diagnostic gaps.
NOTES 1 The term DLD is used here throughout. Until recently, SLI (Specific Language Impairment) was the more frequently used term. For the topic of this chapter, differences between the two terms are not relevant. As a consequence, when referring to research the term DLD will be used here, even when the authors themselves used the term SLI. 2 In this chapter “bilingual” will be used as shorthand for children who are multilingual. 3 For a more detailed description of the tools, see https://www.bi-sli.org/litmus-tools
REFERENCES Armon-Lotem, S., de Jong, J., & Meir, N. (Eds.). (2015). Methods for assessing multilingual children – Disentangling bilingualism from language impairment. Multilingual Matters. Ball, M. J., Crystal, D., & Fletcher, P. (2012). Assessing grammar. The languages of LARSP. Multilingual Matters. Bedore, L., & Peña, E. (2008). Assessment of bilingual children for identification of language impairment: Current findings and implications for practice. International Journal of Bilingual Education and Bilingualism, 11(1), 1–29. Blom, E., Boerma, T., Karaca, F., de Jong, J., & Küntay, A. C. (2022). Grammatical development in both languages of bilingual Turkish-Dutch children with and without Developmental Language Disorder. Frontiers in Communication, 7, 1059427. https://doi. org/10.3389/fcomm.2022.1059427 Blom, E., de Jong, J., Orgassa, A., Baker, A., & Weerman, F. (2013). Verb inflection in monolingual Dutch and sequential bilingual Turkish-Dutch children with and without SLI. International Journal of Language & Communication Disorders, 48, 382–393. Boerma, T., & Blom, E. (2021). Quasi-universal nonword repetition and narrative performance over time: A longitudinal study on 5- to 8-year-old children with diverse language skills. In K. Grohmann & S. Armon-Lotem (Eds.), LITMUS in action: Comparative studies across Europe (pp. 302–328). John Benjamins. Chiat, S. (2015). Non-word repetition. In S. Armon-Lotem, J. de Jong, & N. Meir (Eds.),
Assessing multilingual children: Disentangling bilingualism from language impairment (pp. 125–150). Multilingual Matters. Chondrogianni, V., & Marinis, T. (2012). Production and processing asymmetries in the acquisition of tense morphology by sequential bilingual children. Bilingualism: Language and Cognition, 15, 5–21. Conti-Ramsden, G., Botting, N., & Faragher, B. (2001). Psycholinguistic markers for specific language impairment. Journal of Child Psychology and Psychiatry, 42, 741–748. Dollaghan, C. A., & Horner, E. A. (2011). Bilingual language assessment: A metaanalysis of diagnostic accuracy. Journal of Speech, Language, and Hearing Research, 54, 1077–1088. Ebert, K. D. (2017). Convergence between parent report and direct assessment of language and attention in culturally and linguistically diverse children. PLoS One, 12(7), e0180598. Fenson, L., Marchman, V. A., Thal, D. J., Dale, P. S., Reznick, J. S., & Bates, E. (2006). MacArthurBates communicative development inventories (2nd ed.). APA PsycTests. Gagarina, N., Klop, D., Kunnari, S., Tantele, K., Välimaa, T., Bohnacker, U., & Walters, J. (2019). MAIN: Multilingual assessment instrument for narratives – Revised. ZAS Papers in Linguistics, 63. ZAS. Gutierrez-Clellen, V. F., & Peña, E. (2001). Dynamic assessment of diverse children: A tutorial. Speech-Language-Hearing Services in Schools, 32, 212–224.
256 Jan de Jong Haman, E., Łuniewska, M., & Pomiechowska, B. (2015). Designing Cross-linguistic Lexical Tasks (CLTs) for bilingual preschool children. In S. Armon-Lotem, J. de Jong, & N. Meir (Eds.), Methods for assessing multilingual children: Disentangling bilingualism from language impairment (pp. 196–224). Multilingual Matters. Hunt, E., Nang, C., Meldrum, S., & Armstrong, E. (2022). Can dynamic assessment identify language disorder in multilingual children? Clinical applications from a systematic review. Language, Speech and Hearing Services in Schools, 53, 598–625. Jacobson, P. F., & Schwartz, R. G. (2005). English past tense use in bilingual children with language impairment. American Journal of Speech-Language Pathology, 14, 313–323. Jacobson, P. F., & Yu, Y. H. (2018). Changes in English past tense use by bilingual school-age children with and without developmental language disorder. Journal of Speech, Language, and Hearing Research, 61, 2532–2546. Jong, J. de (2008). Bilingualism and language impairment. In M. Ball, N. Mueller, & S. Howard (Eds.), The handbook of clinical linguistics (pp. 261–274). Blackwell. Julien, M. (2019). Taalstoornissen bij meertalige kinderen. Diagnose en behandeling (‘Language disorders in multilingual children. Diagnosis and treatment’). Amsterdam: Harcourt. Kapantzoglou, M., Brown, J. E., Cycyk, L. M., & Fergadiotis, G. (2021). Code-switching and language proficiency in bilingual children with and without developmental language disorder. Journal of Speech, Language and Hearing Research, 64, 1605–1620. Kohnert, K. (2010). Bilingual children with primary language impairment: Issues, evidence and implications for clinical actions. Journal of Communication Disorders, 43, 456–473. Kohnert, K., Danahy Ebert, K., & Thuy Pham, G. (2021). Language disorders in bilingual children and adults (3rd ed.). Plural Publishing. Leonard, L. B. (2014). Children with specific language impairment (2nd ed.). MIT Press. Marinis, T., & Armon-Lotem, S. (2015). Sentence repetition. In S. Armon-Lotem, J. de Jong, & N. Meir (Eds.), Assessing multilingual children: Disentangling bilingualism from language impairment (pp. 95–124). Multilingual Matters. Miller, J., Andriacchi, K., & Nockerts, A. (Eds.). (2015). Assessing language production using SALT software: A clinician’s guide to language sample analysis (2nd ed.). SALT Software LLC. Orellana, C. I., Wada, R., & Gillam, R. B. (2019). The use of dynamic assessment for the
diagnosis of language disorders in bilingual children: A meta-analysis. American Journal of Speech Language Pathology, 28, 1298–1317. Ortiz, J. A. (2021). Using nonword repetition to identify language impairment in bilingual children: A meta-analysis of diagnostic accuracy. American Journal of Speech-Language Pathology, 30, 2275–2295. Paradis, J. (2005). Grammatical morphology in children learning English as a second language: Implications of similarities with specific language impairment. Language, Speech and Hearing Services in Schools, 36, 172–187. Paradis, J. (2010). The interface between bilingual development and specific language impairment. Applied Psycholinguistics, 31, 227–252. Paradis, J., & Crago, M. (2000). Tense and temporality: Similarities and differences between language-impaired and secondlanguage children. Journal of Speech, Language and Hearing Research, 43, 834–848. Paradis, J., Crago, M., Genesee, F., & Rice, M. (2003). French-English bilingual children with SLI: How do they compare with their monolingual peers? Journal of Speech, Language and Hearing Research, 46, 1–15. Paradis, J., Emmerzael, K., & Sorenson Duncan, T. (2010). Assessment of English language learners: Using parent report on first language development. Journal of Communication Disorders, 43, 474–497. Paradis, J., Genesee, F., & Crago, M. B. (2021). Dual language development and disorders. A handbook on bilingualism and second language learning (3rd ed.). Brookes. Peña, E. D., Gutiérrez-Clellen, V. F., Iglesias, A., Goldstein, B., & Bedore, L. M. (2014). BESA: Bilingual English – Spanish assessment. AR-Clinical Publications. Restrepo, M. A., & Kruth, K. (2000). Grammatical characteristics of a Spanish-English bilingual child with specific language impairment. Communication Disorders Quarterly, 21, 66–76. Schwob, S., Eddé, L., Jacquin, L., Leboulanger, M., Picard, M., Ramos Oliveira, P., & Skoruppa, K. (2021). Using nonword repetition to identify developmental language disorder in monolingual and bilingual children: A systematic review and metaanalysis. Journal of Speech, Language, and Hearing Research, 64, 3578–3593. Thordardottir, E. (2015). Proposed diagnostic procedures and criteria for cost action studies on bilingual SLI. In S. Armon-Lotem, J. de Jong, & N. Meir (Eds.), Methods for assessing
Developmental Language Disorder in a Bilingual Context 257 multilingual children: Disentangling bilingualism from language impairment. Multilingual Matters. Tuller, L. (2015). Clinical use of parental questionnaires in multilingual contexts. In S. Armon-Lotem, J. de Jong, & N. Meir (Eds.), Assessing multilingual children: Disentangling bilingualism from language impairment (pp. 301–330). Multilingual Matters. Tuller, L., Hamann, C., Chilla, S., Ferré, S., Morin, E., Prevost, P., dos Santos, C.,
Ibrahim, L., & Zebib, R. (2018). Identifying language impairment in bilingual children in France and in Germany. International Journal of Language & Communication Disorders, 53, 888–904. Verhoeven, L., Steenge, J., & van Balkom, H. (2011). Verb morphology as clinical marker of specific language impairment: Evidence from first and second language learners. Research in Developmental Disabilities, 32, 1186–1193.
19 Cross-Linguistic Perspectives on Morphosyntax in Child Language Disorders STANISLAVA ANTONIJEVIC AND NATALIA MEIR 19.1 Introduction This chapter discusses morphosyntactic abilities across different languages in children diagnosed with a range of developmental disorders. These include Developmental Language Disorder (DLD), previously also referred to as Specific Language Impairment (SLI); Autism Spectrum Disorder (ASD); hearing impairment; and Down Syndrome (DS), where language and communication are the primary or the secondary disorder. We review studies published in the last decade (2010–2022), to complement a thorough review by Crago et al. (2008) in the previous edition of The Handbook of Clinical Linguistics as well as the seminal work by Leonard (2014, 2022), which compared manifestations of language disorders across different languages. Recently, there has been an increase in research on typical and atypical language development in monolingual and multilingual children in languages beyond English. While we are aware of the importance of researching bi/multilingual language acquisition (see Chapter 17 of the current volume), in the current chapter we focus on research comparing the performance of monolingual, typically developing children (TD), and those with language disorders. Specifically, we include studies employing a Sentence Repetition (SRep) task. This task is highly effective in teasing apart typical and atypical language development (Conti-Ramsden et al., 2001), and one that has been shown to tap into children’s knowledge of morphosyntax (Klem et al., 2015; Nag et al., 2018). Error patterns in SRep tasks are compared across typologically similar and different languages with the ultimate aim of shedding light on commonalities and differences in the manifestation of language disorder across languages. We hope that this will assist in synthesizing more nuanced theories of language acquisition, provide a better understanding of language disorder, and help build better diagnostic tools and effective treatment plans targeting language deficits.
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
260 Stanislava Antonijevic and Natalia Meir
19.1.1 Cross-syndrome Perspective on Morpho-syntactic Difficulties Language disorder (LD) is defined as a “persistent difficulty in the acquisition and use of language across modalities (i.e., spoken, written, sign language, or other) due to deficits in comprehension or production” including language abilities that are “substantially and quantifiably” below age expectations (DSM-5; American Psychiatric Association, 2013). Language deficits in children are observed in a variety of developmental disorders, surfacing as a primary, secondary, and/or as comorbid with other developmental disorders. Developmental Language Disorder (DLD), previously also referred to as Specific Language Impairment (SLI) (Bishop, 2017; Bishop et al., 2017), is a primary disorder of morphosyntactic development in the absence of documented neurological damage, hearing deficits, severe environmental deprivation, or intellectual developmental disorder (Leonard, 2014). Autism Spectrum Disorder (ASD) is diagnosed based upon two symptom clusters: (a) pervasive deficiencies in social communication and social interaction; and (b) restrictive and repetitive patterns of behavior, interests, or activities (DSM-5, American Psychiatric Association, 2013). Language disorder is not included in the current ASD diagnosis criteria. Yet, linguistic profiles of children with ASD are heterogeneous in that some children exhibit intact and others impaired morphosyntactic skills (Kjelgaard & Tager-Flusberg, 2001). Hearing impairment (HI) is diagnosed based on behavioral and objective measures of hearing loss. Children with HI are reported to have difficulties in language abilities compared to children with typical hearing (e.g., Boons et al., 2013; Friedmann & Szterman, 2011). Children with intellectual developmental disorders including Down syndrome (DS), a genetic syndrome resulting from trisomy 21, are also reported to have difficulties in the domain of morphosyntax. Individuals with DS develop language but at a reduced level compared to TD children (Andreou & Chartomatsidou, 2020). Cross-syndrome comparisons can provide further insights into the process of language acquisition and the underlying causes associated with language deficits (Bates, 2004; Rice, 2016).
19.1.2 Cross-linguistic Perspectives on Morphosyntactic Difficulties Cross-linguistic research has indicated that symptoms of language disorder vary depending upon the language concerned. Children with language disorder tend to show broad language deficits, while specific areas posing significant difficulties seem to be languagespecific (for a detailed overview see Frizelle & Fletcher, 2017; Leonard, 2014, 2022). Research on morphosyntax suggested that each language poses a specific “problem space,” which determines the sequences and types of errors produced during language acquisition (Bates, 2004; Leonard, 2022). The idea that languages define a specific acquisition space, had previously been proposed by Slobin (1985). However, current theories of language acquisition propose different mechanisms and processes involved in conquering that space. For example, constructivists suggest that children need to “induce the form-function patterns of their target language from the input” (cf. Behrens, 2021, p. 960) while theories based upon discriminative learning propose that children learn to discriminate between the cues that frequently lead to specific events relative to those that do so rarely or not at all in a defined set of cues and events (MacWhinney, 2021; Ramscar, 2021). Both approaches suggest that the deficiency in language acquisition lies in the processing of linguistic input and in learning
Cross-Linguistic Perspectives on Morphosyntax in Child Language Disorders 261 complex associations. Alternatively, in the generativist approach language acquisition process involves detecting language-specific syntactic options in the linguistic input and adjusting the parameters of the processing system accordingly (e.g., Snyder, 2021). Within this approach, morphosyntactic difficulties arise from an incorrect adjustment of the processing system based upon the linguistic input (for detailed overview of specific attempts to explain deficits in linguistic knowledge in DLD see Leonard, 2014). The need for a cross-linguistic comparison of m orphosyntactic difficulties was also recently highlighted by Christiansen et al. (2022) suggesting the importance of comparing languages across and within t ypological groups, with the aim of finding out how specific language characteristics influence language acquisition. We believe that cross-linguistic comparisons will, on one hand, contribute to a better understanding of acquisition of the individual languages and, on the other hand, specify the areas of morphosyntax that have the potential to discriminate between TD children and those with language disorder and should, therefore, be included in the language assessments that are used for diagnostic purposes.
19.1.3 Assessment of Morphosyntactic Difficulties: Why Sentence Repetition Tasks? Morphosyntactic difficulties in children can be assessed using a variety of different paradigms (e.g., spontaneous speech, narrative elicitation, elicited production, forced choice comprehension, etc). In the current chapter, we look at the evidence from Sentence Repetition tasks (SRep) (also known as Sentence Recall or Sentence Imitation). SRep tasks are widely used for the assessment of morphosyntax and are often part of standardized assessment tools (e.g., the Clinical Evaluation of Language Fundamentals-Fifth Edition (CELF-V) (Wiig et al., 2013)). SRep tasks allow for the elicitation of complex morphosyntactic structures that are difficult to elicit in spontaneous speech. Several recent studies have confirmed that SRep tasks are valid assessments of morphosyntactic knowledge (e.g., Klem et al., 2015; Nag et al., 2018; Polišenská et al., 2015; Riches, 2012). The LITMUS SRep tasks (Marinis & Armon-Lotem, 2015) were designed to be comparable across different languages and have been adapted to a number of typologically different languages (see https://www.litmus-srep.info). They include complex syntactic structures reported to be difficult for children with DLD across languages, for example, structures involving syntactic movement such as passive constructions, topicalization, object relative clauses and object wh-questions. In addition, LITMUS SRep tasks also include language specific structures reported to be vulnerable in individual languages in children with DLD. A comparable design across different languages makes the LITMUS-SRep optimal for crosslinguistic comparisons. A recent scoping review (Rujas et al., 2021) showed that of 203 studies (in 33 languages), which investigated the use of SRep in monolingual and bilingual children with typical and atypical language development approximately half of the studies were conducted with English-speaking children (51%). Studies on other languages included children speaking Spanish (11%), French (7%), Italian (5%), German (5%), and Hebrew (4%). The authors reported that most of the studies (68%; 139/203) used SRep tasks with children with atypical language development (SLI/DLD, HI, ASD, and learning difficulties). These tasks have been shown to be a valuable tool in discriminating children with and without language disorder providing good indices of sensitivity (i.e., in terms of the accuracy of the test to correctly classify an individual with a disorder), and specificity (the accuracy of the test to correctly classify individuals without a disorder) (e.g., Leclercq et al., 2014; Pham & Ebert, 2020; Vang Christensen, 2019).
262 Stanislava Antonijevic and Natalia Meir There are different schemes for scoring SRep performance: verbatim scoring (assessing whether the entire sentence was repeated correctly/incorrectly and coded in a binary manner or in a 0–3 scoring scheme); sentence structure scoring (evaluating whether the target structure was repeated correctly/incorrectly); and grammaticality scoring (evaluating whether the repeated sentence was grammatical or not). Further scoring options focus on error analysis: omission/substitution scoring (calculating the number of function and content words omitted/substituted) and a scoring scheme calculating the number of changes made in repetitions (for more details see Marinis & Armon-Lotem, 2015). We believe that reporting the overall proportion of correctly repeated content words, function words and inflections (aka Seeff-Gabriel et al., 2010) might be optimal for cross-linguistic comparisons because the use of proportion-correct as a measure of accuracy for the three categories could mitigate potential differences in morphological richness between languages, thereby enabling more meaningful comparisons. The current chapter reviews morphosyntactic difficulties as manifested on SRep performance beyond the overall task score: we gather evidence for the performance on different morphosyntactic structures within the task aiming to find out about the error patterns in children with language disorders relative to children with TD. Studies employing SRep tasks to investigate problems related to specific morphosyntactic phenomena are fewer compared to the ones looking at the overall accuracy scores to assess sensitivity and specificity of the assessment. What can be gained from the qualitative analyses of the error patterns? Different error patterns can be indicative of the specific processes involved in acquisition of morphosyntactic structures, as well as of the point children have reached in their language development (e.g., Frizelle & Fletcher, 2014; Nag et al., 2018). Comparing error patterns across languages and across syndromes (and also relative to age-matched and younger children with TD) can be done relative to the type and frequency of errors. For example, children with DLD have been shown to make a significantly higher percentage of morphosyntactic errors than children with TD (irrespective of whether they were age-matched), but the types of errors seem to be similar. Studies conducted in English demonstrated that both TD c hildren and children with DLD make the same types of errors in omitting tense (-ed to indicate past tense) and also agreement (-s to indicate third person plural) morphemes in obligatory context. However, the frequency of correct use of those morphemes across the groups is different: in three-year-old TD children it ranges 50–80%, in five-year-old language-matched children with DLD it ranges 30–60% while in five-year-old TD children it is 90–95% (see detailed discussion in Leonard, 2014). Analysing error patterns can be informative with respect to: (a) the trajectory of language acquisition across different language disorders and/or syndromes; and (b) the underlying causes of language deficits (Bates, 2004; Bishop, 2010; Leonard, 2014) as well as highlighting the different mechanisms that might be taxing language performance (e.g., Abbeduto et al., 2016; Durrleman & Delage, 2016; Penke & Rothweiler, 2018; Schaeffer, 2018).
19.2 Cross-linguistic Evidence 19.2.1 Evidence from Germanic Languages Studies in the English language dominate the literature, however, only a relatively small number of them report error analysis. Seeff-Gabriel et al. (2010) employing SIT61 SRep task showed that children with DLD made the highest proportion of errors on function words and inflections while the most frequent type of errors were omissions followed by substitutions within the same word category. In addition, sentence complexity was found to contribute to
Cross-Linguistic Perspectives on Morphosyntax in Child Language Disorders 263 the frequency of errors. Similarly, Riches (2012) reported complex syntax (object relative clauses, object wh-questions and passives) to be difficult for English-speaking school-age children with DLD. Specific error types were omissions and substitutions of by in passives, omission of complementiser in relative clauses, and past tense omissions. In the same vein, the frequency of errors across syntactic structures in the English LITMUS-SRep task (Marinis & Armon-Lotem, 2015) confirmed complex structures (e.g., long and short passives, who/ what wh-questions, object relative clauses, conditionals and sentences with nouns taking complements) to be challenging for children with DLD. SVO sentences with auxiliaries and negation, object clefts, and subject clefts with passives were found to be challenging for both children with DLD and children with TD. Frizelle and Fletcher (2014) used the SRep task to compare the performance of school-age children with DLD against age-matched and children two-years-younger with TD. Children with DLD showed significantly lower SRep accuracy in comparison to both TD groups while their error patterns were similar to those produced by younger TD children. Children with lower syntactic abilities transformed relative clauses of all types into simple sentences and those with higher syntactic abilities showed the tendency to produce amalgamated sentences omitting the relativizer. Comparison of schoolaged TD children and those with DLD in dialectal variations, African American English and White Southern English highlighted the grammatical structures that included the functional category of tense to be particularly challenging for children with DLD. Presence of negation was challenging only in the complex structures involving all three functional categories of tense, negation, and complementizer (Oetting et al., 2016). Riches et al. (2010) compared error patterns in three groups: adolescents with DLD, adolescents with ASD that had accompanying language disorder, and TD adolescents using a SRep task. While the effect of sentence complexity was present for both DLD and ASD groups, it was much more pronounced in the DLD group. Both groups produced passive sentences instead of object relatives while keeping the correct assignment of thematic roles. Transforming object relatives into subject relatives was more frequent in the DLD group while incomplete sentences and null responses were produced as often in both groups. In Danish, the SRep task was used to examine four specific types of errors: (i) subjectverb order error, (ii) verb inflection errors, (iii) determiner errors, and (iv) personal pronoun errors (Vang Christensen, 2019). The task included: Declaratives with a complex noun phrase or with a complex verb; Negation; Verb second; Wh-questions; Passives; Object or Subject embedded and final relative clauses; and Conditionals. The study included younger (aged 6–10) and older (aged 11–14) TD children and age-matched children with DLD. Children with DLD made overall significantly more errors than the TD children, and subject-verb order errors in all groups were rare. Determiner errors involving mismatch between determiner and noun gender and some omissions of an obligatory determiner were present in all groups, but significantly more frequent in both groups of children with DLD. A similar pattern was observed for verb inflection errors (e.g., use of the infinitive instead of past tense). Personal pronoun errors such as the use of pronouns with incorrect case marking or omission of personal pronouns were also significantly more frequent in children with DLD. In Norwegian, Williams (2019) compared errors, using a SRep task, of TD children and children with DLD (age 5; 9–12; 11 years) matched for age and non-verbal IQ. The highest proportion of errors was recorded in the longest and most complex sentences. Children with DLD differed from the TD children with respect to greater omission of function words, and in particular in the relative clauses where the subordinate clause was frequently omitted starting from the relativizer. Other studies using SRep tasks in Norwegian reported only the total scores rather than error patterns (Klem et al., 2015).
264 Stanislava Antonijevic and Natalia Meir In German, the LITMUS SRep task was used to compare morphosyntactic abilities across mono- and bilingual TD children and children with DLD (aged 5;6–9;0 years) (Abed Ibrahim & Fekete, 2019). Monolinguals with DLD were overall less accurate than their TD peers, however the study did not address specific error patterns. A German SRep task was used to compare morphosyntactic abilities of children with DS and mental-age-matched TD children (Witecy et al., 2020) and found that children with DS produced more errors on function than content words. For content words, children with DS produced roughly similar percentages of omissions and substitutions to TD children (including both stem and morphological markings). For function words, their profiles were reversed: children with DS produced higher percentage of omissions relative to substitutions while TD children produced a higher percentage of substitutions and a low percentage of omissions. SRep performance was also used to evaluate morphosyntax in German-speaking children with HI aged 9;5 to 13 years (Ruigendijk & Friedmann, 2017). The HI group showed significantly poorer performance on object who and object which questions, center-embedded subject relatives and VSO sentences in comparison to slightly younger TD children. Most errors were reported to be related to the semantic/syntactic role assignment resulting from poorer mastery of case morphology.
19.2.2 Evidence from Romance Languages In Romance languages, clitic omissions (particularly third person object clitics) have been reported to be a clinical marker of DLD in several languages, e.g., French, Italian, Spanish, Catalan, and Portuguese (for an overview see Prévost, 2015). Clitics are unstressed, need to be attached to the host, and usually appear in the position reserved for objects. Using SRep tasks, this finding has been confirmed for Catalan-speaking children with DLD aged 6 to 17 (Gavarró, 2017) who had problems with object clitics as well as with determiners. In addition, Catalan-speaking children with DLD were less accurate in producing passive sentences and object relatives. For French, Leclercq et al. (2014) showed that children with DLD aged 7–12 years scored significantly lower on all measures of LITMUS-SRep task than TD children. They were less accurate on verbatim production, number of correctly repeated words, number of syntactic and morphological errors. For French-speaking children with DLD, difficulties were reported with respect to verb agreement morphology and function words (pronouns, possessives, relatives, conjunctions, prepositions, determiners). In the same vein, Silleresi et al. (2018) examined language skills of French-speaking children with ASD aged 7–10 with and without a comorbid DLD as compared to children with DLD and TD controls. The authors reported that DLD groups had difficulties with structures involving embedding (relative and argument clauses), wh-movement, and plural verb agreement. In Portuguese, children with ASD and DLD aged 9 years were compared to aged-matched and younger TD children (3–4 and 5–7 years old) on a SRep task composed of sentences of various syntactic complexity (Martins et al., 2017). Children with DLD and ASD exhibited difficulties with the repetition of subject and object relative clauses. Children with DLD showed better performance on subject than on object relatives, and the most common error pattern in the DLD group was the transformation of subject relatives into simple SVO sentences and object relatives into subject relatives. A different error pattern was observed in the ASD group: firstly, no subject–object asymmetry was found; secondly, in addition to the transformation of complex sentences into simples, children with ASD also tended to omit complementizers.
19.2.3 Evidence from Greek In Cypriot-Greek, Theodorou et al. (2017) showed quantitative and qualitative differences between the DLD and TD groups aged 5–9 years on the Greek LITMUS-SRep task. Younger children with DLD had problems with object and subject relative clauses, embedded
Cross-Linguistic Perspectives on Morphosyntax in Child Language Disorders 265 “that”-clauses, negative den-sentences, and subjunctive na-clauses. Older children with DLD showed difficulties with object relative clauses, embedded oti “that”-clauses, negative den-sentences, and adjunct giati “because”-clauses. As per error analysis, children with DLD (younger and older) made significantly more omission, substitution, addition and word order errors as compared to their TD peers while omissions and substitutions were more frequent than other types of errors.
19.2.4 Evidence from Slavic Languages Slavic languages are characterised by rich morphology and relatively free word order. For Russian, to the best of our knowledge, the error pattern analysis on the LITMUS-SRep task was only conducted for school-age bilingual children with DLD with Russian as their Heritage Language. The results indicated that bilinguals with DLD produced sentence fragments, showed omission of coordinators and subordinators, omission of prepositions, and simplification of wh-questions and relative clauses (Meir et al., 2017). The study on manifestations of DLD in Czech using a SRep task by Smolík and Vávrů (2014) did not address complex syntax, but rather focused on simple sentences and probed the use of verbs, clitics, and overt pronouns. Verbs and clitics were reported to be the most vulnerable categories in Czech for children with DLD aged 4;10 to 7;6. With respect to verbs, the two most frequent types of errors were the replacement of the verb with a nonword and the use of an infinitive in the root position. As for clitics, verbal and reflexive clitics were omitted at a high rate as well as prepositions. Smolík and Matiasovitsová (2021), employing a SRep task where each sentence had one noun or verb ending replaced by a coughing sound, showed that children with DLD aged 5;1–7;6 did not differ from vocabulary-matched TD controls on the use of verbal inflections suggesting that morphology per se is not the locus of the deficit in Czech speaking children with DLD. The authors concluded that the ability to coordinate multiple morphosyntactic relations in a sentence could be challenging for the DLD group.
19.2.5 Evidence from Semitic Languages Semitic languages exhibit pervasive morphological complexity, such that nearly all words are morphologically complex, containing at least two templatic morphemes: a tri- or quadri consonantal root, which encodes the semantic meaning and a primarily vocalic pattern, which denotes grammatical information (e.g., part of speech, tense, number) as well as the prosodic structure of the word. Considerable research in Arabic and Hebrew has investigated morphosyntactic manifestations of language disorder using LITMUS-SRep tasks. For Maltese, it has been demonstrated that the SRep task might be valuable in diagnosing language disorder; however, error patterns were not reported (Grech, 2022). In Arabic, for preschool children with language disorders, complex syntactic structures were found to be challenging, for example, sentences with passives, clitic left dislocation, object wh-questions, subject and object relative clauses, sentences with coordination, complements, subordination, and conditionals (DLD: Taha et al., 2021; ASD+LD: Abd Al Raziq et al., (submitted); Al-Hassan & Marinis, 2021). It is interesting to note that Arabicspeaking children with DLD and ASD+LD showed problems not only with object relative clauses but also with subject relative clauses (Abd Al Raziq et al., submitted; Al-Hassan & Marinis, 2021; Shaalan, 2010). For Palestinian-Arabic-speaking children with DLD, Taha et al. (2021) reported that object clitic pronouns in sentences with clitic left dislocation, object wh-questions, and object relative pronouns were likely to be omitted, which resulted in a change of the target grammatical structure. Furthermore, omission errors were found to affect function words such as the coordinator w “and,” conditional iza “if,” subordinate
266 Stanislava Antonijevic and Natalia Meir ʕashan “because,” demonstrative ha:d “this,” relative pronoun illi “that,” and wh-words such as mi:n “who” and ani/u “which” (Taha et al., 2021). It should be noted that such omissions cannot be solely related to the phonological salience of these function words, as some of these functional elements are bi-syllabic and some contain long vowels. In addition to complex syntax in Arabic, tense marking errors associated with agreement problems are attested in Arabic-Palestinian speaking children with language impairments (Taha et al., 2021), yet in Arabic-Gulf-speaking children with ASD+LD problems with morphology have not been found to be so pronounced (Al-Hassan & Marinis, 2021). In Hebrew, the same pattern emerges. Complex syntax has been consistently reported to be challenging for preschool and school-age children with DLD, HI, ASD+LI. Object relatives and object wh-questions were found to be particularly challenging as compared to subject relatives and subject questions (DLD: Sukenik & Friedmann, 2018; ASD: Meir & Novogrodsky, 2020; Sukenik & Friedmann, 2018; HI: Friedmann & Szterman, 2011). In all these clinical groups, children were reported to produce subject relatives and wh-questions instead of object relative clauses. Topicalization was reported to be challenging with the common errors of generating simple sentences instead. Functional elements such as prepositions and conjunctions are reported to be frequently omitted in children with language impairment in Hebrew. For example, prepositions in oblique wh-questions (me- “from,” le“to,” im “with”) and conjunctions (ve- “and,” im- “if”) in coordinate and subordinate clauses are reported to be omitted (DLD: Meir et al., 2016; ASD: Meir & Novogrodsky, 2020). Interestingly, the problems described above with complex syntax and functional elements are observed in preschool children with HI, whereas in toddlers with HI aged 1;4–3;4, there was evidence that they showed morphosyntactic development on par with their TD peers (Novogrodsky et al., 2018).
19.2.6 Evidence from Isolating Languages Vietnamese is an analytic (and isolating) language which has linguistic properties such as tones and multifunctional grammatical particles that do not occur in European languages. This difference allows for further potential insights into manifestations of language disorder in children. Pham and Ebert (2020) administered a SRep task to a group of Vietnamesespeaking monolingual children aged 5;2–6;2. The SRep task in Vietnamese included both simple (simple sentences with transitive and intransitive verbs) and complex structures (e.g., passives, adverbial clauses, subject and object relatives). The study demonstrated good sensitivity (above 80%) and fair specificity (above 70%) results indices for the Vietnamese SRep task, yet the study did not focus on the locus of difficulties. Wang et al. (2022) investigated DLD manifestations in Mandarin-speaking preschool children aged 4;0 and 5;11 using a SRep task which included sentences with classifiers (nounmodifying morphemes that are semantically complex) and complex sentences with passives and aspect markers as core elements. These have been previously reported to be challenging for children acquiring Mandarin. The task yielded excellent sensitivity (100%) and good specificity (87.5%) results. Although the systematic error pattern analysis was not carried out, the authors reported that substitutions of specific classifiers with the general classifier ge and omissions of aspect markers were quite frequent in children with DLD. In addition, children with DLD changed passive sentences to either simple sentences or BA-sentences in Mandarin.
19.2.7 Evidence from Sign Languages Evidence of language disorder in sign languages is highly informative with respect to manifestations of morphosyntactic difficulties as such data can shed light on more general patterns
Cross-Linguistic Perspectives on Morphosyntax in Child Language Disorders 267 information is expressed via movement and configurational changes of the hands and face. Several studies used SRep tasks to document error patterns in sign language acquisition in children with TD and adults in different sign languages (e.g., Bogliotti et al., 2020; Haug et al., 2020; Rinaldi et al., 2018; Schönström & Hauser, 2022). Studies reporting data from children with atypical language development are fewer. Marshall et al. (2015) reported that similarly to children with DLD acquiring spoken languages, children with DLD acquiring British Sign Language aged 7–11 were less accurate than TD children on overall sentence meaning, sign order, facial expressions, and verb morphological structures. Children with DLD were reported to make more omission errors than substitution ones. Quinto-Pozos et al. (2017) described language skills of a deaf native signer of American Sign Language (ASL) with DLD over a period of 7.5 years (11;10–19;6). The authors reported that the child frequently omitted signs and had difficulties with the word order of some sentences.
19.3 Summary and Conclusions In this chapter, we have reviewed evidence for morphosyntactic difficulties presented during SRep tasks by children with different developmental disorders (DLD, ASD, HI, and DS) across spoken and signed languages. Specifically, we investigated error patterns within and across several language types. Firstly, the accumulated evidence marshals strongly for some commonalities observed in different disorders and different languages: children with developmental disorders uniformly show problems with complex syntax regardless of language modality and regardless of language typology. Difficulties with relative clauses, direct object and oblique wh-questions, and passive constructions are observed in all languages reviewed in this chapter. However, complexity was not necessarily defined in the same way across the studies and future research needs to precise the notion of linguistic complexity. Furthermore, commonalities in error patterns were observed: complex structures were changed into simpler structures; omissions were more common than substitutions, errors on function words were more common than those on content words. However, there are also inevitable language-specific difficulties observed in children with language disorders depending on how specific language properties shape the “problem space” related to those individual languages (Bates, 2004; Leonard, 2022). For example, clitics in the Romance languages, verb morphology in some Germanic and Slavic languages as well as in the British Sign Language. The review of the available research published between 2010–2022 showed that studies addressing error patterns are limited, and we call for future research to incorporate structure and error analysis beyond analysing SRep totals. From there, a further step would be to conduct a more transparent and systematic comparison of error patterns across languages and across syndromes. Error patterns across languages and across the syndromes would allow evaluation of different theoretical approaches (see Leonard, 2014) and allow for more specific hypotheses to be formed to examine individual theoretical approaches (Ambridge & Wagner, 2021). The different SRep tasks reviewed in the current chapter were used across different languages, some were novel tasks and some were a part of a standardized assessment battery. Therefore, the documented cross-linguistic differences in error patterns could potentially be an artifact of the SRep task and the task construction principles. When making such comparisons, researchers need to make sure that they are comparing like for like. In that vein, LITMUS-SRep tasks are a step in this direction because they were designed to be comparable across languages (Marinis & Armon-Lotem, 2015). This way of purposefully constructing assessments that can capture specific characteristics of individual languages, while still being comparable across languages, is a potential solution for the complexity of crosslinguistic comparisons. In addition, we believe that comparing performance on different SRep tasks within
268 Stanislava Antonijevic and Natalia Meir the same clinical populations could shed light on how the rationale behind task construction effects performance. For example, some SRep tasks included only simple sentences while other included different levels of complexity; Different SRep tasks included different clinical markers of DLD; Tasks were not necessarily constructed based on the same theoretical principles. Further insights could also be gained by comparing the performance (including error patterns) on the LITMUS-SRep tasks that were constructed to be comparable across languages, across children with typical and atypical language development and across different languages. Here, we attempted to compare SRep performance across syndromes and across languages to the extent that the data was available, but more systematic approach is required to be able to gain firm conclusions. Finally, comparing SRep task performance and error patterns in different languages, and across syndromes in bi/multilingual individuals, should offer further insights into the processes involved in typical and atypical language acquisition as well as cross-linguistic interference.
REFERENCES Abbeduto, L., McDuffie, A., Thurman, A. J., & Kover, S. T. (2016). Language development in individuals with intellectual and developmental disabilities: From phenotypes to treatments. In R. M. Hodapp, & D. J. Fidler (Eds.), International review of research in developmental disabilities (Vol. 50, pp. 71–118). Academic Press. Abd Al Raziq, M., Meir, N., & Saigh-Haddad, E. (submitted). Morphosyntactic skills of Palestinian-Arabic speaking children with and without Autism Spectrum Disorder (ASD): Evidence from a Sentence Repetition Tasks. Autism & Developmental Language Impairments. Abed Ibrahim, L., & Fekete, I. (2019). What machine learning can tell us about the role of language dominance in the diagnostic accuracy of German LITMUS non-word and sentence repetition tasks. Frontiers in Psychology, 9, 2757. https://doi.org/10.3389/ fpsyg.2018.02757. https://www.frontiersin. org/articles/10.3389/fpsyg.2018.02757/full Al-Hassan, M. A., & Marinis, T. (2021). Sentence repetition in children with autism spectrum disorder in Saudi Arabia. In L. Tommi TszCheung & D. Ntelitheos (Eds.), Experimental Arabic linguistics (Vol. 10, p. 143). John Benjamins. Ambridge, B., & Wagner, L. (2021). Testable theories of core first language acquisition. Journal of Child Language, 48(5), 859–861. https://doi.org/10.1017/S0305000921000581 American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Andreou, G., & Chartomatsidou, E. (2020). A review paper on the syntactic abilities of
individuals with Down syndrome. Open Journal of Modern Linguistics, 10(05), 480. Bates, E. A. (2004). Explaining and interpreting deficits in language development across clinical groups: Where do we go from here? Brain and Language, 88(2), 248–253. Behrens, H. (2021). Constructivist approaches to first language acquisition. Journal of Child Language, 48(5), 959–983. Bishop, D. V. (2010). Overlaps between autism and language impairment: Phenomimicry or shared aetiology? Behavior Genetics, 40(5), 618–629. Bishop, D. V. (2017). Why is it so hard to reach agreement on terminology? The case of developmental language disorder (DLD). International Journal of Language & Communication Disorders, 52(6), 671–680. Bishop, D. V., Snowling, M. J., Thompson, P. A., Greenhalgh, T., & The Catalise‐2 Consortium. (2017). Phase 2 of CATALISE: A multinational and multidisciplinary Delphi consensus study of problems with language development: Terminology. Journal of Child Psychology and Psychiatry, 58(10), 1068–1080. Bogliotti, C., Aksen, H., & Isel, F. (2020). Language experience in LSF development: Behavioral evidence from a sentence repetition task. PloS one, 15(11), e0236729. Boons, T., De Raeve, L., Langereis, M., Peeraer, L., Wouters, J., & Van Wieringen, A. (2013). Narrative spoken language skills in severely hearing impaired school-aged children with cochlear implants. Research in Developmental Disabilities, 34(11), 3833–3846.
Cross-Linguistic Perspectives on Morphosyntax in Child Language Disorders 269 Chapman, R. S., Hesketh, L. J., & Kistler, D. J. (2002). Predicting longitudinal change in language production and comprehension in individuals with Down syndrome. Journal of Speech, Language, and Hearing Research, 45(5), 902–915. Christiansen, M. H., Contreras Kallens, P., & Trecca, F. (2022). Toward a comparative approach to language acquisition. Current Directions in Psychological Science, 31(2), 131–138. Conti-Ramsden, G., Botting, N., & Faragher, B. (2001). Psycholinguistic markers for specific language impairment (SLI). Journal of Child Psychology and Psychiatry, 42(6), 741–748. Crago, M., Paradis, J., & Menn, L. (2008). Cross-linguistic perspectives on the syntax and semantics of language disorders. In M. J. Ball, M. Perkins, N. Müller, & S. Howard (Eds.), The handbook of clinical linguistics (pp. 275–289). Blackwell. Durrleman, S., & Delage, H. (2016). Autism spectrum disorder and specific language impairment: Overlaps in syntactic profiles. Language Acquisition, 23(4), 361–386. Friedmann, N., & Szterman, R. (2011). The comprehension and production of Wh-questions in deaf and hard-of-hearing children. Journal of Deaf Studies and Deaf Education, 16(2), 212–235. Frizelle, P., & Fletcher, P. (2014). Relative clause constructions in children with specific language impairment. International Journal of Language & Communication Disorders, 49(2), 255–264. Frizelle, P. & Fletcher, P. (2017). Syntax in child language disorder. In R. G. Schwartz (Ed.), Handbook of child language disorders (2nd ed., pp. 416–440). Taylor & Francis Group. Gavarró, A. (2017). A sentence repetition task for Catalan-speaking typically-developing children and children with specific language impairment. Frontiers in Psychology, 8, 1865. Grech, H. (2022). The association of sentence imitation with other language domains in bilingual children. Journal of Child Science, 12(01), e15–e23. Haug, T., Batty, A. O., Venetz, M., Notter, C., Girard-Groeber, S., Knoch, U., & Audeoud, M. (2020). Validity evidence for a sentence repetition test of Swiss German Sign Language. Language Testing, 37(3), 412–434. Kjelgaard, M. M., & Tager-Flusberg, H. (2001). An investigation of language impairment in autism: Implications for genetic subgroups. Language and Cognitive Processes, 16(2–3), 287–308. Klem, M., Melby-Lervåg, M., Hagtvet, B., Lyster, S. A. H., Gustafsson, J. E., & Hulme, C. (2015).
Sentence repetition is a measure of children’s language skills rather than working memory limitations. Developmental Science, 18(1), 146–154. Leclercq, A. L., Quémart, P., Magis, D., & Maillart, C. (2014). The sentence repetition task: A powerful diagnostic tool for French children with specific language impairment. Research in Developmental Disabilities, 35(12), 3423–3430. Leonard, L. B. (2014). Children with specific language impairment. MIT press. Leonard, L. B. (2022). Developmental Language Disorder and the role of language typology. Enfance, 1(1), 25–39. MacWhinney, B. (2021). The competition model: Past and future. In J. Gervain, G. Csibra, & K. Kovács (Eds.), A life in cognition (pp. 3–16). Springer Nature. Marinis, T., & Armon-Lotem, S. (2015). Sentence repetition. In S. Armon-Lotem, J. de Jong, & N. Meir (Eds.), Assessing multilingual children: Disentangling bilingualism from language impairment (pp. 95–124). Multilingual Matters. Marshall, C., Mason, K., Rowley, K., Herman, R., Atkinson, J., Woll, B., & Morgan, G. (2015). Sentence repetition in deaf children with specific language impairment in British Sign Language. Language Learning and Development, 11(3), 237–251. Martins, A., Santos, A. L., & Duarte, I. (2017). Syntactic complexity in children with Autism Spectrum Disorder and Specific Language Impairment. In L. Escobar, V. Torrens, & T. Parodi (Eds.), Language processing and disorders (pp. 291–313). Cambridge Scholars Publishing. Meir, N., & Novogrodsky, R. (2020). Syntactic abilities and verbal memory in monolingual and bilingual children with High Functioning Autism (HFA). First Language, 40(4), 341–366. Meir, N., Walters, J., & Armon-Lotem, S. (2016). Disentangling SLI and bilingualism using sentence repetition tasks: The impact of L1 and L2 properties. International Journal of Bilingualism, 20(4), 421–452. Meir, N., Walters, J., & Armon-Lotem, S. (2017). Bi-directional cross linguistic influence in bilingual Russian-Hebrew children. Linguistic Approaches to Bilingualism, 7(5), 514–554. Nag, S., Snowling, M. J., & Mirković, J. (2018). The role of language production mechanisms in children’s sentence repetition: Evidence from an inflectionally rich language. Applied Psycholinguistics, 39(2), 303–325. Novogrodsky, R., Meir, N., & Michael, R. (2018). Morphosyntactic abilities of toddlers with
270 Stanislava Antonijevic and Natalia Meir hearing impairment and normal hearing: Evidence from a sentence‐repetition task. International Journal of Language & Communication Disorders, 53(4), 811–824. Oetting, J. B., McDonald, J. L., Seidel, C. M., & Hegarty, M. (2016). Sentence recall by children with SLI across two nonmainstream dialects of English. Journal of Speech, Language, and Hearing Research, 59(1), 183–194. Penke, M., & Rothweiler, M. (2018). Comparing specific language impairment and hearing impairment: Different profiles in German verbal agreement morphology. Language Acquisition, 25(1), 39–57. Pham, G., & Ebert, K. D. (2020). Diagnostic accuracy of sentence repetition and nonword repetition for developmental language disorder in Vietnamese. Journal of Speech, Language, and Hearing Research, 63(5), 1521–1536. Polišenská, K., Chiat, S., & Roy, P. (2015). Sentence repetition: What does the task measure? International Journal of Language & Communication Disorders, 50(1), 106–118. Prévost, P. (2015). Elicited production of object clitics. In S. Armon-Lotem, J. de Jong, & N. Meir (Eds.), Methods for assessing multilingual children: Disentangling bilingualism from specific language impairment (pp. 55–75). Multilingual Matters. Quinto-Pozos, D., Singleton, J. L., & Hauser, P. C. (2017). A case of specific language impairment in a deaf signer of American Sign Language. The Journal of Deaf Studies and Deaf Education, 22(2), 204–218. Ramscar, M. (2021). How children learn to communicate discriminatively. Journal of Child Language, 48(5), 984–1022. Rice, M. L. (2016). Specific language impairment, nonverbal IQ, attention-deficit/hyperactivity disorder, autism spectrum disorder, cochlear implants, bilingualism, and dialectal variants: Defining the boundaries, clarifying clinical conditions, and sorting out causes. Journal of Speech, Language, and Hearing Research, 59(1), 122–132. Riches, N. G. (2012). Sentence repetition in children with specific language impairment: An investigation of underlying mechanisms. International Journal of Language & Communication Disorders, 47(5), 499–510. Riches, N. G., Loucas, T., Baird, G., Charman, T., & Simonoff, E. (2010). Sentence repetition in adolescents with specific language impairments and autism: An investigation of complex syntax. International Journal of
Language & Communication Disorders, 45(1), 47–60. Rinaldi, P., Caselli, M. C., Lucioli, T., Lamano, L., & Volterra, V. (2018). Sign language skills assessed through a sentence reproduction task. The Journal of Deaf Studies and Deaf Education, 23(4), 408–421. Ruigendijk, E., & Friedmann, N. (2017). A deficit in movement-derived sentences in Germanspeaking hearing-impaired children. Frontiers in Psychology, 8, 689. Rujas, I., Mariscal, S., Murillo, E., & Lázaro, M. (2021). Sentence repetition tasks to detect and prevent language difficulties: A scoping review. Children, 8(7), 578. Schaeffer, J. (2018). Linguistic and cognitive abilities in children with specific language impairment as compared to children with high-functioning autism. Language Acquisition, 25(1), 5–23. Schönström, K., & Hauser, P. C. (2022). The sentence repetition task as a measure of sign language proficiency. Applied Psycholinguistics, 43(1), 157–175. Seeff-Gabriel, B., Chiat, S., & Dodd, B. (2010). Sentence imitation as a tool in identifying expressive morphosyntactic difficulties in children with severe speech difficulties. International Journal of Language & Communication Disorders, 45(6), 691–702. Shaalan, S. (2010). Investigating grammatical complexity in Gulf-Arabic speaking with specific language impairment [Unpublished doctoral dissertation]. University College London. Silleresi, S., Tuller, L., Delage, H., Durrleman, S., Bonnet-Brilhault, F., Malvy, J., & Prévost, P. (2018). Sentence repetition and language impairment in French-speaking children with ASD. In A. Gavarro (Ed.), On the acquisition of the syntax of romance (Language acquisition and language disorders) (pp. 235–258). John Benjamins. Slobin, D. I. (1985). Crosslinguistic evidence for the language-making capacity. In D. I. Slobin (Ed.), The crosslinguistic study of language acquisition, Vol. 2: Theoretical Issues (pp. 1157–1256). Erlbaum. Smolík, F., & Matiasovitsová, K. (2021). Sentence imitation with masked morphemes in Czech: Memory, morpheme frequency, and morphological richness. Journal of Speech, Language, and Hearing Research, 64(1), 105–120. Smolík, F., & Vávrů, P. (2014). Sentence imitation as a marker of SLI in Czech: Disproportionate impairment of verbs and clitics. Journal of Speech, Language, and Hearing Research, 57(3), 837–849.
Cross-Linguistic Perspectives on Morphosyntax in Child Language Disorders 271 Snyder, W. (2021). A parametric approach to the acquisition of syntax. Journal of Child Language, 48(5), 862–887. Sukenik, N., & Friedmann, N. (2018). ASD is not DLI: Individuals with autism and individuals with syntactic DLI show similar performance level in syntactic tasks, but different error patterns. Frontiers in Psychology, 9, 279. Taha, J., Stojanovik, V., & Pagnamenta, E. (2021). Sentence repetition as a clinical marker of Developmental Language Disorder: Evidence from Arabic. Journal of Speech, Language, and Hearing Research, 64(12), 4876–4899. Theodorou, E., Kambanaros, M., & Grohmann, K. K. (2017). Sentence repetition as a tool for screening morphosyntactic abilities of bilectal children with SLI. Frontiers in Psychology, 8, 2104. Vang Christensen, R. (2019). Sentence repetition: A clinical marker for developmental language disorder in Danish. Journal of Speech, Language, and Hearing Research, 62(12), 4450–4463. Wang, D., Zheng, L., Lin, Y., Zhang, Y., & Sheng, L. (2022). Sentence repetition as a clinical
marker for Mandarin-speaking preschoolers with developmental language disorder. Journal of Speech, Language, and Hearing Research, 65(4), 1–18. Wiig, E. H., Semel, E., & Secord, W. A. (2013). Clinical evaluation of language fundamentals – Fifth edition (CELF-5). NCS Pearsons. Williams, E. W. (2019). Sentence repetition in Norwegian children with developmental language disorder: An investigation of morphosyntax [Unpublished Master Dissertation]. University of Oslo. Witecy, B., Tolkmit, T., & Penke, M. (2020). Sentence repetition in German-speaking individuals with Down syndrome. In A. Botinis (Ed.), Proceedings of the 11th International Conference on Experimental Linguistics (ExLing 2020) (pp. 221–224). ExLing Society. https:// doi.org/10.36505/ ExLing-2020/11/0055/000470.
20 The Complex Relationship between Cognition and Language Illustrations from Acquired Aphasia LYNDSEY NICKELS, BRUNA TESSARO, SOLÈNE HAMEAU, AND CHRISTOS SALIS
“Without language, thought is a vague, uncharted nebula.” (Ferdinand de Saussure, 1959).
“My language to describe things in the world is very small, limited. My thoughts when I look at the world are vast, limitless and normal, same as they ever were. My experience of the world is not made less by lack of language but is essentially unchanged.” (Tom Lubbock, 2010).
20.1 Preamble The relationship between language and thought has been hotly debated for decades (e.g., Evans, 2014; Fodor, 1975; Langland-Hassan et al., 2021). Some authors, like the linguist Ferdinand de Saussure quoted above, believe that thought is barely possible without language. However, there are many arguments suggesting that this simply cannot be the case, including the quote above from Tom Lubbock who documented his language loss as a result of a brain tumor. Other examples include that babies show clear intent by reaching for things before they have acquired language, and that non-human animals have been demonstrated to show complex cognition and problem-solving behavior in the absence of language (e.g., Taylor et al., 2010). It has also been demonstrated that many cognitive processes (e.g., arithmetic, solving complex problems, listening to music, thinking about other people’s mental states, navigating in the world) engage distinct brain regions from, and do not depend on, language (Fedorenko & Varley, 2016). The relationship between language and cognition is clearly critical when thinking of individuals with language impairment – does a language impairment cause impairment in
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
274 Lyndsey Nickels, Bruna Tessaro, Solène Hameau, and Christos Salis non-linguistic cognition, and/or vice versa? We would contend that there is clear evidence that the answer is “no”, as over the years, cases have been reported where these abilities have dissociated (for review see Fedorenko & Varley, 2016). For example, Varley (2002) reported a man with severe impairments in both comprehension and production of language following brain damage. Yet, this man could drive, play chess against both human and machine opponents, and took responsibility for financial planning for his family. He also scored in the top 10% for his age and education on the Wisconsin Card Sorting Test (Grant & Berg, 1948) a complex task that is argued to tap problem solving abilities to figure out the changing rules of the game, requiring executive functions as well as working memory and attention. Traditionally, this separation between language and cognition was clear in the definition of some language disorders. For example, “intellect” was generally considered to be unimpaired in aphasia (e.g., Broca, 1861) and children with specific language impairment were diagnosed on the basis that there were no other cognitive impairments. (Note: the broader term Developmental Language Disorder that does not include this criterion is now preferred, Bishop et al., 2017). Nevertheless, while there is no necessary causal relationship between language and cognitive impairment, cognition and language likely function together (Perlovsky, 2011). Moreover, the language areas of the brain lie in close proximity to those associated with cognitive functions (Varoquaux et al., 2018), and consequently impairments to cognition and language are likely to co-occur following brain damage. In order for language intervention to be most appropriately targeted, there is a need to determine the extent to which non-linguistic cognition (i.e., attention, executive functions, memory, visuospatial abilities) is preserved in individuals with language impairment. In addition, the extent to which an individual may benefit from language intervention may also be related to their broader cognitive profile (Gilmore et al., 2019; Simic et al., 2020). For example, Gilmore et al. (2019) showed that enhanced executive functions predicted better gains following treatment in aphasia and suggested that the commonly seen heterogeneity of therapy outcomes among people with aphasia might be related to the varied profiles of non-linguistic cognition before treatment. Crucially, once again, this does not necessarily imply a causal relationship between linguistic and executive factors, but instead that cognitive training is likely to provide people with aphasia with strategies that might enhance their performance (Brownsett et al., 2014; Kohnert, 2004). For example, the ability to monitor for errors, apply a newly learned grammatical rule in the correct circumstances, or use a strategy to improve word retrieval, all require attention, memory and executive function skills. However, other interventions may be successful despite cognitive impairments, for example, when word retrieval treatment results in priming of existing lexical representations thereby making them more accessible. In this chapter, we address several issues relating to cognition and language, focusing first on the difficulties with assessment. We then look in more detail at two aspects of cognition – attention and executive functions, and conclude the chapter with a discussion of cognition in bilingual individuals. While, in each section, we focus on examples from people with acquired language disorder (aphasia) to illustrate the points raised, they also apply to populations of individuals with other language disorders.
20.2 Measuring Cognitive Processing in Individuals with Language Impairment As noted, above, in the context of language impairment, there is a clear clinical need to assess non-linguistic cognition (henceforth referred to as “cognition”). However, the presence of a language disorder makes this a complex task, as both linguistic and non-linguistic abilities can be required to successfully perform cognitive assessments: even those
The Complex Relationship between Cognition and Language 275 that are purportedly “non-linguistic” often require language to understand the instructions (e.g., Rey Complex Figure; Rey, 1941), or performance may be supported by the use of language skills (e.g., verbal rehearsal in memory tasks, verbal scaffolding in problem solving tasks; e.g., Raven’s Progressive Matrices, Raven, 1962; Baldo et al., 2005). It is, therefore, vital to carefully consider the role of language in such assessments. Before we proceed, it is important to make the distinction between constructs and measures. A construct is the aspect of cognition that we wish to understand and quantify, while measures are “a quantified record, or datum, taken as an empirical analogue to a construct” (Edwards & Bagozzi, 2000, p. 156). For example, short-term memory is a c onstruct, and accuracy on a forward digit span task (in which individuals are asked to repeat a string of numbers in the order they heard them) is a measure of this construct. This example also illustrates the intractable issue of task (or measure) “impurity”. Cognitive tasks often assess more than one construct, including attention, memory, and executive functions. Indeed, finding a task that requires only one cognitive function is likely to be impossible (Miyake et al., 2000). Critically, the task impurity problem means that impaired performance on a cognitive measure does not necessarily indicate an impairment of the construct that the measure aims to tap (Miyake et al., 2000): how accurately a measure reflects a construct will vary because of the potential impact on performance of a co-occurring impairment in another cognitive domain. For instance, an individual will perform poorly on the digit span task if they cannot accurately distinguish between different number words. However, their poor performance would not reflect their short-term memory ability, and the clinician would be unwise to conclude the individual suffered from a short-term memory impairment. However, with other measures the picture is more complex, and it is not uncommon for results to be reported in clinical (neuro)psychology reports with little thought of the impact of language disorder on the results. For example, the Mini Mental State Examination (Folstein et al., 1975) is commonly used to provide an indication of the presence of Alzheimer’s disease. It purports to cover a variety of cognitive domains including orientation to time and place, short- and long-term memory, recall, constructional ability and language. However, only one of the 30 questions does not require language (copying a figure), and even this requires comprehension of the instructions to accurately complete the task. Hence, this assessment cannot be considered appropriate for individuals with language disorder. Similarly, assessments that are commonly used in developmental populations, such as the Wechsler Intelligence Scale for Children (e.g., WISC-V; Wechsler, 2014), also have significant language demands. Another frequently used task is letter fluency in which participants are given one minute to produce words starting with a given letter. This is often considered a test of executive functions (Amunts et al., 2020), and a sensitive measure of overall cognitive processing (McDonnell et al., 2020). However, there is strong evidence that this fluency task is highly language dependent as compared to its executive load (Wall et al., 2017; Whiteside et al., 2016). Despite speech and language therapists/clinical linguists being acutely aware of these issues, this problem remains significant for diagnosis and intervention. We recently examined this issue in relation to aphasia in a literature review of research published between 2010 and 2020 (Tessaro et al., 2023b). We found that of the 480 studies that included cognitive assessments in people with aphasia, 54% used tests that relied, at least partially, on unimpaired linguistic abilities such as language production and/or comprehension, and/or the test required participants to map stimuli (e.g., pictures or words) to their meaning to be successfully performed. Similarly, in a survey completed by professionals working with people with aphasia (Tessaro et al., 2023a), a major barrier for cognitive assessment in aphasia was that most of the available tests were highly language dependent. This survey indicated that role clarity was an issue: whose role it was to assess cognition in people with aphasia?
276 Lyndsey Nickels, Bruna Tessaro, Solène Hameau, and Christos Salis Clinicians indicated concern that no one professional would have all the requisite skills to assess cognition in people with aphasia: Speech and language therapists are aware of the linguistic difficulties of patients but may lack training in cognition and cognitive assessment, while other professionals, such as neurologists, and (neuro)psychologists have experience with cognitive assessments but lack training in assessing patients with linguistic impairment. There is great variability in how people with aphasia perform on cognitive tests, and clinicians need to be aware that this is critically dependent on how much language is involved in test administration (e.g., verbal instructions/stimuli) and performance (e.g., verbal responses), as well as the severity of the language disorder. In the assessments of all language disordered populations, clinicians and/or researchers need to remain alert to the requirement for appropriate tools, and careful critique of the language demands of whatever tools they use, before concluding that an individual is impaired in a particular cognitive domain. In addition to language disorder impacting on cognitive assessments, cognitive impairment can also impact on what are traditionally thought of as “pure” language tasks. For example, a sentence-picture matching task (where one sentence is matched to the target from one of four pictures) is thought of as a measure of sentence comprehension. However, it requires far more than accurate comprehension of the sentence. The content of each p icture must also be evaluated, the concept evoked by the picture held in memory and compared to the concept evoked by the sentence, and a judgement made as to whether these two match. This process must be repeated for each of the other pictures and the results of these comparisons held in memory, and finally the “best” match chosen. This complex process requires visuospatial skills (to scan and interpret the pictures), attention (to focus on each picture and switch attention to the next), memory (to hold the concepts and the results of concept comparison), problem solving (to make a decision, and if no suitable match is found to rescan the picture choices), and more. In sum, assessments of both language and other cognitive domains suffer from the “task impurity” problem, and it is therefore incumbent on the clinician to carefully evaluate the task demands of whatever measure they choose and in the interpretation of the results take these complexities into account.
20.3 Language Production in Aphasia and Stroke from a Dual-task Perspective Focusing on what one wishes to say while doing other activities requires a great deal of attention. Attention is a major non-linguistic cognitive skill and one that interacts with other cognitive skills. This section explores the role of attention on language production through studies employing a dual-task paradigm. In a dual-task language study, a language task is carried out while simultaneously performing another task (e.g., storytelling while walking). Performance as measured by linguistic variables (e.g., utterance length, grammatical complexity) is compared between during the dual-task and during the language task alone (e.g., storytelling without walking). For example, one might compare whether, in storytelling while walking, there is a reduction in utterance length or lexical diversity compared to in storytelling without walking. Change in the linguistic variables of interest is known as dual-task effect: when this effect is negative, it is referred to as a dual-task cost. A dual-task condition is thought to place higher attentional demands on language production than when language tasks are performed on their own. This theoretical assumption is based on the premise that people’s attentional capacity is limited and when they perform two tasks their attention is divided between the two tasks (Murray, 2000). Divided attention refers to the performance of more than one process at a time (Cristofori & Levin, 2015).
The Complex Relationship between Cognition and Language 277 Here, we discuss studies from the aphasia and wider stroke literature to illustrate dualtask costs on language production (at single word and discourse levels), and discuss their theoretical implications. Murray (2000) was the first to investigate dual-task effects on the word retrieval difficulties of people with aphasia following stroke (although other participant groups were also included, we focus only on the subgroup of 14 people with mild aphasia). In the single-task condition, participants had to complete phrases with a single word. The phrases primed the production of a word from either a closed set of possible responses (e.g., he milked the …) or an open set (e.g., he carried the …). In the dual-task condition, participants were instructed to first carry out a tone discrimination task (were pure tones high or low) and then complete the phrase. The expected dual-task cost was evident: performance dropped in both completion conditions. However, the “open” set was particularly affected dropping from an average accuracy of 86% in the single-task condition to 58% (closed set: 94% vs 88%). Murray suggested that the dual-task cost for the open set in particular may stem from insufficient attention when carrying out the extended searches within the mental dictionary that are required for a satisfactory response for these stimuli (see also Laganaro et al. (2019) for similar results and conclusions). Researchers have also used word fluency tasks to investigate dual-task costs on spoken word retrieval in people with stroke who did not have language impairment (e.g., Haggard et al., 2000). In (semantic) word fluency tasks, the person is asked to name as many words as they can from a specific semantic category (e.g., animals) within one minute. Haggard et al. (2000) examined semantic fluency in a dual-task condition where the person was walking. While neurotypical individuals showed no dual-task decrement and produced more words than the stroke group in single- and dual-task conditions, the stroke group had a small yet statistically significant reduction in the average number of correct, unique words produced in one minute in the dual-task. The evidence we have discussed so far shows that carrying out two tasks simultaneously diminishes accuracy of word finding ability in people with aphasia. In people affected by stroke who do not have aphasia, word finding ability under time pressure slows under dualtask conditions. Therefore, both accuracy and the speed of word finding are amenable to divided attention as measured with dual tasks. Dual-task effects have also been investigated on discourse production in aphasia. Harmon et al. (2019) asked people with (either mild or moderate) aphasia and neurotypical controls (healthy older adults) to recall stories after audio-visual presentation. In the dual-task condition, while recalling the stories, participants were also presented with tones that they had to identify as either “high” or “low” in frequency. Dual-task effects were examined on the number of correct information units (CIUs), defined as intelligible, accurate, relevant, and informative words. Dual-task costs were also found in speech rate (words per minute) of both groups with aphasia, and in the mild group in terms of pause rate (pauses per utterance). It is worth noting that for neurotypical controls dual-task costs were only evident in speech and pause rate, but not in CIUs or speech efficiency. These findings show that the greater the aphasia severity, the more evident the dual-task costs in some linguistic measures. The CIU measure can also be conceived a measure of word finding efficiency in a discourse context, and hence these findings converge with the dualtask studies that focused on single word production. In the non-aphasia stroke literature, Kemper et al. (2006) examined the effect of walking on linguistic measures elicited from spoken discourse such as descriptions of events or people. They used a variety of different measures that they proposed tapped different aspects of language processing: (a) fluency (mean length of utterance, words per minute, proportion of grammatical sentences, proportion of utterances without fillers); (b) sentence complexity
278 Lyndsey Nickels, Bruna Tessaro, Solène Hameau, and Christos Salis (mean clauses per utterance, morphological complexity); (c) semantic content (mean number of propositions per 100 words, type-token ratio). Kemper et al. reported dual-task costs in all fluency and sentence complexity measures. However, in the semantic content measures, only mean number of propositions showed dual-task costs. As in Harmon et al., above, this study by Kemper and colleagues shows that dual-task costs are evident in several linguistic measures, including sentence (or syntactic) complexity, even in the absence of language disorder. From the dual-task paradigm perspective, the simultaneous production of language during another task provides a glimpse as to how the person’s language production abilities may fare in real-life settings which are inherently distracting (e.g., driving while telling a story to a fellow passenger while the radio is on). The studies discussed here show that producing language while doing something else heightens attentional demands. Consequently, linguistic measures of accuracy and speed are affected. Earlier, we introduced the notion of divided attention and its fluctuating capacity when the demands for processing increase when engaging in two cognitive tasks. Implicit are other ideas that relate to which of the two tasks one should prioritize in terms of planning, executing and monitoring language, a key requirement in the dual-task paradigm. Some of these issues regarding attentional demands and the ability to divide attention between tasks are explored in the next two sections.
20.4 The Relationship between Executive Function and Language/Language Impairment This section focuses on a complex cognitive domain, and one of the least well understood: executive functions. Executive functions are also sometimes called cognitive control or executive control, and this term refers to higher order processes that are important in daily functioning, such as planning, initiating and executing actions, inhibiting behaviors, organizing and/or prioritizing tasks, switching between tasks, self-monitoring and avoiding distraction (Diamond, 2013; Jurado & Rosselli, 2007). Patterns of behavior that are attributed to impairment of executive functions are often observed after frontal lobe damage (Stuss et al., 2002). A well-known case dates from 1848, when the railroad worker Phineas Gage suffered a traumatic brain injury from an accident that drove an iron bar through his frontal lobes (Harlow, 1868). Phineas survived the accident but showed symptoms that were found surprising at the time: he was able to walk and talk, but his colleagues reported that he was “no longer Gage” because his personality and temperament had changed drastically. He demonstrated severe problems that affected his daily life in controlling and regulating his behavior, trouble with rational decision making, and processing emotions. Dysexecutive symptoms, such as those exhibited by Phineas Gage, are commonly observed among other individuals with neurological disorders, such as Parkinson’s disease, or following traumatic brain injury, brain tumor or stroke. They may also be present in neurodevelopmental conditions such as autism spectrum disorder, Attention Deficit Hyperactivity Disorder and Tourette’s syndrome (Johnston et al., 2019; Peng & Wallace, 2017) or psychiatric disorders such as schizophrenia, bipolar disorder, or depression. Some people with aphasia have also been reported to have deficits in executive functions that have been argued to negatively affect communication skills (Keil & Kaszniak, 2002). For example, Frankel et al. (2007) reported the case MS who had aphasia following stroke and showed difficulties on cognitive tests that required shifting attention. Frankel et al. argued that this executive dysfunction was linked to her difficulties in conversation where she had difficulty shifting attention to a different conversational topic or changing from an unsuccessful strategy for conversational repair.
The Complex Relationship between Cognition and Language 279 The relationship between linguistic and cognitive impairment is particularly evident in the literature on traumatic brain injury (TBI). Cognitive problems are often present as a consequence of TBI and they have been considered to be the underlying cause of communication difficulties in these individuals (Togher et al., 2013a) as is evident in the use of the term cognitive-communication disorders in this population. Common difficulties observed in people with TBI include excessive talkativeness with unexpected and sometimes incoherent topic changes and poor turn taking, whilst they also present with impaired attention, memory and executive functions. Cognition can also be seen to influence communication when working memory is required to integrate world knowledge (retrieved from long term memory) with what is being said in a discourse (Togher et al., 2013b). When working memory is impaired, full understanding of a discourse is also impaired (e.g., Barnes & Dennis, 2001). Some researchers have made more controversial claims, suggesting that impairment in executive functions may be the underlying cause of meaning-related errors in comprehension and production in people with aphasia (e.g., Jefferies & Lambon Ralph, 2006). These symptoms would previously have been attributed to impaired access to semantic representations. This literature attributes difficulties in picture naming (with semantically related errors), and tasks requiring semantic judgments (such as selecting an item associated to the stimulus) to deregulated “semantic control” mechanisms (e.g., Jefferies & Lambon Ralph, 2006; Thompson et al., 2015). In selecting people with aphasia with impaired processing of meaning, the authors used a task that requires selection of one of two or more items that are associated with a stimulus (e.g., is a palm tree or a fir tree associated with a pyramid?). However, this task requires executive control to perform, thus by selecting individuals who perform poorly on this task, rather than just selecting individuals with impaired processing of meaning there will be a bias toward selecting individuals who also (or alternatively) have poor executive control (Tessaro et al., 2022). Consequently, the task impurity problem may have led to a biased sample, and perhaps difficulties in processing meaning in stroke aphasia are not always associated with impaired executive control. Moreover, other research groups have not been able to replicate the evidence for a semantic control impairment (Chapman, 2019; Chapman et al., 2020; Heikkola et al., 2022). Additionally, as discussed throughout this chapter, one needs to be very careful as to which tests are used to determine a cognitive impairment in people with language disorders. While the “semantic control” studies use seemingly non-linguistic cognitive tasks (e.g., Wisconsin Card Sorting Test; Grant & Berg, 1948), these tasks still require participants to understand complex verbal instructions and have been previously shown to engage language in the form of inner speech (Baldo et al., 2005).
20.5 Cognitive Control and Bilingual Language Processing We now turn our attention to a particularly fruitful area of research for interrogating the links between cognitive control and language: bilingual language processing (see also Chapter 18). There is now widespread agreement that both languages are activated at least to some extent at any given time when understanding and producing language (e.g., De Bot, 1992; Costa et al., 1999). Individuals who speak several languages constantly “choose” the right language for successful communication or to use in a given task and, simultaneously, disregard or suppress any language that is not appropriate for the situation. Yet, bilinguals are usually very successful at managing this competition and seldom make errors in selecting the correct language. This indicates that bilinguals must be equipped with an efficient mental control mechanism that prevents them from using the wrong language.
280 Lyndsey Nickels, Bruna Tessaro, Solène Hameau, and Christos Salis
20.5.1 Bilingual Cognitive Control Theories of bilingual language processing have postulated mechanisms by which successful language selection can be achieved. For example, the intention to speak one language and not the other might trigger differential activation of a bilingual’s two language subsystems (Grosjean’s language mode theory: for example, Grosjean, 2001; Paradis’ activation threshold theory: Paradis, 2004). Some theories posit that this differential activation is governed by cognitive mechanisms outside of the language system. For example, in Green’s inhibitory control model of bilingual language control (Green, 1998), the “Supervisory Attentional System” involved in the planning, regulation, and verification of (all kinds of) voluntary actions, is responsible for retrieving language task schemas. A language task schema s pecifies how language(s) need to be used for a specific task (e.g., word production, word translation), and the Supervisory Attentional System specifies the schemas that are relevant for that task. In turn, the selected schemas activate relevant lexical representations and inhibit irrelevant representations. In Abutalebi and Green’s (2007) influential neurocognitive model of language representation and control, the two languages of a bilingual person are represented in the same areas of the brain, but a language control apparatus regulates which language should be used and which should not. This system involves a coordinated network of regions that are not specifically involved in language and are mostly located outside of language areas. In bilingual research, cognitive control is often tested using non-verbal tasks, like the flanker task. In the flanker task, participants must respond to a stimulus (e.g., an arrow) that points either to the left or to the right. This stimulus is “flanked” by other similar stimuli that either point to the same direction (congruent condition) or to the opposite direction (incongruent condition). Typically, responses are slower in the incongruent condition. Tasks such as the flanker task require resistance to interference from irrelevant information, and a bilingual advantage on these tasks has often been interpreted as the consequence of the bilingual experience of inhibiting the irrelevant language (e.g., Costa et al., 2009). There are, however, criticisms of this literature. First, the bilingual advantage on these tasks has not always been replicated (see e.g., Hilchey & Klein, 2011, for a review). Second, if the bilingual advantage is due to better inhibition of competing information, then bilinguals should show a smaller difference between congruent and incongruent trials when compared to monolinguals. Yet, this is mostly not the case: when a bilingual advantage is observed, it is generally in the form of a more global advantage on both congruent and incongruent trials (e.g., Nickels et al., 2019). Consequently, although specific language control processes are a feature of many theories of bilingualism, supporting behavioral evidence is at times equivocal.
20.5.2 Bilingual Aphasia: The Result of a Failure of Language Control? Data from bilingual speakers with aphasia has been used to support the proposal that cognitive control mechanisms regulate language use in bilinguals. There are a variety of scenarios regarding the impact aphasia has on each of the languages of a bilingual. Most bilinguals with aphasia (e.g., 65%: Fabbro, 2001; 95%: Peñaloza et al., 2020) experience parallel impairment: aphasia affects both languages to the same extent. Other bilingual individuals with aphasia may have a disproportionate impairment in one of the languages. At the extreme end of the spectrum is the case of selective impairment, in which one language becomes unavailable for communication post-stroke. Several authors have hypothesized that, rather than a selective loss of language representations, these differential aphasia presentations are caused by a failure of language control through temporary or permanent inhibition of a language (e.g., Green & Abutalebi, 2008; Paradis, 2004).
The Complex Relationship between Cognition and Language 281
20.5.3 Investigating the Link between Cognitive Control and Bilingual Aphasia Following the theory whereby differential impairment in each language in bilingual aphasia is related to a disruption of cognitive control, some bilingual individuals with aphasia have demonstrated impaired non-linguistic cognitive control as measured on tasks such as the flanker task (for recent reviews see Mooijman et al., 2022; Nair et al., 2021). However, given that such impairments of non-linguistic control are observed not only in individuals whose languages are differentially impaired (e.g., Van der Linden et al., 2018; Verreyt et al., 2013) but also in parallel impairment (Green et al., 2010), the precise involvement of non-linguistic cognitive control in bilingual aphasia is still unclear. Interestingly, rather than showing the role of cognitive control impairments in bilingual aphasia, another line of research has investigated whether being bilingual confers some “advantage” or protective effect in the case of aphasia. While bilingualism does not seem to affect the incidence of aphasia (Alladi et al., 2016), several authors (Ardila et al., 2023; Lahiri et al., 2021; Paplikar et al., 2019) have found evidence of a reduced severity of aphasia in bilinguals compared to monolinguals.1 Other researchers (e.g., Dash et al., 2020; Dekhtyar et al., 2020; Penn et al., 2010) found that, compared to monolinguals with aphasia, bilinguals with aphasia performed better on a range of measures of non-linguistic cognition. Penn et al. (2010) also noted that bilinguals with aphasia made use of stronger executive functions in discourse and had better success in conversational speech as a consequence. “Pathological” language switching/mixing is a disconcerting behavior that is observed in some bilinguals with aphasia. Pathological switching refers to instances in which the speaker involuntarily switches languages between utterances, while pathological mixing refers to cross-language intrusions within utterances (Fabbro, 2001). Although switching and mixing can be normal behaviors in some bilingual communities (e.g., Grosjean, 1989), with brain damage these behaviors may occur at a higher rate or in inappropriate situations (for example, when speaking with a monolingual person). Both pathological switching and mixing have been hypothesized to result from a failure of language control (Green & Abutalebi, 2008). Pathological mixing has also been linked to a disruption of the language control network (e.g., in subcortical aphasia, Abutalebi et al., 2000). However, more recently, a number of case studies of bilingual aphasia emphasize the possibility that an increase in language mixing might be the consequence of word-finding difficulties and reflect a strategy (conscious or unconscious) to maximize communication effectiveness (e.g., Ansaldo et al., 2010; Goral et al., 2019; Hameau et al., 2022). Although most of these case studies do not report performance on non-linguistic cognitive measures, Hameau et al. (2022) reported an individual with aphasia who demonstrated frequent “pathological” code-mixing, but yet had normal scores on a Simon task (similar to a flanker task) and had no obvious deficit in executive functions. Therefore, a disruption of cognitive control may not necessarily be the cause of “pathological” language mixing behaviors and more research is needed to understand the implications of cognitive control on pathological switching and mixing in bilingual aphasia. Finally, the assumption of cognitive control impairments being central to many of the deficits experienced by bilinguals with aphasia leads to the expectation that training non-linguistic cognitive control in these individuals would lead to improvements of language processing. However, to date there is very limited evidence that this is the case. Kohnert (2004) provided a “cognitive” intervention consisting of card sorting, math computations, or sustained attention tasks, to a bilingual individual with severe chronic aphasia. This individual did show some cognitive difficulties (e.g., on the Wisconsin Card Sorting Test), and, following the intervention, showed some improvements in the treated skills
282 Lyndsey Nickels, Bruna Tessaro, Solène Hameau, and Christos Salis (e.g., card sorting, visual field searches, math computations). Importantly, general language assessment, comprising word and sentence production and comprehension tasks, was administered pre- and post- intervention, which showed “modest” gains in both languages (e.g., picture naming of 12 items improved from 3 to 9 correct in English, and from 4 to 7 in Spanish, respectively). Further argument for targeting cognitive control in intervention is that it contributes to successful use of compensatory strategies (for example by self-cueing in the other language: Ansaldo et al., 2010). Further research is warranted to determine the impact of non-linguistic cognitive control training on language outcomes in bilingual aphasia. In sum, the most recent theories of bilingual language processing feature cognitive control mechanisms that regulate the activation of a bilingual’s languages. However, behavioral evidence used to demonstrate the role of cognitive control in language processing in unimpaired bilinguals shows some inconsistencies, and it is still unclear what the precise involvement of cognitive control is in the specific manifestations of bilingual aphasia. More research is needed to better understand the link between cognitive control and language in bilinguals with and without aphasia. Nevertheless, a fact that remains undisputed is that strong cognitive control abilities support better communication outcomes in bilingual individuals with aphasia.
20.6 Conclusion In this chapter we have aimed to provide insights into the complexities of the relationship between language and other cognitive processes which is of great relevance when considering language impairment. We focused on examples from aphasia, examining the difficulties with assessment, aspects of attention, executive functions, and cognitive control in bilingual individuals. It is clear that there is an association between linguistic processing (i.e., semantic and lexical processing, syntax) and other cognitive domains in aphasia (i.e., attention, executive functions, memory and visuospatial abilities) (e.g., Baldo et al., 2005; Murray, 2000, 2012) and other language disorders (e.g. Developmental Language Disorder, Bishop et al., 2017). However, there is also evidence of clear dissociations between these abilities (e.g., Varley, 2002) and consequently we believe that it is unwarranted to conclude that there is a causal relationship between the two. Nevertheless, it is virtually impossible to assess any cognitive domain (including language) in isolation from another. Moreover, it is also true that much of our daily life involves tasks which are facilitated by the ability to use language. Consequently, clinicians and researchers need to be mindful of the inherent complexities of choosing a cognitive assessment and interpreting the results of such assessments carried out with individuals with language impairments.
NOTE 1 Note that even highly proficient bilinguals should not be expected to perform equivalently to monolinguals on language assessments (a bilingual is not “two monolinguals in one person”; Grosjean, 1989; see Hope et al. (2015) for an example of this assumption leading to an erroneous conclusion).
The Complex Relationship between Cognition and Language 283
REFERENCES Abutalebi, J., & Green, D. (2007). Bilingual language production: The neurocognition of language representation and control. Journal of Neurolinguistics, 20(3), 242–275. Abutalebi, J., Miozzo, A., & Cappa, S. F. (2000). Do subcortical structures control “language selection” in polyglots? Evidence from pathological language mixing. Neurocase, 6(1), 51–56. Alladi, S., Bak, T. H., Mekala, S., Rajan, A., Chaudhuri, J. R., Mioshi, E., Krovvidi, R., Surampudi, B., Duggirala, V., & Kaul, S. (2016). Impact of bilingualism on cognitive outcome after stroke. Stroke, 47(1), 258–261. Amunts, J., Camilleri, J. A., Eickhoff, S. B., Heim, S., & Weis, S. (2020). Executive functions predict verbal fluency scores in healthy participants. Scientific Reports, 10(1), 1–11. Ansaldo, A. I., Saidi, L. G., & Ruiz, A. (2010). Model‐driven intervention in bilingual aphasia: Evidence from a case of pathological language mixing. Aphasiology, 24(2), 309–324. Ardila, A., Lahiri, D., & Mukherjee, A. (2023). Bilingualism as a protective factor in aphasia. Applied Neuropsychology. Adult, 30(5), 512–520. https://doi.org/10.1080/23279095.2021.1960837 Baldo, J. V., Dronkers, N. F., Wilkins, D., Ludy, C., Raskin, P., & Kim, J. (2005). Is problem solving dependent on language? Brain and Language, 92(3), 240–250. Barnes, M. A., & Dennis, M. (2001). Knowledgebased inferencing after childhood head injury. Brain and Language, 76(3), 253–265. https://doi. org/10.1006/brln.2000.2385 Bishop, D. V. M., Snowling, M. J., Thompson, P. A., & Greenhalgh, T. (2017). Phase 2 of CATALISE: A multinational and multidisciplinary Delphi consensus study of problems with language development: Terminology. Journal of Child Psychology and Psychiatry, 58(10), 1068–1080. Broca, P. (1861). Remarques sur le siège de la faculté du langage articulé, suivies d’une observation d’aphémie (perte de la parole). Bulletin et Mémoires de la Société Anatomique de Paris, 6, 330–357. Brownsett, S. L., Warren, J. E., Geranmayeh, F., Woodhead, Z., Leech, R., & Wise, R. J. (2014). Cognitive control and its impact on recovery from aphasic stroke. Brain, 137(1), 242–254. Chapman, C. A. (2019). How do word frequency and semantic diversity affect selection of representations in word processing? PhD Thesis. Rice University.
Chapman, C. A., Hasan, O., Schulz, P. E., & Martin, R. C. (2020). Evaluating the distinction between semantic knowledge and semantic access: Evidence from semantic dementia and comprehension-impaired stroke aphasia. Psychonomic Bulletin and Review, 27(713), 607–639. https://doi.org/10.3758/s13423-01901706-6 Costa, A., Hernández, M., Costa-Faidella, J., & Sebastián-Gallés, N. (2009). On the bilingual advantage in conflict processing: Now you see it, now you don’t. Cognition, 113(2), 135–149. Costa, A., Miozzo, M., & Caramazza, A. (1999). Lexical selection in bilinguals: Do words in the bilingual’s two lexicons compete for selection? Journal of Memory and Language, 41(3), 365–397. https://doi.org/10.1006/JMLA.1999.2651 Cristofori, I., & Levin, H. S. (2015). Traumatic brain injury and cognition. Handbook of Clinical Neurology, 128, 579–611. https://doi.org/ 10.1016/B978-0-444-63521-1.00037-6 Dash, T., Masson-Trottier, M., & Ansaldo, A. I. (2020). Efficiency of attentional processes in bilingual speakers with aphasia. Aphasiology, 34(11), 1363–1387. De Bot, K. (1992). A bilingual production model: Levelt’s “speaking” model adapted. Applied Linguistics, 13(1), 1–24. Dekhtyar, M., Kiran, S., & Gray, T. (2020). Is bilingualism protective for adults with aphasia?. Neuropsychologia, 139, 107355. https://doi. org/10.1016/j.neuropsychologia.2020.107355 de Saussure, F. (1959). Course in general linguistics (W. Baskin, Trans.), Philosophical Library. Diamond, A. (2013). Executive functions. Annual Review of Psychology, 64(1), 135–168. Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological Methods, 5(2), 155–174. Evans, V. (2014). The language myth: Why language is not an instinct. Cambridge University Press. Fabbro, F. (2001). The bilingual brain: Bilingual aphasia. Brain and Language, 79(2), 201–210. Fedorenko, E., & Varley, R. (2016). Language and thought are not the same thing: Evidence from neuroimaging and neurological patients. Annals of the New York Academy of Sciences, 1369(1), 132–153. Fodor, J. A. (1975). The language of thought (Vol. 5). Harvard University Press.
284 Lyndsey Nickels, Bruna Tessaro, Solène Hameau, and Christos Salis Folstein, M. F., Folstein, S. E., & McHugh, P. R. (1975). “Mini-mental state” A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12(3), 189–198. Frankel, T., Penn, C., & Ormond-Brown, D. (2007). Executive dysfunction as an explanatory basis for conversation symptoms of aphasia: A pilot study. Aphasiology, 21(6–8), 814–828. Gilmore, N., Meier, E. L., Johnson, J. P., & Kiran, S. (2019). Nonlinguistic cognitive factors predict treatment-induced recovery in chronic poststroke aphasia. Archives of Physical Medicine and Rehabilitation, 100(7), 1251–1258. Goral, M., Norvik, M., & Jensen, B. U. (2019). Variation in language mixing in multilingual aphasia. Clinical Linguistics & Phonetics, 33(10–11), 915–929. Grant, D. A., & Berg, E. (1948). A behavioral analysis of degree of reinforcement and ease of shifting to new responses in a Weigl-type card-sorting problem. Journal of Experimental Psychology, 38(4), 404. Green, D. W. (1998). Mental control of the bilingual lexico-semantic system. Bilingualism: Language and Cognition, 1(2), 67–81. Green, D. W., & Abutalebi, J. (2008). Understanding the link between bilingual aphasia and language control. Journal of Neurolinguistics, 21(6), 558–576. Green, D. W., Grogan, A., Crinion, J., Ali, N., Sutton, C., & Price, C. J. (2010). Language control and parallel recovery of language in individuals with aphasia. Aphasiology, 24(2), 188–209. Grosjean, F. (1989). Neurolinguists, beware! The bilingual is not two monolinguals in one person. Brain and Language, 36(1), 3–15. Grosjean, F. (2001). The bilingual’s language modes. In J. Nicol (Ed.), One mind, two languages: Bilingual language processing (pp. 1–22). Blackwell. Haggard, P., Cockburn, J., Cock, J., Fordham, C., & Wade, D. (2000). Interference between gait and cognitive tasks in a rehabilitating neurological population. Journal of Neurology, Neurosurgery and Psychiatry, 69(4), 479–79. https://doi.org/10.1136/jnnp.69.4.479 Hameau, S., Dmowski, U., & Nickels, L. (2022). Factors affecting cross-language activation and language mixing in bilingual aphasia: A case study. Aphasiology, 37(8), 1149–1172. Harlow, J. M. (1868). Recovery from the passage of an iron bar through the head. Publications of the Massachusetts Medical Society, 2(3), 327–347. Harmon, T. G., Jacks, A., Haley, K. L., & Bailliard, A. (2019). Dual-task effects on story retell for participants with moderate, mild, or no aphasia:
Quantitative and qualitative findings. Journal of Speech, Language, and Hearing Research, 62(6), 1890–1905. Heikkola, L. M., Kuzmina, E., & Jensen, B. U. (2022). Predictors of object naming in aphasia: Does cognitive control mediate the effects of psycholinguistic variables? Aphasiology, 36(11), 1275–1292. Hilchey, M. D., & Klein, R. M. (2011). Are there bilingual advantages on nonlinguistic interference tasks? Implications for the plasticity of executive control processes. Psychonomic Bulletin & Review, 18(4), 625–658. Hope, T. M., Parker Jones, Ō., Grogan, A., Crinion, J., Rae, J., Ruffle, L., Leff, A. P., Seghier, M. L., Price, C. J., & Green, D. W. (2015). Comparing language outcomes in monolingual and bilingual stroke patients. Brain, 138(4), 1070–1083. Jefferies, E., & Lambon Ralph, M. A. (2006). Semantic impairment in stroke aphasia versus semantic dementia: A case-series comparison. Brain, 129(8), 2132–2147. Johnston, K., Murray, K., Spain, D., Walker, I., & Russell, A. (2019). Executive function: Cognition and behaviour in adults with autism spectrum disorders (ASD). Journal of Autism and Developmental Disorders, 49(10), 4181–4192. Jurado, M. B., & Rosselli, M. (2007). The elusive nature of executive functions: A review of our current understanding. Neuropsychology Review, 17(3), 213–233. Keil, K., & Kaszniak, A. W. (2002). Examining executive function in individuals with brain injury: A review. Aphasiology, 16(3), 305–335. Kemper, S., McDowd, J., Pohl, P., Herman, R., & Jackson, S. (2006). Revealing language deficits following stroke: The cost of doing two things at once. Aging, Neuropsychology, and Cognition, 13(1), 115–139. Kohnert, K. (2004). Cognitive and cognate-based treatments for bilingual aphasia: A case study. Brain and Language, 91(3), 294–302. Laganaro, M., Bonnas, C., & Fargier, R. (2019). Word form encoding is under attentional demand: Evidence from dual-task interference in aphasia. Cognitive Neuropsychology, 36(1-2), 18–30. Lahiri, D., Ardila, A., Dubey, S., Mukherjee, A., Chatterjee, K., & Ray, B. K. (2021). Effect of bilingualism on aphasia recovery. Aphasiology, 35(8), 1103–1124. Langland-Hassan, P., Faries, F. R., Gatyas, M., Dietz, A., & Richardson, M. J. (2021). Assessing abstract thought and its relation to language with a new nonverbal paradigm: Evidence from aphasia. Cognition, 211, 104622.
The Complex Relationship between Cognition and Language 285 Lubbock, T. (2010, November 7). Tom Lubbock: A memoir of living with a brain tumour. The Guardian, International Edition. https:// www.theguardian.com/books/2010/nov/07/ tom-lubbock-brain-tumour-language McDonnell, M., Dill, L., Panos, S., Amano, S., Brown, W., Giurgius, S., Small, G., & Miller, K. (2020). Verbal fluency as a screening tool for mild cognitive impairment. International Psychogeriatrics, 32(9), 1055–1062. Miyake, A., Emerson, M. J., & Friedman, N. P. (2000). Assessment of executive functions in clinical settings: Problems and recommendations. Seminars in Speech and Language, 21(2), 169–183. Mooijman, S., Schoonen, R., Roelofs, A., & Ruiter, M. B. (2022). Executive control in bilingual aphasia: A systematic review. Bilingualism: Language and Cognition, 25(1), 13–28. Murray, L. L. (2000). The effects of varying attentional demands on the word-retrieval skills of adults with aphasia, right hemisphere brain-damage or no brain-damage. Brain and Language, 72(1), 40–72. Murray, L. L. (2012). Attention and other cognitive deficits in aphasia: Presence and relation to language and communication measures. American Journal of Speech-Language Pathology, 21(2), S51–S64. Nair, V. K., Rayner, T., Siyambalapitiya, S., & Biedermann, B. (2021). Domain-general cognitive control and domain-specific language control in bilingual aphasia: A systematic quantitative literature review. Journal of Neurolinguistics, 60, 101021. Nickels, L., Hameau, S., Nair, V. K., Barr, P., & Biedermann, B. (2019). Ageing with bilingualism: Benefits and challenges. Speech, Language and Hearing, 22(1), 32–50. Paplikar, A., Mekala, S., Bak, T. H., Dharamkar, S., Alladi, S., & Kaul, S. (2019). Bilingualism and the severity of poststroke aphasia. Aphasiology, 33(1), 58–72. Paradis, M. (2004). A neurolinguistic theory of bilingualism. John Benjamins. Peñaloza, C., Barrett, K., & Kiran, S. (2020). The influence of prestroke proficiency on poststroke lexical-semantic performance in bilingual aphasia. Aphasiology, 34(10), 1223–1240. Peng, C. S., & Wallace, G. L. (2017). Profiles of executive control in autism spectrum disorder, attention deficit hyperactivity disorder, and Tourette’s syndrome: Performance-based versus real-world measures. In L. C. Centifanti & D. M. Williams (Eds.), The Wiley handbook of developmental psychopathology (pp. 91–137). Wiley Blackwell.
Penn, C., Frankel, T., Watermeyer, J., & Russell, N. (2010). Executive function and conversational strategies in bilingual aphasia. Aphasiology, 24(2), 288–308. Perlovsky, L. (2011). Language and cognition interaction neural mechanisms. Computational Intelligence and Neuroscience, 2011, 454587. https://doi.org/10.1155/2011/454587 Raven, J. (1962). Coloured progressive matrices. Psychological Corporation. Rey, A. (1941). L’examen psychologique dans les cas d’encéphalopathie traumatique. Archives de Psychologie, 28, 286–340. Simic, T., Bitan, T., Turner, G., Chambers, C., Goldberg, D., Leonard, C., & Rochon, E. (2020). The role of executive control in post-stroke aphasia treatment. Neuropsychological Rehabilitation, 30(10), 1853–1892. Stuss, D. T., Alexander, M. P., Floden, D., Binns, M. A., Levine, B., McIntosh, A. R., Rajah, N., & Hevenor, S. J. (2002). Fractionation and localization of distinct frontal lobe processes: Evidence from focal lesions in humans. In D. T. Stuss & R. T. Knight (Eds.), Principles of frontal lobe function (pp. 392–407). Oxford University Press. Taylor, A. H., Elliffe, D., Hunt, G. R., & Gray, R. D. (2010). Complex cognition and behavioural innovation in New Caledonian crows. Proceedings of the Royal Society B: Biological Sciences, 277(1694), 2637–2643. Tessaro, B., Brownsett, S.L.E., Hameau, S., Simic, T., Nickels, L., Gilmore, N., Penaloza, C., Robson, H., & Salis, C. (2023a). Assessment of cognition in aphasia: Perspectives from clinicians and researchers. Manuscript in preparation. Tessaro, B., Hameau, S., Salis, C., & Nickels, L. (2022). Semantic impairment in aphasia: A problem of control? International Journal of Speech-Language Pathology, on line ahead of print, 1–12. https://doi.org/10.1080/17549507. 2022.2125072 Tessaro, B., Salis, C., Hameau, S., & Nickels, L. (2023b). How cognition has been assessed in research with people with aphasia: A systematic literature review. Manuscript submitted for publication. Thompson, H. E., Robson, H., Lambon Ralph, M. A., & Jefferies, E. (2015). Varieties of semantic “access” deficit in Wernicke’s aphasia and semantic aphasia. Brain, 138(12), 3776–3792. Togher, L., McDonald, S., & Code, C. (2013a). Social and communication disorder following traumatic brain injury. In S. McDonald, L. Togher, & C. Code (Eds.), Social and communication disorders following traumatic brain injury (pp. 1–25). Taylor & Francis.
286 Lyndsey Nickels, Bruna Tessaro, Solène Hameau, and Christos Salis Togher, L., McDonald, S., Coelho, C. A., & Byom, L. (2013b). Cognitive communication disability following TBI: Examining discourse, pragmatics, behaviour and executive function. In S. McDonald, L. Togher, & C. Code (Eds.), Social and communication disorders following traumatic brain injury (pp. 89–118). Taylor & Francis. Van der Linden, L., Dricot, L., De Letter, M., Duyck, W., de Partz, M. P., Ivanoiu, A., & Szmalec, A. (2018). A case study about the interplay between language control and cognitive abilities in bilingual differential aphasia: Behavioral and brain correlates. Journal of Neurolinguistics, 46, 37–68. Varley, R. (2002). Science without grammar: Scientific reasoning in severe agrammatic aphasia. In P. Carruthers, S. Stich, & M. Siegal (Eds.), The cognitive basis of science (pp. 99–116). Cambridge University Press. Varoquaux, G., Schwartz, Y., Poldrack, R. A., Gauthier, B., Bzdok, D., Poline, J. B., & Thirion,
B. (2018). Atlases of cognition with large-scale human brain mapping. PLoS Computational Biology, 14(11), e1006565. Verreyt, N., De Letter, M., Hemelsoet, D., Santens, P., & Duyck, W. (2013). Cognate effects and executive control in a patient with differential bilingual aphasia. Applied Neuropsychology: Adult, 20(3), 221–230. Wall, K. J., Cumming, T. B., & Copland, D. A. (2017). Determining the association between language and cognitive tests in poststroke aphasia. Frontiers in Neurology, 8(149), 1–9. Wechsler, D. (2014). Wechsler intelligence scale for children (5th ed.). Pearson. Whiteside, D. M., Kealey, T., Semla, M., Luu, H., Rice, L., Basso, M. R., & Roper, B. (2016). Verbal fluency: Language or executive function measure? Applied Neuropsychology: Adult, 23(1), 29–34. https://doi.org/10.1080/23279095.2015. 1004574
21 Linguistic and Motoric Disorders in the Sign Modality MARTHA E. TYRONE 21.1 Preamble Sign languages are natural languages, which have developed as deaf people have had the opportunity to form social and linguistic communities. Sign languages have the lexical and grammatical complexity of spoken languages, but they use the hands and arms as their main articulators, which results in formational differences between sign and speech. As with speech, the production of sign language can be disrupted as the result of either a language disorder or a movement disorder. Traditionally, dysarthria and other speech motor disorders have been viewed as specific to the speech production mechanism, so they have not been explored much in the context of sign languages. There are multiple sign languages worldwide. Different sign languages have distinct lexicons and distinct grammars, but some structural commonalities have been observed across languages. This chapter will focus on research from American Sign Language (ASL) and British Sign Language (BSL) because these are the sign languages that have been studied most in the context of neurogenic language and motor deficits. Sign languages can convey morphological, syntactic, and discourse information through the placement of signs in the physical space in front of the signer (Klima & Bellugi, 1979; Sandler & Lillo-Martin, 2006; Johnston & Schembri, 2007; Sutton-Spence & Woll, 1999). For instance, some verbs can be inflected for person and number based on the direction and number of repetitions of a sign’s movement. For example, the ASL sign GIVE can be inflected to mean “you give me,” “she gives him,” or any other combination of subject and object, as indicated by the direction in which the hands move. Similarly, if a sign representing an agent is placed at a given position in signing space, verbs and pronouns that refer to the same agent can move toward or away from that position to specify that reference (Klima & Bellugi, 1979). This type of grammatical use of the signing space has been documented in several sign languages. Sign languages, cross-linguistically, also make use of head movement and facial e xpression to convey linguistic information. For example, in Greek Sign Language, there is a specific type of head movement used to mark negation (Antzakas & Woll, 2002). Other sign languages also use head and face movements to mark negation, topicalization, and wh-questions, among other functions (Herrmann & Steinbach, 2013; Klima & Bellugi, 1979; Sutton-Spence & Woll, 1999). Researchers have debated whether these non-manual components of sign language serve primarily a syntactic or intonational role, or some combination of the two (Wilbur, 2009), but there is a general agreement that the use of face and head actions in sign language is rule-governed and is an integral part of sign structure.
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
288 Martha E. Tyrone Regarding phonology, four major phonological parameters have been identified that can differentiate signs from each other. Those parameters are: handshape, location, movement, and orientation (Battison, 1978; Stokoe, 1960). Handshape refers to the configuration of the hand(s). Location refers to where the hands are located on the body or in front of the body as a sign is produced. In ASL, there are 12 contrastive locations on the body, plus the space in front of the body, which is referred to as neutral space. For example, the ASL signs FATHER, MOTHER, and FINE are contrastive because they differ in location (Figure 21.1). Movement refers to the shape, direction, and number of repetitions in the arm’s movements. Finally, orientation refers to the direction that the palm of the hand is facing. Orientation seems to serve a less important role than the other parameters in differentiating signs. Like the linguistic uses of space and of facial expression, the major phonological parameters of sign language have been documented across multiple sign languages (Crasborn, 2001; Johnston & Schembri, 2007; Sandler & Lillo-Martin, 2006). Sign phonetics refers to the physical transmission of the linguistic signal through the manual–visual channel by the movement of the hands, arms, upper body, and head (Tyrone, 2020). A growing body of research suggests that sign languages undergo some of the same phonetic processes as spoken languages. Signs in ASL and in other sign languages can exhibit undershoot and coarticulation, so that the handshape or location of a sign becomes more like the handshape or location of the signs that surround it (Cheek, 2001; Grosvald, 2009; Mauk, 2003; Mauk et al., 2008). Likewise, signs can undergo phonetic reduction, such that movement trajectories are reduced in amplitude, as an effect of rate or phonetic environment (Tyrone & Mauk, 2010). In addition, phonetic reduction in sign language can occur when sign movements originate from distal articulators (e.g. the finger rather than the wrist), as observed when a signer is close to their interlocutor (Crasborn, 2001). Sign languages include lexical signs that are similar to spoken words. In addition to these lexical signs, sign languages use systems of fingerspelling to borrow words from spoken/ written languages. During fingerspelling, the fingers or hands represent the individual letters of a written word. Some languages (such as ASL) have a one-handed fingerspelling system and others (such as BSL) have a two-handed system, but for both types of systems, the movements are small compared to the movements that are used for lexical signs. Because multiple elements have to be produced to represent a single borrowed word, and because the movements are small, fingerspelled letters are produced at a faster rate than lexical signs.
Figure 21.1 The ASL signs FATHER, MOTHER, and FINE.
Linguistic and Motoric Disorders in the Sign Modality 289 There have been many studies of the neural basis of sign language, in part because such studies allow examination of language independent of a specific physical production system and set of perceptual organs (Poizner et al., 1987). Based on research on clinical case studies and normative imaging data, it is clear that the same neural structures underlie sign language as well as spoken language. Numerous studies have demonstrated the activation of traditional language areas during sign language processing and production. These areas include the inferior frontal lobe (Levanen et al., 2001; MacSweeney et al., 2002; Neville et al., 1998; Petitto et al., 2000) and the superior, posterior temporal lobe (Braun et al., 2001; MacSweeney et al., 2002; Nishimura et al., 1999). The latter finding is noteworthy because the superior temporal gyrus has traditionally been associated with auditory function, but in deaf signers it serves a role in perceiving visual– manual language. In addition, Corina et al. (2003) found activity in secondary motor areas of the left hemisphere during the production of ASL signs, irrespective of which hand was used to produce the sign. This suggests that the linguistic nature of the movements influenced which motor areas of the brain were recruited during sign production.
21.2 Comparing Sign and Speech The most obvious difference between sign and speech is probably the size and configuration of the articulators used for each system. Sign language uses the hands and arms as its primary articulators, while spoken language uses the larynx and vocal tract as its primary articulators. The sign articulators are paired on opposite sides of the body, and some signs require bimanual coordination (all sign languages include both one-handed and two-handed signs). By contrast, the speech articulators are located along the midline of the body, and they produce speech sounds by means of a source-filter mechanism. Because sign languages use large articulators and large movement trajectories, their production rate tends to be slower than the typical rate of speech production. Despite this, sign language users are able to communicate the same amount of information in the same amount of time as users of spoken languages. While producing an individual sign takes more time than producing an individual spoken word, sign language grammar employs fewer function words and its structure relies more on information presented simultaneously in the signing space and less on information that is sequential (Bellugi & Fischer, 1972; Vermeerbergen et al., 2007). One socio-cultural difference between signed and spoken language is that most sign language users do not acquire their language natively. Most deaf people are born into hearing families and do not acquire sign language until they go to school or come in contact with other deaf children. The effect of this is that most signers are non-native users of their primary language, and unlike hearing speakers, many deaf signers do not have full exposure to any language early in infancy. Similarly, sign languages are minority languages. They always co-exist with a spoken language that is used by the majority of people in any given country or language region. As a result, almost all sign language users are bilingual, using a sign language in the deaf community and a spoken or written language at work, with family, and in other contexts.
21.3 Sign Production Disorders: Aphasia and Apraxia Early research on motor deficits and sign language focused on how the characteristics of aphasia and limb motor disorders differed in the sign modality (Brentari et al., 1995; Loew et al., 1995; Poizner & Kegl, 1992; Poizner et al., 1987). Prior to those studies, not much consideration had been given to linguistic as opposed to non-linguistic deficits in sign production, and a differentiation between the two could constitute evidence that sign languages were distinct from non-linguistic gesture or pantomime.
290 Martha E. Tyrone Another point to consider about the study of aphasia in a sign language is that aphasia often co-occurs with limb apraxia following left hemisphere stroke. In a hearing speaker with a left hemisphere lesion, the two deficits can be differentiated based on which body part is affected. In the case of sign language users, the two deficits would affect the same articulators. Along similar lines, non-speech oral apraxia and apraxia of speech affect the same articulators as aphasia in a spoken language, but there is a more substantial literature laying out the distinctions among these disorders (Haley & Martin, 2011; Miller, 2002; Ziegler, 2002). While there has been extensive research differentiating sign aphasia from apraxia, there have been no documented cases of apraxia in the complete absence of aphasia in a sign language user. Multiple studies suggest that the same types of aphasia occur in sign language as occur in spoken language, and that those types of aphasia correspond to similarly located lesions (Corina et al., 1992; Hickok et al., 1996; Marshall et al., 2004; Poizner et al., 1987). For both language modalities, Broca’s aphasia is characterized by limited and non-fluent language production, disruptions to phonology, and somewhat intact language comprehension. By contrast, Wernicke’s aphasia is characterized by fluent production that lacks semantic content, and severely disrupted comprehension. Poizner et al. (1987) were the first to examine sign aphasia and apraxia and to show that the two could be differentiated in sign language users. The study included three signers with left hemisphere lesions and aphasia. Only one signer was impaired on pantomime production and imitation; and none of the signers was impaired on pantomime recognition. Thus, the signers did not have the same severity of disorder in sign and in gesture. Corina et al. (1992) and Kegl and Poizner (1997) also identified dissociations between aphasia and apraxia in deaf signers with left hemisphere damage. Corina et al. (1992) described a signer with a left posterior lesion who had limited sign comprehension and fluent but non-grammatical sign production. At the same time, the signer could produce and understand non-linguistic gestures, and imitate sequences of gestures, suggesting that his representation of symbolic movements was largely preserved, and his deficit was fundamentally linguistic in nature. Kegl and Poizner (1997) described a signer who had a lesion in the left parietal lobe and exhibited severe comprehension deficits and mild production deficits. Despite his signing deficits, he performed within the normal range on tests of ideomotor apraxia, pantomime recognition, and joint coordination. Similarly, Hickok et al. (1996) studied a group of ASL signers with left hemisphere damage and found no correlation between their aphasia scores and their apraxia scores. Apraxia and aphasia were assessed by the Kimura gestures task (Kimura, 1993) and by an ASL translation of the Boston Diagnostic Aphasia Examination (Goodglass & Kaplan, 1975). Findings from studies of British deaf signers following stroke are consistent with many of the findings from similar case studies in the USA. Marshall et al. (2004) and Marshall et al. (2005) described two sign language users who had left hemisphere damage and aphasia, and who both showed different degrees of impairment on sign and gesture tasks. Marshall et al. (2005) described one deaf signer with a left anterior lesion whose aphasia was severe, with extensive comprehension deficits and no spontaneous language production. Although her performance on the Kimura box task and the Kimura gesture task suggested apraxia (Kimura, 1993), her gesture comprehension was far superior to her comprehension of BSL signs. Marshall et al. (2004) described a signer who had good comprehension of single signs, but who exhibited anomia and used a large amount of non-linguistic gesture. Like the other BSL signer with aphasia, he exhibited apraxia, but his production and comprehension of gesture were much better than his production and comprehension of BSL signs. Notably, these studies controlled for the potential role of iconicity in sign comprehension and production. In particular, their tests of sign comprehension included potential visual distractors, so that if a signer was using iconic information to perceive signs, they might choose
Linguistic and Motoric Disorders in the Sign Modality 291 the distractor rather than the BSL sign. It is interesting that both of the individuals with aphasia described by Marshall and colleagues showed better performance on gesture comprehension tasks than on sign comprehension tasks, but neither signer confused BSL signs with iconic gestures representing the same objects. Thus, even though signers could use an iconic strategy to comprehend gestures, they apparently did not use this strategy for sign comprehension.
21.4 S ign Production Disorders: Right Hemisphere Damage Like apraxia and aphasia, right hemisphere damage can provide an interesting contrast to other sign production disorders. Some well-documented effects of right hemisphere damage are hemispatial neglect and a more general deficit in processing visuospatial information. In terms of motor deficits, individuals with right hemisphere damage often exhibit paresis or paralysis on the left side of the body, affecting voluntary hand and arm movement as well as other movements. Additionally, individuals with right hemisphere damage in some cases experience language-related deficits, such as aprosodia, pragmatic disorders, and discourse processing deficits. These types of deficits have been observed in sign language as well as spoken language. Poizner et al. (1987) demonstrated that language function could be preserved in signers who had right hemisphere damage, in spite of disruptions to visuospatial processing. This finding was important because sign languages use the spatial relationships between signs to mark how those signs are related grammatically or in discourse. Consequently, signers must keep track of signs’ positions in space in order to know how they are related in the discourse. The two signers with right hemisphere damage and visuospatial processing deficits described by Poizner et al. (1987) were able to comprehend and produce complex sentences in ASL, despite the visuospatial processing demands of the task. Similarly, Marshall et al. (2003) compared BSL signers with left hemisphere damage and with right hemisphere damage and found that the signers with right hemisphere damage were impaired in their comprehension of spatial information but not in their general comprehension of BSL sentences or their ability to match signs to pictures. Most studies of right hemisphere damage and sign language have focused on signers’ receptive skills or on the grammaticality of their signing (Emmorey et al., 1996; Loew et al., 1997; Poizner et al., 1987). However, a few studies have examined the phonetics of sign production in individuals with right hemisphere damage. Poizner and Kegl (1993) describe a hearing ASL signer with right hemisphere damage who exhibited a mild articulatory deficit. Specifically, she had difficulty coordinating the two arms during two-handed signs, and the authors interpreted this as motor neglect on the side of the body affected by stroke. They pointed out that the signer with right hemisphere damage showed movement lagging. When she produced two-handed signs, the initiation of movement in the left hand was delayed relative to the right hand. One British signer with right hemisphere damage was studied in terms of his sign production (Tyrone, 2005). His signing and fingerspelling were compared to his performance on a range of non-linguistic movement tasks. In addition, both his linguistic and non-linguistic movement behaviors were compared to those of an age-matched control signer. In comparison to the control signer, the signer with right hemisphere damage exhibited lowering of high signs, laxed handshapes, and difficulty with fine motor control. The fine motor control deficit occurred across tasks but was more pronounced during signing. Finally, this signer showed minimal deficits with coordination, as would be expected for an individual with unilateral right hemisphere damage.
292 Martha E. Tyrone
21.5 Sign Production Disorders: Hypokinesia 21.5.1 Parkinson’s Disease Hypokinetic movement disorders are characterized primarily by reduced movement amplitude and speed, and by difficulty in initiating voluntary movements. The most common hypokinetic disorder is Parkinson’s disease, which results from the loss of dopaminergic neurons in the substantia nigra. The hallmark symptoms of Parkinson’s disease are slowed movement, reduced movement size, resting tremor, muscle rigidity, and postural instability (Fahn & Elton, 1987). In addition, individuals with Parkinson’s disease often experience stooped posture, shuffling gait, and dementia in the later stages of the disease. Dysarthria from Parkinson’s disease is characterized by reduced loudness, reduced pitch range, strained and breathy voice quality, and, for some speakers, occurrence of short, rapid bursts of speech. This last characteristic is of interest, because it is unlike hypokinetic limb movement, which appears markedly slow. Several studies in the 1990s focused on sign production in ASL users with Parkinson’s disease (Brentari & Poizner, 1994; Brentari et al., 1995; Loew et al., 1995; Poizner & Kegl, 1993). The goals of these studies were to characterize the effects of the disease on signing and fingerspelling and to compare the motoric deficits from Parkinson’s disease to linguistic deficits resulting from left hemisphere damage. ASL signers with Parkinson’s disease were c ompared to healthy deaf controls and to signers with aphasia on descriptive and kinematic measures of sign production, and on kinematic measures of non-linguistic motor tasks. Results from these studies suggested that the signers with PD produced signs using mostly distal articulators, used laxed articulatory configurations, reduced the size of the signing space, showed minimal facial expression, and decoupled the coordinated movements of the hand and arm. This last deficit was apparent in signs that included a handshape change as the arm was in motion, such as in the ASL sign ASK (Figure 21.2). Loew et al. (1995) emphasized that PD signing was spatially reduced but preserved crucial linguistic contrasts. By contrast, the errors produced by signers with aphasia consisted largely of phonological substitutions.
Figure 21.2 The ASL sign ASK.
Linguistic and Motoric Disorders in the Sign Modality 293 Tyrone et al. (1999) discussed similar phenomena but examined fingerspelling rather than signing. This study suggested that signers with PD exhibited incoordination, a rticulatory undershoot, blending of handshapes, and irregular pausing in fingerspelling. Moreover, it suggested that the rapid, sequential nature of fingerspelling made it extremely challenging for signers with Parkinson’s disease. All of these studies emphasized that PD signing deficits were phonetic rather than phonological in nature and that incoordination was a hallmark characteristic of PD signing. Tyrone and Woll (2008a) described a BSL signer with Parkinson’s disease. He was tested on individual sign production, fingerspelling, and a range of non-linguistic movement tasks. In addition, because of the motoric side effects of some PD medications, he was tested both on- and off-medication for all tasks. The signer with PD exhibited some of the same patterns reported previously on speech motor control and Parkinson’s disease. The disease was not at an advanced stage when he was tested, and his sign production deficits were not severe. Moreover, he did not exhibit significant incoordination in signing or fingerspelling, either on- or off-medication. This is consistent with findings from earlier dysarthria research, suggesting that PD dysarthria does not specifically affect inter-articulator coordination. However, this finding is in contrast to the results of Brentari et al. (1995), which suggested that coordination in particular was impaired in PD sign production. The BSL signer with PD also exhibited irregular pauses and difficulty initiating movement, but less so in signing than in other movement tasks. In some ways, the signs produced by the BSL signer with PD resembled dysarthric speech resulting from PD, while his non-linguistic limb movements resembled the limb movements of hearing speakers with PD. This suggests that the nature of the specific movement deficit depended more on the function for which the articulator was used than on the anatomical and physiological properties of the articulator itself. The BSL signer with PD often produced signs with laxed handshapes and sometimes laxed orientations. In addition, he produced slow movements both during signing and during other movement tasks. This signer did not exhibit the coordination deficits that were emphasized in the earlier research on PD and signing in the USA (Brentari et al., 1995; Poizner & Kegl, 1993); nor did he lower sign locations (Kegl et al., 1999). One final distinction between the BSL signer and the ASL signers with PD is that the BSL signer did not use more distal articulators during sign production. It may be that the patterns observed in ASL signers with Parkinson’s disease resulted not only from the movement disorder but also from normal aging, since the BSL signer with Parkinson’s disease was relatively young. Unfortunately, there have been no studies of agerelated changes in motor control for signing, and the studies on ASL and Parkinson’s disease did not include age-matched controls.
21.5.2 Progressive Supranuclear Palsy Only one case of hypokinetic dysarthria not resulting from Parkinson’s disease has been identified in a sign language user. The individual was a British deaf man who developed progressive supranuclear palsy (Tyrone & Woll, 2008b). He was 79 years old at the time of testing. He was born deaf, began learning BSL as a child, and it became his primary language. Following the onset of PSP, he showed limited mobility, slow and reduced spontaneous movement, intention tremor, and stooped posture. His score on the Mini-Mental State Examination (Folstein et al., 1975) suggested mild dementia, but his comprehension and production of BSL were intact at the time of testing (Atkinson et al., 2005). Progressive supranuclear palsy (PSP) is similar to Parkinson’s disease in that it causes reduced movement amplitude and speed. Unlike Parkinson’s disease, PSP typically results in reduced eye movement and a severe form of dysarthria early in the course of the disease. PSP speech is characterized by reduced loudness, limited pitch range, articulatory undershoot, and palilalia.
294 Martha E. Tyrone The signer with PSP produced movements that were small, hypoarticulated, and gradual. When the signer with PSP produced individual signs, he often used laxed handshapes and palm orientation, and sign locations were often lowered. He also exhibited incoordination in the production of two-handed signs. Unlike other signers with hypokinesia, the signer with PSP produced involuntary movements and palilalia during signing. Palilalia in spoken language is defined as the repetition of an entire word, with decreasing volume over multiple repetitions. The BSL signer with PSP also exhibited palilalia, since entire signs were repeated, and sign repetitions had decreased movement amplitude. In several ways, the signer with PSP produced signs similarly to the signers with Parkinson’s disease that had been described earlier. Like them, he produced slow, small movements with laxed articulators. Unlike signers with Parkinson’s disease, he exhibited palilalia during signing, but he had no analogous error during fingerspelling or non-linguistic movement tasks. Like hearing speakers with PSP, his spontaneous repetition disorder was specific to the production of words. In other words, his sign production disorder was somewhat distinct from what has been reported for signers with PD, and similar to what has been reported for speakers with PSP.
21.6 Sign Production Disorders: Ataxia Ataxia refers to the motor deficits that result from cerebellar damage. These often include movement inaccuracy and incoordination (Timmann et al., 2001), intention tremor, dysdiadochokinesis (disturbance to rapidly alternating movements), dysrhythmia, and dysmetria (movement undershoot or overshoot) (Bastian, 2002; Topka et al., 1998). Ataxic speech has been characterized as slow, distorted, and imprecise, with a scanning rhythm, and irregular variation in pitch and loudness (Kent et al., 2000). Both clinical and experimental research suggests that ataxic dysarthria affects multiple speech articulators, instead of affecting articulators in isolation (Kent et al., 1997, 2000). There has only been one documented case of ataxia in a sign language user, and he was a deaf BSL signer (Tyrone et al., 2009). The individual was 36 years old at the time of testing. He had developed ataxia due to extensive hemorrhaging in the cerebellum during surgery to correct an arteriovenous malformation. He was born deaf and began acquiring BSL at age five at an oral school for the deaf. Following the onset of ataxia, he was tested on sign comprehension, sign production, and fingerspelling tasks, over the course of multiple sessions. In addition, he was tested on a range of non-linguistic limb movement tasks, such as pointing, reach and grasp, and the Kimura box (Kimura, 1993). The BSL signer with ataxia was quite different from the other signers with movement disorders. In contrast to signers with Parkinson’s disease, who have been reported to use laxed handshapes, his handshapes during signing were hyperextended, so that his fingers extended backwards from the base knuckle of the hand. The signer with ataxia also had a tendency to use articulators that were proximal to those normally used for a given sign (for example, flexing the wrist instead of the fingers). This is the opposite of what has been reported for ASL signers with Parkinson’s disease, who sometimes produced signs using more distal articulators. The signer with ataxia also showed intention tremor during signing and non-linguistic tasks, and he exhibited incoordination of the movements of proximal and distal articulators (such as the elbow and the fingers) and incoordination of the two hands during signing. The ataxic signer’s motor symptoms more often affected aspects of sign structure that changed over the course of a sign’s production (e.g. the configuration of the hand in a sign with handshape change). In some cases, he added movements to signs where they were not required. One movement pattern that occurred across linguistic and non-linguistic tasks for the signer with ataxia was the tendency to perform one-handed tasks with two hands. In BSL
Linguistic and Motoric Disorders in the Sign Modality 295 and in other sign languages, some signs are produced with one hand, and others are produced with two hands. The signer with ataxia tended to spontaneously produce one-handed BSL signs using two hands and mirroring the right hand’s actions on his left hand. Similarly, on the reach and grasp task, he was asked to grasp cylinders of different sizes and move them a short distance forward. On this task as well, he used both hands to accomplish the movement task.
21.7 Discussion The studies outlined here suggest that dysarthria, as distinct from disruptions to simpler movement tasks, occurs in sign language as well as spoken language. Moreover, sign and non-sign movements may be affected differently by the same movement disorder in the same individual, in terms of the severity of the symptoms exhibited or in terms of which specific symptoms are present. Speech production deficits are often described in articulatorspecific terms (Ackermann et al., 1997; Yunusova et al., 2008), but the fact that similar deficits occur in signed as well as spoken language suggests that the articulators may not be the most important factor for differentiating speech movement deficits from other types of movement deficits. Just as dysarthria is not articulator-specific, it is not fundamentally linguistic in nature. The reason that dysarthria can occur in either an oral or a manual language is because both modalities use rapid, complex, coordinated movements. The movement speed and complexity are necessary for the information transfer required by a linguistic system, but this does not imply that motoric disruptions to language output are inherently linguistic. Individuals with dysarthria would probably also show impairments in any task with similar motor demands, but since few movement tasks require such a combination of speed and precision, production deficits appear predominantly in speech or sign. The sign production deficits that have been identified so far show patterns that would be expected, based on the form that dysarthria takes in spoken language for the same movement disorders. For instance, the signer with progressive supranuclear palsy shows a distinctive production deficit, which is characterized in part by palilalia, and this is what would be predicted based on the form that dysarthria takes in hearing speakers with dysarthria. In the same way, the signer with ataxia exhibits incoordination and exaggerated or imprecise sign movements, similar to the speech production deficits exhibited by hearing speakers with ataxic dysarthria. In light of findings like these, research on sign production should move beyond oppositions between motor disorders and linguistic disorders to consider more nuanced comparisons within each of those categories.
21.7.1 Differences and Similarities across Modalities There are certain types of production deficits that occur in both sign and speech as well as some deficits that do not, which may provide insight into the nature of signed and spoken language structure and into the nature of speech motor disorders. Specific deficits that occur in both sign and speech include palilalia, incoordination, reduced movement size, and slowed movement. Palilalia is an interesting example of a production deficit that can occur across modalities, because the movement sequences produced by the hands or by the vocal mechanism are lengthy and complex – they are the combinations of movements required to repeat an entire word. With respect to reduced movement size across modalities, the size of an articulatory movement in a sign language is described only in those terms, whereas reduced movement displacements in speech can also be discussed in terms of their acoustic consequences (for example, acoustic undershoot or reduced amplitude).
296 Martha E. Tyrone In terms of deficits that do not occur cross-modally, none of the studies so far suggests that a sign equivalent to festination occurs in Parkinson’s disease. Hearing speakers with Parkinson’s disease produce rapid bursts of speech (which may be an adaptation to altered expiratory control for speech), even though their targeted limb movements tend to be slow. It does not seem that signers with Parkinson’s disease produce rapid, brief bursts of signing. This could be related to the fact that individual signs are produced more slowly than individual spoken words to begin with (Bellugi & Fischer, 1972). The difference in production rate across modalities could create differences in the production strategies that language users employ. There are aspects of sign and speech that cannot be easily compared, either for individuals with motor disorders or more generally. Differences in the articulators themselves or in their innervation may make it counter-productive to search for analogues across modalities. For example, there is no obvious sign analogue to phonation, nasality, or respiration. Likewise, with the exception of the vocal folds, the speech mechanism does not employ articulators that are paired across the midline of the body. The primary sign articulators are positioned on opposite sides of the body and are controlled by opposite sides of the brain, whereas the speech articulators are mostly unitary structures positioned along the midline of the body, and many of them receive bilateral innervation. For these reasons, it seems best not to search for parallels between sign and speech where the comparisons are too strained.
21.7.2 Methodological Considerations In addition to considering the similarities and differences in production mechanisms and deficits for sign and speech, it is worth considering the limitations of current knowledge and procedures in the field. Firstly, one issue in comparing groups with a disorder is the validity of the existing methods of assessing that disorder. This is particularly problematic for sign language research, because the pool of research participants is quite small, especially in research studies on disordered signing. Moreover, the standard assessments of language, cognition and motor function must be translated into a sign language if they are used with deaf signers, which can threaten measurement accuracy and validity. The cross-linguistic validity of standardized assessments is an issue for all minority languages, but sign languages present a unique challenge because they employ a different language modality (Atkinson et al., 2005). To illustrate this point, one need only consider established assessments of apraxia and aphasia. In administering these assessments, researchers must give instructions to participants in a sign language without using iconic signs or gestures that may demonstrate the gesture or sign that an individual is being asked to produce. Other assessments can pose similar challenges. Similarly, there are no well-established measurements or procedures for analyzing phonetic variation in typical signers. Current measures in sign language research are based on units from sign phonology, so there is no framework for describing aspects of production that are not linguistically contrastive. For example, one observation from research on sign production disorders has been that signers with certain disorders tend to use laxed handshapes (e.g., Loew et al., 1995; Tyrone et al., 1999). While several studies concur on this point, there are no established measures or normative data for laxing (or hyperextension) in typical signers. Quantitative, gradient measures of sign production were only developed recently (Cheek, 2001; Mauk et al., 2008; Tyrone & Mauk, 2010). These measures are not widely used, and they only exist for a few aspects of sign structure. Moreover, studies from many decades ago examined the physiological basis of speech in typical speakers (Lisker & Abramson, 1964; Stevens & House, 1955), but few studies have investigated anatomical and physiological factors that influence sign production or perception (Ann, 1996; Mandel, 1981). Another limitation in comparisons of sign and speech production disorders is that there has been only minimal investigation of limb movement in studies of speech and language disorders. Several studies have examined speech movements and non-linguistic limb movements in typical speakers (McNeill, 1992; Meister et al., 2009; Rochet-Capellan et al., 2008)
Linguistic and Motoric Disorders in the Sign Modality 297 and in speakers who stutter (Max et al., 2003; Olander et al., 2010), but limb movement in individuals with dysarthria has received less attention. Studies such as the one by Ackermann et al. (1997) discussed speech motor deficits in dysarthria in light of central motor functions underlying both speech and limb movement, but the researchers did not directly compare the two types of movement in the same participants. It could be informative to analyze both limb movements in isolation and gestures that accompany speech in hearing speakers with dysarthria. Similar patterns may arise in gesture and in speech, particularly since co-speech gestures are typically timed to coordinate with speech production (Krivokapic et al., 2017; Leonard & Cummins, 2011; McNeill, 1992). Deficits that appear in both speech and speechaccompanying gesture could suggest new avenues for diagnosis and therapy.
21.8 Directions for Future Research One challenge for the study of production deficits in sign language is that the numbers of cases of the different disorders are very small, so it is difficult to obtain instrumented measures of impaired production. More problematically, there is an insufficient amount of normative sign production data available for comparison. A few studies have collected motion capture data for signers with neurogenic motor deficits (Brentari et al., 1995; Poizner et al., 1987), but there is not a sizable body of normative motion capture data available for comparison. Moreover, the normative data that do exist were collected mostly from young adult signers and not from signers who would be more closely age-matched to signing individuals with acquired motor disorders. For this reason and others, there is a clear need for more normative signing data, collected from a broader range of the deaf community. As outlined above, most sign language users are bilingual and have extensive experience with the dominant spoken language as well as with their sign language. This being the case, future studies of acquired production disorders could focus more on comparing sign and speech skills in the same individuals, particularly now that the wide availability of video data can assist with pre-morbid assessments of both signing and speech. Marshall et al. (2005) studied a deaf signer with aphasia, who had strong bilingual skills, so the researchers compared her skills in BSL and in English. With respect to motor disorders, Kegl et al. (1999) compared findings on speech production and signing in Parkinson’s disease, but the data were cross-sectional, so it was not possible to control for individual variation. Finally, one goal of further research into sign language and speech motor disorders should be to develop better therapies for individuals with sign production deficits. Research in the UK suggests that speech and language services for deaf signers are extremely lacking (Atkinson et al., 2002; Marshall et al., 2003). This question has received scant attention in other parts of the world, but the situation elsewhere is likely to be the same. It is hoped that better insight into the sign production mechanism will help to inform future diagnosis and treatment for individuals with motor disorders in general and for sign language users in particular.
REFERENCES Ackermann, H., Hertrich, I., Daum, I., Scharf, G., & Spieker, S. (1997). Kinematic analysis of articulatory movements in central motor disorders. Movement Disorders, 12(6), 1019–1027. Ann, J. (1996). On the relation between the difficulty and the frequency of occurrence of
handshapes in two sign languages. Lingua, 98, 19–41. Antzakas, K., & Woll, B. (2002). Head movements and negation in Greek Sign Language. In I. Wachsmuth & T. Sowa (Eds.), Gesture and sign language in human-computer interaction (pp. 193–196). Springer-Verlag.
298 Martha E. Tyrone Atkinson, J., Marshall, J., Thacker, A., & Woll, B. (2002). When sign language breaks down: Deaf people’s access to language therapy in the UK. Deaf Worlds, 18, 9–21. Atkinson, J. R., Marshall, J., Woll, B., & Thacker, A. (2005). Testing comprehension abilities in users of British Sign Language following CVA. Brain and Language, 94, 233–248. Bastian, A. J. (2002). Cerebellar limb ataxia: Abnormal control of self-generated and external forces. Annals of the New York Academy of Sciences, 978, 16–27. Battison, R. (1978). Lexical borrowing in American sign language. Linstok. Bellugi, U., & Fischer, S. (1972). A comparison of sign language and spoken language. Cognition, 1, 173–200. Braun, A. R., Guillemin, A., Hosey, L., & Varga, M. (2001). The neural organization of discourse: An H215O-PET study of narrative production in English and American sign language. Brain, 124(10), 2028–2044. Brentari, D., & Poizner, H. (1994). A phonological analysis of a Deaf Parkinsonian signer. Language and Cognitive Processes, 9, 69–99. Brentari, D., Poizner, H., & Kegl, J. (1995). Aphasic and Parkinsonian signing: Differences in phonological disruption. Brain and Language, 48, 69–105. Cheek, A. (2001). The phonetics and phonology of handshape in American sign language. Ph.D. dissertation, Department of Linguistics, University of Texas at Austin. Corina, D. P., Poizner, H., Bellugi, U., Feinberg, T., & O’Grady-Batch, L. (1992). Dissociation between linguistic and nonlinguistic gestural systems: A case for compositionality. Brain and Language, 43, 414–447. Corina, D. P., San Jose-Robertson, L., Guillemin, A., High, J., & Braun, A. R. (2003). Language lateralization in a bimanual language. Journal of Cognitive Neuroscience, 15(5), 718–730. Crasborn, O. (2001). Phonetic implementation of phonological categories in sign language of the Netherlands. Ph.D. dissertation, Leiden University. Landelijke Onderzoekschool Taalwetenschap. Emmorey, K., Hickok, G., & Klima, E. S. (1996). The neural organization for sign language: Insights from right hemisphere damaged signers. Brain and Cognition, 32(2), 212–215. Fahn, S., & Elton, R. L. (1987). Unified Parkinson’s disease rating scale. In S. Fahn, C. D. Marsden, D. B. Calne, & M. Golstein (Eds.), Recent developments in Parkinson’s disease
(Vol. 2, pp. 153–163). Macmillan Healthcare Information. Folstein, M. F., Folstein, S. E., & McHugh, P. R. (1975). Mini-mental state: A practical method for grading the state of patients for the clinician. Journal of Psychiatric Research, 12, 189–198. Goodglass, H., & Kaplan, E. (1975). The assessment of aphasia and related disorders. Lea & Febiger. Grosvald, M. A. (2009). Long-distance coarticulation: A production and perception study of English and American sign language. PhD thesis, Department of Linguistics, University of California at Davis. Haley, K. L., & Martin, G. (2011). Production variability and single word intelligibility in aphasia and apraxia of speech. Journal of Communication Disorders, 44, 103–115. Herrmann, A., & Steinbach, M. (Eds.). (2013). Nonmanuals in sign languages. John Benjamins. Hickok, G., Bellugi, U., & Klima, E. S. (1996). The neurobiology of sign language and its implications for the neural basis of language. Nature, 381, 699–702. Johnston, T., & Schembri, A. (2007). Australian sign language (Auslan): An introduction to sign language linguistics. Cambridge University Press. Kegl, J., Cohen, H., & Poizner, H. (1999). Articulatory consequences of Parkinson’s disease: Perspectives from two modalities. Brain and Cognition, 40, 355–386. Kegl, J., & Poizner, H. (1997). Crosslinguistic/ crossmodal syntactic consequences of left-hemisphere damage: Evidence from an aphasic signer and his identical twin. Aphasiology, 11, 1–37. Kent, R. D., Kent, J. F., Duffy, J. R., Thomas, J. E., Weismer, G., & Stuntebeck, S. (2000). Ataxic dysarthria. Journal of Speech, Language, and Hearing Research, 43, 1275–1289. Kent, R. D., Kent, J. F., Rosenbek, J. C., Vorperian, H. K., & Weismer, G. (1997). A speaking task analysis of the dysarthria in cerebellar disease. Folia Phoniatrica et Logopaedica, 49, 63–82. Kimura, D. (1993). Neuromotor mechanisms in human communication. Oxford University Press. Klima, E. S., & Bellugi, U. (1979). The signs of language. Harvard University Press. Krivokapic, J., Tiede, M. K., & Tyrone, M. E. (2017). A kinematic study of prosodic structure in articulatory and manual gestures: Results from a novel method of data collection. Laboratory Phonology, 8(1), 3.
Linguistic and Motoric Disorders in the Sign Modality 299 Leonard, T., & Cummins, F. (2011). The temporal relation between beat gestures and speech. Language and Cognitive Processes, 26, 1457–1471. Levanen, S., Uutela, K., Salenius, S., & Hari, R. (2001). Cortical representation of sign language: Comparison of Deaf signers and hearing non-signers. Cerebral Cortex, 11, 506–512. Lisker, L., & Abramson, A. S. (1964). A crosslanguage study of voicing in initial stops: Acoustical measurements. Word, 20, 384–422. Loew, R. C., Kegl, J. A., & Poizner, H. (1995). Flattening of distinctions in a Parkinsonian signer. Aphasiology, 9(4), 381–396. Loew, R. C., Kegl, J. A., & Poizner, H. (1997). Fractionation of the components of role play in a right-hemispheric lesioned signer. Aphasiology, 11(3), 263–281. MacSweeney, M., Woll, B., Campbell, R., McGuire, P. K., David, A. S., Williams, S. C. R., Suckling, J., Calvert, G. A., & Brammer, M. J. (2002). Neural systems underlying British Sign Language and audiovisual English processing in native users. Brain, 125, 1583–1593. Mandel, M. A. (1981). Phonotactics and morphophonology in ASL. PhD dissertation, Department of Linguistics, University of California. Marshall, J., Atkinson, J., Smulovitch, E., Thacker, A., & Woll, B. (2004). Aphasia in a user of British Sign Language: Dissociation between sign and gesture. Cognitive Neuropsychology, 21(5), 537–554. Marshall, J., Atkinson, J., Thacker, A., & Woll, B. (2003). Is speech and language therapy meeting the needs of language minorities? The case of Deaf people with neurological impairments. International Journal of Language and Communication Disorders, 38, 85–94. Marshall, J., Atkinson, J. R., Woll, B., & Thacker, A. (2005). Aphasia in a bilingual user of British Sign Language and English: Effects of cross-linguistic cues. Cognitive Neuropsychology, 22, 719–736. Mauk, C. E. (2003). Undershoot in two modalities: Evidence from fast speech and fast signing. Ph.D. dissertation, Department of Linguistics, University of Texas at Austin. Mauk, C. E., Lindblom, B., & Meier, R. P. (2008). Undershoot of ASL locations in fast signing. In J. Quer (Ed.), Signs of the time: Selected papers from TISLR 2004 (pp. 3–24). Signum. Max, L., Caruso, A. J., & Gracco, V. L. (2003). Kinematic analyses of speech, orofacial nonspeech, and finger movements in
stuttering and nonstuttering adults. Journal of Speech, Language, and Hearing Research, 46, 215–232. McNeill, D. (1992). Hand and mind: What gestures reveal about thought. University of Chicago Press. Meister, I. G., Buelte, D., Staedtgen, M., Boroojerdi, B., & Sparing, R. (2009). The dorsal premotor cortex orchestrates concurrent speech and fingertapping movements. European Journal of Neuroscience, 29, 2074–2082. Miller, N. (2002). The neurological bases of apraxia of speech. Seminars in Speech and Language, 23, 223–230. Neville, H., Bavelier, D., Corina, D., Rauschecker, J., Karni, A., Lalwani, A., Braun, A., Clark, V., Jezzard, P., & Turner, R. (1998). Cerebral organization for language in deaf and hearing subjects: Biological constraints and effects of experience. Proceedings of the National Academy of Sciences, 95, 922–929. Nishimura, H., Hashikawa, K., Doi, K., Iwaki, T., Watanabe, Y., Kusuoka, H., Nishimura, T., & Kubo, T. (1999). Sign language “heard” in the auditory cortex. Nature, 397, 116. Olander, L., Smith, A., & Zelaznik, H. N. (2010). Evidence that a motor timing deficit is a factor in the development of stuttering. Journal of Speech, Language, and Hearing Research, 53, 876–886. Petitto, L. A., Zatorre, R. J., Gauna, K., Nikelski, E. J., Dostie, D., & Evans, A. C. (2000). Speech-like cerebral activity in profoundly deaf people processing signed languages: Implications for the neural basis of human language. Proceedings of the National Academy of Sciences, 97(25), 13961–13966. Poizner, H., & Kegl, J. (1992). Neural basis of language and motor behaviour: Perspectives from American Sign Language. Aphasiology, 6(3), 219–256. Poizner, H., & Kegl, J. (1993). Neural disorders of the linguistic use of space and movement. In P. Tallal, A. Galaburda, R. Llinas, & C. von Euler (Eds.), Annals of the New York Academy of Science, temporal information processing in the nervous system (Vol. 682, pp. 192–213). New York Academy of Sciences Press. Poizner, H., Klima, E., & Bellugi, U. (1987). What the hands reveal about the brain. MIT Press. Rochet-Capellan, A., Laboissiere, R., Galvan, A., & Schwartz, J. L. (2008). The speech focus position effect on jaw-finger coordination in a pointing task. Journal of Speech, Language, and Hearing Research, 51, 1507–1521.
300 Martha E. Tyrone Sandler, W., & Lillo-Martin, D. (2006). Sign language and linguistic universals. Cambridge University Press. Stevens, K. N., & House, A. S. (1955). Development of a quantitative description of vowel articulation. Journal of the Acoustical Society of America, 27, 484–493. Stokoe, W. C. (1960). Sign language structure: An outline of the visual communication systems of the American deaf. Linstok Press. Sutton-Spence, R., & Woll, B. (1999). The linguistics of British sign language. Cambridge University Press. Timmann, D., Citron, R., Watts, S., & Hore, J. (2001). Increased variability in finger position occurs throughout overarm throws made by cerebellar and unskilled subjects. Journal of Neurophysiology, 86, 2690–2702. Topka, H., Konczak, J., Schneider, K., Boose, A., & Dichgans, J. (1998). Multijoint arm movements in cerebellar ataxia: Abnormal control of movement dynamics. Experimental Brain Research, 119, 493–503. Tyrone, M. E. (2005). An investigation of sign dysarthria. PhD Thesis, Department of Language and Communication Science, City University London. Tyrone, M. E. (2020). The phonetics of sign language. Oxford Research Encyclopedia of Linguistics. https://doi.org/10.1093/ acrefore/9780199384655.013.744 Tyrone, M. E., Atkinson, J. R., Marshall, J., & Woll, B. (2009). The effects of cerebellar ataxia
on sign language production: A case study. Neurocase, 15, 419–426. Tyrone, M. E., Kegl, J., & Poizner, H. (1999). Interarticulator co-ordination in Deaf signers with Parkinson’s disease. Neuropsychologia, 37, 1271–1283. Tyrone, M. E., & Mauk, C. E. (2010). Sign lowering and phonetic reduction in American Sign Language. Journal of Phonetics, 38, 317–328. Tyrone, M. E., & Woll, B. (2008a). Sign phonetics and the motor system: Implications from Parkinson’s disease. In J. Quer (Ed.), Signs of the time: Selected papers from TISLR 2004 (pp. 43–68). Signum. Tyrone, M. E., & Woll, B. (2008b). Palilalia in sign language. Neurology, 70(2), 155–156. Vermeerbergen, M., Leeson, L., & Crasborn, O. (2007). Simultaneity in signed languages: Form and function. John Benjamins. Wilbur, R. B. (2009). Effects of varying rate of signing on ASL manual signs and nonmanual markers. Language and Speech, 52, 245–285. Yunusova, Y., Weismer, G., Westbury, J. R., & Lindstrom, M. J. (2008). Articulatory movements during vowels in speakers with dysarthria and healthy controls. Journal of Speech, Language, and Hearing Research, 51(3), 596–611. Ziegler, W. (2002). Task-related factors in oral motor control: Speech and oral diadochokinesis in dysarthria and apraxia of speech. Brain and Language, 80, 556–575.
Part 3: Phonology
22 Phonology and Clinical Phonology ELENA EVEN-SIMKIN 22.1 Developmental and Clinical Phonology Phonology is the study of systems of sounds in a language. Different theories have been proposed to describe and explain the acquisition process, development, and production of sound systems. Phonological systems of sound can be described by the phonemic inventory in a specific language, allophonic variations, phonotactic rules, and morphophonemic changes. Children’s acquisition of the sound systems of their language or languages proceeds through the phonological stages of speech development, including pre-linguistic phases of vocalization, perception, and communication through crying and gestures, that is, phonation, cooing, expansion, reduplicated babbling, and variegated babbling; and linguistic stages, which include: first words, phonemic development, stabilization of phonological system, morphophonemic development, and spelling. The normative data, accumulated from various studies, can provide essential information concerning normative phonological development needed for assessment and clinical intervention programs (Grunwell, 1987). Indeed, children use various phonological deviations in their speech during the process of language acquisition, but children with phonological disorders also use atypical phonological deviation patterns. The most common categories of phonological disorders, as reported in Ball et al. (2010), Even-Simkin (2019), Gordon-Brannan and Weiss (2007), Grunwell (1987), Ingram (1990), are: 1. Syllable Structure Deviations, that is, when the produced sequence or number of consonants and vowels differ from the standard form of the word, and which include: syllable deletion/omission/reduction, initial consonant deletion/omission, final consonant deletion/omission, cluster reduction or consonant sequence omission, coalescence, epenthesis, diminutization, doubling, metathesis, migration, glottal replacement; 2. Assimilation or Harmony Deviations, that is, when a sound or a syllable becomes more similar to either the preceding sound – a progressive assimilation, or to the following sound that influences the earlier sound in the word – a regressive assimilation, with the following main types: voicing harmony/assimilation, consonant harmony/assimilation, syllable harmony/assimilation; 3. Substitution Processes or Feature Contrast, that is, the replacement of the sound of one class of phonemes by another class, the substitution that is not influenced by the features of the surrounding sounds – the process that includes the following types of deviations: fronting, backing, stopping, affrication, deaffrication, palatalization, depalatalization, denasalization, gliding,
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
304 Elena Even-Simkin neutralization, vocalization/vowelization; and 4. Articulation shifts, the process that is considered as a typical developmental process when there is a shift in the place of articulation, like in the articulation of the sounds /s, z, f, v/ instead of /ɵ, ð/. As presented above, different types of phonological deviations may be found in the child’s production and depending on the phonological information, the difference in the diagnosis between children’s delayed or deviant speech development can be provided. Thus, the task of the speech-language therapist is to identify and provide the appropriate treatment procedures, which can be based on the various frameworks of phonological analysis.
22.2 Phonological Theories Effective clinical intervention programs and treatment processes require essential knowledge and understanding of the phonological system and of the various aspects of typical and atypical speech development. One of the leading theoreticians of the Prague School and a pioneer in the development of phonological approach and its clinical application to child language, Jakobson (1971, pp. 12–13) defines the phonological system as: a stratified structure, that […] is formed of superimposed layers. The hierarchy of these layers is practically universal and invariable. It occurs in the synchrony of language; consequently we have to do with a panchronic ordering. If there exists a relationship of irreversible solidarity between two phonological values, the secondary value cannot exist without the primary and the primary cannot be eliminated without the secondary. This ordering is to be found in any existing phonological system and it governs all its mutations; the same ordering determines […] the acquisition of language, a system in the process of build up; and […] it reappears in language disorders.
That is, any clinical intervention program presupposes the understanding of the formation of speech sounds and the linguistic knowledge of the sound patterns and sound system, that is, the understanding of the phonological system of a language. It is then important to note a crucial difference between phonology versus phonetics, and phonological deficit versus a phonetic deviation in the classification of language pathologies. Whereas phonology accounts for the system and production of speech sounds, or as Stoel-Gammon and Dunn (1985) define it, the phonological analysis includes the following aspects: (a) inventory of the phonemes, (b) description of the produced patterns of the phonemes, (c) description of the phonemes as used in the allophonic variations, and (d) description of morphophonemic variations in sound patterns; the phonetic analysis involves three aspects: (a) a rticulation, (b) acoustic components, and (c) how the sounds are perceived. The foundation for the distinction between phonetics and phonology was originally laid by Ferdinand de Saussure (1966[1916]) who is considered to be the father of modern linguistics (Tobin, 1997). The dichotomy between phonology and phonetics was further developed in the framework of communication-oriented functional approach of the Prague school by Nikolai Trubetzkoy (1939, 1949, 1969) and Roman Jakobson (1962) who brought the functional notion of communication to the forefront of phonology. The first fundamental work on phonological development Kindersprache Aphasie und Allgemeine Lautgesetze (Child Language, Aphasia, and Phonological Universals) by Jakobson was published in 1941 in Sweden, on which most of the subsequent research was either based on or inspired by in the field of developmental phonology and a theory of general and clinical phonology (Atkinson et al., 1988; de Villiers & de Villiers, 1978; Tobin, 1997).
Phonology and Clinical Phonology 305 The following sections present a brief description of the major six current developmental and clinical phonological theories: Generative Phonology, Natural Phonology, Nonlinear Phonology, Cognitive Phonology, Structuralist Phonological Theory, and the Theory of Phonology as Human Behavior.
22.2.1 Generative Phonology Generative approach to the phonological analysis, introduced in Chomsky and Halle’s (1968) study of English phonology: The Sound Pattern of English, focuses on the abstract phonological representations mapped to surface pronunciations by a set of phonological rules that excludes a phonemic analysis, or as Grunwell (1987) concisely summarizes “[t]he formal techniques of generative phonological analysis involves descriptions of the sound patterns of language in terms of phonological rules”(p. 169) and further elaborates that “the phonological specifications of the ‘underlying representations’ are frequently very similar to the orthographic forms of the words”(p. 169). In the generative phonological analyses of the child’s speech, the pronunciation patterns are described following the adult phonological pronunciation rules, that is, “[t]he adult pronunciations are represented by the phonemic forms of the words or segments and these form the ‘input’ to the phonological rules; the ‘output’ is the child’s pronunciation” (Grunwell, 1987, p. 171). In other words, phonological rules describe: (a) the underlying representations which are not uttered, (b) the surface form that the speaker pronounces, and (c) the conditions under which it is pronounced. This process of application of the phonological rules to the underlying form to derive the surface structure in a certain context is called derivation. By applying this analytical procedure in the clinical and developmental analysis, one needs to be aware of the presupposition of some kind of “reality” of the descriptive rules in the child’s production system and language processing, in general. According to Maxwell (1979): “[i]t is an unsupported claim, however, that [children with deviant phonologies] … are perceiving precisely the same signals or interpreting them in precisely the same way as normal speakers”(p. 189). Generative theory posits that phonemes are generated from a set of distinctive f eatures, which constitute the underlying representation of a phoneme (Ferguson & Garnica, 1975), and though there is no consensus on a universal set of these features, the order of the distinctive feature acquisition, applicable to both typical and clinical phonological systems, has been explicitly defined, for example, by Dinnsen (1992) according to the following levels: A – inclusion of vowels, glides, obstruent stops, and nasals (without voicing and manner distinction); B – addition of voice distinction; C – addition of fricatives and affricates; D – addition of a liquid, either /r/ or /l/; and E. – addition of all sound classes. The principles of generative approach, in the framework of child speech, are that phonological rules, which compare two systems: the child’s and the adult’s one and which are the descriptive statements of developmental error patterns, can: (a) change segments, feature specifications (substitutions and distortions); (b) delete segments (omissions); (c) insert segments (additions); (d) metathesis, interchange segments (transpositions); and (e) coalesce segments, thus explaining their easy application to the analysis of child speech. Another principle of the generative theory is rule ordering that implies a particular sequence of the phonological rules, however, which is a subject to considerable controversy (Grunwell, 1987; Hyman, 1975). It should be noted that one of the formal clinical assessment procedures based on generative phonological principles, the Compton–Hutton Phonological Assessment (Compton & Hutton, 1978), includes a predetermined set of common deviant phonological rules, which, however, excludes investigation of the phonological organization of the child’s pronunciation of the complex patterns, thus, providing a very limited view of the potential characteristics of variability in children’s pronunciation patterns, which is essential for developmental and clinical analysis.
306 Elena Even-Simkin
22.2.2 Natural Phonology The major tenet of the Natural Phonology, originally proposed by David Stampe (1969, 1979), Donegan and Stampe (1979), is that phonological organization of the patterns of speech is governed by certain universal phonological processes which are innate. According to Stampe, at the very beginning of the speech development, children have a fully developed “innate” phonological system which they have to revise in terms of the phonological contrasts and structures as a result of auditory perception of their language in order to learn the language-specific pronunciation system. That is, an underlying mental representation (UR) of the child’s speech is not different from its surface form and implies the correct UR in the incorrect cases of speech production. As Stampe (1979) notes “[a] phonological process is a mental operation that applies in speech to substitute for a class of sounds or sound sequences presenting a common difficulty to the speech capacity of the individual, an alternative class identical but lacking the difficult property” (p. 1). In Stampe’s view, children unlearn most of the universal natural processes to achieve a more complex language-specific pronunciation system of their language, which eventually matches the adult model of their language, for example, as a result of suppression, limitation and/or reordering of the phonological processes. This approach presents a deterministic account of the children’s phonological development as some kind of passive suppression of the phonological innate processes, rather than the active acquisition of the phonological system of their language. It is important to highlight the impact of the natural phonology theory in the explanation and description of the phonological development and consequently its impact on clinical phonology and clinical assessment procedures. The major clinical assessment procedures, based on the natural phonological process analysis, are: Phonological Process Analysis – PPA (Weiner, 1979), Assessment of Phonological Processes – APP (Hodson, 1980), Natural Process Analysis – NPA (Shriberg & Kwiatkowski, 1980), Procedures for the Phonological Analysis of Children’s Language – PPACL (Ingram, 1981), and Phonological Assessment of Child Speech – PACS (Grunwell, 1985). For example, the NPA analysis includes the following eight natural processes that are intended for the analysis of spontaneous speech: (1) Final Consonant Deletion; (2) Velar Fronting (initial and final); (3) Stopping (initial and final); (4) Palatal Fronting (initial and final); (5) Liquid Simplification (initial and final); (6) Assimilation (progressive and regressive); (7) Cluster Reduction (initial and final); and (8) Unstressed Syllable Deletion. Indeed, this NPA procedure provides a description of normal developmental patterns, but it does not include unusual processes that are characteristic in developmental phonological disorders, like denasalization (which is described, however, in PPACL and PPA procedures) or glottal replacement (described in PACS, PPA and APP procedures). According to Grunwell (1987, p. 232): Phonological process analysis of disordered child speech […] has firmly established that the children’s pronunciation patterns have a systematic relationship to the target adult pronunciation patterns. […] [T]hat processes can be easily identified and […] there is much that is normal in the processes that are discovered. […] However, differences have been found between normal […] and disordered child speech.
The differences between a typical and an atypical child speech development, following Grunwell (1987), are related to the use of the following processes: (a) Persisting Normal Processes; (b) Chronological Mismatch; (c) Systematic Sound Preference; (d) Idiosyncratic Processes; and (e). Variable Use of Phonological Processes. As Edwards and Shriberg (1983) claim “some segments, sound classes, consonant and vowel systems, types of syllables and rules and processes are more natural than others” (p. 87) and these more natural sounds, such as the most common stop /p, t, k/ and vowel
Phonology and Clinical Phonology 307 /i, a, u/ systems, are the most common ones in many languages and, thus, are learned sooner than others by children. This natural phonology approach describes the developmental phonology as a natural progression from the innate phonological system, whilst excluding almost any other factors. Nevertheless, it is evident that an assessment of children’s disordered speech following the Natural Phonology framework can provide an important clinical information concerning the developmental deviations and the identification of the complex language-specific phonological patterns for remediation process.
22.2.3 Nonlinear Phonology Nonlinear Phonology, that has become an active research field since the late 1979s, focuses on the hierarchical relationships of phonological units and addresses different aspects of sound structure that include the following major layers: phrase and word structure, segments, that is, speech sounds and features. This theory embraces theories termed metrical phonology, prosodic phonology and autosegmental phonology (Bernhardt & Stoel-Gammon, 1994, 1997) to address the issues, such as long-distance assimilation processes and the interaction of phonology with morphology, the typology of stress system and tone. One of the main concepts emerged from nonlinear phonology is autosegmentalism (Goldsmith, 1990) which implies that a distinctive feature, as a free-standing entity, is the atom of phonological representation and is independent of its segmental host. As McCarthy (2001, p. 393) notes, “the central insight of autosegmental phonology: speech sounds consist of independent components which are coordinated in time.” Tone is an example of such independent component and its analysis, following autosegmentalism, may include: (a) Persistence under deletion – when a vowel deletes, its tone attaches to the nearby vowel; (b) Floating tones – when a vowel deletes, its tone expresses indirectly by affecting following tones; (c) Toneless syllables – the pitch of the syllables without tonal specification that is determined by linear interpretation between syllables with tones; (d) Tone shift – when every tone shifts one syllable to the right; and (e) Tone melodies – when words are limited to a set of fixed tone melodies, which may mark certain morphological distinctions (McCarthy, 2001). Moreover, this theory was applied to nontonal phenomena, such as assimilation (when one segment takes on the features of a nearby segment) and harmony (a long-distance assimilation when segments are not adjacent). The nonlinear analysis allows an interpretation of the assimilation and harmony processes in an organized graphical mode which is claimed to be universal. However, it should be mentioned that the most radical presentation of every distinctive feature on a s eparate layer, which is completely independent from other features, may be rather misleading since there are significant functional associations among different subsets of the features. Another concept addressed in the framework of the nonlinear theory is the prosodic hierarchy that organizes segments to larger constituents like syllables and words. The study of prosody refers to the rhythmic, durational and phrasing aspects of speech and focuses on stress, syllables, the size of words and the ways that morphology and syntax affect pronunciation. According to McCarthy (2001, p. 394): A key insight into the nature of prosody is the idea of a small, universal set of prosodic categories arranged into a hierarchy […] proceeding from the bottom of the hierarchy, a syllable is a grouping of segments: the head of the syllable is the peak of acoustic energy, usually a vowel. A foot is a grouping of (usually two) syllables, one of which is more prominent than the other. The more prominent syllable is the locus of linguistic stress. A phonological word is a grouping of feet, one of which is also the most prominent (the locus of main stress). A phonological phrase is a grouping of phonological words that also includes a most prominent member and so on.
308 Elena Even-Simkin In other words, in the nonlinear approach, different segmental and prosodic units are presented hierarchically, that is, various tiers of speech sounds and features of phonemes are presented in a hierarchical order and not as bundles of features. Other two concepts that are related to the nonlinear theory are: specified and underspecified features. That is, there are some default features that are predictable, like voicing in producing sonorant sounds in English, as far as sonorants are voiced in English. Thus, in the UR of vowels, nasals, liquids and glides, +voice is the default feature and consequently is not specified. However, for the obstruents this +/-voice feature has to be specified since there are voiced and voiceless stops, fricatives and affricates in English, with the default feature (underspecified) being -voice and the nondefault feature (specified) being +voice. Other f eatures, postulated for consonants, are: -continuant (default feature/underspecified) for stops versus +continuant (nondefault feature/specified) for fricatives; or for the place of articulation like coronal (+anterior) as the default/underspecified versus coronal (-anterior), labial and dorsal(velar) as the nondefault/specified. Thus, following the description of the phonological acquisition in terms of the nonlinear theory, children begin from the production of the underspecified segment tiers which are then complemented by the specified features till the acquisition of their language-specific phonological system. However, it should be noted that this specified versus underspecified preference is child-specific, rather than languagespecific or universal, since there is a great variability in the order of development (Bernhardt & Stoel-Gammon, 1997). Based on the nonlinear theory, Bernhardt and Stemberger (2000) developed a clinical application of the nonlinear phonological approach which includes the following types of goals: (1) New syllables and word structures; (2) New individual segments and features; (3) New simultaneous combinations of old features; and (4) New places for old segments. These four types of goals are based on “(a) syllable and word structure versus segments and features and (b) ‘new stuff’ versus ‘old stuff’ in new combinations or places” (Bernhardt & Stemberger, 2000, p. 51). This type of approach addresses sound production in the contexts of: phonemic sequences, syllable and word structures, individual speech sound and stress patterns, and can be applied in the treatment of multiple articulation errors, whilst it is recommended for the treatment of the disordered phonological systems rather than for phonetic errors. Another limitation that should be mentioned is the time-consuming aspect of the analysis, which, with an assistance of a computer program, the CAPES (Masterson & Bernhardt, 2001), and with the mastery of the analysis, however, can be significantly reduced (Bernhardt & Stemberger, 2000).
22.2.4 Cognitive Phonology Since the late 1980s there has been a growing attention to phonology from the Cognitive Linguistics perspective. This new approach attempts to explain linguistic processes in terms of cognitive mechanisms of the human mind, thus, suggesting that language is motivated by linguistic and non-linguistic factors. In other words, following the Cognitive Linguistics approach, the phonological component of languages is explained by mechanisms of the human brain, various environmental aspects and cognitive or semiotic systems (Radden & Panther, 2004). Thus, one of the basic ideas that underlie Cognitive Phonology, as Mompean (2014, p. 264) summarizes, is: Linguistic and non-linguistic categories are not abstract, human-independent and objectively “out there” in the world but they are rooted or grouped in people’s concrete physical, social and cultural experiences and under the constraints imposed by their bodies (Johnson
1987; Lakoff 1987; Lakoff and Johnson 1980, 1999; Rohrer 2007).
Phonology and Clinical Phonology 309 That is, categorization, as the cognitive process, is considered an essential issue in Cognitive Linguistics and one of the primary principles of linguistic organization (Taylor, 2003) with phonological units being considered as categories (Fraser, 2006). Cognitive approach challenges the classical view for grammatical and semantic categories, as well as the classical view for phonological categories (Coleman & Kay, 1981; Corrigan, 1991; Dirven & Taylor, 1988). For example, in cognitive phonology, phonemes and allophones are defined as mental categories of sounds classified “the same” (Nathan, 1989). However, this definition has raised certain challenges, as Mompean (2014, p. 255) notes: [R]esearch has shown that these segment-sized categories can hardly ever be defined by necessary and sufficient conditions. The various phoneme categories discussed in the literature, from the oral plosives /t, d/ in English (Eddington, 2007; Mompean, 2004; Nathan, 1986, 1996, 2007; Taylor, 2003) to the nasal plosives /n/ and /m/ in Spanish (Cuenca & Hilferty, 1999; Fraser, 2004; Mompean and Mompean, 2012) appear to lack features shared by all category members.
To refine the “radial” view of phoneme categories, it was proposed the instance-schema network view of category structure that referred to the speakers’ ability to extract schemas at different levels of specificity, that is, considering that phoneme categories have a central member (prototype) and a context-induced extension (Bybee, 1999; Cuenca & Hilferty, 1999; Langacker, 2007; Nathan, 1989; Taylor, 2002). For instance, the phoneme /t/ (a voiceless alveolar stop in English – [t]) is the prototype, while other types of allophones are referred to be its extensions. This kind of category cohesion with various category members to a prototype member causes phoneme category overlap, such as the labiodental nasal [ɱ] for the nasal phonemes in English and Spanish – /m/ and /n/ (Cuenca & Hilferty, 1999; Taylor, 2002). Similar to phonemes, cognitive feature categories lack defining features and for this reason some contrasting feature categories overlap, like approximants /j/ and /w/ that are categorized as consonants or vowels. This overlap in some of the category members is explained by one common view that “feature categories are abstracted from speakers” encounters with language-specific events in the course of cognitive development and language acquisition rather than being hard-wired universals’ (Taylor, 2002; Mompean, 2014, p. 256). Another key issue of the cognitive approach is the usage-based conception of language with the main idea that grammar is not only a knowledge repository in the language use but also a continuously redefined outcome of the language use. For example, Bybee (1994, 1999), who has explored the usage-based conception of phonology, shows the significant influence of the language use on the phonetic properties of lexical items over time. That is, according to the usage-based approach, language is grounded in and driven by people’s encounters with language, and the frequency of linguistic occurrence is the most representative variable that highlights the embodiment of this dynamic process. According to Langacker (1987), the more frequently occurring expressions are more likely to be entrenched, thus, acquiring unit status, like in the use of th-fronting process (the use of /f, v/ instead of /ð, ɵ/) in more frequently used words in east-central Scotland (Clark & Trousdale, 2009). Similarly, Taylor (2002) claims that the phonological structure of a word may undergo several modifications in the stream of speech and as Mompean (2014) notes “these changes can be accounted for by a type of schema known as phonological processes” (p. 262), like in the process of “alveolar stop deletion” in English – a phonological schema that can predict the deletion of alveolar stops: /t/ and /d/ between consonants at external or internal morpheme boundaries. Indeed, studies in the framework of cognitive linguistics have shown the relationship between phonology and other levels of linguistic analysis. For example, Bergen (2004) found a tendency for meaning associations for some non-morphemic strings of p honemes. Other studies looked at the relationship between phonology and morphology, that is,
310 Elena Even-Simkin morphonology, like in the studies of vowel alternations between present and past tense forms in irregular verbs in English (Bybee & Moder, 1983; Bybee & Slobin, 1982) or regular plural and past formation (Croft & Cruse, 2004; Kumashiro & Kumashiro, 2006). Following the cognitive approach, these morphonological schemas (referred as morphophonological rules in other theories) are generalizations that refer to a particular morphological context and are the left-overs of earlier phonetically based processes that have been lost over time (Langacker, 1988; Nathan, 2007, 2008). By applying the cognitive phonology approach to clinical phonology, according to Mompean (2014, p. 270), one should keep in mind that any comprehensive and truly explanatory account of phonology in Cognitive Linguistics should take into account the general cognitive processes that shape and give rise to phonological units as well as the various factors – linguistic and language-independent – which motivate those units (phonetic, usage-based, sociocultural, ecological, etc.).
22.2.5 Structuralist Theory Structural phonological theory was developed by Jakobson (1929) and Trubetzkoy (1931, 1939), in the first decades of the twentieth century within the formation of the broader movement known as “structuralism” in the frame of general linguistics. Both Jakobson and Trubetzkoy recognized their intellectual debt to Ferdinand de Saussure, who, as Culler (1976, pp. 7–8) notes: re-organized the systematic study of language and languages in such a way as to make possible the achievements of twentieth-century linguistics […] by his methodological example and by various prophetic suggestions which he offered, and structuralism, which has been an important trend in contemporary anthropology and literary criticism as well as linguistics.
As Harris (1983) adds in his translator’s introduction in de Saussure’s (1983[1916]) book Course (Cours de Linguistic Générale), “Saussure placed modern linguistics in the vanguard of twentieth-century structuralism” (p. x). Most of post-Saussurian twentieth century linguistic thought may be defined as a structuralist paradigm with qualitative and quantitative models developed to describe and explain concrete data (parole) by abstract theory (langue). Jakobson and Trubetzkoy adapted the notion of system, the dichotomy between parole and langue with the particular emphasis on the functional communicative role of the phonemic units and their mutual distinctiveness, that was subsequently developed to binary articulatory and acoustic distinctive features (Jakobson et al., 1952; Jakobson & Halle, 1956; Trubetzkoy, 1931). Jakobson and Halle (1956) presented twelve binary oppositions (see Table 22.1) as universal distinctive features sufficient for the analysis of all languages. However, although these inherent features reflect the extreme versatility of the speech organs (Ivić, 2001), according to Jakobson (1962) this “tentative list of distinctive features … encountered in the languages of the world is intended just as a preliminary draft, open to additions and rectifications” (p. 654). The phonological developmental theory, based on the distinctive feature analysis, offered by Jakobson (1968[1941]), is one of the most influential theories in developmental phonology, though much of which is now considered inaccurate (Atkinson et al., 1988). De Villiers and de Villiers (1978, pp. 38–39) summarized the following major Jakobson’s claims regarding developmental phonology: 1. Babbling is essentially unrestricted and bears no relation to the child’s later acquisition of adult phonology. 2. Phonological development is best described in terms of the mastery of distinctive features.
Phonology and Clinical Phonology 311 Table 22.1 The twelve binary universal distinctive features (according to Jakobson & Halle, 1956/John Wiley & Sons). Binary Universal Distinctive Features 1 2 3 4 5 6 7 8 9 10 11 12
vocalic consonantal compact tense voiced nasal discontinuous strident checked grave flat sharp
nonvocalic nonconsonantal diffuse lax voiceless oral continuant mellow unchecked acute plain plain
3. The child does not approximate the adult’s phonemes one by one, but he/she develops his/her own system of phonemic contrasts, not always using the same features as adults to distinguish between words. 4. Finally, the pattern of phonological development in all children is systematic and universal. Following this structuralist perspective, the first Jakobson’s statement, which implies that meaningful speech and babbling are independent and not related stages, has not been supported by the research in this field. As studies show, the first words production and canonical babbling share various commonalities, thus, pointing to the babbling as a significant period in the acquisition of the phonological system (Oller, 1980; Oller et al., 1985; Stoel-Gammon & Otomo, 1986). In this respect, Kent (1990) emphasizes the important prognostic character of canonical syllables production during the babbling stage for speech development. Another significant contribution of Jacobson’s study of developmental phonology is “his stress on the orderliness of the child’s development and his concentration on the child’s own phonological system” (de Villiers & de Villiers, 1978, p. 42). According to the structuralist theory, there is an innate order of phonemic developmental stages in the learning of all languages, which is, albeit, variable and goes through the individual rate of progression. These universals apply to various languages, to typical language development and clinical cases, like aphasia, in which the stages of the breakdown of the sound system appear in the opposite order to the phonological development, or as Ivić (2001) notes “[p]henomena which are the last to appear in childhood are the first to be lost in aphasia”(p. 88). That is, the sequence of stages in the phonemic development is based on sound feature contrasts and their gradual development. This notion can explain why the number of various sound contrasts increases with the rise of the complexity of the developing phonological system. Ivić (2001) points out the significance of the “fact that children master the sound system gradually, by acquiring binary oppositions, [… and adds that] it would be dangerous to equate the natural and unavoidable successive steps of the child’s development with the rules governing the system of features occurring in a bundle simultaneously” (p. 96). Indeed, the gradual phonological acquisition begins with the initial phonemic development at the labial stage, usually with an acquisition of a wide vowel /ɑ/ and an anterior labial stop /p/, thus, adding the first contrast in consonants versus vowels. In the next stage there is
312 Elena Even-Simkin an addition of a nasal sound as a contrast to the oral stop that is followed by a labial versus alveolar contrast. In the further stages of the phonological development, the acquisition of language continues with fricatives, affricates, liquids and velars, thus, adding more contrasts to the acquired by children the phonemic system.
22.2.6 Phonology as Human Behavior Theory The theory of Phonology as Human Behavior (PHB), developed by William Diver (1979) in an analysis of non-random distribution of initial consonant clusters in English, has been applied in the field of developmental and clinical phonology in different languages (Azim, 1993; Davis, 1987[1984]; Even-Simkin, 2016, 2017; Flores, 1997; Miyakoda, 2003; Tobin, 1990). Phonology as Human Behavior in the framework of the Columbia school (Andrews & Tobin, 1996; Contini-Morava, 1989; Contini-Morava & Sussman Goldberg, 1995; Davis, 1984[1987]; García, 1975; Givόn, 1979; Gorup, 1987; Huffman, 1985; Kirsner, 1979; KleinAndreu, 1983; Reid, 1991; Tobin, 1997) was developed as a part of the historical development of structural, natural, and cognitive phonological theories in the twentieth century, beginning with de Saussure’s (1966[1916]) concept of system and the dichotomy between parole (phonetics) and langue (phonology), that was further developed by Trubetzkoy (1939) and Jakobson (1968[1941]) in the framework of the communication-oriented structural/ functional Prague school. In the classification of sounds according to their acoustic and articulatory features, the communication factor, was then supplemented by André Martinet’s (1955) postulation of the asymmetrical arrangement of phonological systems in a way that reflects the search for harmony and equilibrium within the system. Following the PHB theory, the phonological system is influenced by the principle that there is a continuous strive for a minimal number of phonemes that require the least amount of effort for their production, that is, the human factor which, following Sampson (1980), is referred to be a “therapeutic view of a sound change”( p. 112). Diver adopts both the human and the communication factors into his p honological analysis and proposes that there is a constant strive for maximum communication (communication factor) which is in conflict with the desire for minimum effort (human factor). This notion is reflected in the fact that “there is a similar number (20–30) of phonemes of varied proportional degrees of difficulty acquired in a similar order in the languages of the world: less than 20 phonemes would reduce the communication potential and more than 30 would be too difficult to learn, remember and produce” (Diver, 1979; Tobin, 2002, p. 4). According to the PHB theory, different distinctive features in phonology are more related to the human perception, physiology, behavior and cognition and, thus, different sets of phonemes can be classified in terms of the degree of an effort required to perceive, learn and produce them. There are four principles, as Diver (1995, p. 67) shows that underlie the order of phonological acquisition and are proposed to be the same as in the construction of morphemes in speech: For the “primary” units we thus get a picture of an imagined sequence of development ordered in terms of the need for precision of control. The units are here symbolized in ways that foreshadow their later phonological status: (1) The single-cavity /a/, furnishing undifferentiated resonance for the excitation of the vocal folds. (2) The development of a two-cavity system, using the dorsum and lips as articulators, introducing /i/ and /u/ and thereby converting [a] to a third member of a system, maximally differentiated from the other two, rather than a unique unit. (3) The use of the apex in a fairly undemanding way, giving another shape to the cavity, /l/, without recourse to dorsum and lips, still with excitation by the vocal folds.
Phonology and Clinical Phonology 313 (4) The development of fine motor control over the apex, as it is brought into use as a means of excitation, as well as shaping, of the cavity, in the maximally differentiated positions of /t/ and /s/.
It is also important to mention a different methodology of the PHB approach in application of diverse procedures for data collection, that is, the developmental and clinical studies are based not only on clinical exercises and recordings of spontaneous speech, but also on lexical analysis from standard dictionaries, discourse and various texts analysis including neologisms (Tobin, 1997). The major tenets of this phonological approach suggest the dynamic interplay between the human and communication factor that motivates and controls language change, pointing to the following diachronic and synchronic phonological conclusions: language and phonology, in particular, are a result of a mini-max struggle between the communication and human factors in attempt to create maximum communication with minimal effort (Tobin, 1990). Thus, developmental and clinical speech errors are viewed as extreme versions of this synergetic struggle with less efficient communication either because of a lack of control over the articulatory mechanisms or due to the extreme minimal effort, which, consequently, require clinical interventions in order to reach a balance in them. Another aspect of the PHB theory is that it explains the natural functional processes (Dressler et al., 1987; Stampe, 1979) found in functional language acquisition (Grunwell, 1987; Ingram, 1990) and usually served as norms for child language acquisition and labeling functional disorders. As Tobin (2002, p. 19) claims, the PHB theory “can explain, in a principled way, the connection and interrelationship between the phylogeny, the ontogeny, and the pathology of the development of sound systems in human languages”.
22.3 Conclusion Phonology is concerned with all aspects of the sound system development and production; therefore, the understanding of the typical and atypical phonological development is crucial for the identification of the phonological patterns or disorders, in which the produced sounds differ from the expected ones at each stage of development. Different theories, developed over the years, present different perspectives to the phenomena. As it has been discussed in the sections above, some of the theories show the innate capacities of language learners and some focus on the outside influence, while others integrate both internal and external factors. However, the current stage of phonological science, despite or, probably, due to this particular variance in the theoretical approach, can provide some very helpful data needed for the identification of typical and atypical phonological patterns.
REFERENCES Andrews, E., & Tobin, Y. (Eds.). (1996). Towards a calculus of meaning: Studies in markedness, distinctive features and deixis. John Benjamins. Atkinson, M., Kilby, D., & Roca, I. (1988). Foundations of general linguistics. Unwin Hyman. Azim, A. (1993, August 24). Some problems in the phonology of modern standard Urdu. Paper presented at the First International Columbia
School Conference on Linguistics, New York, United States. Ball, M. J., Perkins, M. R., Muller, N., & Howard, S. (Eds.). (2010). The handbook of clinical linguistics. Wiley-Blackwell: Blackwell Publishing. Bergen, B. (2004). The psychological reality of phonaesthemes. Language, 80(2), 290–311.
314 Elena Even-Simkin Bernhardt, B., & Stemberger, J. P. (2000). Workbook in nonlinear phonology for clinical application. Pro-Ed. Bernhardt, B., & Stoel-Gammon, C. (1994). Nonlinear phonology: Introduction and clinical application. Journal of Speech and Hearing Research, 37(1), 123–143. Bernhardt, B., & Stoel-Gammon, C. (1997). Grounded phonology: Application to the analysis of disordered speech. In M. J. Ball & R. D. Kent (Eds.), The new phonologies: Developments in clinical linguistics (pp. 163–210). Singular Publishing Group, Inc. Bybee, J. L. (1994). A view of phonology from cognitive and functional perspective. Cognitive Linguistics, 5(4), 285–305. Bybee, J. L. (1999). Usage-based phonology. In M. Darnell, E. A. Moravcsik, M. Noonan, F. J. Newmeyer, & K. Wheatley (Eds.), Functionalism and formalism in linguistics, volume I: General papers (pp. 211–242). John Benjamins. Bybee, J. L., & Moder, C. L. (1983). Morphological classes as natural categories. Language, 59(2), 251–270. Bybee, J. L., & Slobin, D. S. (1982). Rules and schemas in the development and use of the English past tense. Language, 58(2), 265–289. Chomsky, N., & Halle, M. (1968). The sound pattern of English. Harper and Row, Publishers. Clark, L., & Trousdale, G. (2009). The role of frequency in phonological change: Evidence from th-fronting in east-central Scotland. English Language and Linguistics, 13(1), 33–55. Coleman, L., & Kay, P. (1981). Prototype semantics: The English word lie. Language, 57(1), 26–44. Compton, A. L., & Hutton, J. S. (1978). Compton– Hutton phonological assessment. Carousel House. Contini-Morava, E. (1989). Discourse pragmatics and semantic categorization: The case of negation and tense-aspect with special reference to Swahili. Mouton de Gruyter. Contini-Morava, E., & Sussman Goldberg, B. (Eds.). (1995). Meaning as explanation: Advances in sign-based linguistics. Mouton de Gruyter. Corrigan, R. (1991). Sentences as categories: Is there a basic-level sentence? Cognitive Linguistics, 2(4), 339–356. Croft, W., & Cruse, D. A. (2004). Cognitive linguistics. Cambridge University Press. Cuenca, M. J., & Hilferty, J. (1999). Introducciόn a la Lingüística Cognitiva. Ariel Lingüística. Culler, J. (1976). Saussure. Fontana. Davis, J. C., Jr. (1987[1984]). A combinatory phonology of Italian. Columbia University Working Papers in Linguistics, 8, 1–99.
de Villiers, J. G., & de Villiers, P. A. (1978). Language acquisition. Harvard University Press. Dinnsen, D. (1992). Variation in developing and fully developed phonetic inventories. In C. A. Ferguson, L. Menn, & C. Stoel-Gammon (Eds.), Phonological development: Models, research, implications (pp. 191–210). York Press. Dirven, R., & Taylor, J. (1988). The conceptualization of vertical space in English: The case of tall. In B. Rudzka-Ostyn (Ed.), Topics in cognitive linguistics (pp. 379–402). John Benjamins. Diver, W. (1979). Phonology as human behavior. In D. Aaronson & R. Rieber (Eds.), Psycholinguistic research: Implications and applications (pp. 161–186). Lawrence Erlbaum Associates. Diver, W. (1995). The theory. In E. ContiniMorava & B. Sussman Goldberg (Eds.), Meaning as explanation: Advances in Linguistic sign theory (pp. 43–114). Mouton de Gruyter. Donegan, P. J., & Stampe, D. (1979). The study of natural phonology. In D. A. Dinnsen (Ed.), Current approaches to phonological theory (pp. 126–173). Indiana University Press. Dressler, W. U., Willi, M., Oswald, P., & Wurzel, W. (Eds.). (1987). Leitmotifs in natural phonology. John Benjamins. Eddington, D. (2007). Flaps and other variants of /t/ in American English: Allophonic distribution without constraints, rules, or abstractions. Cognitive Linguistics, 18(2), 23–46. Edwards, M. L., & Shriberg, L. D. (1983). Phonology: Applications in communicative disorders. College-Hill Press. Even-Simkin, E. (2016). An iconic and systematic feature of irregular forms in English. In C. Periñán-Pascual & E. M. Mestre-Mestre (Eds.), Understanding meaning and knowledge representation: From theoretical and cognitive linguistics to natural language processing (pp. 209–228). Cambridge Scholars Publishing. Even-Simkin, E. (2017). A morpho-phonological past tense processing as a clinical marker in SLI EFL learners. Journal of Clinical Linguistics & Phonetics, 31(7–9), 542–556. Even-Simkin, E. (2019). Clinical phonology. In M. J. Ball & J. S. Damico (Eds.), The SAGE encyclopedia of communication sciences and disorders (pp. 359–362). SAGE Publications, Inc. Ferguson, C. A., & Garnica, O. K. (1975). Theories of phonological development. In E. H. Lenneberg & E. Lenneberg (Eds.), Foundations of language development (pp. 153–180). Academic Press. Flores, N. (1997, February 15). The distribution of post-vocalic phonological units in Spanish.
Phonology and Clinical Phonology 315 Paper presented at the Fifth International Columbia School Conference on Linguistics, Rutgers University, New Jersey, United States. Fraser, H. (2004). Constraining abstractness: Phonological representation in the light of color terms. Cognitive Linguistics, 15(3), 239–288. Fraser, H. (2006). Phonological concepts and concept formation: Metatheory, theory, and application. International Journal of English Studies, 6(2), 55–75. García, E. C. (1975). The role of theory in linguistic analysis: The Spanish pronoun system. North-Holland. Givόn, T. (1979). On understanding grammar. Academic Press. Goldsmith, J. (1990). Autosegmental and metrical phonology. Blackwell. Gordon-Brannan, M. E., & Weiss, C. E. (2007). Clinical management of articulatory and phonologic disorders (3rd ed.). Williams & Wilkins. Gorup, R. J. (1987). The semantic organization of the Serbo-Croatian verb. Francke. Grunwell, P. (1985). Phonological assessment of child speech (PACS). NFER–UK: Nelson, Windsor. College-Hill Press. Grunwell, P. (1987). Clinical phonology (2nd ed.). Williams and Wilkins. Harris, R. (1983). See Saussure, F. de. (1983). Hodson, B. W. (1980). The assessment of phonological processes. Interstate Inc. Huffman, A. (1985). The semantic organization of the French clitic pronouns lui and le [Doctoral dissertation]. Columbia University. Hyman, L. M. (1975). Phonology: Theory and analysis. Holt, Rinehart and Winston. Ingram, D. (1981). Procedures for the phonological analysis of children’s language. University Park Press. Ingram, D. (1990). Phonological disability in children. Whurr. Ivić, P. (2001). Roman Jakobson and the growth of phonology. In C. W. Kreidler (Ed.), Phonology: Critical concepts (pp. 69–107). Taylor and Francis Group. Jakobson, R. (1929). Remarques sur l’évolution phonologique de Russe. Travaux du Cercle Linguistique de Prague, 2, 161–183. Jakobson, R. (1962). Selected writings, vol I, phonological studies. Mouton. Jakobson, R. (1968[1941]). Kindersprache, Aphasie, une Allgemeine Lautgesetze, Uppsala Universitets arsskift. Translated as Child language, aphasia and phonological universals (Vol. 1, pp. 1–83). Mouton. Jakobson, R. (1971). Selected writings, vol II, word and language. Mouton.
Jakobson, R., Fant, G., & Halle, M. (1952). Preliminaries to speech analysis. MIT Press. Jakobson, R., & Halle, M. (1956). Fundamentals of language. Mouton. Johnson, M. (1987). The Body in the Mind: The Bodily Basis of Meaning, Imagination, and Reason. Chicago: University of Chicago Press. Kent, R. D. (1990). The emergence of pediatric phonetic science: Implications for the 0–3 population. Paper presented at the American Speech-Language-Hearing Association Convention, Seattle, WA. Kirsner, R. S. (1979). The problem of presentative sentences in modern standard Dutch. North-Holland. Klein-Andreu, F. (Ed.). (1983). Discourse perspectives on syntax. Academic Press. Kumashiro, F., & Kumashiro, T. (2006). Interlexical relations in English stress. International Journal of English Studies, 6(2), 77–106. Lakoff, G. (1987). Women, fire and dangerous things: What categories reveal about the mind. University of Chicago Press. Lakoff, G., & Johnson, M. (1980). Metaphors we live by. University of Chicago Press. Lakoff, G., & Johnson, M. (1999). Philosophy in the flesh: The embodied mind and its challenge to western thought. Basic Books. Langacker, R. W. (1987). Foundations of cognitive grammar, vol. 1: Theoretical prerequisites. Stanford University Press. Langacker, R. W. (1988). A usage-based model. In B. Rudzka-Ostyn (Ed.), Topics in cognitive linguistics (pp. 127–161). John Benjamins. Langacker, R. W. (2007). Cognitive grammar. In D. Geeraerts & H. Cuyckens (Eds.), The Oxford handbook of cognitive linguistics (pp. 421–462). Oxford University Press. Martinet, A. (1955). Économie des changements phonétiques: Traité de phonologie diachronique. A. Francke. Masterson, J., & Bernhardt, B. (2001). Computerized articulation and phonology evaluation system. The Psychological Corporation. Maxwell, E. M. (1979). Competing analysis of a deviant phonology. Glossa, 13(2), 181–214. McCarthy, J. J. (2001). Nonlinear phonology. In N. J. Smelser & P. B. Baltes (Eds.), International encyclopedia of the social and behavioral sciences (Vol. 17, pp. 11392–11395). Pergamon. Miyakoda, H. (2003). Speech errors in normal and pathological speech: Evidence from Japanese. Journal of Multilingual Communication Disorders, 1(3), 210–221.
316 Elena Even-Simkin Mompean, J. A. (2004). Category overlap and neutralization: The importance of speakers’ classifications in phonology. Cognitive Linguistics, 15(4), 429–469. Mompean, J. A. & Mompean, P. (2012). La fonología cognitive. In I. Ibarretxe-Antuñano & J. Valenzuela (Eds.), Lingüística Cognitiva (pp. 305–326). Anthropos. Mompean, J. (2014). Cognitive linguistics and phonology. In The Bloomsbury companion to cognitive linguistics. https://doi.org/ 10.5040/9781472593689.CH-015 Nathan, G. S. (1986). Phonemes as mental categories. Proceedings of the Annual Meeting of the Berkeley Linguistic Society, 12, 212–223. Nathan, G. S. (1989). Preliminaries to a theory of phonological substance: The substance of sonority. In R. Corrigan, F. Eckman, & M. Noonan (Eds.), Linguistic categorisation (pp. 55–67). John Benjamins. Nathan, G. S. (1996). Steps towards a cognitive phonology. In B. Hurch & R. Rhodes (Eds.), Natural phonology: The state of the art (pp. 107–120). Mouton de Gruyter. Nathan, G. S. (2007). Phonology. In D. Geeraerts & H. Cuyckens (Eds.), The Oxford handbook of cognitive linguistics (pp. 611–631). Oxford University Press. Nathan, G. S. (2008). Phonology: A cognitive grammar introduction (Cognitive Linguistics in Practice [CLP] 3). John Benjamins. Oller, D. K. (1980). The emergence of the sounds of speech in infancy. In G. YeniKomshian, J. Kavanaugh, & C. A. Ferguson (Eds.), Child phonology (Vol. 1, pp. 93–112). Academic Press. Oller, D. K., Eilers, R., Bull, D., & Carney, A. (1985). Pre-speech vocalizations of a deaf infant: A comparison with normal metaphonological development. Journal of Speech and Hearing Research, 28(1), 47–63. Radden, G., & Panther, K. U. (2004). Introduction: Reflections on motivation. In G. Radden & K. U. Panther (Eds.), Studies in linguistic motivation (Cognitive Linguistics Research [CLR] 28) (pp. 1–46). Mouton De Gruyter. Reid, W. (1991). Verb and noun number in English. Longman. Rohrer, T. (2007). Embodiment and experientialism. In D. Geeraerts & H. Cuyckens (Eds.), Oxford handbook of cognitive linguistics (pp. 611–631). Oxford University Press.
Sampson, G. (1980). Schools of linguistics. Stanford University Press. Saussure, F. de (1966[1916]). Course in general linguistics (W. Baskin, Trans.). Philosophical Library/McGraw-Hill. Saussure, F. de (1983[1916]). Course in general linguistics (C. Bally, A. Sechehaye, & A. Riedlinger, Eds.). Translated and annotated by Roy Harris. Duckworth. Shriberg, L. D., & Kwiatkowski, J. (1980). Natural process analysis (NPA). John Wiley. Stampe, D. (1969). The acquisition of phonetic representation. Papers from the Fifth Regional Meeting of the Chicago Linguistic Society, University of Chicago Department of Linguistics. Stampe, D. (1979). A dissertation on natural phonology. Garland Publishing Inc. Stoel-Gammon, C., & Dunn, C. (1985). Normal and disordered phonology in children. University Park Press. Stoel-Gammon, C., & Otomo, K. (1986). Babbling development of hearing-impaired and normally hearing subjects. Journal of Speech and Hearing Disorders, 51(1), 33–41. Taylor, J. R. (2002). Cognitive grammar. Oxford University Press. Taylor, J. R. (2003). Linguistic categorisation: Prototypes in linguistic theory (3rd ed., 1st ed. 1989). Oxford University Press. Tobin, Y. (1990). Semiotics and linguistics. Longman. Tobin, Y. (1997). Phonology as human behavior: Theoretical implications and clinical applications. Duke University Press. Tobin, Y. (2002). Phonology as human behavior: Theoretical implications and cognitive and clinical applications. In E. Fava (Ed.), Linguistic theory, speech and language pathology, speech therapy (pp. 3–22). John Benjamins. Trubetzkoy, N. S. (1931). Die phonologischen system. Travaux du Cercle Linguistique de Prague, 4, 96–116. Trubetzkoy, N. S. (1939). Grundzuge der phonologie. Travaux du Cercle Linguistique de Prague, 7. Trubetzkoy, N. S. (1949). Principes de phonologie. C. Klincksieck. Trubetzkoy, N. S. (1969). Principes de phonologie (C. A. M. Baltaxe, Trans.). University of California Press. Weiner, F. (1979). Phonological process analysis. University Park Press.
23 Constraints-based Nonlinear Phonological Theories in Clinical Phonology Across Languages BARBARA MAY BERNHARDT, JOSEPH P. STEMBERGER, GLENDA MASON, AND DANIEL BÉRUBÉ 23.1 Introduction Speech-language pathology has tracked changes in phonological theory for over 80 years. The current chapter situates the description of constraints-based nonlinear phonological theories within the history of phonological theories and then describes clinical applications, primarily for children but where available, also for adults with neurogenic speech disorders (apraxia and paraphasia).
23.2 Phonological Theories in Clinical Application Early structuralist theories (e.g., Hockett, 1955) viewed phonological representations as (linear) strings of segments (consonants, vowels), with place (e.g., labial), manner (e.g., frication) and voicing characteristics. If following this perspective, in articulation therapy for children, a speech-language pathologist (SLP) targets individual speech sounds one at a time (Van Riper & Irwin, 1959), often using speech sound acquisition charts as a guide for order in intervention. For adults with speech impairments, segments have also been a major focus, with cues sometimes given that highlight the phonetic properties of segments (Duffy, 2020). Generative phonological theories then expanded the description of syllable structure and phonological features. Features delineated natural classes of segments with similar properties. Although Trubetzkoy (1939) argued that, depending on the language, features could be in privative, equipollent or gradual oppositions with one another, Jakobson (1941/1968) and
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
318 Barbara May Bernhardt, Joseph P. Stemberger, Glenda Mason, and Daniel Bérubé Chomsky and Halle (1968) argued for binary (+/-) oppositions, one value being universally more “marked” (less common, more complex) than the other. Implicational universals were proposed, in which complex forms (e.g., CVC syllables) implied existence of simpler forms (CV syllables). To account for outputs (pronunciations [X]) that differed from the assumed input (underlying representations /X/), generative phonologists proposed sets of phonological rules/processes that could delete, insert or re-order input: /A/ > [B]/_ /C/. One consequence of linearity was that interactions between adjacent segments is simpler than interactions between non-adjacent segments (because the intervening material must be spelled out), leading to the expectation that distant interactions should be less common, because complexity is disfavored. However, some types of distant interactions (e.g., vowel harmony across consonants) were surprisingly common. We return to this issue in 23.2.1. For child phonology, clinical applications of generative phonology were based on distinctive features (and their contrasts/oppositions and implicational universals, for example, Gierut, 1990) and phonological rules (e.g., Dinnsen & Elbert, 1984; McReynolds & Engmann, 1975) or processes (e.g., Edwards & Bernhardt, 1973; Grunwell, 1985; Hodson, 1986; Ingram, 1976; Yavaş et al., 1991). For adults with speech impairment, rule/processbased approaches were generally less elucidating, because of the variable and often unpredictable nature of paraphasic or apraxic errors, although intervention might address common features or transcend the segment, targeting syllables and/or rimes (e.g., Fisher et al., 2009). Rule/process-based intervention generally targets groups of speech sounds with common features or structural properties (e.g., position in the syllable), providing more opportunities for system-wide generalization than segment-by-segment approaches. However, theories continued to evolve, providing new opportunities for clinical application.
23.2.1 Nonlinear Phonology A number of challenges for rule/process-based generative phonology (briefly noted above) led to further evolution in phonological theory. The first major shift concerned the structure of phonological representations. Rather than having a single line of segmental elements, positing multiple lines (= levels) of representation from prosodic structure to features (Figures 23.1, 23.2), could explain distant phenomena more easily than in previous accounts (e.g., Goldsmith, 1976). In the hierarchy, each element has its own place on a tier (level) of like units, that is, is underlying adjacent (contiguous) to like elements, providing a context for surface-distant interactions (e.g., consonant harmony). Each tier (line) proceeds across time in a multilinear (so-called “nonlinear”) fashion, as in a symphonic score. Using tonal data, Goldsmith (1976) demonstrated that features can be as short as half a segment or extend across many segments, that is, different elements in the hierarchy are not necessarily time-locked to begin or end at the same time. Further considerations, particularly concerning word stress and compensatory lengthening, suggested the need for a timing tier. Both general timing units (x-units) and more specific rime timing units (“moras”) were proposed (Hayes, 1989). Assigning two moras to diphthongs, long consonants and long vowels, and one to short vowels (and codas in “weight-sensitive” languages) accounts for assignment of stress to heavy syllables (long vowels, diphthongs, short vowels plus codas). If a coda consonant deletes, compensatory vowel lengthening can maintain the mora (VC > VV). Feature descriptions also evolved (McCarthy, 1988; Sagey, 1986). As Figure 23.2 shows, manner, place, and laryngeal features are situated on their own tiers in the hierarchy, but are grouped under privative Place, Laryngeal, and Root (manner) Nodes (Figure 23.2). Features can be linked to one segmental slot (e.g., [Labial] in onset only of bee /ˈbi/), or multiple slots,
Constraints-based Nonlinear Phonological Theories in Clinical Phonology Across Languages 319 Phrase Prosodic word Prosodic word Foot Syllable Onset
Syllable Rime
Nucleus (x, C)
Foot
Mora, V
Coda Mora, C
(Timing tier) (Segmental tier)
I (Features)
Figure 23.1 Phonological hierarchy from the prosodic phrase to the segmental tier. (Links to tiers above)
consonantal
sonorant
continuant
Root Node
nasal lateral
Laryngeal Node
constricted glottis
spread glottis
Labial
round
voiced
Place Node
Coronal
labiodental anterior
grooved
Dorsal
high
low
back
Figure 23.2 Feature hierarchy.
as in boom/ˈbum/ ([Labial] in /b/, /u/, and /m/). Phonological changes can pertain to one sub-type of [Labial] (e.g., eliminating [+round]: no /w/, /u/, /o/, e.g., boom [ˈbam]), all labials (e.g., eliminate [Labial] in boom [ˈden]) or more broadly, to the consonant Place node (e.g., eliminating not just [Labial] but all supralaryngeal place in boom [ˈʔuʔ]: debuccalization).
320 Barbara May Bernhardt, Joseph P. Stemberger, Glenda Mason, and Daniel Bérubé Perspectives on markedness also developed further. The term default describes a speaker’s frequently used forms in contrasting pairs or sets of competing forms. At the prosodic level, bimoraic or CV syllables and trochaic feet are common default structures (Bernhardt & Stemberger, 1998). Default consonant features for English (and many languages) are [-continuant], [-voiced] and [Coronal,+anterior] (i.e., features of /t/: Paradis & Prunet, 1991). Default features generally align with the notion “unmarked,” in terms of greater simplicity and higher frequency (Bernhardt & Stemberger, 1998). Defaults tend to be acquired early (Bernhardt & Stemberger, 1998; Stemberger & Stoel-Gammon, 1991) and substitute for lowerfrequency (marked) non-defaults both in child phonology and adult speech disorders (Romani et al., 2017); However, defaults can be replaced by lower-frequency non-default (more marked) features under certain conditions (see below). For example, default [Coronal] may replace a non-default [Dorsal] (velar, e.g. cow /ˈkaʊ/ > [ˈthaʊ]), but [Dorsal] may replace [Coronal] in consonant harmony (e.g. duck /ˈdʌk/ > [ˈɡʌk]); double linking/repetition of [Dorsal] makes its output more likely. Treating [Coronal] as an unspecified default feature unifies the “processes” of Velar Fronting and Velar Harmony; default [Coronal] only appears when something else (in this case non-default [Dorsal]) does not. Systematic sound preferences have been described for child phonology (Edwards & Shriberg, 1983), and align with the concept of default. Child defaults may or may not match defaults of adult systems or markedness expectations. For example, the place default for “Colin,” a five-year-old (Bernhardt & Stemberger, 1998), was [Dorsal] before intervention: [gak], [gaga] and [gagak] were common productions. After treatment focusing on coronals /d/ and /n/, his substitution patterns changed, with the more typical [Coronal] default replacing dorsals.
23.2.2 Constraints-based Theories Another major theoretical development concerns phonological constraints, initially expressed as well-formedness constraints (Goldsmith, 1976). Paradis (1987) suggested that phonological rules/processes are driven by output constraints. If a constraint prohibits production of clusters, a repair process must be invoked to prevent that output: cluster deletion, cluster reduction or vowel epenthesis. In mature phonological systems, repair processes usually involve minimal changes that are just sufficient to prevent constraint violation (e.g., epenthesis, a common cluster process across adult languages). In child phonology, however, repairs may be non-minimal; if clusters are not permitted, they may be fully deleted (Bernhardt & Stemberger, 1998). Paradis’s (1987) model included both constraints and repair processes, but Prince and Smolensky (1993) and McCarthy and Prince (1993) proposed a constraint-based but processfree model, Optimality Theory (OT). In their model, constraints occur in competing pairs, a faithfulness constraint promoting output of an underlying element versus a markedness constraint prohibiting its output. Depending on the relative ranking (importance) of faithfulness and markedness constraints for a given element, that element will be produced or prohibited. If no component of the element can be produced, the element will simply not surface (“be deleted”). If some component(s) of the element can be produced, another element with allowable characteristics could appear. In either case, no rule/process has effected a change, but rather, the output is a result of the intended target “passing through a type of filter” of competing constraints. Output patterns are not represented directly or due to targeted changes but arise in a distributed fashion from interactions in the whole system of constraints. The origin of constraints is not clear. Stampe (1972) suggested that rules/processes were “natural/innate/universal” but provided minimal evidence to support the claim. OT
Constraints-based Nonlinear Phonological Theories in Clinical Phonology Across Languages 321 researchers have been attempting to ground constraints in phonetics and cognition (whether they are innate or emergent), utilizing data from acoustic and articulatory phonetics, speech motor programming and cognitive processing. A discussion of constraint grounding exceeds the scope of this chapter, but it is an important topic for clinical populations, where underlying deficits in perception, cognition and motor systems can strongly affect outputs. See Bernhardt and Stemberger (1998) for discussion of constraints for children with physical issues such as cleft palate.
23.3 Clinical Applications of Constraints-Based Nonlinear Phonological Theories Jakobson (1941/1968) and Romani et al. (2017) outlined similarities and differences between child phonological development and breakdown of phonological systems in adults with neurogenic disorders. A detailed description of the similarities and differences is beyond the scope of this chapter. The following sections note highlights and key citations from the two research areas.
23.3.1 Developmental Applications Spencer (1984) was first to apply nonlinear concepts in explaining reduplication patterns in a six-year-old with protracted phonological development (PPD). Further research for English and other languages followed: (1) group studies on word structure (e.g. Bernhardt et al., 2015a; Mason et al., 2015); sound classes (e.g. Bernhardt et al., 2015b); and interactions between word structure and segments (e.g. Bérubé et al., 2020); (2) case profiles (Volume 36[7–9], Clinical Linguistics and Phonetics); (3) clinical tool development, for example, Bernhardt and Stemberger (2000), Bérubé et al. (2015), Mason (2018); and (4) intervention studies, for example, Barlow and Gierut (1999), Bernhardt (1990, 1992), Chung et al. (2022), Combiths et al. (2022), Dinnsen and Gierut (2008), Edwards (1995), Feehan et al. (2015), Major and Bernhardt (1998), Másdóttir and Bernhardt (2022), Ozbič and Bernhardt (2022), and Von Bremen (1990). Constraints-based nonlinear approaches have provided a comprehensive framework for analyzing speech data from children with typical development (TD) and PPD, and for designing intervention programs when needed. Although segments remain a key focus, analyzing data from other levels of the phonological hierarchy has resulted in a deeper understanding of children’s developmental patterns, particularly because word structure constraints can affect segment production. Children can show asynchronous mastery of word structure versus segments; multilevel analyses provide insights into their greatest developmental needs, but also into phonological strengths that can be exploited when intervention is indicated (see 23.4.3). Further detail on assessment and intervention applications follows below.
23.3.1.1 Developmental Phonology: Assessment Tools for Nonlinear Analysis Single-word elicitation tools have been developed in 18 languages for an ongoing crosslinguistic study in phonological acquisition (Bernhardt & Stemberger, 2016, 2022, phonodevelopment.sites.olt.ubc.ca). Each tool incorporates the same design principles. The word sets probe all levels of the phonological hierarchy (with at least two exemplars per element). Word lists are divided into a screening set (30–50 words) and more in-depth evaluation (approximately 100 total words). Words have a variety of lengths and syllable structures and
322 Barbara May Bernhardt, Joseph P. Stemberger, Glenda Mason, and Daniel Bérubé test both earlier- and later-acquired segments to be applicable across a developmental range (two to nine years of age). The full list provides sufficient data for analysis of a child’s strengths and needs in word structure, segments/features and their interactions, but is still relatively quick to administer (a story format aids elicitation). Other parameters for developmental tools are also considered, for example, word familiarity, imageability, and cultural suitability of the words and pictures for younger and older children. For French, for example, the screening list contains 46 words and the full list, 111 words. Each segment is targeted at least twice in words that vary from one to four syllables and comprise common syllable structures and stress patterns of French. The test takes about 20 minutes to administer (using a story, “Julie’s Day”), and most preschoolers produce the words spontaneously (Bérubé & MacLeod, 2022). A form is available (SCAN) for analyzing a child’s strengths and needs in word structure and segmental inventories (consonants, vowels), consonant and vowel sequences, and mismatch patterns. Multisyllabic words can be analyzed using the Multisyllabic Nonlinear Analysis in Phon (Mason, 2018), a computerized tool (Hedlund & Rose, 2020). A quick measure, Whole Word Match (WWM), can be helpful for identifying children with PPD. WWM equals the proportion of a child’s words that match the adult targets, ignoring minor deviations in voicing or exact placement of articulators (Bernhardt et al., 2020).
23.3.1.2 Developmental Applications: Planning Intervention Intervention studies in English and other languages have applied various concepts from constraints-based nonlinear phonology: (1) phonological hierarchy; (2) autonomy versus interaction of phonological elements; (3) exploiting strengths (high-ranked faithfulness constraints) when addressing needs (low-ranked faithfulness constraints; high-ranked markedness constraints); (4) syllable structure (moras; onset-rime divisions); (4) feature defaults versus non-defaults. In 8- to 18-week treatment studies conducted in British Columbia, Canada, treatment targets addressed word structure and segments/features in separate mini-blocks. Targets were addressed using established elements from the other major part of the system as support (new features in established word structures and vice versa). For example, a boy aged 5;10 (Bernhardt, 1992) mastered the following treatment targets in two six-week treatment blocks, that each comprised two mini-blocks of three weeks (nine 45-minute sessions). 1. New word structure: Word-initial stop clusters using segments in his phonetic inventory: stop-/w/, stop-/j/ and in Block 2, also stop-/l/ and /st/. Timing unit match increased for all clusters (CC(C)), even if segments were still developing (/s/, /ɹ/); 2. New features in existing word structures, CVV, CVC(V)C: (a) [+lateral] for /l/; (b) Coronal [−anterior] for palatoalveolars /ʃ/, /ʤ/. Rate of acquisition was compared between features from higher versus lower positions in the feature hierarchy, in this case, a higherlevel manner feature [+lateral] which progressed faster than the lower-level place feature, [Coronal,-anterior] (see Figure 23.2). There was generalization to the untrained voiceless affricate in Block 2. For some children in the studies by Bernhardt and colleagues, targets addressed constraints on interactions of phonological units. A common interactive target was expansion of available segmental/feature content into an existing word position where that feature/segment did not yet appear. For example, a Slovenian child showed a positional constraint for the feature [continuant]: fricatives could appear only in coda, and stops only in onset; one of her major treatment targets was mastery of positional faithfulness for the feature [ continuant], whatever its value (Ozbič & Bernhardt, 2022), a goal attained after a three-month treatment period with ten weekly sessions.
Constraints-based Nonlinear Phonological Theories in Clinical Phonology Across Languages 323 An Icelandic child showed a different kind of prosodic-segmental interaction constraint (Másdóttir & Bernhardt, 2022); at age 4;8, the child matched syllable timing faithfully but through compensatory vowel lengthening, rather than consonant production; after a 5-month weekly treatment program (17 sessions) targeting word-medial pre-aspirated stops, consonant moras were re-linked to the consonant tier, with a notable reduction in compensatory lengthening. Within-segment, there can also be constraints on interactions, that is, certain feature combinations can be prohibited. A French-speaking child’s pre-treatment inventory included alveolar fricatives and labial stops but no labiodental fricatives (he produced default [t] for many targets); one of his four treatment targets was the feature combination [Labial] & [+continuant] & [−sonorant], which he acquired after 7 weeks of treatment (14 30-minute sessions), mastering first /v/ and then /f/ (Bérubé & Spoor, 2022).
23.3.1.3 Developmental Applications: Treatment Strategies for Syllable Structure In treatment blocks focusing on prosodic structure, treatment outcomes were compared for strategies that presented stimuli as onsets and rimes versus as timing/weight units (moras) (Bernhardt, 1990, 1994a, 1994b). For treatment using onset-rime divisions, onsets and rimes were presented separately if pronounceable (e.g., for a /sn/ cluster, “sn” versus “ap” for snap); or segments were moved from coda to onset or vice versa through alternating repetition (/asp asp aspasp-a-spa/) (Bernhardt, 1990, 1994a). When taking a moraic approach to target CVC(V), for example, lax (short) vowels were used in stimuli, because codas are obligatory in that context for English (i.e., open monosyllables with lax vowels, e.g.*/bɪ/, do not occur in adult English). Rhythmic support was provided to highlight each of the moras. Although the six children (aged three to six years) in Bernhardt (1990) showed no significant differences in rate of acquisition by syllabic treatment strategy, the presentation of structural stimuli in two ways may have facilitated the faster rate of change for syllable and word structure compared with segments by drawing attention to different aspects of syllable structure and timing. As noted in Section 23.3.1.2, moras were relevant for treatment outcomes in Másdóttir and Bernhardt (2022) but in that case, the language has a length distinction, increasing the likelihood of relevance for moras.
23.3.1.4 Developmental Applications: Defaults and Nondefaults in Intervention Intervention studies have also addressed default/nondefault contrasts. In the studies by Bernhardt and colleagues, to determine a child’s defaults, highly frequent forms were considered defaults, especially if they substituted for other elements except in consonant harmony contexts (e.g., [Coronal] /t/ overriden by [Dorsal] [k] in take [keɪk]). With the assumption that marked (lower-frequency, non-default) features and structures are more challenging than defaults, treatment typically targeted adult non-defaults. Exceptions occurred when a child had different defaults from those of adults, as we indicated above for “Colin.” Barlow, Gierut and colleagues (e.g., Barlow & Gierut, 1999; Combiths et al., 2022) also target marked non-defaults, with the perspective (and evidence) that this can accelerate development of less marked elements through implicational universals (following earlier generative phonological theories). There can also be interactions between segments/features and word structure relative to defaults/non-defaults. In adult and child phonology, default features occur more frequently than non-defaults in marked word positions (Bernhardt & Stemberger, 1998; Hammond, 1999). Thus, non-default structures might be more readily learned if using well-established (default) features, and vice versa, but this has not been tested yet in treatment.
324 Barbara May Bernhardt, Joseph P. Stemberger, Glenda Mason, and Daniel Bérubé
23.3.1.5 Outcomes and Future Applications with Children Overall, children made significant gains in intervention studies that employed a constraintsbased nonlinear phonological framework in studies by Bernhardt and colleagues. Children attained age-level phonology or moved from a severe to mild-moderate level of impairment. In studies by Barlow, Gierut and colleagues, successful short-term outcomes were also observed from applications of constraints-based and generative phonological theories. In a longer-term outcome evaluation for English (Bernhardt & Major, 2005), ten out of twelve participants achieved age-level language and literacy scores, and five showed only minor age-appropriate articulatory mismatches, even though all children had shown moderate to severe phonological impairments pre-treatment. The results suggest that a concentrated focus on the various aspects of the phonological system may have long-term benefits for speech and literacy. All major concepts in the theoretical framework appeared relevant: 1. Phonological hierarchy: Targeting phonological content at various levels of the hierarchy stimulated system-wide change in multiple ways: strengthening word structure provided more robust locations for segments/features; strengthening the segmental system provided content to populate the various structures. An element’s location in the hierarchy appeared generally predictive of its rate of change, further suggesting the relevance of hierarchy: children showed faster rates of change for higher-level prosodic targets than for lower-level segmental/feature targets, and for higher- versus lower-level features (Bernhardt, 1990); 2. Tier interactions: Targeting tier interactions (prosodic-segmental; feature combinations) can create system-wide effects, for example, the Slovenian and Icelandic cases; 3. Defaults/non-defaults and markedness constraints: Targeting non-defaults allowed complex elements to be addressed with support of treatment, allowing less complex (unmarked) elements to mature independently or through generalization (Barlow & Gierut, 1999); 4. Markedness versus faithfulness constraints: Discovering what a child can do (faithfulness constraints) in addition to what they cannot (markedness constraints) provides opportunities to support phonological needs with phonological strengths (Bernhardt & Stemberger, 2000).
23.3.2 Applications for Adults with Neurogenic Speech Impairments Several studies have addressed prosodic structure in neurogenic impairments. Fewer have addressed notions about features deriving from nonlinear phonology. The first section describes general findings about prosodic structure and features. The second reviews intervention studies. Because there has been less research applying constraints-based nonlinear phonology to evaluation and treatment of adult speech impairments, not all the same topics are addressed in this section as in 23.3.1.
23.3.2.1 Applications for Adults with Neurogenic Impairments: Structure versus Segments Nickels and Howard (2004) provide an overview of the literature on prosodic structure in people with aphasic disorders, showing that more marked syllable structures are subject to higher error rates. They argue, however, that the effects reported up to that point, and in their own study of nine patients with a variety of speech production disorders, could just derive
Constraints-based Nonlinear Phonological Theories in Clinical Phonology Across Languages 325 from complexity in terms of the numbers of phonemes per word: words with more phonemes are subject to higher error rates, however the phonemes may be arranged (as singletons or in clusters). Romani and Galluzzi (2005) argue that there are indeed effects separate from complexity. They introduce a scale that combines structural and segmental markedness and show that it correlates with error rates, separately from complexity by segment number. Although it is not unexpected that words with many phonemes would be less accurate than words with few, the perspective of Romani and Galluzzi (2005) is more in line with current phonological theories in incorporating representational levels other than the segmental, an approach taken also in research of Aichert, Ziegler and colleagues, described below.
23.3.2.2 Applications for Adults: Integrating Nonlinear and Gestural Phonology Computational models based on theories of nonlinear phonology and articulatory gestures (NLG) were used by Aichert, Ziegler and colleagues to evaluate apraxic speech errors in adult German speakers. Articulatory gestures, comprising abstract representations of action units of vocal tract constriction, were viewed as the smallest speech motor planning units. During childhood motor learning, phonetic targets were considered the result of coordinated and potentially overlapping functional groupings of articulatory movements with temporal duration. Generalization of a component of a complex target from a practiced to unpracticed syllable implied an organized store of differentiated motor units, as opposed to holistic syllable storage (Levelt et al., 1999). As would be expected from a nonlinear perspective of phonological hierarchy, errors were better predicted by a model that incorporated metrical foot and syllable structure than linear syllable sequences or number of gestures (Ziegler, 2005; Ziegler & Aichert, 2015). In other studies, Aichert & Ziegler (2013a, 2013b, 2016: German) and Bailey et al. (2019: English) found prosodic structure influences on segmental accuracy in adults with apraxia of speech (A-AOS), similar to findings for child phonology, for example, more frequent segmental errors in iambs than trochees, highest in word-initial unstressed syllables, and for stressed syllables, higher word-finally than word-initially.
23.3.2.3 Applications for Adults: Feature-structure Interactions Béland et al. (1993) addressed feature-structure interactions in adults with neurogenic impairments. Consonant clusters containing two non-default place features ([Labial], [Dorsal], and [–anterior]) had a higher error rate than clusters with default [Coronal + anterior] plus one nondefault place feature. These data resonate with the child data concerning the relevance of default/non-default status for both structure and features, where default features will be more successful in a marked structural context.
23.3.2.4 Applications for Adults: Intervention Similar to data reported in 23.4.1.1, intervention studies have shown metrical and syllabic “downstream” effects, suggesting robust effects on phonetic planning, and speech motor learning. Furthermore, for both a German-speaking group with A-AOS and for an aphasic group with phonological impairment, natural speech rhythm primes enhanced articulatory accuracy in more marked iambs, and even more so in less marked trochees, even though error rates were higher for the A-AOS group (Aichert et al., 2019; Ziegler et al., 2017). In terms of generalization from treatment targets, singleton consonant practice did not generalize to syllables, but syllable practice did generalize to unpracticed trochees (Aichert & Ziegler., 2013a, 2013b), implying less demanding motor planning requirements for learned monosyllables (coarticulated phonetic units) than segments. For segmentally complex syllables, for example, CCVCCs, accuracy improved by practicing CVCs containing
326 Barbara May Bernhardt, Joseph P. Stemberger, Glenda Mason, and Daniel Bérubé c onsonants from the target onset and coda (Aichert & Ziegler, 2008). Finally, consonant accuracy generalized to untrained syllables when in the same position as the practiced syllables (Schor et al., 2012). Maas et al. (2002) evaluated intervention targeting complex (marked) non-defaults. For two patients, treatment with more marked (complex) syllables with triconsonantal onsets (e.g./str/) led to improvement in all onsets regardless of number of consonants, whereas treatment with simple syllables (with single-consonant onsets) affected only syllables with singleton onsets. This could be interpreted as showing that treatment of both structural and segmental targets (as in /str/) leads to improvement in both structure and segments, whereas treatment of only segmental targets (the singleton onset) leads to improvement only of segments. It should be noted, however, that these authors focus on the fact that more complex (non-default) treatment targets had more effect overall, similar to what has been observed in phonological intervention with children (Barlow, Gierut, and colleagues); the results are ambiguous, however, between effects of syllables versus effects of segmental complexity of the treatment syllables. Overall, applications of constraints-based nonlinear phonological theories have been fewer for adult speech impairments, but key tenets of the theories that have been applied show similar effects to those for the children: (1) influence of higher-level prosodic constraints on lower-level segmental output; (2) relevance of the notion of defaults/non-defaults in patterns observed and for intervention programming and outcomes.
23.5 Clinical Application: Current and Future Practice For clinical purposes, it is important to find efficient and effective methods for assessment and analysis, and to evaluate various intervention methodologies. Some research has been conducted on applications of constraints-based nonlinear phonological theories, including on intervention for people with speech impairments, more for children than adults. One of the innovations of the adult-oriented research programs is the linking of nonlinear phonological and gestural profiles. Key findings are summarized below, indicating implications for future research and practice. Findings for both children and adults suggest that it is valuable to probe all levels of the phonological hierarchy in assessment, targeting all phonemes and features of the language across relevant word positions and in words with different lengths, stress patterns, and word shapes in CV sequences (see Bérubé et al., 2015). The Bernhardt and Stemberger (2016, 2022) website phonodevelopment.sites.olt.ubc.ca provides free-access assessment tools in 17 languages, and tutorials demonstrating transcription practices, nonlinear phonological analysis and treatment activities. In addition, a set of case studies (Stemberger & Bernhardt, 2022) illustrates the application of constraints-based nonlinear phonological constructs in assessment and intervention planning for 16 languages. If quantitative analysis is desired (to establish baselines and/or evaluate outcomes), free computer software may be utilized, for example Phon (Hedlund & Rose, 2020), which has a number of measures useful for nonlinear analysis. For instance, a specialized routine in Phon analyzes multisyllabic words (Multisyllabic Nonlinear Analysis: MNLA, Mason, 2018), outputting a whole word mismatch total, and subtotals for word structure c omponents (stress, syllables), consonant and vowel features, and assimilations. The MNLA analysis facilitates individually tailored selection of intervention stimuli for promoting phonological and/or speech motor learning. The procedures mentioned above have primarily been applied to child data to date, but SLPs may also find them helpful in characterization of adult speech disturbances, whether of a more phonological or apraxic nature. As noted, the available resources extend beyond English.
Constraints-based Nonlinear Phonological Theories in Clinical Phonology Across Languages 327 An open access application developed for adult data also exists for German 1- to 4-syllable words (Ziegler et al., 2021), www.neurophonetik.de/gesten-koeffizientenrechner, with the possibility of incorporating additional languages. The application evaluates challenges of target words giving nonlinear gestural (NLG) scores and displays of gestural word structure. Additionally, NLG quantifications of mismatches between target words and client productions provide a tool for monitoring speech accuracy. Speech-language pathologists may need to assess phonology in an unfamiliar language, for example when an English-speaking SLP examines an Arabic-speaking child. In this multilingual context, in-depth assessment using the full nonlinear framework may be daunting at first. We recommend that clinicians screen the child’s phonological abilities using the available screening word lists and a WWM measure. In this context, SLPs only need to listen to the unfamiliar language, without transcribing speech, and determine the proportion of child productions that match the adult target. Lower WWM indicates a greater phonological need. The WWM is a promising tool to identify children who are at risk of PPD within a multilingual context (Bernhardt et al., 2020). If the child appears to warrant intervention, as a second step, further transcription and analysis could be undertaken, using the tools available. In the 2008 edition of the Handbook of Clinical Linguistics, Bernhardt & Stemberger commented that it would be a remarkable computer program that could take the raw audio files of a speaker, transcribe them reliably, analyze them according to the most elegant phonological theory, and present a ranked set of hierarchical targets for implementation and outcomes evaluation. Fifteen years have passed, and no automated system has yet replaced the e xpertise of SLPs. However, automation can quicken certain elements of assessment and intervention planning. One computer-assisted program shows promise: AutoPATT (Combiths et al., 2022). Once the target and child productions are entered into Phon (Rose & MacWhinney, 2014), a plug-in for Phon allows quick relational (comparing child to adult target) and independent analyses (phonetic and phonemic inventory). These two types of analyses have been conducted for research purposes through Phon. The AutoPATT program identifies gaps in the child’s phonological knowledge and helps select relatively complex treatment targets. The SLP can then more quickly identify and select intervention strategies. Perhaps, over time, other plug-ins based on NLP could be entered into Phon or other platforms to identify treatment targets for children who demonstrate more complex phonological needs. Of course, that computer program would have to be able to take all of a child’s other needs into account, including family support, hearing status, perception, phonological awareness, oral-motor structures, physical abilities, personality, motivation, and so on. For the foreseeable future, the fuzzy logic of human beings appears to be needed, as flawed as it sometimes is, and as poor as we may be at predicting outcomes of treatment. As phonological theories develop, the researcher in child phonology may find better ways of interpreting phonological patterns that will lead to more informed choices for analysis and intervention. In the interim, readers are encouraged to delve further into the literature and to visit the Bernhardt and Stemberger (2016, 2022) website: phonodevelopment.sites.olt.ubc.ca.
REFERENCES Aichert, I., Lehner, K., Falk, S., Späth, M., & Ziegler, W. (2019). Do patients with neurogenic speech sound impairments benefit from auditory priming with a regular metrical pattern? Journal of Speech, Language, and Hearing Research, 62(8S), 3104–3118. https://doi.org/ 10.1044/2019_JSLHR-S-CSMC7-18-0172
Aichert, I., Späth, M., & Ziegler, W. (2016). The role of metrical information in apraxia of speech. Perceptual and acoustic analyses of word stress. Neuropsychologia, 82(1), 171–178. https://doi.org/10.1016/j.neuropsychologia. 2016.01.009
328 Barbara May Bernhardt, Joseph P. Stemberger, Glenda Mason, and Daniel Bérubé Aichert, I., & Ziegler, W. (2008). Learning a syllable from its parts: Cross-syllabic generalisation effects in patients with apraxia of speech. Aphasiology, 22(11), 1216–1229. https://doi.org/10.1080/02687030701820303 Aichert, I., & Ziegler, W. (2013a). Segments and syllables in the treatment of apraxia of speech: An investigation of learning and transfer effects. Aphasiology, 27(10), 1180–1199. https://doi.org/10.1080/02687038.2013.802285 Aichert, I., & Ziegler, W. (2013b). Word position effects in apraxia of speech: Group data and individual variation. Journal of Medical Speech-Language Pathology, 20(4), 7–11. Bailey, D. J., Bunker, L., Mauszycki, S., & Wambaugh, J. L. (2019). Reliability and stability of the metrical stress effect on segmental production accuracy in persons with apraxia of speech. International Journal of Language & Communication Disorders, 54(6), 902–913. https://doi.org/10.1111/14606984.12493 Barlow, J., & Gierut, J. (1999). Optimality theory in phonological acquisition. Journal of Speech, Language and Hearing Research, 42(6), 1482– 1498. https://doi.org/10.1044/jslhr.4206.1482 Béland, R., Paradis, C., & Bois, M. (1993). Constraints and repairs in aphasic speech: A group study. Canadian Journal of Linguistics, 38(2), 279–302. https://doi.org/10.1017/ S000841310001478X Bernhardt, B. (1990). Application of nonlinear phonological theory to intervention with six phonologically disordered children. Unpublished PhD dissertation, University of British Columbia. Bernhardt, B. (1992). The application of nonlinear phonological theory to intervention. Clinical Linguistics and Phonetics, 6(4), 283–316. https:// doi.org/10.3109/02699209208985537 Bernhardt, B. (1994a). The prosodic tier and phonological disorders. In M. Yavas (Ed.), First and second language acquisition (pp. 149–172). Singular Press. Bernhardt, B. (1994b). Phonological intervention techniques for syllable and word structure development. Clinics in Communication Disorders, 4(1), 54–65. PMID: 8019551. Bernhardt, B., & Major, E. (2005). Speech, language and literacy skills three years later: Long-term outcomes of nonlinear phonological intervention. International Journal of Language and Communication Disorders, 40(1), 1–27. https://doi.org/10.1080/1368282041000 1686004
Bernhardt, B., & Stemberger, J. P. (1998). Handbook of phonological development: From a nonlinear constraints-based perspective. Academic Press. Bernhardt, B., & Stemberger, J. P. (2007). Phonological impairment. In P. de Lacy (Ed.), Handbook of phonology (pp. 575–594). Cambridge University Press. Bernhardt, B., & Stemberger, J. P. (2008). Constraints-based nonlinear phonological theories: Application and implications. In M. Ball, M. R. Perkins, N. Müller, & S. Howard (eds.). The handbook of clinical linguistics (pp. 423-438). Blackwell. https://doi. org/10.1002/9781444301007.ch26 Bernhardt, B., & Stemberger, J. P. (2016, 2022). Phonological developmental tools and crosslinguistic phonology project. https:// phonodevelopment.sites.olt.ubc.ca. Bernhardt, B. H., & Stemberger, J. P. (2000). Workbook in nonlinear phonology for clinical application. Pro-Ed. Bernhardt, B. M., Hanson, R., Perez, D., Ávila, C., Lleó, C., Stemberger, J. P., Carballo, G., Mendoza, E., Fresneda, D., & Chávez-Peón, M. (2015a). Word structures of Granada Spanishspeaking preschoolers with typical versus protracted phonological development. International Journal of Language & Communication Disorders, 50(3), 298–311. https://doi.org/10.1111/1460-6984.12133 Bernhardt, B. M., Másdóttir, T., Stemberger, J. P., Leonhardt, L., & Hansson, G. O. (2015b). Fricative development in Icelandic and English-speaking children with protracted phonological development. Clinical Linguistics and Phonetics, 29(8–10), 642–665. https://doi. org/10.3109/02699206.2015.1036463 Bernhardt, B. M., Stemberger, J. P., Bérubé, D., Ciocca, V., Freitas, M.-J., Ignatova, D., Kogošek, D., Lundeborg Hammarström, I., Másdóttir, T., Ozbič, M., Perez, D., & Ramalho, A. M. (2020). Identification of protracted phonological development across languages: The Whole Word Match and basic mismatch measures. In E. Babatsouli, M. Ball, & N. Müller (Eds.), An anthology of bilingual child phonology (pp. 274–308). Multilingual Matters. https://doi.org/10.21832/ BABATS8410 Bérubé, D., Bernhardt, B. M., & Stemberger, J. P. (2015). A test of Canadian French phonology: Construction and use. Canadian Journal of Speech-Language Pathology and Audiology, 39(1), 61–100. https:// cjslpa.ca/files/2015_CJSLPA_Vol_39/No_01/ CJSLPA_Spring_2015_Vol_39_No_1_Berube_ et_al.pdf
Constraints-based Nonlinear Phonological Theories in Clinical Phonology Across Languages 329 Bérubé, D., Bernhardt, B. M., Stemberger, J. P., & Ciocca, V. (2020). Development of singleton consonants in French-speaking children with typical versus protracted phonological development: The influence of word length, word shape and stress. International Journal of Speech-Language Pathology, 22(6), 637–647. https://doi.org/10.1080/17549507.2020.1829706 Bérubé, D., & MacLeod, A. (2022). A comparison of two phonological screening tools for French-speaking children. International Journal of Speech-Language Pathology, 24(1), 22–32. https:// doi.org/10.1080/17549507.2021.1936174 Bérubé, D., & Spoor, J. (2022). Word structure and consonant interaction in a French-speaking child with protracted phonological development. Clinical Linguistics and Phonetics, 36(8), 696–707. https://doi.org/10.1080/02699206.2021.2019313 Chomsky, N., & Halle, M. (1968). The sound pattern of English. MIT Press. Chung, S., Bernhardt, B. M., & Stemberger, J. P. (2022). When codas trump onsets: An Englishspeaking child with atypical phonological development before and after intervention. Clinical Linguistics and Phonetics, 36(9), 779–792. https://doi.org/10.1080/02699206.2021.2025432 Combiths, P., Amberg, R., Hedlund, G., Rose, Y., & Barlow, J. A. (2022). Automated phonological analysis and treatment target selection using AutoPATT. Clinical Linguistics and Phonetics, 36(2–3), 203–218. https://doi.org /10.1080/02699206.2021.1896782 Dinnsen, D. A., & Elbert, M.(1984). On the relationship between phonology and learning. In M. Elbert, D. A. Dinnsen, & G. Weismer (Eds.), Phonological theory and the misarticulating child, ASHA Monographs, 22 (pp. 59–68). American Speech-Language-Hearing Association. PMID: 6466409. Dinnsen, D. A., & Gierut, J. A. (2008). The predictive power of Optimality Theory for phonological treatment. Asia Pacific Journal of Speech, Language and Hearing, 11(4), 239–249. https://doi.org/10.1179/136132808805335608 Duffy, J. R. (2020). Motor speech disorders: Substrates, differential diagnosis and management. Elsevier. Edwards, M. L., & Bernhardt, B. (1973). Phonological analyses of the speech of four children with language disorders. Unpublished MS. The Scottish Rite Institute for Childhood Aphasia, Stanford University. Edwards, M. L., & Shriberg, L. (1983). Phonology: Applications in communicative disorders. College-Hill Press.
Edwards, S. M. (1995). Optimal outcomes of nonlinear phonological intervention. Unpublished MSc thesis, University of British Columbia. Feehan, A., Francis, C., Bernhardt, B. M., & Colozzo, P. (2015). Outcomes of phonological and morphosyntactic intervention for twin boys with protracted speech and language development. Child Language Teaching and Therapy, 31(1), 53-69. https://doi. org/10.1177/0265659014536205 Fisher, C. A., Wilshire, C. E., & Ponsford, J. L. (2009). Word discrimination therapy: A new technique for the treatment of a phonologically based word‐finding impairment. Aphasiology, 23(6), 676–693. https://doi. org/10.1080/02687030801987382 Gierut, J. A. (1990). Differential learning of phonological oppositions. Journal of Speech and Hearing Research, 33(3), 540–549. https://doi. org/10.1044/jshr.3303.540 Goldsmith, J. (1976). Autosegmental phonology. PhD dissertation, MIT. Garland Press, 1979. Grunwell, P. (1985). Phonological assessment of child speech. College-Hill Press. Hammond, M. (1999). The phonology of English: A prosodic Optimality-Theoretic approach. Oxford University Press. Hayes, B. (1989). Compensatory lengthening in moraic phonology. Linguistic Inquiry, 20(2), 253–306. https://www.jstor.org/stable/i390235 Hedlund, G., & Rose, Y. (2020). Phon 3.1 [Computer Software]. Retrieved from https://phon.ca Hockett, C. (1955). A manual of phonology. Waverly Press. Hodson, B. (1986). Assessment of phonological processes – Revised. Interstate Publishers. Ingram, D. (1976). Phonological disabilities in children. Elsevier. Jakobson, R. (1941/1968). Child language, aphasia, and phonological universals (A. R. Keiler Trans.). Mouton. Originally published as Kindersprache, Aphasie, und allgemeine Lautgesetze. Almqvist and Wiksell, 1941. Levelt, W. J. M., Roelofs, A., & Meyer, A. S. (1999). A theory of lexical access in speech production. Behavior and Brain Sciences, 22(1), 1–38. https:// dx.doi.org/10.1017/S0140525X99001776 Maas, E., Barlow, J., Robin, D., & Shapiro, L. (2002). Treatment of sound errors in aphasia and apraxia of speech: Effects of phonological complexity. Aphasiology, 16(4–6), 609–622. https://doi.org/10.1080/02687030244000266 Major, E., & Bernhardt, B. (1998). Metaphonological skills of children with
330 Barbara May Bernhardt, Joseph P. Stemberger, Glenda Mason, and Daniel Bérubé phonological disorders before and after phonological and metaphonological intervention. International Journal of Language and Communication Disorders, 33(4), 413–444. https://doi.org/10.1080/136828298247712 Másdóttir, T., & Bernhardt, B. M. (2022). Uncommon timing variation in the speech of an Icelandic-speaking child with protracted phonological development. Clinical Linguistics and Phonetics, 36(9), 806–819. https//doi.org/1 0.1080/02699206.2021.2011959 Mason, G. K. (2018). School-aged children’s phonological accuracy in multisyllabic words on a whole-word metric. Journal of Speech, Language, and Hearing Research, 61(12), 2869–2883. https:// doi.org/10.1044/2018_JSLHR-S-17-0137 Mason, G. K., Bérubé, D., Bernhardt, B. M., & Stemberger, J. P. (2015). Evaluation of multisyllabic word production in Canadian English- and French-speaking children within a nonlinear phonological framework. Clinical Linguistics and Phonetics, 29(8–10), 666–685. https://doi.org/10.3109/02699206. 2015.1040894 McCarthy, J. J. (1988). Feature geometry and dependency: A review. Phonetica, 43(2–4), 84–108. https://doi.org/10.1159/000261820 McCarthy, J. J., & Prince, A. S. (1993). Prosodic morphology I: Constraint interaction and satisfaction. Rutgers University Center for Cognitive Science Technical Report-3. McReynolds, L. V., & Engmann, D. (1975). Distinctive feature analysis of misarticulations. University Park Press. Nickels, L. A., & Howard, D. (2004). Dissociating effects of number of phonemes, number of syllables and syllabic complexity in aphasia: It’s the number of phonemes that counts. Cognitive Neuropsychology, 21(1), 57–78. https://doi.org/10.1080/02643290342000122 Ozbič, M., & Bernhardt, B. M. (2022). Complexity resolved: Profile of a Slovenian child with protracted phonological development over an intervention period. Clinical Linguistics and Phonetics, 36(9), 765–778. https://doi.org/10.10 80/02699206.2021.2010241 Paradis, C. (1987). On constraints and repair strategies. Linguistic Review, 6(1), 71–97. https://doi.org/10.1515/tlir.1987.6.1.71 Paradis, P., & Prunet, J.-F. (eds.). (1991). The special status of coronals: Internal and external evidence. Academic Press. Prince, A. S., & Smolensky, P. (1993). Optimality theory: Constraint interaction in generative grammar. Rutgers University Center for Cognitive Science Technical Report-2.
Romani, C., & Galluzzi, C. (2005). Effects of syllabic complexity in predicting accuracy of repetition and direction of errors in patients with articulatory and phonological difficulties. Cognitive Neuropsychology, 22(7), 817–850. https://doi.org/10.1080/02643290442000365 Romani, C., Galuzzi, C., Guariglia, C., & Goslin, J. (2017). Comparing phoneme frequency, age of acquisition, and loss in aphasia: Implications for phonological universals. Cognitive Neuropsychology, 34(7–8), 449–471. https://doi. org/10.1080/02643294.2017.1369942 Rose, Y., & MacWhinney, B. (2014). The PhonBank initiative. In J. Durand, U. Gut, & G. Kristoffersen (Eds.), The Oxford handbook of corpus phonology (pp. 380–401). Oxford University Press. Sagey, E. (1986). The representation of features and relations in non-linear phonology. PhD dissertation, MIT. Garland Press, 1991. Schor, A., Aichert, I., & Ziegler, W. (2012). A motor learning perspective on phonetic syllable kinships: How training effects transfer from learned to new syllables in severe apraxia of speech. Aphasiology, 26(7), 880–894. https://doi. org/10.1080/02687038.2012.660458 Spencer, A. (1984). A nonlinear analysis of phonological disability. Journal of Communication Disorders, 17(5), 325–384. https://doi.org/10.1016/0021-9924(84)90035-2 Stampe, D. (1972). How I spent my summer vacation: A dissertation on Natural Phonology. PhD dissertation, University of Chicago. Garland Press, 1981. Stemberger, J. P., & Bernhardt, B. M. (2022). Individual profiles in protracted phonological development across languages: Introduction to the special issue. Clinical Linguistics & Phonetics, 36(7), 597–616. https://doi.org/10.1080/026992 06.2022.2057871 Stemberger, J. P., & Stoel-Gammon, C. (1991). The underspecification of coronals: Evidence from language acquisition and performance errors. In C. Paradis & J.-F. Prunet (Eds.), The special status of coronals (pp. 181–199). Academic Press. Trubetzkoy, N. (1939). Grundzüge der Phonologie [Fundamentals of Phonology]. Travaux du Cercle Linguistique de Prague 7. Van Riper, C., & Irwin, J. V. (1959). Voice and articulation. Pitman Medical Publishing Company. Von Bremen, V. (1990). A nonlinear phonological approach to intervention with severely phonologically disordered twins. Unpublished MSc thesis, University of British Columbia. Yavaş, M., Hernandorena, C. L. M., & Lamprecht, R. R. (1991). Avaliação Fonológica da
Constraints-based Nonlinear Phonological Theories in Clinical Phonology Across Languages 331 criança: Reeducação e terapia [Phonological assessment of the child: Rehabilitation and therapy]. Artes Medicas. Ziegler, W. (2005). A nonlinear model of word length effects in apraxia of speech. Cognitive Neuropsychology, 22(5), 603–623. https://doi. org/10.1080/02643290442000211 Ziegler, W., & Aichert, I. (2015). How much is a word? Predicting ease of articulation planning from apraxic speech error patterns. Cortex, 69(1), 24–39. https://doi.org/10.1016/j.cortex. 2015.04.001
Ziegler, W., Aichert, I., & Staiger, A. (2017). When words don’t come easily: A latent trait analysis of impaired speech motor planning in patients with apraxia of speech. Journal of Phonetics, 64(1), 145–155. https://doi.org/10.1016/j. wocn.2016.10.002 Ziegler, W., Lehner, K., Pfab, J., & Aichert, I. (2021). The nonlinear gestural model of speech apraxia: Clinical implications and applications. Aphasiology, 35(4), 462–484. https://doi.org/10. 1080/02687038.2020.1727839
24 Articulatory Phonology and Speech Impairment CHRISTINA HAGEDORN AND ARAVIND NAMASIVAYAM 24.1 Introduction In the field of Speech and Language Pathology, the distinction between “phonetics” and “phonology” has long been of interest, and much attention has been devoted to debating whether breakdown accounting for a variety of speech sound errors and disorders falls at one level or the other. In this chapter, we present accounts of speech impairment based on the theory of Articulatory Phonology (AP), which attempts to unify phonetics and phonology. As demonstrated in the following sections, Articulatory Phonology, in many cases, offers parsimonious account of impaired speech patterns based on principles of Task Dynamics (TD) and motor control, and, specifically, motor simplification, without the need to appeal to arbitrary rule-based processes.
24.2 Articulatory Phonology Articulatory Phonology is a theoretical framework developed by Catherine Browman and Louis Goldstein beginning in 1986. AP aims to unify the physical and cognitive-linguistic levels of speech production (traditionally classified as “phonetic” and “phonological” levels, respectively), considering them to be low- and high-dimensional domains of a single system. The AP framework posits that the basic units of speech production are gestures, which serve both as units of lexical contrast (at the cognitive-linguistic level) and as units of articulatory movement (at the physical level) (Browman & Goldstein, 1989, 1992; Goldstein et al., 2006). These gestures consist of the formation and release of constrictions in the vocal tract and are described and modeled in terms of task dynamics and dynamical systems (Saltzman, 1986; Saltzman & Munhall, 1989). Dynamical systems are used to understand and model, using an equation or set of equations, quantitative changes in a given variable (e.g., position) over time, and can be characterized by the state of the system (e.g., tongue tip constriction degree) and a rule denoting how the state changes, depending on the current state. Within the AP framework, a gesture is defined as a dynamical system with set parameter values for defined vocal tract variables, such as constriction location and constriction degree. Both constriction location and degree can be conceptualized as the system’s targets or end goals and are
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
334 Christina Hagedorn and Aravind Namasivayam mathematically modeled as attractors in the system. Importantly, what such a system characterizes dynamically is change in vocal tract variables, such as constriction degree at a given constriction location, rather than the motion of individual articulators. In this way, given target attractors of the system, articulatory constriction formation can be modeled regardless of the start position of the articulators. A particular gesture is specified using one of the five possible sets of tract variables outlined in Table 24.1. The control regime for a given tract variable is comprised of the set of articulators used to form the constriction and release, as well as the parameter values in the dynamic equation that characterizes its movement and gives rise to the spatiotemporal unfolding of the constriction formation. Among these parameters are target (described above) and stiffness, which determines the rate at which the target is approached. The articulators in the control regime work synergistically toward the achievement of the specified parameter values, and are organized into a coordinative structure (Fowler et al., 1980; Turvey, 1977). Given that gestures serve as basic units of phonological contrast, lexical items will contrast if they differ in gestural composition. These differences may involve (i) the presence or absence of a particular gesture; (ii) gestural parameters, such as target values for constriction degree; or (iii) the organization of the gestures. The organization and coordination of gestures within a particular lexical item, therefore, must be specified in critical ways to ensure perceptual recoverability. In this way, the relative timing of gestures that this organization gives rise to is information-bearing. It is this organization that AP posits comprises the phonological structure of speech. Each gesture is associated with a planning oscillator, or clock, that is responsible for triggering its activation (Browman & Goldstein, 2000; Goldstein et al., 2006; Nam & Saltzman, 2003; Saltzman & Byrd, 2000). Articulatory studies have revealed that there are general principles that define how the activation of certain classes of gestures are organized or phased with respect to one another (Löfqvist & Gracco, 1999). As speech is being planned, each gesture’s clock is set into motion at random phases and coupling forces specific to the gestural constellation at hand cause each gesture’s clock to stabilize at specific relative phases before the triggering of each gesture’s activation begins (Saltzman & Byrd, 2000).1 The in-phase mode of coupling is the most intrinsically simple and stable of all modes, and can be mastered with relative ease across modalities (e.g., drumming, speaking, etc.) (Haken et al., 1985). When coordinated in-phase, actions’ activations are synchronous; one gesture’s activation is triggered at 0º with respect to the other’s activation. In CV sequences, Table 24.1 Tract variable categories and articulators involved. Tract variable
Articulators involved
Lip Aperture Lip Protrusion
Upper Lip Lower Lip Jaw
Tongue Tip Constriction Location Tongue Tip Constriction Degree
Tongue Tip Tongue Body Jaw
Tongue Body Constriction Location Tongue Body Constriction Degree
Tongue Body Jaw
Velic Aperture
Velum
Glottal Aperture
Glottis
Articulatory Phonology and Speech Impairment 335 the gestures pertaining to the onset consonant and following vowel begin synchronously with one another (Browman & Goldstein, 2000; Goldstein et al., 2006; Nam, 2007), suggesting that the clocks associated with these gestures are coupled in-phase. Despite being triggered simultaneously, the gestures associated with CV syllables are recoverable due to differences in constriction degree, dynamic stiffness (causing the vocalic gesture to take longer to reach its target), and activation duration (allowing the vocalic gesture to remain active beyond the offset of the consonant gesture). The in-phase mode being the most intrinsically simple and stable is consistent with cross-linguistic data suggesting that CV sequences are the first to be mastered developmentally (Nam et al., 2009; Vihman & Ferguson, 1987). The anti-phase mode of coupling is second most stable, and slightly less accessible than the in-phase mode (Haken et al., 1985). AP posits that the gestures involved in VC combinations are organized in the anti-phase mode, as the gesture(s) pertaining to the coda consonant begin later than the vocalic gestures. This is consistent with clocks associated with the gestures for the coda consonant being activated at 180º with respect to that of the vowel, resulting in sequential production. In accordance with the anti-phase mode being slightly less stable than the in-phase mode, consonants in coda position (i.e., VC) are developed by infants after those in onset position (i.e., CV) across languages (Vihman & Ferguson, 1987). The aforementioned phase relations can be depicted using a coupling graph in which nodes represent the gestures and their respective planning oscillators and edges represent coupling relations between the pairs of planning oscillators. For example, in the word “mad” /mæd/, the labial closure gesture and the velic widening gesture corresponding to onset /m/ are coupled in-phase with each other and with the tongue body gesture corresponding to vowel nucleus /æ/, as represented by the solid connecting lines. The gesture corresponding to the vowel is coupled anti-phase with the tongue tip closure gesture corresponding to coda /d/, represented by the dashed line connecting them (Figure 24.1). Coupling graphs give rise to gestural scores, which are used to generate motor commands for speech articulators. Gestural scores (Figure 24.2) display the activation duration of each gesture and therefore make observable any potential temporal overlap among them. The width of the box corresponding to a given gesture denotes the duration of time for which its set of values for the dynamic parameters are active. For example, in the word “mad” /mæd/, the lip closure gesture and the velic widening gesture corresponding to the onset consonant begin synchronously along with the tongue body gesture corresponding to the vowel. The activation duration for the lip closure gesture is shortest, while the velic widening gesture is slightly longer, and the activation duration of the tongue body gesture corresponding to the
Figure 24.1 Coupling graph corresponding to the words “mad,” “bad,” and “pad.” In-phase gestures are connected by solid lines whereas anti-phase gestures are connected by dashed lines.
336 Christina Hagedorn and Aravind Namasivayam
Figure 24.2 Gestural scores corresponding to the words “mad,” “bad,” and “pad.”
vowel is longest. The tongue tip closure gesture, in its anti-phase relation to the tongue body gesture, is activated last in the sequence, overlapping only slightly in time with tongue body activation. The gestural scores for the words “mad” /mæd/ and “pad” /pæd/ are identical to the gestural score for “bad” /bæd/ except for the addition of the velic widening gesture and the addition of the laryngeal opening gesture, respectively. These gestural scores exemplify how Articulatory Phonology is able to capture and unify both low-dimensional (e.g., phonological contrast) and high-dimensional (e.g., contextdependent variation) aspects of speech production that are otherwise attributed to “phonology” and “phonetics,” respectively. The distinct gestural scores capture low-dimensional lexical contrast (i.e., between “mad,” “bad,” and “pad”) based on the presence or absence of a single gesture (i.e., the velic widening gesture in “mad” and “bad” and the laryngeal opening gesture in “bad” and “pad”). Additionally, the relative timing of the laryngeal opening gesture, the labial closure gesture and the tongue body gesture in “pad” results in the wellattested aspiration of voiceless stops of English, a high-dimensional, context-dependent pattern (Goldstein & Fowler, 2003). Tract variables specify goals of a gesture in terms of constriction location and constriction degree, hence controlling context-independent constriction trajectories. Gestural scores, together with their tract variable specifications (Table 24.1), are used to generate motor commands for speech articulators that work synergistically toward achievement of articulatory goals specified by those tract variables. These articulatory movements have aerodynamic and acoustic consequences. In sum, AP posits that lexical items comprise gestures with intergestural coupling information. Tract variables specify goals of a gesture in terms of constriction location and constriction degree. The oscillators (clocks) associated with each gesture are coupled in a pairwise manner, are activated at random phases, and ultimately stabilize at their specified relative phases. The activation of these oscillators gives rise to gestural scores that represent sets of invariant gestures in the form of context-independent sets of dynamical parameter specifications, and specify temporal intervals during which constriction tasks actively control vocal tract articulators. These gestural scores are used to generate motor commands for speech articulators that work synergistically toward achievement of articulatory goals specified by tract variables. The resulting articulatory movements produce the aerodynamic and acoustic output ultimately perceived by listeners. Interarticulator coordination encompasses tract variable specification, which controls the context-independent constriction trajectories and the actual synergistic movements of articulators toward such goals. Intergestural coordination, on the other hand, is determined by the coupling information that yokes a set of gestures, determining the gestural score which specifies each gesture’s activation interval and relative timing.
Articulatory Phonology and Speech Impairment 337
24.3 A ccounting for Patterns Exhibited in Developmental Speech of Typical Children and Those with Speech Delay or Impairment While AP and TD have been used primarily to account for patterns in typical speech, recent work has demonstrated their utility also in accounting for patterns exhibited by individuals with speech impairment (Hagedorn et al., 2017, 2021, 2022; Namasivayam et al., 2020; van Lieshout et al., 2008). In the following sub-sections, we summarize how Articulatory Phonology and Task Dynamics can account for the atypical/developmental patterns observed and we identify the level(s) of the motor speech system at which the breakdown likely occurs, based on the existing evidence.
24.3.1 Weak Syllable Deletion Weak syllable deletion refers to the omission of an unstressed syllable in speech (e.g., [ˈpju. ɾɚ] for /kʌm.ˈpju.ɾɚ/ “computer”). This can be accounted for by breakdown at the level of gestural planning oscillators corresponding to the gestures of the omitted syllable. If the gestural planning oscillators are absent or not appropriately activated, the triggering of the gestures which is dependent on those oscillators will also be absent.
24.3.2 Epenthesis While epenthesis refers to the insertion of any non-target speech segment, neutral vowels are most frequently epenthesized, and tend to surface between consonants (e.g., [bə.ˈlæk] for /blæk/ “black”) and word-finally, following coda consonants (e.g., [ˈbɔ.lə] for /bɔl/ “ball”). It is most likely that the percept of an epenthetic vowel surfaces due to erroneous relative timing among the target gestures. For example, complex phase patterns pertaining to consonant clusters (CCV) (specified at the levels of intergestural coupling and gestural planning oscillators) that are still being acquired and attuned in the child’s system may be inaccurate or unstable. Children may consequently produce the gestures corresponding to consonants of a cluster with too great a lag (i.e., without sufficient temporal overlap) which will result in a perceptual and acoustic vocoid due to the absence of narrow vocal tract constriction during the period between the target consonantal constrictions. Similarly, it is possible that continued phonation beyond the offset of the final consonantal gesture(s) underlies vocoids that surface following voiced coda consonants. Such miscoordination between the laryngeal and supralaryngeal gestures’ offsets would occur at the level of the gestural planning oscillators or mistiming of gestural scores.
24.3.3 Final Consonant Deletion Final consonant deletion refers to the omission of coda consonants (e.g., [ɹɛ] for /ɹɛd/ “red”). While it is possible that all gestures corresponding to a particular segment are not entailed by a child’s target representation of a particular lexical item,2 in most cases, the pattern likely arises from motor simplification at a lower level.3 At the level of intergestural coordination, failure of the oscillator(s) associated with the final consonant’s gestures to be triggered would result in complete omission of the corresponding gestures. It is also possible that the gesture(s)’ oscillators are triggered, but at an erroneous phase with respect to the preceding segments’ gestures. Alternatively, the coda gestures’ oscillators being triggered too early (e.g., between 0º and 30º with respect to the preceding onset consonant and vowel nucleus
338 Christina Hagedorn and Aravind Namasivayam gestures) would likely result in erroneous co-production of the target coda gesture(s) with the target onset gesture(s). Depending on the timing of each gesture’s formation and release, the coda gesture(s) may be covert, or not detectable in the acoustic signal, resulting in the perception of its omission (Surprenant & Goldstein, 1998). Similarly, depending on the target constriction degree of the final consonant segment, it is possible that perception of final consonant deletion arises due to miscoordination of the supralaryngeal articulators with the laryngeal articulators that control phonation. It is also possible that breakdown occurs at the level of tract variable specification, and that the constriction degree tract variable specification of the final consonant is too wide. Finally, acoustic and perceptual omission of the final consonant could also arise from inaccuracies at the level of articulatory synergy; if one or all articulators do not create a sufficient constriction, no final consonant will be detected in the acoustic signal.
24.3.4 Cluster Reduction and Deletion Cluster reduction refers to the elimination of a subset of the target segments in a consonant cluster (e.g., [pa] for /spa/ “spa”), while cluster deletion refers to the omission of all segments in a target cluster (e.g., [a] for /spa/ “spa”). Onset clusters pose a particular challenge because of the complex coordination patterns that underlie their target production; we refer the reader to Marin and Pouplier (2010) for detailed explanation of these competitive coupling patterns. Given the complex intergestural coupling patterns required for consonant clusters, breakdown at the level of intergestural coordination (i.e., intergestural coupling information, gestural planning oscillators, and gestural score activation) is most likely. This would cause all or some of the target gestures’ activations not to be triggered, resulting in complete or partial omission of the cluster. Similarly, this could cause gestures to be triggered at inappropriate phases, resulting in the perception of only a subset of the consonants in a cluster due to gestural overlap. Alternatively, a child’s target not entailing all appropriate gestures will result in omission of all or some of the segments in the cluster. Breakdown at the level of tract variable specifications also may result in cluster omission or deletion. Alternatively, constriction degree specification of one or more gesture being too wide, or constrictions at the articulatory synergy level that are not sufficiently narrow to render an acoustic percept may result in the percept of cluster reduction or deletion.4
24.3.5 Voicing, Nasal, and Place Assimilation Voicing, nasal, and place assimilation in atypical speech refers to segments being erroneously produced with specific attributes of nearby segments. Errors of assimilation can most be straightforwardly accounted for by simplification of the motor patterns produced (e.g., through deletion of gestures or shifting the phase relations of gestures to an intrinsically simpler, more stable mode). The most prevalent voicing assimilation pattern observed in developmental speech is prevocalic voicing (e.g., [bat] for /pat/ “pot”), which refers to target voiceless consonants in onset position being produced with voicing by influence of the following vowel. Articulatory Phonology posits that since the resting state of the vocal folds is adduction, glottal gestures are required for the production of target voiceless segments but not target voiced segments (Goldstein & Browman, 1986). Prevocalic voicing can be accounted for by breakdown at the level of intergestural coordination (i.e., intergestural coupling information, gestural planning oscillators, and gestural score activation) which prevents
Articulatory Phonology and Speech Impairment 339 the activation of the onset consonant’s glottal opening gesture from being triggered altogether, or at the appropriate time. All gestures corresponding to the onset consonant and the vowel are coordinated in-phase, such that their activations are triggered synchronously. It may be that despite the in-phase mode being most accessible, the developing system is taxed by the presence of multiple gestures needing to be planned and executed synchronously. Similarly, nasal assimilation, by which target oral segments are produced as nasal in the presence of a target nasal segment (e.g., [mæ̃ n] for /mæd/ “mad”), likely arises due to breakdown at the level of intergestural coordination causing the onset or offset of a single, target velic gesture to be mistimed with respect to the other target oral gestures in the sequence. Conversely, denasalization (e.g., [bæd] for /mæd/ “mad”) likely arises due to breakdown at the level of intergestural coordination which prevents the activation of the velic lowering gesture from being appropriately triggered. Place assimilation refers to the constriction location of a target segment being influenced by the constriction location of a segment in the vicinity (e.g., [bɛb] for /dɛb/ “Deb”). While Articulatory Phonology and the Task Dynamics model do not have m echanisms by which long-distance anticipatory assimilation5 can be straightforwardly accounted for, recent extensions to the model proposed by Tilsen et al. (2016, 2019a) do. In the selection-coordination framework, gestures associated with upcoming speech segments are sequenced through competitive queuing, in which motor plans associated with each target initially have stable relative levels of excitation which, over time, rise until one plan reaches a selection threshold and is therefore executed while its competitors’ (plans for other segments in the vicinity) excitations are temporarily gated. Achievement of a particular target induces suppression of that plan and de-gating of competing plans, allowing the plan with the next highest excitation level to reach the selection threshold. This continues iteratively until all plans have been selected (Tilsen, 2019b). Plans being executed prematurely, as in cases of anticipatory place assimilation errors, could be caused by error in enforcing appropriate selection thresholds (i.e., setting the threshold too low) or by erroneous assignment of the relative activation level for a particular plan, itself. Alternatively, inappropriate specification of parameters defining the “leaky gating” function – the mechanism by which even gestures that are not selected can exert influence on vocal tract shaping – may account for certain plans exerting premature and excessive influence on the vocal tract.
24.3.6 “Substitution” Patterns: Stopping, Fronting, Backing, Deaffrication, Palatalization, Depalatalization, Gliding, and Vocalization Several error patterns exist that have historically been categorized as “substitution” patterns. Such categorization reflects presumption that target segments are substituted with non-target segments that differ in terms of place or manner of articulation. Among these patterns are stopping, fronting, backing, deaffrication, palatalization, depalatalization, gliding, and vocalization. Based on the existing evidence and predictions made by Articulatory Phonology and Task Dynamics, we propose that these patterns most likely emerge due to breakdown at the level of inter-articulator coordination (i.e., tract variable specification and articulator movement and synergy) rather than due to substitution of targets with non-target segments at higher (i.e., gestural) levels. Stopping refers to replacement of a target fricative segment with a homorganic oral stop (e.g., [tɪp] for /sɪp/ “sip”). Fricatives are a notoriously difficult class of speech
340 Christina Hagedorn and Aravind Namasivayam segments to master, given that they require the articulators to form constrictions with very particular aperture specifications, necessitating meticulous articulatory control (Stevens, 1971, 1972) and it is suggested that constriction degree targets for fricatives are likely specified with more precision than for other sounds (MacNeilage, 1970; Saltzman & Byrd, 2000). Stopping can therefore be accounted for by breakdown at the level of tract variable specification, at the level of synergistic articulatory movement execution, or both. Velar fronting refers to target velar segments being produced at more anterior constriction locations on the palate (e.g., [tæp] for /kæp/ “cap”). The converse, coronal backing, refers to target coronal segments being produced more posteriorly (e.g., [kæp] for /tæp/ “tap”). Several studies have revealed that children who exhibit these patterns produce undifferentiated lingual gestures, in which movement of the tongue tip, tongue body, tongue dorsum, and the lateral margins of the tongue are not independently controlled (Gibbon, 1999; Gibbon & Wood, 2002; Goozée et al., 2007). It has been speculated that this pattern is driven primarily by developmental constraints on the independent movement of the jaw and tongue (Byun, 2012; Cleland & Scobbie, 2021; Davis & MacNeilage, 1995; Green et al., 2002). Additionally, Gibbon and Wood (2002) observed that undifferentiated lingual gestures are unlikely to be completely released at any one moment in time, resulting in articulatory drift, the direction of which determines which percept is formed; if an anterior constriction is released before posterior constriction, a velar is likely to be perceived, while if a posterior constriction is released before an anterior constriction, an alveolar is likely to be perceived (Cleland & Scobbie, 2021). Additionally, studies have demonstrated that many children exhibit covert contrast for these segments, in which subtle acoustic or articulatory differences exist between speakers’ attempts of each target even when no detectable perceptual differences exist, and that perceptually acceptable /k/ is acquired in a gradient manner (Cleland & Scobbie, 2021; McAllister Byun et al., 2016; Scobbie et al., 1996). These findings suggest that (i) fronting and backing patterns do not reflect selection errors at the planning stage involving substitutions of entire target segments, (ii) children do have distinct articulatory targets for the contrastive segments, and (iii) such errors arise from breakdown at lower levels affecting articulatory movement and synergies, in the case of undifferentiated lingual gestures, and/or at the level of intergestural coordination, in the case of mistimed release gestures. Palatalization refers to target alveolar fricatives being produced at a post-alveolar constriction location (e.g., [∫i] for /si/ “see”), while depalatalization refers to target postalveolar fricatives or affricates being produced at a more anterior, alveolar constriction location (e.g., [si] for /∫i/ “she”). It is possible that the acoustic percepts of palatalization and depalatalization arise due to undifferentiated lingual movement compromising constriction location accuracy, reflecting breakdown at the level of articulator synergy and movement. Given the complexity of articulation and the turbulent airflow necessary for sibilant production (Narayanan & Alwan, 2000; Narayanan et al., 1995; Proctor et al., 2010; Shadle et al., 1996; Stevens, 1971) the control of distinct lingual regions poses a substantial motoric challenge to the speech system which is apparent especially during development (Cheng et al., 2007; Denny & McGowan, 2012; Green et al., 2000). It is also possible that these patterns are caused by erroneous constriction location or constriction degree tract variable specification, or due to temporal miscoordination of the multiple lingual gestures required (i.e., breakdown at the levels of gestural planning oscillators or gestural score activation). Deaffrication refers to target affricate segments (e.g., /tʃ/, /dʒ/) being produced as either fricative or stop segments (e.g., [wɪt] or [wɪʃ] for /wɪtʃ/). In the AP/TD framework, affricates
Articulatory Phonology and Speech Impairment 341 are composed of a stop constriction with a fricative release. As described above, fricative production poses a challenge to the motor speech system due to the very narrow range of permissible aperture values involved. Cases of fricative release omission could be accounted for by erroneous specification of the constriction degree tract variable corresponding to the release of the stop, omission of the constriction release specifications altogether, or difficulty at the level of articulatory movement and synergy. Cases of stop constriction omission (in which only a fricative remains) could be accounted for by erroneous specification of the constriction degree tract variable. Gliding refers to target liquids (e.g., /l/ and /ɹ/) being produced as glides (e.g., /j/, /w/) (e.g., [ˈjɛ.woʊ] for /jɛ.loʊ/ “yellow”; [wɛd] for /ɹɛd/ “red”), while vocalization refers to these same target segments being produced as vowels (e.g., [ˈsæ.dʊ] for /sæd.l̩/ (“saddle”)). /l/ and /ɹ/ require multiple lingual constrictions which pose a motoric challenge for the developing motor speech system due to the requirement of lingual differentiation (Cheng et al., 2007; Gibbon, 1999; Green et al., 2000; Lin & Demuth, 2015; Studdert-Kennedy & Goldstein, 2003). American English /l/ is produced with both anterior and posterior lingual constrictions, and possibly with active lateral channel formation (Browman & Goldstein, 1995; Ying et al., 2021). Prior to mastery of target /l/, individuals tend to simplify the production, omitting either the anterior or posterior constriction, resulting in percepts approximating /w/ if the anterior constriction is omitted, or /j/ if the posterior constriction is omitted (Lin & Demuth, 2015). Vocalization will arise if the constrictions formed are not sufficiently narrow. American English /ɹ/ is also produced with multiple simultaneous lingual constrictions. Although /ɹ/ production varies substantially among typical speakers (Zhang et al., 2003), Preston et al. (2020) outlines five articulatory requirements for accurate production: oral constriction involving raising of some portion of the front half of the tongue, tongue root retraction to create a pharyngeal constriction, lowering the midline of the posterior tongue body, contact of the lateral margins of the tongue body with the back teeth or gums, and slight lip rounding. Like /l/, during development and in individuals with speech impairment, /ɹ/ may be typically produced with only a subset of the required gestures. These patterns give rise to the percepts of gliding (e.g., [w]) or general rhotic distortion. The simplification of the complex target liquids described above may be caused by breakdown at one or more of the following levels: defining required gestures, intergestural coordination, specification of constriction location, and degree tract variables for all gestures, formation of articulatory synergies (including differential control of distinct parts of the tongue), and articulatory movement.
24.4 A ccounting for Patterns Exhibited in Articulation Impairment “Articulation impairment” typically refers to errors in the production of rhotics or sibilants (e.g., derhotacization, [hʌ] for /hɝ/ “her”; s-distortions [s̪ i] or [ɬi] for /si/ “sea”), and is presumed to involve breakdown at the level of speech motor specification and implementation (McLeod & Baker, 2017; Namasivayam et al., 2020; Preston et al., 2013). Within the AP/TD framework, it likely involves breakdown at the levels of defining required gestures, specification of constriction location and degree tract variables for all gestures, relative timing of all gestures (intergestural coordination), and articulatory synergies (including differential control of distinct parts of the tongue).
342 Christina Hagedorn and Aravind Namasivayam
24.5 A ccounting for Patterns Exhibited in Childhood Apraxia of Speech (CAS) Childhood Apraxia of Speech (CAS) is a developmental motor speech disorder associated with purported deficits in planning and programming of speech motor commands. CAS characteristics include (i) impaired movement transitions between articulatory configurations and in coarticulation, (ii) groping or trial-and-error behavior, (iii) vowel distortions, (iv) impaired prosody, (v) voicing errors, (vi) consonant distortions due to “blending” of manner, and (vii) inconsistent errors across repetitions of the same word or phrase (Shriberg et al., 2017; Strand et al., 2013). Several recent experimental studies have demonstrated that children with CAS exhibit increased articulatory movement variability as compared to both children with typically developing speech and children with other speech sound disorders (Case & Grigos, 2020; Grigos et al., 2015; Moss & Grigos, 2012; Terband et al., 2012). Moreover, they have been observed to produce longer articulatory movement durations and larger movement amplitudes (Case & Grigos, 2016, 2020; Grigos & Case, 2018) as well as atypical behavior of single articulatory movements, and atypical interarticulator and intergestural coordination (Grigos et al., 2015; Munson et al., 2003; Nijland et al., 2002; Terband et al., 2011). Within Articulatory Phonology and Task Dynamics, the speech motor planning presumed to be affected in CAS is associated with intergestural coupling information, planning oscillators, and gestural score activations. The speech motor programming affected corresponds to interarticulator coordination and encompasses tract variable specification and articulatory movement/synergy formation.
24.6 A ccounting for Patterns Exhibited in Apraxia of Speech (AOS) Apraxia of Speech (AOS) is a neurogenic motor speech disorder that oftentimes occurs concomitantly with aphasia. While the exact neural substrates of AOS have yet to be unequivocally identified, it has been traditionally assumed that lesions to the left posterior inferior frontal gyrus (Broca’s Area (BA 44)) and the ventral premotor cortex (BA6) are implicated (Richardson et al., 2012; Ziegler et al., 2021). However, recent work suggests that the middle precentral gyrus plays a unique role in speech motor planning and execution and that injury to this area results in pure apraxia of speech (Silva et al., 2022). AOS is classically defined as a disorder affecting the spatial and temporal planning and programming of speech motor commands specified in a target sequence (Ballard et al., 2015; Ziegler et al., 2012) and is generally characterized by speech production errors, reduced rate of speech, increased segment durations, increased intersegment (i.e., transition) durations, and other prosodic difficulties (Duffy, 2019; McNeil, 2000; McNeil et al., 1997, 2009; Ogar et al., 2005; van Lieshout et al., 2007; Wambaugh et al., 2006). These difficulties may exist alongside behaviors including articulatory groping, attempts to repair errors, difficulty with speech initiation (Duffy, 2019; Wambaugh et al., 2006). AOS has been evidenced to result in increased variability of individual articulators (Bartle-Meyer et al., 2009a, 2009b; Itoh et al., 1979; McNeil et al., 1989, 1991; Hoole et al., 1997) and impairment of interarticulator and intersegmental coordination (Itoh et al., 1980, 1982; van Lieshout et al., 2007; Ziegler & Von Cramon, 1985). Several studies have revealed that many errors in apraxic speech impressionistically characterized as “substitutions” can more accurately be described as intrusions, in which intrusive gestures are coproduced with gestures pertaining to the target segment (Bartle-Meyer et al., 2009a; Hagedorn et al., 2017; Hardcastle et al., 1985; Pouplier & Hardcastle, 2005; Sugishita et al., 1987). Articulatory breakdown in AOS likely results from the loss of procedural memories required to produce individual gestures as well as gestural combinations of varying sizes
Articulatory Phonology and Speech Impairment 343 (e.g., segments, clusters, syllables, etc.) (Ziegler, 2008; Ziegler et al., 2021). This “glue” has been proposed to serve as cohesion of gestural components, both within and across segments, specifying coordination patterns. When this “glue” is lost, the speaker must assemble all movement components a tabula rasa, giving rise to the many clinical manifestations of AOS (Ziegler et al., 2021). In AP and TD, this breakdown occurs at levels of intergestural coupling, planning oscillator activation, and gestural score activation, determining how gestures are coordinated with each other in space and time, as well as at the level of articulator movement and synergy, determining relative contribution of various articulators to a goal.
24.7 Accounting for Patterns Exhibited in Dysarthria Dysarthria refers to a class of several neurogenic speech disorders that can be further characterized based on the physiological level of breakdown implicated and characteristics of the resulting movement disorder. The locus of pathophysiology may be the central or peripheral nervous system or the articulatory organ itself, and may be congenital, as in the case of Cerebral Palsy, or acquired, as in cases of Amyotrophic Lateral Sclerosis, Parkinson’s Disease, demyelinating or inflammatory diseases, or surgical or radiological trauma as part of treatment for head and neck cancer. The articulatory patterns observed in speakers with dysarthria tend to vary by dysarthria type, though some characteristics are shared. Individuals with Parkinson’s Disease (PD), Multiple Sclerosis (MS), and Amyotrophic Lateral Sclerosis (ALS) have been observed to produce impaired segment duration, reduced movement amplitude, and reduced speed (Connor & Abbs, 1991; Forrest et al., 1989; Hirose et al., 1981; Liss et al., 2009; Mefferd et al., 2019; Yunusova et al., 2008). These patterns can be accounted straightforwardly by breakdown of the dynamical parameter specification for stiffness (see Kim et al. (2021) for evidence supporting lower articulatory stiffness in speakers with ALS and MS and Goozee et al. (2000) for evidence that control of articulatory speed is the locus of impairment in dysarthria secondary to traumatic brain injury (TBI)). Individuals with ALS, PD, and TBI also exhibit patterns consistent with impaired intergestural and intragestural coordination, including reduced spatiotemporal coupling between lingual regions (Kuruvilla et al., 2012) as well as different relative contributions of the jaw and tongue in lingual constrictions (Bartle et al., 2006; Mefferd & Dietrich, 2019; Mefferd et al., 2012; Rong & Green, 2019). Impairment of lingual flexibility observed in individuals Parkinson’s Disease (Whalen et al., 2014) and following partial glossectomy (Hagedorn et al., 2021) may be attributable to breakdown at the level of interarticulator coordination or articulator movement.
24.8 Conclusion In this chapter, we provide accounts of a number of speech error patterns from an Articulatory Phonology and Task Dynamics perspective. We demonstrate that error patterns that may be traditionally classified as “motoric” (“phonetic”) or “phonological” in nature can be accounted for by positing breakdown or simplification at one or more levels of the Articulatory Phonology and Task Dynamics model, which reconciles the phonetics-phonology dichotomy. Adopting this framework has the additional merits of enabling characterization of the disorders in a more fine-grained manner (for example, distinguishing between impairment in articulator movement speed and control of speed in certain sub-types of dysarthria) as well as offering explanation of error patterns using basic concepts in task dynamics and
344 Christina Hagedorn and Aravind Namasivayam motor control which have also been used to account for numerous behavioral phenomena across fields. By approaching speech disorders and speech error patterns in such a way, not only can error patterns and the disorders by which they are underlain be more effectively characterized, but through such characterization, clinical intervention for these disorders can be better informed, and thus optimized. While the Articulatory Phonology and Task Dynamics model offers new insight into speech error patterns, it does have limitations. For example, it does not include any mechanism by which auditory feedback can be incorporated, rendering it unable to account for errors based on impaired auditory feedback (e.g., Houde et al., 2019; Sangtian et al., 2021) or effects of perturbed feedback on the system’s behavior (e.g., Niziolek & Parrell, 2021). And, unlike some other models of speech production (e.g., Tourville & Guenther, 2011), the Articulatory Phonology and Task Dynamics model does not explicitly specify neurological structures or neurophysiological processes involved in each component, though extensions of the model (e.g., Tilsen, 2016, 2019b) do so to some degree. It is our intent that this chapter serve also to inspire future directions of research focused on testing theory-based hypotheses regarding speech impairment, which will ultimately give rise to more complete characterization of the disorders at hand, more refined treatment strategies, and possibly refinement and extensions of the theory, itself.
NOTES 1 Importantly, the relative phase of gestures’ clocks which determines the time of each gesture’s triggering is controlled by the coupling relations between the clocks of individual pairs of gestures rather than by a master clock. For evidence, see Byrd (1996). 2 Omission of the segment in the target form could result from the child never having perceived the segment in modeled productions, such as in cases of hearing loss. 3 Evidence that all target gestures are likely present in the lexical representations of children without hearing loss includes their ability to discriminate between adult target forms and forms attempted by the child as reproduced by adults. 4 Breakdown at the tract variable specification level would be expected to affect the same segments in simple onset and coda positions, as well. 5 Here, we refer to long-distance assimilation in which intervening segments are unaffected.
REFERENCES Ballard, K. J., Wambaugh, J. L., Duffy, J. R., Layfield, C., Maas, E., Mauszycki, S., & McNeil, M. R. (2015). Treatment for acquired apraxia of speech: A systematic review of intervention research between 2004 and 2012. American Journal of Speech-Language Pathology, 24(2), 316–337. Bartle, C. J., Goozée, J. V., Scott, D., Murdoch, B. E., & Kuruvilla, M. (2006). EMA assessment of tongue–jaw co-ordination during speech in
dysarthria following traumatic brain injury. Brain Injury, 20(5), 529–545. Bartle-Meyer, C. J., Goozée, J. V., Murdoch, B. E., & Green, J. R. (2009a). Kinematic analysis of articulatory coupling in acquired apraxia of speech post-stroke. Brain Injury, 23(2), 133–145. Bartle‐Meyer, C. J., Murdoch, B. E., & Goozée, J. V. (2009b). An electropalatographic investigation of linguopalatal contact in participants with acquired apraxia of speech: A quantitative and
Articulatory Phonology and Speech Impairment 345 qualitative analysis. Clinical Linguistics & Phonetics, 23(9), 688–716. Browman, C. P., & Goldstein, L. (1989). Articulatory gestures as phonological units. Phonology, 6(2), 201–251. JSTOR. Browman, C. P., & Goldstein, L. (1992). Articulatory phonology: An overview. Phonetica, 49(3–4), 155–180. https://doi.org/ 10.1159/000261913 Browman, C. P., & Goldstein, L. (1995). Gestural syllable position effects in American English. In F. Bell-Berti, R. J. Lawrence (Eds.), Producing speech: Contemporary issues (pp. 19–33). American Institute of Physics. Browman, C. P., & Goldstein, L. (2000). Competing constraints on intergestural coordination and self-organization of phonological structures. Les Cahiers de l’ICP. Bulletin de La Communication Parlée, 5, 25–34. Byrd, D. (1996). Influences on articulatory timing in consonant sequences. Journal of Phonetics, 24(2), 209–244. Byun, T. M. (2012). Positional velar fronting: An updated articulatory account. Journal of Child Language, 39(5), 1043–1076. Case, J., & Grigos, M. I. (2016). Articulatory control in childhood apraxia of speech in a novel word–learning task. Journal of Speech, Language, and Hearing Research, 59(6), 1253–1268. Case, J., & Grigos, M. I. (2020). A framework of motoric complexity: An investigation in children with typical and impaired speech development. Journal of Speech, Language, and Hearing Research, 63(10), 3326–3348. Cheng, H. Y., Murdoch, B. E., Goozée, J. V., & Scott, D. H. (2007). Physiologic development of tongue–jaw coordination from childhood to adulthood. Journal of Speech, Language, and Hearing Research, 50(2), 352–360. Cleland, J., & Scobbie, J. M. (2021). The dorsal differentiation of velar from alveolar stops in typically developing children and children with persistent velar fronting. Journal of Speech, Language, and Hearing Research, 64(6S), 2347–2362. Connor, N., & Abbs, J. (1991). Task-dependent variations in parkinsonian motor impairments. Brain, 114(1), 321–332. Davis, B. L., & MacNeilage, P. F. (1995). The articulatory basis of babbling. Journal of Speech, Language, and Hearing Research, 38(6), 1199–1211. Denny, M., & McGowan, R. S. (2012). Implications of peripheral muscular and anatomical development for the acquisition of
lingual control for speech production: A review. Folia Phoniatrica et Logopaedica, 64(3), 105–115. Duffy, J. R. (2019). Motor speech disorders e-book: Substrates, differential diagnosis, and management. Elsevier Health Sciences. Forrest, K., Weismer, G., & Turner, G. S. (1989). Kinematic, acoustic, and perceptual analyses of connected speech produced by Parkinsonian and normal geriatric adults. The Journal of the Acoustical Society of America, 85(6), 2608–2622. Fowler, C., Rubin, P., Remez, R., & Turvey, M. T. (1980). Implications for speech production of a general theory of action. In B. Butterworth (Ed.), Language production (pp. 373–420). Academic Press. Gibbon, F. (1999). Undifferentiated lingual gestures and their implications for speech disorders in children. Proceedings of the XIVth International Congress of Phonetic Sciences. Gibbon, F. E., & Wood, S. E. (2002). Articulatory drift in the speech of children with articulation and phonological disorders. Perceptual and Motor Skills, 95(1), 295–307. Goldstein, L., & Browman, C. (1986). Representation of voicing contrasts using articulatory. Journal of Phonetics, 14, 339–342. Goldstein, L., Byrd, D., & Saltzman, E. (2006). The role of vocal tract gestural action units in understanding the evolution of phonology. In M. A. Arbib (Ed.), Action to language via the mirror neuron system (1st ed., pp. 215–249). Cambridge University Press. https://doi. org/10.1017/CBO9780511541599.008 Goldstein, L., & Fowler, C. A. (2003). Articulatory phonology: A phonology for public language use. In N. O. Schiller, A. S. Meyer (Eds.), Phonetics and phonology in language comprehension and production: Differences and similarities (pp. 159–207). Mouton de Gruyter. Goozée, J., Murdoch, B., Ozanne, A., Cheng, Y., Hill, A., & Gibbon, F. (2007). Lingual kinematics and coordination in speech‐disordered children exhibiting differentiated versus undifferentiated lingual gestures. International Journal of Language & Communication Disorders, 42(6), 703–724. Goozee, J. V., Murdoch, B. E., Theodoros, D. G., & Stokes, P. D. (2000). Kinematic analysis of tongue movements in dysarthria following traumatic brain injury using electromagnetic articulography. Brain Injury, 14(2), 153–174. Green, J. R., Moore, C. A., Higashikawa, M., & Steeve, R. W. (2000). The physiologic development of speech motor control: Lip and
346 Christina Hagedorn and Aravind Namasivayam jaw coordination. Journal of Speech, Language, and Hearing Research, 43(1), 239–255. Green, J. R., Moore, C. A., & Reilly, K. J. (2002, February). The sequential development of jaw and lip control for speech. The Journal of Speech, Language, and Hearing Research,45(1), 66–79. https://doi.org/10.1044/1092-4388(2002/005). PMID: 14748639; PMCID: PMC2890215. Grigos, M. I., & Case, J. (2018). Changes in movement transitions across a practice period in childhood apraxia of speech. Clinical Linguistics & Phonetics, 32(7), 661–687. Grigos, M. I., Moss, A., & Lu, Y. (2015). Oral articulatory control in childhood apraxia of speech. Journal of Speech, Language, and Hearing Research, 58(4), 1103–1118. Hagedorn, C., Kim, J., Sinha, U., Goldstein, L., & Narayanan, S. S. (2021). Complexity of vocal tract shaping in glossectomy patients and typical speakers: A principal component analysis. The Journal of the Acoustical Society of America, 149(6), 4437–4449. Hagedorn, C., Lu, Y., Toutios, A., Sinha, U., Goldstein, L., & Narayanan, S. (2022). Variation in compensatory strategies as a function of target constriction degree in post-glossectomy speech. JASA Express Letters, 2(4), 045205. Hagedorn, C., Proctor, M., Goldstein, L., Wilson, S. M., Miller, B., Gorno-Tempini, M. L., & Narayanan, S. S. (2017). Characterizing articulation in apraxic speech using real-time magnetic resonance imaging. Journal of Speech, Language, and Hearing Research, 60(4), 877–891. Haken, H., Kelso, J. S., & Bunz, H. (1985). A theoretical model of phase transitions in human hand movements. Biological Cybernetics, 51(5), 347–356. Hardcastle, W. J., Morgan Barry, R., & Clark, C. (1985). Articulatory and voicing characteristics of adult dysarthric and verbal dyspraxic speakers: An instrumental study. British Journal of Disorders of Communication, 20(3), 249–270. Hirose, H., Kiritani, S., Ushijima, T., Yoshioka, H., & Sawashima, M. (1981). Patterns of dysarthric movements in patients with Parkinsonism. Folia Phoniatrica et Logopaedica, 33(4), 204–215. Hoole, P., Schröter-Morasch, H., & Ziegler, W. (1997). Patterns of laryngeal apraxia in two patients with Broca’s aphasia. Clinical Linguistics & Phonetics, 11(6), 429–442. Houde, J. F., Gill, J. S., Agnew, Z., Kothare, H., Hickok, G., Parrell, B., Ivry, R. B., & Nagarajan, S. S. (2019). Abnormally increased vocal responses to pitch feedback perturbations in
patients with cerebellar degeneration. The Journal of the Acoustical Society of America, 145(5), EL372–EL378. Itoh, M., Sasanuma, S., Hirose, H., Yoshioka, H., & Ushijima, T. (1980). Abnormal articulatory dynamics in a patient with apraxia of speech: X-ray microbeam observation. Brain and Language, 11(1), 66–75. Itoh, M., Sasanuma, S., Tatsumi, I. F., Murakami, S., Fukusako, Y., & Suzuki, T. (1982). Voice onset time characteristics in apraxia of speech. Brain and Language, 17(2), 193–210. Itoh, M., Sasanuma, S., & Ushijima, T. (1979). Velar movements during speech in a patient with apraxia of speech. Brain and Language, 7(2), 227–239. Kim, D., Kuruvilla-Dugdale, M., de Riesthal, M., Jones, R., Bagnato, F., & Mefferd, A. (2021). Articulatory correlates of stress pattern disturbances in talkers with dysarthria. Journal of Speech, Language, and Hearing Research, 64(6S), 2287–2300. Kuruvilla, M. S., Green, J. R., Yunusova, Y., & Hanford, K. (2012, December). Spatiotemporal coupling of the tongue in amyotrophic lateral sclerosis. The Journal of Speech, Language, and Hearing Research, 55(6), 1897–1909. https://doi. org/10.1044/1092-4388(2012/11-0259). Epub 2012 May 21. PMID: 22615476; PMCID: PMC4607050. Lin, S., & Demuth, K. (2015). Children’s acquisition of English onset and coda/l: Articulatory evidence. Journal of Speech, Language, and Hearing Research, 58(1), 13–27. Liss, J. M., White, L., Mattys, S. L., Lansford, K., Lotto, A. J., Spitzer, S. M., & Caviness, J. N. (2009, October). Quantifying speech rhythm abnormalities in the dysarthrias. The Journal of Speech, Language, and Hearing Research, 52(5), 1334–1352. https://doi.org/10.1044/10924388(2009/08-0208). Epub 2009 Aug 28. PMID: 19717656; PMCID: PMC3738185. Löfqvist, A., & Gracco, V. L. (1999). Interarticulator programming in VCV sequences: Lip and tongue movements. The Journal of the Acoustical Society of America, 105(3), 1864–1876. MacNeilage, P. F. (1970). Motor control of serial ordering of speech. Psychological Review, 77(3), 182. Marin, S., & Pouplier, M. (2010). Temporal organization of complex onsets and codas in American English: Testing the predictions of a gestural coupling model. Motor Control, 14(3), 380–407.
Articulatory Phonology and Speech Impairment 347 McAllister Byun, T., Buchwald, A., & Mizoguchi, A. (2016). Covert contrast in velar fronting: An acoustic and ultrasound study. Clinical Linguistics & Phonetics, 30(3–5), 249–276. McLeod, S., & Baker, E. (2017). Children's speech: An evidence-based approach to assessment and intervention. (Always learning). Pearson. McNeil, M. (2000). Apraxia of speech: A treatable disorder of motor planning and programming. In S. E. Nadeau, B. A. Crosson,& L. GonzalezRothi(Eds.), Aphasia and Language, Theory to Practice (1st ed., pp. 221–266). The Guilford Press. McNeil, M. R., & Adams, S. (1991). A comparison of speech kinematics among apraxic, conduction aphasic, ataxic dysarthria, and normal geriatric speakers. Clinical Aphasiology, 19, 279–294. McNeil, M. R., Caligiuri, M., & Rosenbek, J. C. (1989). A comparison of labiomandibular kinematic durations, displacements, velocities, and dysmetrias in apraxic and normal adults. Clinical Aphasiology, 18, 173–193. McNeil, M. R., Robin, D. A., & Schmidt, R. A. (1997). Apraxia of speech: Definition, differentiation, and treatment. In Clinical Management of Sensorimotor Speech Disorders (ed. Malcolm Ray McNeil), (pp. 311–344). McNeil, M. R., Robin, D. A., & Schmidt, R. A. (2009). Apraxia of speech: Definition and differential diagnosis. In M. R. McNeil, D. Robin, & R. A. Schmidt (Eds.), Clinical Management of Sensorimotor Speech Disorders (Vol. 2, pp. 249–267). Mefferd, A. S., & Dietrich, M. S. (2019). Tongueand jaw-specific articulatory underpinnings of reduced and enhanced acoustic vowel contrast in talkers with Parkinson’s disease. Journal of Speech, Language, and Hearing Research, 62(7), 2118–2132. Mefferd, A. S., Green, J. R., & Pattee, G. (2012). A novel fixed-target task to determine articulatory speed constraints in persons with amyotrophic lateral sclerosis. Journal of Communication Disorders, 45(1), 35–45. Mefferd, A. S., Lai, A., & Bagnato, F. (2019). A first investigation of tongue, lip, and jaw movements in persons with dysarthria due to multiple sclerosis. Multiple Sclerosis and Related Disorders, 27, 188–194. Moss, A., & Grigos, M. I. (2012). Interarticulatory coordination of the lips and jaw in childhood apraxia of speech. Journal of Medical SpeechLanguage Pathology, 20(4), 127.
Munson, B., Bjorum, E. M., & Windsor, J. (2003, February). Acoustic and perceptual correlates of stress in nonwords produced by children with suspected developmental apraxia of speech and children with phonological disorder. The Journal of Speech, Language, and Hearing Research, 46(1), 189–202. https://doi. org/10.1044/1092-4388(2003/015). PMID: 12647898. Nam, H. (2007). A gestural coupling model of syllable structure. Yale University. Nam, H., Goldstein, L., & Saltzman, E. (2009). Self-organization of syllable structure: a coupled oscillator model. In Approaches to Phonological Complexity (pp. 297–328). De Gruyter Mouton. https://doi. org/10.1515/9783110223958.297. Nam, H., & Saltzman, E. (2003). A competitive, coupled oscillator model of syllable structure. In M.-J. Solé (Eds.), ( Proceedings of the 15th International Congress of Phonetic Sciences (Vol. 1, pp. 2253–2256). ICPhS Organizing Committee. Namasivayam, A. K., Coleman, D., O’Dwyer, A., & van Lieshout, P. (2020, January 28). Speech sound disorders in children: An articulatory phonology perspective. Frontiers in Psychology, 10, 2998. https://doi.org/10.3389/ fpsyg.2019.02998. PMID: 32047453; PMCID: PMC6997346. Narayanan, S., & Alwan, A. (2000). Noise source models for fricative consonants. IEEE Transactions on Speech and Audio Processing, 8(3), 328–344. Narayanan, S. S., Alwan, A. A., & Haker, K. (1995). An articulatory study of fricative consonants using magnetic resonance imaging. The Journal of the Acoustical Society of America, 98(3), 1325–1347. Nijland, L., Maassen, B., Meulen, S. V. der, Gabreëls, F., Kraaimaat, F. W., & Schreuder, R. (2002). Coarticulation patterns in children with developmental apraxia of speech. Clinical Linguistics & Phonetics, 16(6), 461–483. Niziolek, C. A., & Parrell, B. (2021). Responses to auditory feedback manipulations in speech may be affected by previous exposure to auditory errors. Journal of Speech, Language, and Hearing Research, 64(6S), 2169–2181. Ogar, J., Slama, H., Dronkers, N., Amici, S., & Luisa Gorno-Tempini, M. (2005). Apraxia of speech: An overview. Neurocase, 11(6), 427–432. Pouplier, M., & Hardcastle, W. (2005). A re-evaluation of the nature of speech errors in
348 Christina Hagedorn and Aravind Namasivayam normal and disordered speakers. Phonetica, 62(2–4), 227–243. Preston, J. L., Hitchcock, E. R., & Leece, M. C. (2020). Auditory perception and ultrasound biofeedback treatment outcomes for children with residual/ɹ/distortions: A randomized controlled trial. Journal of Speech, Language, and Hearing Research, 63(2), 444–455. Preston, J. L., Hull, M., & Edwards, M. L. (2013, May). Preschool speech error patterns predict articulation and phonological awareness outcomes in children with histories of speech sound disorders. American Jounal of SpeechLanguage Pathology, 22(2), 173–184. https://doi. org/10.1044/1058-0360(2012/12-0022). Epub 2012 Nov 26. PMID: 23184137; PMCID: PMC3586759. Proctor, M. I., Shadle, C. H., & Iskarous, K. (2010). Pharyngeal articulation in the production of voiced and voiceless fricatives. The Journal of the Acoustical Society of America, 127(3), 1507–1518. Richardson, J. D., Fillmore, P., Rorden, C., LaPointe, L. L., & Fridriksson, J. (2012). Re-establishing Broca’s initial findings. Brain and Language, 123(2), 125–130. Rong, P., & Green, J. R. (2019). Predicting speech intelligibility based on spatial tongue–jaw coupling in persons with amyotrophic lateral sclerosis: The impact of tongue weakness and jaw adaptation. Journal of Speech, Language, and Hearing Research, 62(8S), 3085–3103. Saltzman, E. (1986). Task dynamic coordination of the speech articulators: A preliminary model. In H. Heuer & C. Fromm (Eds.), Generation and modulation of action patterns (pp. 129–144). Springer Berlin Heidelberg. https:// doi.org/10.1007/978-3-642-71476-4_10 Saltzman, E., & Byrd, D. (2000). Task-dynamics of gestural timing: Phase windows and multifrequency rhythms. Human Movement Science, 19(4), 499–526. Saltzman, E. L., & Munhall, K. G. (1989). A dynamical approach to gestural patterning in speech production. Ecological Psychology, 1(4), 333–382. https://doi.org/10.1207/ s15326969eco0104_2 Sangtian, S., Wang, Y., Fridriksson, J., & Behroozmand, R. (2021, November-December). Impairment of speech auditory feedback error detection and motor correction in post-stroke aphasia. Journal of Communication Disorders, 94, 106163. https://doi.org/10.1016/j. jcomdis.2021.106163. Epub 2021 Nov 2. PMID: 34768093; PMCID: PMC8627481.
Scobbie, J., Gibbon, F., Hardcastle, W., & Fletcher, P. (1996). Covert contrast as a stage in the acquisition of phonetics and phonology. QMC Working Papers in Speech and Language Sciences (Editor: J. Scobbie), 1, 43–62. Shadle, C. H., Tiede, M., Masaki, S., Shimada, Y., & Fujimoto, I. (1996). An MRI study of the effects of vowel context on fricatives. Proceedings-institute of Acoustics, 18(5), 187–194. Shriberg, L. D., Strand, E. A., Fourakis, M., Jakielski, K. J., Hall, S. D., Karlsson, H. B., Mabie, H. L., McSweeny, J. L., Tilkens, C. M., & Wilson, D. L. (2017). A diagnostic marker to discriminate childhood apraxia of speech from speech delay: I. Development and description of the pause marker. Journal of Speech, Language, and Hearing Research, 60(4), S1096–S1117. Silva, A. B., Liu, J. R., Zhao, L., Levy, D. F., Scott, T. L., & Chang, E. F. (2022). A neurosurgical functional dissection of the middle precentral Gyrus during speech production. The Journal of Neuroscience: the Official Journal of the Society for Neuroscience 42(45), 8416–8426. https://doi. org/10.1523/JNEUROSCI.1614-22.2022 Stevens, K. N. (1971). Airflow and turbulence noise for fricative and stop consonants: Static considerations. The Journal of the Acoustical Society of America, 50(4B), 1180–1192. Stevens, K. N. (1972). The quantal nature of speech: Evidence from articulatory-acoustic data. Human communication: A unified view, 51–66. Strand, E. A., McCauley, R. J., Weigand, S. D., Stoeckel, R. E., & Baas, B. S. (2013). A motor speech assessment for children with severe speech disorders: Reliability and validity evidence. Journal of Speech, Language, and Hearing Research: JSLHR, 56(2), 505–520. https://doi. org/10.1044/1092-4388(2012/12-0094) Studdert-Kennedy, M., & Goldstein, L. (2003). Launching language: The gestural origin of discrete infinity. Studies in the Evolution of Language, 3, 235–254. Sugishita, M., Konno, K., Kabe, S., Yunoki, K., Togashi, O., & Kawamura, M. (1987). Electropalatographic analysis of apraxia of speech in a left hander and in a right hander. Brain, 110(5), 1393–1417. Surprenant, A. M., & Goldstein, L. (1998). The perception of speech gestures. The Journal of the Acoustical Society of America, 104(1), 518–529. Terband, H., Maassen, B., van Lieshout, P., & Nijland, L. (2011). Stability and composition of
Articulatory Phonology and Speech Impairment 349 functional synergies for speech movements in children with developmental speech disorders. Journal of Communication Disorders, 44(1), 59–74. Terband, H., Van Brenk, F., Henriques, R. N., van Lieshout, P., Maassen, B., & Lowit, A. (2012). Speech rate strategies in younger and older adults. Motor Speech Conference. Tilsen, S. (2016). Selection and coordination: The articulatory basis for the emergence of phonological structure. Journal of Phonetics, 55, 53–77. Tilsen, S. (2019a). Space and time in models of speech rhythm. Annals of the New York Academy of Sciences, 1453(1), 47–66. Tilsen, S. (2019b). Motoric mechanisms for the emergence of non-local phonological patterns. Frontiers in Psychology, 10, 2143. Tourville, J. A., & Guenther, F. H. (2011). The DIVA model: A neural theory of speech acquisition and production. Language and Cognitive Processes, 26(7), 952–981. Turvey, M. T. (1977). Preliminaries to a theory of action with reference to vision. In resj bransford (Eds.), Perceiving, acting and knowing (pp. 211–265). Lawrence Erlbaum Associates Inc. van Lieshout, P., Merrick, G., & Goldstein, L. (2008). An articulatory phonology perspective on rhotic articulation problems: A descriptive case study. Asia Pacific Journal of Speech, Language and Hearing, 11(4), 283–303. van Lieshout, P. H., Bose, A., Square, P. A., & Steele, C. M. (2007). Speech motor control in fluent and dysfluent speech production of an individual with apraxia of speech and Broca’s aphasia. Clinical Linguistics & Phonetics, 21(3), 159–188. Vihman, M. M., & Ferguson, C. A. (1987). The acquisition of final consonants. In Proceedings of the eleventh international congress of phonetic sciences (Vol. 1). (Ed. Academy of Sciences of the Estonian S.S.R.) Tallinn Estonia. Wambaugh, J. L., Duffy, J. R., McNeil, M. R., Robin, D. A., & Rogers, M. A. (2006). Treatment
guidelines for acquired apraxia of speech: A synthesis and evaluation of the evidence. Journal of Medical Speech-Language Pathology, 14(2), xv–xv. Whalen, D. H., Dawson, K. M., Carl, M., & Iskarous, K. (2014). Tongue shape complexity for liquids in Parkinsonian speech. The Journal of the Acoustical Society of America, 135(4), 2389–2389. Ying, J., Shaw, J. A., Carignan, C., Proctor, M., Derrick, D., & Best, C. T. (2021). Evidence for active control of tongue lateralization in Australian English/l. Journal of Phonetics, 86, 101039. Yunusova, Y., Weismer, G., Westbury, J. R., & Lindstrom, M. J. (2008, June). Articulatory movements during vowels in speakers with dysarthria and healthy controls. The Journal of Speech, Language, and Hearing Research, 51(3), 596–611. https://doi.org/10.1044/10924388(2008/043). PMID: 18506038. Zhang, Z., Boyce, S., Espy-Wilson, C., & Tiede, M. (2003, August). Acoustic strategies for production of American English “retroflex”/r. In Proceedings of the 15th International Congress of Phonetic Sciences (pp. 1125–1128). Universitat Autònoma, Barcelona, Spain. Ziegler, W. (2008). Apraxia of speech. Handbook of Clinical Neurology, 88, 269–285. Ziegler, W., Aichert, I., & Staiger, A. (2012). Apraxia of speech: Concepts and controversies. The Journal of Speech, Language, and Hearing Research, 55(5), S1485–S1501. https://doi.org/10.1044/1092-4388(2012/120128). PMID: 23033443. Ziegler, W., Lehner, K., Pfab, J., & Aichert, I. (2021). The nonlinear gestural model of speech apraxia: Clinical implications and applications. Aphasiology, 35(4), 462–484. Ziegler, W., & Von Cramon, D. (1985). Anticipatory coarticulation in a patient with apraxia of speech. Brain and Language, 26(1), 117–130.
25 Government Phonology and Speech Impairment MARTIN J. BALL AND BEN RUTTER 25.1 Introduction Government Phonology (Harris, 1990, 1994; Harris & Lindsey, 1995; Kaye et al., 1985, 1990) can be seen to some extent as a development of Dependency Phonology (Anderson & Durand, 1986, 1987; Anderson & Ewen, 1987), with the aim of constraining the generative power of this latter approach. Government Phonology (henceforth GovP) is seen by its proponents to be within the generative tradition (and thus part of “universal grammar”), and it shares with other approaches the insights and developments of feature geometry, autosegmental, and metrical phonology. In particular, we should note that the theory distinguishes a skeletal tier which contains the terminal nodes of syllabic constituents (termed “constituency”) from a segmental one (termed “melody”), and that the equivalent of features (“elements”) are thought to operate on a set of tiers as well. However, despite these similarities, there are major differences – in particular, as the name suggests, the idea of governing or licensing relations between units. Recent accounts of the theory are available in Scheer and Cyran (2017a, 2017b) and Scheer and Kula (2017). Jaskuła’s (2021) study of English loanwords in a variety of Irish also provides a description of the theory. In this chapter, for reasons of space, we will concentrate on pointing out where GovP differs from traditional models of generative phonology, and then turn our attention to how GovP can inform our descriptions of disordered speech.
25.2 Constituency We noted above that this area of GovP concerns the syllabic tier in traditional parlance. However, we should note that the theory does not, in actual fact, recognize syllables as constituents (although it does as a licensing relation). Harris (1994, p. 45) notes that the notion of the syllable has “no pre-theoretical standing,” and Kula (2002, p. 23) states, “there is no notion of syllable as understood in the traditional sense; rather, phonological units are regarded as consisting of sequences of Onset–Nuclear (ON) pairs.” The concept of the “phonological word” is used, however, and is deemed to consist of feet, which in turn consist of the units O (onset), N (nucleus) and rime (all of which may potentially be binary-branching, depending on the language concerned). These sequences are located on the tier P0, which dominates the timing tier (or skeleton) traditionally represented by timing slots x-x-x etc. Kula (2002, p. 23) notes that the skeleton links segmental information to the constituency level, and “the government and licensing relations that hold between them.”
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
352 Martin J. Ball and Ben Rutter While the distinction between rime and nucleus (found in traditional accounts of the syllable) is retained, branching rimes do not contain a traditional coda unit, and in GovP, the coda is not an accepted unit. Syllable-final singleton consonants are always considered to be onsets followed by empty nuclei (see Harris, 1994, for arguments in favor of this viewpoint; theory-independent, empirical reasons for considering final Cs to be onsets are found in Harris & Gussmann, 2002). We can show this in the following examples, where, traditionally, the empty nucleus is normally omitted from the diagram utterance finally: (1)
O
R
O
N x
O
R
O
R N
N
x
x
x
O
x
x
x
x
x
Branching of the nucleus denotes diphthongs and long vowels, while branching of the onset is used to mark complex onsets. This applies only to non-/s/ -initial two-consonant clusters; with /s/-initial two- or three-consonant clusters the initial /s/ is deemed to be external to the branching onset. Harris (1994) suggests this /s/ may be deemed to be an onset with an empty nucleus. Word-final and word-medial consonant clusters are dealt with by allowing branching rimes. In these cases, the right-hand branch of the rime may contain a consonant, subject to certain restrictions in the case of heavy nuclei (long vowels or diphthongs): consonants can only be fricatives or sonorants, sonorants agree with the place of the following consonant, and the favored such place is coronal (see Harris, 1994, p. 77). These restrictions do not hold on light nuclei. This is illustrated in the following: (2)
O
R
O
O
N x
x
R
O
N x
x
x
x
O
R N
x
x
x
x
Clearly, English, for example, allows more complex final consonant clusters, and final twoconsonant clusters that do not meet the conditions on branching rimes noted above. In these cases, empty nuclei are posited between the consonants (also we have removed the rime unit from this diagram, as a branching rime plays no part in this example): (3)
O
x 1
N
x
x i:
O
N
O
x
x
x
p
t
Government Phonology and Speech Impairment 353 Final three- and four-consonant clusters are accounted for by combinations of branching rimes and empty nuclei, depending on which consonants are involved. This approach to phonology is termed government phonology because its units enter into governing and licensing relations with each other. While this is perhaps most obvious on the melodic tier (see below), there are also relations between the various units we have been considering at the constituency level. We can look at some of the more important of these here. For example, the onset, rime and nucleus constituents are subject to the following general principles: (4) Every nucleus can and must license a preceding onset. (5) Every onset must be licensed by a following nucleus. Furthermore, it is required that (6) Every constituent licenser must dominate a skeletal point. Given the above, we can derive the following principle: (7) Every nucleus must dominate a skeletal point. Related to these principles is the principle concerning codas, discussed above: (8) Coda-licensing principle: Post-nuclear rhymal positions must be licensed by a following onset. These principles are concerned with the three main units of constituent structure. We may also consider principles concerned with location and direction of government between them. Kula (2002, p. 25) notes that government is subject to the following conditions: (9) Conditions on government a. strict locality: only adjacent positions can constitute a government relation b. strict directionality: constituent government goes from left to right and interconstituent government goes from right to left. Further, she notes a proposal that all governing relations be licensed by a following nucleus. (9) constrains government relations at the constituency tier. So, for example, within a constituent, government is left-headed (so, the first element of an onset cluster is the head of the relation), whereas between constituents, government is left-headed (so, a nucleus governs its preceding onset). Finally, all governing relations are subject to the projection principle: (10) Projection principle (Kaye, 1990, p. 321) Governing relations are defined at the level of lexical representation and remain constant throughout derivation. The projection principle implies that “constituent categories may not be altered during the course of derivation; onsets remain onsets, nuclei remain nuclei and the licensing relation between nuclei and onsets remains stable” (Kula, 2002, p. 26). We have not had the space to examine in detail structures such as the phonological word, and foot, or analyses of prosodic features such as tone. Readers should consult Harris (1994) and Kula (2002) for more information on these areas. We can also note that this summary excludes recent moves toward a more constrained theory, as started in van der Hulst’s work on Radical CV Phonology (1989, 2020).
354 Martin J. Ball and Ben Rutter
25.3 Melody The main difference at the segmental level between GovP and traditional generative approaches concerns the nature of the smallest phonological unit. The binary feature (that was claimed to be equipollent) was the smallest unit in generative phonology as outlined by Chomsky and Halle (1968) and, until comparatively recently, most theoretical developments within this tradition maintained binary features. In work within feature geometry (Clements, 1985; Halle, 1992; Sagey, 1986; see review in Roca & Johnson, 1999), strict binarity was relaxed to allow nodes with privative, unary features, mixed with binary equipollent ones.
25.3.1 Features Phonological theories have for a long time posited the need for a phonological unit smaller than the segment. Such units are required if we wish to make statements about classes of sounds (all those that share a particular phonological aspect), or to describe rules that affect particular sounds or groups of sounds. Since the time of Trubetzkoy (e.g., 1969, originally published 1939), this unit has been the distinctive feature. However, while Trubetzkoy and the Prague School of linguistics considered a range of feature typologies, later work on distinctive features (e.g., Chomsky & Halle, 1968; Jakobson et al., 1952; Jakobson & Halle, 1956) opted for one particular type: binary, equipollent features. Equipollent features are those where each value (in this case a binary +/– set of values) specifies a particular property. For example, the commonly encountered feature [voice] (as proposed, for example, in Chomsky & Halle, 1968) has a plus value denoting vocal-fold vibration, and a minus value denoting open vocal folds (i.e., not simply “not vibrating vocal folds”). Privative features, on the other hand, have a distinction between one value denoting a particular property, and another denoting simply the absence of that property. (In fact, whereas in Chomsky and Halle’s 1968 formulation of distinctive features the authors claim that their features are equipollent, some do not appear to fit within this classification. For example, the feature [high] has a plus value denoting high tongue position, and a minus value denoting not high (mid or low); this would appear to be a privative distinction.) Feature theory has developed considerably since Chomsky and Halle’s groundbreaking work of 1968. We have seen work on the relationship between features (markedness and feature geometry: Clements, 1985; Halle, 1992; Roca & Johnson, 1999; Sagey, 1986), and on economy in segmental feature matrices (underspecification: Archangeli, 1988; Clements, 1988; Steriade, 1987). Some of the work in feature geometry has suggested that some nodes on a feature tree may be best described with privative, unary features rather than with binary equipollent ones (Roca & Johnson, 1999). It is to this notion we turn next.
25.3.2 Elements Dependency phonology (Anderson & Durand, 1986, 1987; Anderson & Ewen, 1987), and related approaches such as Radical CV Phonology (van der Hulst, 1989), and Government Phonology have taken the developments in feature geometry just noted one step farther, and have adopted elements rather than features as the basic unit of phonological analysis. These elements are unary (i.e., they are either present or absent in a description, and so also privative), and they are phonetically interpretable in isolation. The main advantage of unary elements is that their use constrains the phonology (see, e.g., van der Hulst, 2016). Binary features allow a large number of segment classes to be established (those sharing the plus value and those sharing the minus value of a feature); unary elements only allow a class of segments that have that element, not one that does not have it. Harris (1994) also sees unary
Government Phonology and Speech Impairment 355 accounts as a means of reducing the range of phonological processes available to the theory to those that are observed in natural language, thus obviating the need for theoretical addons such as markedness conventions. The advantages claimed for phonetic interpretability of elements include freedom from the need to map non-interpretable distinctive features onto phonetic features late in a derivation, and the fact that we do not need underspecification (or to decide between different models of underspecification). Using phonetically interpretable elements results in all levels of derivation containing segments that are also phonetically interpretable. Harris (1994, p. 96) claims that this approach is arguably more psycholinguistically plausible than traditional ones, and that: Since phonological representation uniformly adheres to the principle of full phonetic interpretability, there is no motivation for recognizing an autonomous level of systematic phonetic representation. Any phonological representation at any level of derivation can be directly submitted to articulatory or perceptual interpretation. Derivation is thus not an operation by means of which abstract phonological objects are transformed into increasingly concrete physical objects. Rather it is a strictly generative function which defines the grammaticality of phonological strings.
The appeal to psycholinguistically plausible models of phonology has echoes in recent work within what may be broadly termed cognitive models of linguistics; see, for example, Sosa and Bybee (Chapter 26 in this volume). From the point of view of clinical phonology, it might well be more insightful to posit phonetically interpretable phonological elements, rather than uninterpretable binary distinctive features.
25.3.3 Vowel Elements In GovP the phonological primes are termed elements, and three elements are proposed for vowels. These, with their pronunciations, are: (11) A [a] I [i] U [u] A fourth symbol, @, is also used, but represents a default tongue position, or the carrier signal on which the modulations represented by elements are superimposed (Harris, 2005; Harris & Lindsey, 1995). We noted earlier that, as its name suggests, GovP uses governing relations between its units of description, and this is no less true of the melodic tier than of the constituency one. The combination of elements is regulated by the concept of Licensing Constraints. These constraints provide restrictions on the combinations of elements so that it is possible to derive the set of phonological representations that capture all and only those sound segments relevant to a particular language (Kula, 2002, p. 27). So, combinations of elements provide a wider vowel set, and in combinations one element is normally considered to be the head (or governor), and others are usually dependent on the head. In GovP formalism, the head element is shown underlined; where no element is underlined then the elements are in a non-governing relationship. English lax vowels illustrate these possibilities: (12) [I, @] [A, I, @] [I, A]
/ɪ/ /ε/ /æ/
356 Martin J. Ball and Ben Rutter [U, A] [U, @] [@] [A, @]
/ɒ/ /ʊ/ /�/ [ɐ]
These combinations illustrate the use of the neutral element [@] as governor of vowels we traditionally term lax. Long vowels, like diphthongs, are deemed to occupy two skeletal slots (as described earlier). Typical examples from English are seen in: N
(13) x
N x
I
x
N x
A
x
x
x
x
A
I
A
I
U /i:/
/ c:/
N
U /aɪ/
/ cɪ/
The layering of the elements in these diagrams reflects the contention that these elements (and the consonant elements of the following subsection) can be thought of as operating on separate tiers.
25.3.4 Consonant Elements The following are the elements most often used to characterize consonants, together with their phonetic exponence and description in more traditional phonological terms. It can be noted that the different exponence of A, I, and U result from their no longer being dominated by a nucleus node in word structure. (14) ? [ʔ] h [h] R [ɾ] I [j] U [w] @ [ɰ] A [ʁ] N [ŋ]
stop or edge noise or aperiodic energy on release coronality palatality labiality neutral present in uvulars and pharyngeals nasality
There are two further, laryngeal-node, elements used mainly to distinguish voiced from voiceless consonants: [H] stiff vocal folds, aspiration, voicelessness, and [L] slack vocal folds, voicing. In the following examples we include only voiced sonorants and voiceless obstruents, so have no need of these last two elements. Researchers in GovP have sought ways to constrain the theory through the reduction of consonant elements from this original set to seven or five (see Ritter, 1996, as an example). Such reductions have included the removal of the [N] element and its replacement by a combination of [L] governing [?]. For our purposes, we retain a maximal set of elements. Illustrations of both place and manner distinctions in consonants can be seen in the following:
Government Phonology and Speech Impairment 357 (15) [h, U, ?] [h, R, ?] [h, @, ?] [h, U] [h, R] [h, R] [h, R, I] [h, @] [h, A] [h, A] [N, R, ?] [R, ?] [R, @]
[p] [t] [k] [f] [s] [θ] [ʃ] [x] [χ] [ħ] [n] [l] [ɹ]
25.3.5 Element Geography We have referred to feature geometry above; those working with GovP have proposed element geometries for similar reasons. As Kula (2002, p. 30) notes, Feature geometries have … been proposed in order to not only classify natural classes, but also to exclude unnatural ones. The GP view that elements are directly linked to the skeleton implies that they are individually accessible to phonological processing. True as this is, it has also been observed that particular phonological processes do indeed access more than one element at the same time and thus make it necessary for us to conceive of some geometric organisation of elements.
In other words, element geometries allow us to constrain the possible combinations of elements that can be accessed in phonological processes, in a way complementary to that in which licensing constraints restrict the possible combination of elements within the description of a single segment; both are language-specific. While various possible element geometries have been proposed in the literature, we can illustrate the concept with an element tree combined from proposals in Harris (1994) and in Harris and Lindsey (1995): x
(16)
Root Laryngeal L
H
Resonance A
I
U
N
h
?
R
25.4 Government Phonology in Derivation The mechanisms we have looked at so far are used to describe a range of phonological processes in natural language. We have space here to consider just a couple of examples (both taken from Harris, 1994). First, we can consider vowel syncope. In fast, casual speech, unstressed vowels in certain words are subject to deletion. Examples include separate (/’sɛpəɹət/ vs. /’sɛpɹət/); camera (/’kæməɹə/ vs. /’kæmɹə/); opener (/’oʊpənə/ vs.
358 Martin J. Ball and Ben Rutter /’oʊpnə/); and definite (/’dɛfɪnət/ vs. /’dɛfnət/). The removal of the unstressed vowel might be thought to result in a resyllabification process. Considering definite, we could propose that, in the reduced form, the /f/ could be treated as post-nuclear in a branching rime. However, when we examine the example of opener, this solution is not open to us, as /p/ belongs to the class of stops that are not permitted in this position after a heavy nucleus. The solution best fitting the constituent-structure constraints of GovP for separate would be to treat pr as a complex onset to the second syllable, as follows: (17)
O
N
O
N
O
N
O
N
x
x
x
x
x
x
x
x
O
N
O
N
O
N
x
x
x
x
x
x
x
However, in some instances such a strategy will produce onset clusters that are not otherwise found in the language (e.g., /pn/ in /’oʊpnə/, /mɹ/ in /’kæmɹə/, and /fn/ in /’dɛfnət/), or even ones that break the sonority sequencing principle (e.g., /nt/ in /’mɒntɹɪŋ/ mon’toring). Harris (1994), therefore, argues that a better-motivated solution is to assume that the N slot for the deleted vowel remains in structure at the skeletal tier, but is phonetically empty, that is, there is no resyllabification, just the phonetic interpretation or non-interpretation of stable syllabic positions. This would give us, for the example separate, the following: (18)
O
N
O
N
O
N
O
N
x
x
x
x
x
x
x
x
O
N
O
N
O
N
O
N
x
x
x
x
x
x
x
x
At the melodic level we can consider a commonly occurring example of lenition. In many instances historically original /s/ in a language has weakened to [h] or even been deleted. This is a current change in Cuban Spanish and can also be seen in classical Greek as compared to its reconstructed ancestor language. We also know of many instances of /h/ deletion: in modern English dialects, in fast speech in English with /h/-initial function words, in several Romance languages historically. Lenition of /t/ has also been commonly reported
Government Phonology and Speech Impairment 359 and, although this is more often seen as a change to [θ] (as in Welsh aspirate mutation), a change to [ts] or [s] may also be found (as in Merseyside English, and the German sound shift producing [ts] from earlier [t]). If we put all these lenitions together, we see that GovP provides in its combinations of elements an explanation of these changes through a gradual elimination of melodic material until an empty slot is obtained. (19)
t
s
h
O
x
x
x
(x)
h
h
h
R
R
?
25.5 Government Phonology and Disordered Speech This model has been applied to disordered speech, see the work of Harris et al. (1999), and Ball (2002) and review in Ball et al. (2009), and in normal phonological acquisition the work of Ball (1996), and Harrison (1996). More recent studies include Pathi and Mondal (2021) on the mental representation of sound in speech sound disorders, Prince and Ferré (2020) on typical and atypical phonological acquisition of French, and Prince (2023) on element theory in aphasia. Following the example of Ball (1997) regarding Dependency Phonology, we can examine here some of the more commonly reported phonological patterns in disordered speech and see how GovP accounts for them. We can commence by considering some common patterns in disordered speech at the constituency level. Difficulties with onset clusters are commonly reported in the clinical literature (e.g., Bauman-Waengler, 2003), and indeed, simplifications of these clusters are found in normal phonological development as well. As we have noted earlier, GovP deals with onset clusters in English in two ways: non-/s/-initial clusters are accounted for through binary branching of the onset; /s/-initial clusters, on the other hand, have the /s/ as the onset to an empty nucleus (Harris, 1994, notes that alternative analyses are available, but the /s/ is never part of a branching onset). This distinction does reflect differences in the ways English initial clusters behave in both normal and disordered phonological development (see Gierut, 1999, for evidence of this). Harris (1994) points out that GovP adopts a principles and parameters approach to grammar and so, for the cluster simplification we have been looking at, a change in parameter setting to disallow branching onsets (as is found in many languages, such as Chinese) will account for loss of non-/s/-initial clusters. As the leftmost item in the cluster is the head, this also accounts for the usual pattern in cluster simplification of this type: the retention of the leftmost item, and loss of the right. To account for simplification in /s/-initial clusters, we have to look beyond the onset to P0 or even the skeletal tier. We need to ban onsets with empty nuclei to account for these clusters but, all other things being equal, this ban must work only with initial instances. The operation of such a prohibition, then, would remove the /s/ onset and its empty nucleus, leaving (in this case) the rightmost consonant of the (superficial) /s/-initial cluster, as is indeed found in most cases in disordered speech. In normally developing /s/-clusters, and in delayed phonology, an epenthetic vowel may be encountered between the /s/ and the following consonant (e.g., stop being realized as [sətɒp]). GovP supplies an elegant account
360 Martin J. Ball and Ben Rutter of these forms, whereby we assume the constraint at initial position is not on onsets and their following empty nuclei, but just on empty nuclei following initial onsets; the empty nuclei must be phonetically realized, in this case through the addition of the default [@] element. Another commonly occurring simplification in both developmental and disordered phonology is the deletion of final consonants, whereby cat is realized as [kæ], and dog as [dɒ]. These, too, can be accounted for by a constraint on onsets and empty nuclei, this time in final position. If final consonant clusters are involved (and if all consonants are deleted), then the parameter setting allowing branching rimes will also need to be turned off. The label “final consonant deletion” may, however, be overused as, at least on some occasions, final consonants may be replaced by glottal stops. (It is probable that lack of training in detailed phonetic transcription has led to this overuse.) Final glottal replacement involves an interaction between constituency (as this realization is restricted to final position) and melody (in that these consonant slots have had all element material stripped from them except [?]). Turning now to disordered patterns at the melodic level, we will examine first the commonly reported pattern of velar fronting (we ignore for the purposes of this discussion the debate as to whether this pattern is mainly phonological or articulatory in origin). In traditional binary feature descriptions, a change from target /k/, /g/, /ŋ/ to [t], [d], [n] involves changing the values of the four features [high, back, anterior, coronal]. In GovP we can show that a much simpler account is available where the element [@] is substituted for [R]: (20)
k
t
x
x
h
h
@
R
?
?
Typical lisp patterns involve the realization of target /s/ and /z/ as dental fricatives or alveolar lateral fricatives. Both of these patterns can be accounted for through simple changes at the melodic level: for the dental fricative a change in head is all that is required, while for the lateral fricative the addition of the [?] element is all that is needed. (21)
s
θ
x
x
x
h
h
h
R
R
R
/
?
Government Phonology and Speech Impairment 361 Whereas these lisping patterns (arguably motoric rather than phonological disruption) are relatively straightforward to account for in GovP, more obviously phonological patterns such as fricative simplification are not so easy to deal with. Fricative simplification is a pattern whereby (in English) target dentals are realized as labiodentals, and target postalveolars as alveolars (e.g., /θ, ð/ as [f, v], and /ʃ, ʒ/ as [s, z]). These two patterns can be seen in GovP formalism as follows: (22)
θ
f
∫
s
x
x
x
x
h
h
h
h
R
U
R
R
I
The realization of postalveolars as alveolars is neatly captured through the deletion of the [I] element, but the dental to labiodental change requires a switch of elements and a change of head pattern. This latter aims to reflect the change from a non-strident to a strident fricative (but, as argued in Ball & Howard, 2004, 2017, the classification of labiodentals as strident is not well motivated phonetically or developmentally, and so a simpler change would result if dentals and labiodentals were both classed as non-sibilant fricatives). Finally, we can briefly consider the work on vowel disorders reported in Ball (1996, 2002, Ball & Gibbon, 2013) and Harris et al. (1999). Many of the realization patterns described in these publications involved a move to, or toward, the corner vowels [i, a, u]. This can elegantly be captured in GovP by a simplification of vowels to the three elements of [I, A, U]. Other commonly reported patterns in disordered speech (such as context sensitive voicing, fricative stopping, and liquid gliding) can also, of course, be captured in GovP, but we do not have the space to explore all of these. However, what we do see when we look at accounts of disordered phonology with GovP is the economy of description when we are dealing with unary primes rather than binary features (for example, Grunwell, 1986, argues that a fully specified /r/ to [w] change requires at least six feature changes, whereas in GovP this is accomplished through the removal of two elements and their replacement by one other).
25.6 Conclusion Government Phonology, and especially the use of phonetically interpretable unary primes, clearly provides more elegant accounts of many aspects of disordered speech than traditional feature-based accounts. However, relatively few studies have yet applied this approach to clinical data, and we await with interest discussion on whether GovP has a role to play in informing intervention as well as analysis.
REFERENCES Anderson, J., & Durand, J. (1986). Dependency phonology. In J. Durand (Ed.), Dependency and non-linear phonology (pp. 1–54). Croom Helm. Anderson, J., & Durand, J. (Eds.). (1987). Explorations in dependency phonology. Foris.
Anderson, J., & Ewen, C. (1987). Principles of dependency phonology. Cambridge University Press. Archangeli, D. (1988). Aspects of underspecification theory. Phonology, 5(2), 183–207.
362 Martin J. Ball and Ben Rutter Ball, M. J. (1996). An examination of the nature of the minimal phonological unit in language acquisition. In B. Bernhardt, J. Gilbert, & D. Ingram (Eds.), Proceedings of the UBC International Conference on Phonological Acquisition (pp. 240–253). Cascadilla Press. Ball, M. J. (1997). Monovalent phonologies. In M. J. Ball & R. D. Kent (Eds.), The new phonologies (pp. 127–162). Singular. Ball, M. J. (2002). Clinical phonology of vowel disorders. In M. J. Ball & F. E. Gibbon (Eds.), Vowel disorders (pp. 187–216). Butterworth-Heinemann. Ball, M. J., & Gibbon, F. (Eds.). (2013). Handbook of vowels and vowel disorders. Psychology Press. Ball, M. J., & Howard, S. (2004). Is Stridency Deletion really a phonological process? Paper presented at 10th ICPLA Symposium, Lafayette, LA. Ball, M. J., & Howard, S. J. (2017). Classifying disordered speech: “Stridency deletion” and phonological processes. Journal of Interactional Research in Communication Disorders, 8(2), 147–161. Ball, M. J., Müller, N., & Rutter, B. (2009). Phonology for communication disorders. Psychology Press. Bauman-Waengler, J. (2003). Articulatory and phonological impairments: A clinical focus (2nd ed.). Allyn and Bacon. Chomsky, N., & Halle, M. (1968). The sound pattern of English. MIT Press. Clements, G. (1985). The geometry of phonological features. Phonology Yearbook, 2, 225–252. Clements, G. (1988). Towards a substantive theory of feature specification. Proceedings of the North East Linguistics Society, 18(1), 79–93. Gierut, J. A. (1999). Syllable onsets: Clusters and adjuncts in acquisition. Journal of Speech, Language, and Hearing Research, 42(3), 708–746. Grunwell, P. (1986). Clinical phonology (2nd ed.). Croom Helm. Halle, M. (1992). Phonological features. In W. Bright (Ed.), International encyclopedia of linguistics (Vol. 3, pp. 207–212). Oxford University Press. Harris, J. (1990). Segmental complexity and phonological government. Phonology, 7(1), 255–300. Harris, J. (1994). English sound structure. Blackwell. Harris, J. (2005). Vowel reduction as information loss. In P. Carr, J. Durand, & C. Ewen (Eds.), Headhood, elements, specification and
contrastivity: Phonological papers in honour of John Anderson (pp. 119–132). Benjamins. Harris, J., & Gussmann, E. (2002). Codas, constraints, and coda constraints. UCL Working Papers in Linguistics, 14, 1–42. Harris, J., & Lindsey, G. (1995). The elements of phonological representation. In J. Durand & F. Katamba (Eds.), Frontiers of phonology (pp. 34–79). Longman. Harris, J., Watson, J., & Bates, S. (1999). Prosody and melody in vowel disorder. Journal of Linguistics, 35(3), 489–525. Harrison, P. (1996). The acquisition of melodic primes in infancy. Paper presented at the 4th Phonology Meeting, University of Manchester, May. Jakobson, R., Fant, G., & Halle, M. (1952). Preliminaries to speech analysis. MIT Press. Jakobson, R., & Halle, M. (1956). Fundamentals of language. Mouton. Jaskuła, K. (2021). English loanwords in the Irish of Iorras Aithneach – New vowels in a Government and Licensing analysis. Journal of Celtic Linguistics, 22(1), 1–14. Kaye, J. (1990). Coda-licensing. Phonology, 7(1), 301–330. Kaye, J., Lowenstamm, J., & Vergnaud, J.-R. (1985). The internal structure of phonological elements: A theory of charm and government. Phonology Yearbook, 2, 305–325. Kaye, J., Lowenstamm, J., & Vergnaud, J.-R. (1990). Constituent structure and government in phonology. Phonology, 7(1), 193–232. Kula, N. C. (2002). The phonology of verbal derivation in Bemba. Netherlands Graduate School of Linguistics. Pathi, S., & Mondal, P. (2021). The mental representation of sounds in speech sound disorders. Humanities and Social Sciences Communications, 8(1), 1–12. Prince, T. (2023). The case of substitutions in adult aphasia and in typical acquisition of French: Revisiting element theory. In F. Breit, B. Botma, M. van’t Veer, & M. van Oostendorp (Eds.), Primitives of phonological structure (pp. 280-304). Oxford University Press. Prince, T., & Ferré, S. (2020). French (A)typical L1 acquisition: Compensatory strategies in #sC sequences. In E. Babatsouli (Ed.), On underreported monolingual child phonology (pp. 179–200). Multilingual Matters. Ritter, N. (1996). An alternative means of expressing manner. Paper presented at the 4th Phonology Meeting, University of Manchester, May.
Government Phonology and Speech Impairment 363 Roca, I., & Johnson, W. (1999). A course in phonology. Blackwell. Sagey, E. (1986). The representation of features and relations in non-linear phonology. PhD dissertation, MIT. Scheer, T., & Cyran, E. (2017a). Interfaces in government phonology. In S.J. Hannahs and Anna Bosch (Ed.), The Routledge handbook of phonological theory (pp. 293–324). Routledge. Scheer, T., & Cyran, E. (2017b). Syllable structure in government phonology. In The Routledge handbook of phonological theory S.J. Hannahs and Anna Bosch (Ed.), (pp. 262–292). Routledge. Scheer, T., & Kula, N. C. (2017). Government phonology: Element theory, conceptual issues and introduction. In The Routledge Handbook of
Phonological Theory S.J. Hannahs and Anna Bosch (Ed.), (pp. 226–261). Routledge. Steriade, D. (1987). Redundant values. Chicago Linguistic Society, 23(2), 339–362. Trubetzkoy, N. (1969). Principles of phonology. University of California Press. (Originally published in 1939). van der Hulst, H. (1989). Atoms of segmental structure: Components, gestures and dependency. Phonology, 6(2), 253–284. van der Hulst, H. (2016). Monovalent “features” in phonology. Language and Linguistics Compass, 10(2), 83–102. van der Hulst, H. (2020). Principles of radical CV phonology: A theory of segmental and syllabic structure. Edinburgh University Press.
26 A Usage-based Approach to Clinical Phonology ANNA V. SOSA AND JOAN L. BYBEE 26.1 I ntroduction: Usage-based Approaches to Language Acquisition In the last two decades many studies of language acquisition have been framed in what has come to be called “usage-based theory” (Dąbrowska & Lieven, 2005; Tomasello, 2005). The main premise of this approach is that experience with language in both children and adults shapes the cognitive representations and processes that make production and perception possible. Rather than proposing innate structures or processes, this theory proposes that with an array of mostly domain-general abilities operating on linguistic input, phonological and morphological units as well as constructions can emerge from the categorization of that input (Beckner et al., 2009; Bybee, 2001, 2006, 2010; Larsen-Freeman, 1997).
26.1.1 Experience with Language: Frequency Effects Frequency effects offer strong evidence for the crucial role played by experience with language in determining the cognitive properties of linguistic representation. The role of frequency in language processing comprises a set of well-established phenomena, involving both lexical access and articulatory automation. Lexical access is the process by which a word is identified from perceptual input or chosen for production. Both processes have been shown to be much faster for frequent than infrequent words. This trend is explained by a model in which each token of language use strengthens the memory for the word, creating what is termed a higher level of resting activation, or a stronger and more entrenched representation (Bybee, 1985; Langacker, 1987). Entrenchment is described as a general psychological principle by which the occurrence of some event leaves behind a trace that facilitates its reoccurrence (Langacker, 2008). The more often an event is repeated, the more entrenched it becomes, making it more accessible. For example, grammatical morphemes or constructions, because they are used over and over again, are highly entrenched and easily accessed when needed. The impact on language structure is seen in the fact that highly entrenched items are not easily changed. For example, there are relatively few irregular past tense verbs in English. Many of those individual forms, however, are highly frequent. Because the high frequency past tense for went is easily accessed, it is unlikely to become regularized to goed. On the other hand, less frequent
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
366 Anna V. Sosa and Joan L. Bybee irregular past tense forms such as wept or crept, are much more likely to undergo the process of regularization, becoming weeped or creeped (Bybee, 1985). This conserving effect is also seen in morpho-syntax where older constructions can survive in frequent contexts. For example, English has two different negative constructions, as seen in these two alternatives (Tottie, 1991): (1) the fellowship has no funds (2) the fellowship doesn’t have any funds The no-negative as in (1) is older but the not-negative in (2) is more productive (Tottie, 1991). However, the older construction continues to be used quite often in high frequency constructions, such as those with have and be (Bybee, 2006). A different sort of effect of high token frequency is the phonetic reduction that occurs in words and phrases that are often repeated. This effect has its source in the neuromotor system which improves efficiency with practice. As speech is a highly practiced activity, using the same configurations of articulations over and over again, there is a natural trend toward increasing efficiency by overlapping and reducing some of the speech gestures (Browman & Goldstein, 1992; Lindblom, 1990). These changes occur as language is being used, so that high frequency words and phrases are exposed to reduction processes more often than low frequency items (Bybee, 2000; Pierrehumbert, 2001). In cases of variation, such as the deletion of the unstressed penultimate vowel in words such as every, camera, memory, family, the higher frequency words have more deletion (Hooper, 1976). In the case of ongoing sound change, it is usually the high frequency words or the words used most in the reducing contexts that undergo change earlier (Bybee, 2002; Phillips, 2006). A few examples are the deletion of intervocalic /d/ [ð] in Spanish dialects (D’Introno & Sosa, 1986); also in Spanish dialects, the reduction of syllable-final /s/ to [h] to ø (Brown, 2009) and the reduction of word-initial /s/ to [h] (Raymond & Brown, 2012). The reducing effect also has an impact on linguistic structure: grammatical morphemes are usually shorter than lexical morphemes; high frequency words are usually shorter than low frequency words (Kapatsinski, 2018; Zipf, 1965).
26.1.2 Rich Memory As argued in Bybee (2001), memory for linguistic experiences is like memory for other types of experience. That is, memories are grouped together into categories, but their similar features are not removed, rather, the memories are highly redundant. While generalizations can be made across categories, this does not entail that details of the linguistic experience are removed (Langacker, 1987). Categories formed in this way – by grouping tokens of experience with similar stored memories (or exemplars) – exhibit prototype effects: they are organized around a central member and linked by sets of family resemblances (Bybee & Eddington, 2006; Lakoff, 1987; Rosch & Mervis, 1975). In studying natural categories, it has been found that even redundant features are used in classifying objects. Along similar lines, work in construction grammar has shown the important link between constructions and the lexical items that occur in them as well as the way children form the lexical categories for constructions (Goldberg, 1995, 2006, 2019). In cognitive and usagebased approaches, then, there is no strict distinction between lexicon and grammar: lexical items and constructions as well as idioms, formulas and prefabricated phrases occur stored together in what has been called the “constructicon.” Morpho-syntactic patterns are not fully abstracted away from the words that occur in them (see Section 26.1.4).
A Usage-based Approach to Clinical Phonology 367
26.1.3 Phonology: Exemplar Models Exemplar theory has been developed more for phonology than for morpho-syntax. Rather than mapping the phonetic tokens found in natural speech on abstract, redundancyfree representations, exemplar theory proposes that phonetic tokens are categorized by similarity to previously occurring tokens to form exemplars. New tokens of experience that are highly similar to existing exemplars are categorized as the same, strengthening that exemplar; if the new token cannot be considered the same as an existing exemplar, a new exemplar is created and positioned (in a spatial metaphor) close to similar exemplars, creating a category or cloud. Of course, phonetic segments are rarely heard in isolation from other phonetic segments, so the unit upon which exemplar clouds are created is a word, or in some cases, a phrase. The words and phrases constitute nodes that also contain information about meaning and context, which also form exemplar clouds (Bybee, 2000, 2001, 2002; Pierrehumbert, 2001, 2003). The evidence for this type of representation is found in cases where subphonemic variation is tied to a particular word, phrase or construction. As mentioned above, in cases of variation and ongoing change, words can have different ranges of variation according to their frequency, frequency in the context for change, and other factors. Thus, every is a twosyllable word, memory has continuous variation in the schwa of the penultimate syllable, and mammary usually has three syllables. In another case, the intervocalic [ð] in Spanish lado “side” is frequently deleted while the [ð] in the same context in grado “degree” is less often deleted (Bybee, 2002; D’Introno & Sosa, 1986). Bybee (2000) explains that reduction processes occur opportunistically every time a word is used; thus words with more exposure to these processes reduce faster. This occurs because the phonetic clouds associated with the word are updated as language is used. Exemplar models allow the direct representation of variation as well as the association of that variation with factors in the context, which can include social factors (Foulkes & Docherty, 2006). They also allow ranges of variation to change over time, even in adults (Harrington, 2006; Sankoff & Blondeau, 2007) and thus can model sound change. In production, an exemplar is chosen based on a variety of factors, but one of them is the prior frequency of the exemplar. As sound change is occurring, an exemplar cloud can gradually shift as the changed exemplars are used more. Because representations are experience-based, the units they are based on may also be multiword sequences such as idioms (pull strings) and formulaic language (see you later), but also prefabs, such as good friends, choose words or broad daylight (Erman & Warren, 2000), and sequences that are frequent but have no other special properties, such as sequences of prepositions and articles, which fuse in languages such as Spanish (de + el > del; a + el > al). Phrases of extreme high frequency undergo more extreme reduction, as found in cases of grammaticalization (be going to > gonna) or phrases that are simply used a lot (I don’t know) (Bybee et al., 2016; Bybee & Scheibman, 1999).
26.1.4 Categorization Redundant storage does not mean that generalizations are not recognized. The human mind continually categorizes incoming and stored experiences. In fact, linguistic experience could not be decoded at all if there were not a powerful matching process that allows relations of similarity to be captured. Even when complex stretches of language are stored as units, their component parts are still matched to other parts that are similar in sound and meaning. Relations among stored items form a vast network ranging from phonetic features to semantic and contextual features. Figure 26.1 shows how the phonological segments and the morphological units emerge from categorizing the parts of words.
368 Anna V. Sosa and Joan L. Bybee readable
washable
unbelievable
believe unattractive
unwarranted
Figure 26.1 Emergent morphological analysis of unbelievable (Bybee, 1998). Reproduced by permission of the Chicago Linguistics Society. Support for this type of cognitive structuring comes from first language acquisition. As we will see in Sections 26.2 and 26.3, the acquisition of sounds is strongly interrelated with the acquisition of words. For morpho-syntax, careful study of children’s utterances reveals that the development of constructions depends on rote learning of expressions of lexically based patterns (such as There’s a ____, or I want a ____) and the superimposition of another unit that fills in the blank (Dąbrowska & Lieven, 2005). In an analysis of questions, Dąbrowska and Lieven (2005) found that 90% of questions asked by children that were not direct repetitions of immediately preceding adult questions, could be derived from previously recorded utterances by the same child with minor alterations. Goldberg (2006, 2019) uses experimental methods to uncover the role of frequency of use and particular lexical items in the acquisition of the meaning of argument structure constructions. These studies support the claim that all stages of development, including those that show gains in productivity, are exemplar-based (Ambridge, 2018).
26.2 Consequences for the Acquisition of Phonology As noted above, usage-based theory conceives of language as a complex adaptive system in which apparent structure emerges from human cognitive capacities as applied to usage events (Bybee, 2001; Hopper, 1998). From the perspective of diachronic change, it is evident that grammatical morphemes and constructions are created through the processes involved in grammaticalization: aspect, tense, voice, and modal constructions and many others develop from lexical items in particular constructions that are used repeatedly and expand their contexts of use in a way that leads them to signal grammatical rather than lexical meaning (Bybee et al., 1994). For example: crosslinguistically past tenses and perfectives develop out of perfects (or anterior, e.g. have done), VERB + finish, or come from + VERB; future constructions develop from verbs of movement toward a goal (be going to), from verbs of volition (will) or obligation (shall) and so on. Argument structure constructions also develop gradually over time through usage (Barðdal, 2008). Similarly, phoneme inventories and phonotactic patterns are created by sound change (Bybee & Easterday, 2022; Easterday, 2019; Greenberg, 1969).
A Usage-based Approach to Clinical Phonology 369 The other sense in which language is emergent is in the individual’s acquisition, in which structures are built up gradually as experience and practice with language lead to the creation of more and more complex and accurate cognitive and neuromotor representations. Recent approaches to the acquisition of phonology have identified some of the developing abilities that lead to the establishment of adult-like phonetics and phonology. First, the development of fine motor skills and the association of articulations with acoustic percepts. Motor skills require significant practice; as Redford (2015) puts it, they are both overlearned and automatic and take a very long time to develop. Second, these motor skills develop in the context of goal-oriented behavior. Motor patterns are directly associated with particular goals and constitute what Redford (2015) calls “schemas,” which are embedded in pragmatic and performative goals. The development of the cross-modal association of context with motor and acoustic patterns is the basis of word learning. While these two aspects of language acquisition can be broken down into further steps, both are firmly based on experience with the ambient language and practice using linguistic material for the accomplishment of goals. It is often observed that early schemas are holistic in form so that phonetic shapes are embedded within the pragmatic whole and often do not fully reproduce the adult form. The honing of motor skills occurs as more schemas develop, adding to the child’s vocabulary. Sosa and Stoel-Gammon (2012), Noiray et al. (2019), and Cychosz et al. (2021) all show that increased ability to produce less variation, and more adult-like coarticulation corresponds to a larger expressive vocabulary, indicating greater practice in production. All usage-based theorists recognize that in addition to remembered exemplars, which can be accessed more or less directly, the human ability to categorize creates relations among these remembered tokens based on similarity of both phonetic substance and conceptual substance (Bybee, 1985, 2001, 2010; Goldberg, 1995, 2006; Langacker, 1987; Tomasello, 2005). Figure 26.1 is a rough sketch of how this might be visualized, but it is better to think of the categorization process as first grouping together instances of [ti] and also [tu], [ta], etc. and perhaps later creating a category for [t]. These categories are much more abstract than the early schemas because they, in themselves, are not associated with a particular goal. The problem of the acquisition of phonology is that it requires a disassociation of form and meaning. On the assumption that adult phonology is based on contrastive, segment-sized units (phonemes) and child language begins with holistic word-based or syllable-based units, researchers are faced with the task of describing a transition from holistic units to smaller segment-based units (Lindblom, 1992; Nittrouer et al., 1996; Noiray et al., 2019; Sosa & StoelGammon, 2006; Studdert-Kennedy, 1998). In contrast, a theory that recognizes categorization of similar percepts but does not insist on the size of those units, can view adult phonology as organized in terms of syllables, of rhymes or of segments all at once. Schemas may be formed at many different levels of generality. The representation of a particular word, such as send, would be a very specific schema or local schema. A schema for the rhyme -end$ is at a more general level of representation. Then there could be a more general schema -Vnd$, and a still more general – vowel-nasal-voiced stop$, or even a more general – vowel-sonorant-stop$, and so on. The presence of any of these levels of generality for a schema does not preclude the existence of others (Bybee, 2001, p. 32). The transition from child speech in which it is found that consonant and vowels are coarticulated to a greater extent than in adult speech can be based on both increased motor skills and a growth of categorization options with increased vocabulary use. Noiray et al. 2019 find a move from C and V articulations being more simultaneous and overlapping to one in which they are more sequenced to be associated with vocabulary growth as well as increase in phonological awareness. Cychosz et al. (2019) add to this list the practice that comes with vocalizing more. Noiray et al. conclude that what they have observed is not just a move
370 Anna V. Sosa and Joan L. Bybee from syllables to segments as basic units, but an increasingly integrated system, supported by many lexical exemplars, which offers children greater flexibility in manipulating parts of the system. Findings that variability in word production decreases with vocabulary growth (Sosa & Stoel-Gammon, 2006) also support the interaction of lexical integration with improved motor control. The categorization processes discussed above allow phonetically similar articulations to be grouped together. This may help to entrench variants that are used more while weakening lesser used variants. Thus improvements in coarticulation and variation in child language may come about through increased practice as well as the natural ability to categorize that is necessary for increased vocabulary size and in turn contributes to further accuracy of phonetic output. We agree with Redford (2015, p. 145) who says “Traditional phonological units, defined as discrete and atemporal, are viewed as epiphenomena”. In addition to the gradual emergence of the units of phonology, children also accomplish the remarkable task of reproducing with great accuracy the automatic coarticulatory patterns of their community and also the range of variation associated with different social groups. The studies just mentioned focus on the changes in children’s consonant and vowel coarticulation in single words toward the adult pattern, but they do not go as far as considering the full range of adult coarticulation which is especially prominent in multi-word utterances. It takes children a very long time to master the coarticulatory detail and how it varies in casual speech, as well as the special reduction that occurs in high frequency phrases such as I don’t know, want to, have to, want you to and the appropriate contexts for their use. Thus, we feel it is important to point out that the result of the acquisition process is a speaker who reflects with great detail and accuracy the articulatory gestures, their magnitude and timing of the speech community to which she belongs. Research focusing on the acquisition of sociolinguistic variation has undergone some changes over the decades. Early approaches assumed that children were acquiring variable rules (Roberts & Labov, 1995) but more recently researchers have identified a significant lexical effect. Chevrot et al. (2000) study the presence or absence of final /r/ in French children aged 6 to 7 and 10 to 12 and conclude that given the prevalence of the absence of /r/, acquiring the variant with /r/ requires that information be encoded in the lexical representation of each word. Diaz-Campos (2004) also comes to the conclusion that children are not learning a variable rule for the presence or absence of intervocalic [ð] in Venezuelan Spanish: even younger children (42 to 47 months old) control the variation and favor d eletion in high frequency words. Foulkes and Docherty (2006) also offer evidence that the children’s use of sociophonetic variants closely follows the input received from their mothers and their peers once they have started school, indicating that “changes are taking place in learned linguistic experience” (2006, p. 425). They also note that input experienced by children intertwines both linguistic and non-linguistic, especially social, information. They argue that an exemplar model, which registers phonetic tokens and the social information associated with them, is particularly appropriate for capturing the gradual acquisition of the phonetic and indexical values of community variation. Thus work on typical language acquisition of late has focused on the emerging categories of speech, based on stored words, and the growing integration of the phonological system built up by sorting parts of words for similarity with other words, along with the growing motor skills required for producing words. It is also recognized that the child’s experience is categorized along with the phonetics, and even young children master much of the phonetic variation associated with individuals and social groups in the child’s experience.
A Usage-based Approach to Clinical Phonology 371
26.3 Clinical Applications of Usage-based Phonology The chapters in the phonology section of this volume demonstrate that many different phonological theories have been used to describe and analyze the speech production patterns of children with speech sound disorder (SSD) and to inform treatment. These include constraints-based theories, Articulatory Phonology, and Government Phonology, among others. In this section, we will further consider how the basic tenets of usage-based phonology, as described in the first part of the chapter, may be applied to the clinical practice of speechlanguage pathologists working with children with SSD.
26.3.1 General Considerations Broadly speaking, a fundamental difference in terms of clinical thinking within a u sage-based framework will be that phonological competence is not described entirely in terms of the mastery of individual features, contrasts, or sounds, as is typical in much clinical practice. More emphasis will be placed instead on frequency and patterns and context of use, stemming from the usage-based idea that phonology does not exist in isolation, but only in relation to stored lexical items, phrases, and constructions that are used in specific communicative contexts. Furthermore, the underlying representations for these items are thought to be concrete, as opposed to abstract, and production would not be described in terms of rules or processes that change a correct underlying form into the erred production. Thus, analysis would include examining existing networks of lexical items that are either sufficient or insufficient for the emergence of individual phonological patterns and units. This is consistent with the assertion in Ball (2003), which notes that within a usage-based or Cognitive approach to clinical phonology, less emphasis will be placed on individual phonemic contrasts, as is typical of a minimal pairs approach for example, and more emphasis would be placed on the building up of networks of words that would allow phonological units to emerge.
26.3.2 Phonology and the Lexicon Together with the de-emphasis on individual phonemes and phonemic contrasts, a usage-based approach to clinical phonology would highlight the important connections between phonological and lexical development. The field of clinical linguistics has historically differentiated “speech” from “language” and often viewed articulation/phonology as separate from the actual language system (Hoff & Parra, 2011); perhaps most clearly evidenced by the common reference to some children on a clinician’s caseload as being just artic kids. Nonetheless, the idea of an important link between phonology and the lexicon is not a new idea in the field of clinical phonology and phonological development. In the 1970s, child phonologists began to acknowledge an important relationship between phonological and lexical development. Ferguson and Farwell (1975), for example, highlighted the importance of the “lexical parameter” in phonological acquisition, positing that phonological development is a gradual process that occurs based on generalization from the child’s own “phonic core of remembered lexical items and articulations which produce them” (Ferguson & Farwell, 1975, p. 437). And since Ferguson & Farwell’s early observations, a large number of studies of children with both typical and delayed linguistic development have confirmed this close relationship between phonology and the lexicon (i.e., Macrae & Sosa, 2015; Rescorla & Ratner, 1996; Sosa & Stoel-Gammon, 2012; Zamuner & Thiessen, 2018). Redford (2015) extends these observations of close interconnections between phonetics, phonology, and words and outlines a model of language production that aims to unify speech and language within the context of language development. In this model, the unit of linguistic
372 Anna V. Sosa and Joan L. Bybee production is a schema consisting of “temporally structured sequences of remembered actions” (Redford, 2015, p. 141) that is executed in order to achieve a specific communicative goal. Thus, the actual vocal and articulatory routines produced are an integral component of the communicative act itself. There are several clinical implications that arise from this notion that fluent speech emerges from these schema that are activated only in the context of achieving a specific communication goal. For example, there has long been an emphasis on focusing on “functional” communication in a clinical context, particularly when targeting expressive language, but also to a certain extent when working on speech sound production. For example, treatment may include specific work on words, phrases, and even entire dialogues that are most useful in the daily communicative interactions of the individual child. The model outlined by Redford, which is framed within usage-based linguistic theory, would support an approach to speech sound intervention that emphasizes the importance of successful completion of communicative goals in meaningful contexts given that phonological structure is seen as emerging from repeated practice of these underlying schemas. This would be in contrast to an approach that focuses on drilling sounds in isolation or in nonsense syllables or in words and phrases that are not produced in the context of accomplishing the child’s own communicative goal. Taken further, this perspective on speech production could provide additional theoretical basis for rejecting the use of non-speech oral motor exercises to treat speech sound disorder (see Lof & Watson, 2008), since non-speech movements are not produced in the context of action schema that are used to accomplish a specific communicative goal.
26.3.3 The Importance of Frequency Perhaps the most central aspect of usage-based phonology is the emphasis on the role that frequency plays in the shaping of linguistic structure. Frequency, however, is not routinely considered in the clinical analysis of speech production patterns or in treatment planning. Take, for example, the observation that a child produces the word “gate” as [geɪt], the name of his new friend “Gavin” as [gævɪn], but the word “go” as [do]. A usual analysis would describe this as inconsistent use of the phonological process of velar fronting. A clinician may go further in looking for phonetic characteristics of the target words that might explain the inconsistent use of the process. A usage-based account, however, would consider the potential role of frequency. Likely, the word go is highly frequent in the child’s speech. Within a usage-based account, the representation for this high frequency word would be entrenched and thus resistant to change via analogy to other occurrences of [g] in the child’s lexicon. Thus, treatment would involve direct targeting of the high frequency words that may retain the erred production patterns. Information regarding the role of frequency in typical phonological development also has implications for the treatment of speech sound disorder. A number of studies have identified frequency, both token and type, input and output, as an important factor in the acquisition of phonological structure. Several studies suggest that accurate productions may emerge first in high frequency words. An early study by Tyler & Edwards found that stable, correct production of aspirated voiceless stops appeared first in words that were produced frequently by the children (Tyler & Edwards, 1993). Ota (2006) and Ota and Green (2013) report higher accuracy of initial consonant clusters in high frequency words and lower occurrence of syllable deletion in high vs. low frequency words. Additionally, Sosa and Stoel-Gammon (2012) found that high frequency words were produced more consistently by typically developing two-year-olds, although frequency did not influence overall accuracy of production. In addition to word frequency, the frequency and/or probability of individual sounds and sound sequences has also been found to facilitate accuracy of production (Beckman & Edwards, 2000; Leonard & Ritterman, 1971; Másdóttir & Stokes, 2015; Sosa & Stoel-Gammon, 2012; Zamuner et al., 2004).
A Usage-based Approach to Clinical Phonology 373 These findings regarding the role of frequency in typical development have led to the investigation of frequency as a potential active ingredient in the treatment of speech sound disorder. Much of the work in this area comes from Gierut and colleagues in their investigations of the role of lexical factors such as word frequency and phonological neighborhood density on patterns of change in the productive phonology of children with phonological delay. While there appears to be an interaction between word frequency and neighborhood density, in general, the use of high frequency words in treatment results in greater phonological learning than using low frequency words (Gierut & Morrisette, 2012; Gierut et al., 1999; Morrisette & Gierut, 2002). Additionally, from a developmental perspective, the facilitative effect of word frequency on accuracy and consistency of production in typical development would suggest that targeting high frequency words would be appropriate. See Storkel (2018) and Sosa (2016) for more detailed discussion of how clinicians might integrate the consideration of these lexical factors into their target selection for the treatment of speech sound disorder.
26.3.4 Exemplar Models of Phonological Representation As described in the previous section, usage-based phonological theory is consistent with an exemplar model of phonological representation, which proposes that multiple phonetic tokens of individual linguistic strings (e.g., words, phrases, etc.) are stored simultaneously, creating an exemplar cloud of possible productions of an individual word. This model of representation allows for variable realization of individual linguistic units; the factors that influence which exemplar is selected include frequency and context, among others. Clinicians working with children with SSD are acutely aware of the ubiquity of variable production of individual sounds and words. Once a new sound is established and easily p roduced by a child, the process of generalization of the new sound to words, words in phrases, words in conversation, and to new communicative contexts is usually long and slow. Within a u sage-based framework, generalization would be seen as the process of ensuring that the correct production of a word becomes the most central exemplar making it more likely to be selected in a communicative context. Given that treatment time is often limited to 30–60 minutes per week, it is unlikely that the correct production practiced during therapy sessions would be the most frequent exemplar in the exemplar cloud and thus be the one selected during communication in other contexts. Thus, treatment would need to be designed to build the frequency with which correct productions are experienced and produced and to extend the use of correct productions to as many different contexts as possible to increase the likelihood of selecting the correct exemplar from among those stored.
26.4 Conclusion There are many aspects of a usage-based approach to phonology that are likely attractive and even intuitive to experienced clinicians. For example, clinicians know that phonological learning in the clinical setting is not a matter of merely learning that two sounds are meaningfully contrastive and once a child learns the motor pattern to produce a new sound they will then use it consistently in all contexts. Furthermore, clinicians know that treatment should be designed to optimize the number of productions during a session and that activities should be created that provide additional opportunities for children to practice their targets outside of the therapy setting and in a variety of contexts. Usage-based phonology provides a unique theoretical framework that allows and encourages clinicians to think about phonological learning as an emergent process that develops through context and usage, while minimizing the emphasis on mastery of individual features, contrasts, and phonemes. Likely, many clinicians are already thinking and practicing within this usage-based framework.
374 Anna V. Sosa and Joan L. Bybee
REFERENCES Ambridge, B. (2018). Against stored abstractions: A radical exemplar model of language acquisition. https://doi.org/10.31234/osf.io/ gy3ah Ball, M. J. (2003). Clinical applications of a cognitive phonology. Logopedics Phoniatrics Vocology, 28(2), 63–69. https://doi. org/10.1080/14015430310011763 Barðdal, J. (2008). Productivity: Evidence from case and argument structure in Icelandic. John Benjamins Publishing. Beckman, M. E., & Edwards, J. (2000). The ontogeny of phonological categories and the primacy of lexical learning in linguistic development. Child Development, 71(1), 240–249. https://doi.org/10.1111/1467-8624.00139 Beckner, C., Blythe, R., Bybee, J., Christiansen, M. H., Croft, W., & Schoenemann, T. (2009). Language is a complex adaptive system: Position paper. Language Learning, 59(s1), 1–26. Browman, C. P., & Goldstein, L. (1992). Articulatory phonology: An overview. Phonetica, 49(3–4), 155–180. Brown, E. (2009). A usage-based account of syllable- and word-final /s/ reduction in four dialects of Spanish. Studies in Hispanic and Lusophone Linguistics, 2(1), 249–250. https:// doi.org/10.1515/shll-2009-1047 Bybee, J. (1998). The emergent lexicon. Chicago Linguistic Society, 34(2), 421–435. Bybee, J. (2000). The phonology of the lexicon: Evidence from lexical diffusion. In M. Barlow & S. Kemmer (Eds.), Usage-based models of language (pp. 1–63). CSLI Publications. Bybee, J. (2001). Phonology and language use. Cambridge University Press. Bybee, J. (2002). Word frequency and context of use in the lexical diffusion of phonetically conditioned sound change. Language Variation and Change, 14(3), 261–290. https://doi.org/ 10.1017/s0954394502143018 Bybee, J. (2010). Language, usage and cognition. Cambridge University Press. Bybee, J. L. (1985). Morphology: A study of the relation between meaning and form. John Benjamins Publishing. Bybee, J. L. (2006). From usage to grammar: The mind’s response to repetition. Language, 82(4), 711–733. https://doi.org/10.1353/lan.2006.0186 Bybee, J., & Easterday, S. (2022). Primal consonants and the evolution of consonant inventories. Language Dynamics and Change, 13(1), 1–33. https://doi.org/10.1163/22105832-bja10020
Bybee, J., & Eddington, D. (2006). A usage-based approach to Spanish verbs of “becoming”. Language, 82(2), 323–355. Bybee, J., File-Muriel, R. J., & Napoleão De Souza, R. (2016). Special reduction: A usagebased approach. Language and Cognition, 8(3), 421–446. https://doi.org/10.1017/ langcog.2016.19 Bybee, J., Perkins, R., & Pagliuca, W. (1994). The evolution of grammar: Tense, aspect, and modality in the languages of the world. University of Chicago Press. Bybee, J., & Scheibman, J. (1999). The effect of usage on degrees of constituency: The reduction of don’t in English. Linguistics, 37(4). https://doi.org/10.1515/ling.37.4.575 Chevrot, J., Beaud, L., & Varga, R. (2000). Developmental data on a French sociolinguistic variable: Post-consonantal word-final /R/. Language Variation and Change, 12(3), 295–319. https://doi.org/10.1017/ s095439450012304x Cychosz, M., Edwards, J. R., Munson, B., & Johnson, K. (2019). Spectral and temporal measures of coarticulation in child speech. The Journal of the Acoustical Society of America, 146(6), EL516–EL522. Cychosz, M., Munson, B., & Edwards, J. R. (2021). Practice and experience predict Coarticulation in child speech. Language Learning and Development, 17(4), 366–396. https://doi.org/10. 1080/15475441.2021.1890080 Dąbrowska, E., & Lieven, E. (2005). Towards a lexically specific grammar of children’s question constructions. Cognitive Linguistics, 16(3). https://doi.org/10.1515/ cogl.2005.16.3.437 Diaz-Campos, M. (2004). Acquisition of sociolinguistic variables in Spanish: Do children acquire individual lexical forms or variable rules? In T. Face (Ed.), Laboratory approaches to Spanish phonology (pp. 221–236). Walter de Gruyter. D’Introno, F., & Sosa, J. M. (1986). Elisión de la /d/ en el español de Caracas: Aspectos sociolingüísticos e implicaciones teóricas. In R. Núñez Cedeño, I. Páez Urdaneta, & Y. J. Guitart (Eds.), Estudios sobre la fonología del español del Caribe (pp. 135–163). Easterday, S. (2019). Highly complex syllable structure: A typological and diachronic study: A typological and diachronic study. Language Science Press.
A Usage-based Approach to Clinical Phonology 375 Erman, B., & Warren, B. (2000). The idiom principle and the open choice principle. Text - Interdisciplinary Journal for the Study of Discourse, 20(1), 29–62. https://doi.org/ 10.1515/text.1.2000.20.1.29 Ferguson, C. A., & Farwell, C. B. (1975). Words and sounds in early language acquisition. Language, 51(2), 419. https://doi.org/10.2307/412864 Foulkes, P., & Docherty, G. (2006). The social life of phonetics and phonology. Journal of Phonetics, 34(4), 409–438. https://doi. org/10.1016/j.wocn.2005.08.002 Gierut, J. A., & Morrisette, M. L. (2012). Density, frequency and the expressive phonology of children with phonological delay. Journal of Child Language, 39(4), 804–834. https://doi. org/10.1017/s0305000911000304 Gierut, J. A., Morrisette, M. L., & Hust Champion, A. (1999). Lexical constraints in phonological acquisition. Journal of Child Language, 26(2), 261–294. https://doi. org/10.1017/s0305000999003797 Goldberg, A. E. (1995). Constructions: A construction grammar approach to argument structure. University of Chicago Press. Goldberg, A. E. (2006). Constructions at work: The nature of generalization in language. Oxford University Press. Goldberg, A. E. (2019). Explain me this: Creativity, competition, and the partial productivity of constructions. Princeton University Press. Greenberg, J. (1969). Some methods of dynamic comparison in linguistics. In J. Puhvel (Ed.), Substance and structure of language (pp. 147–203). University of California Press. Harrington, J. (2006). An acoustic analysis of “happy-tensing” in the Queen’s Christmas broadcasts. Journal of Phonetics, 34(4), 439–457. https://doi.org/10.1016/j.wocn.2005.08.001 Hoff, E., & Parra, M. (2011). Mechanisms linking phonological development to lexical development – A commentary on StoelGammon’s “Relationships between lexical and phonological development in young children.” Journal of Child Language, 38(1), 46-50. https:// doi.org/10.1017/s0305000910000462 Hooper, J. B. (1976). Word frequency in lexical diffusion and the source of morphophonological change. In W. Christie (Ed.), Current progress in historical linguistics (pp. 96–105). North Holland. Hopper, P. J. (1998). Emergent grammar. In M. Tomasello (Ed.), The new psychology of language: Cognitive and functional approaches to language structure (pp. 155–175). Lawrence Erlbaum. Kapatsinski, V. (2018). Changing minds changing tools: From learning theory to language acquisition to language change. MIT Press.
Lakoff, G. (1987). Women, fire, and dangerous things: What categories reveal about the mind. University of Chicago Press. Langacker, R. W. (1987). Foundations of cognitive grammar: Theoretical prerequisites. Stanford University Press. Langacker, R. W. (2008). Cognitive grammar: A basic introduction. Oxford University Press. Larsen-Freeman, D. (1997). Chaos/complexity science and second language acquisition. Applied Linguistics, 18(2), 141–165. https://doi. org/10.1093/applin/18.2.141 Leonard, L. B., & Ritterman, S. I. (1971). Articulation of /s/ as a function of cluster and word frequency of occurrence. Journal of Speech and Hearing Research, 14(3), 476–485. https:// doi.org/10.1044/jshr.1403.476 Lindblom, B. (1990). Explaining phonetic variation: A sketch of the H&H theory. In W. J. Hardcastle & A. Marchal (Eds.), Speech production and speech modelling (pp. 403–439). Kluwer Academic Publishers. Lindblom, B. (1992). Phonological units as adaptive emergents of lexical development. In C. A. Ferguson, L. Menn, & C. Stoel-Gammon (Eds.), Phonological development: Models, research, implications (pp. 131–163). York Press. Lof, G. L., & Watson, M. M. (2008). A nationwide survey of Nonspeech oral motor exercise use: Implications for evidence-based practice. Language, Speech, and Hearing Services in Schools, 39(3), 392–407. https://doi.org/ 10.1044/0161-1461(2008/037) Macrae, T., & Sosa, A. V. (2015). Predictors of token-to-token inconsistency in preschool children with typical speech-language development. Clinical Linguistics & Phonetics, 29(12), 922–937. https://doi.org/10.3109/02699 206.2015.1063085 Másdóttir, T., & Stokes, S. F. (2015). Influence of consonant frequency on Icelandic-speaking children’s speech acquisition. International Journal of Speech-Language Pathology, 18(2), 111–121. https://doi.org/10.3109/17549507.20 15.1060525 Morrisette, M. L., & Gierut, J. A. (2002). Lexical organization and phonological change in treatment. Journal of Speech, Language, and Hearing Research, 45(1), 143–159. https://doi. org/10.1044/1092-4388(2002/011) Nittrouer, S., Studdert-Kennedy, M., & Neely, S. T. (1996). How children learn to organize their speech gestures: Further evidence from fricative-vowel syllables. Journal of Speech, Language, and Hearing Research, 39(2), 379–389. https://doi.org/10.1044/jshr. 3902.379
376 Anna V. Sosa and Joan L. Bybee Noiray, A., Popescu, A., Killmer, H., Rubertus, E., Krüger, S., & Hintermeier, L. (2019). Spoken language development and the challenge of skill integration. Frontiers in Psychology, 10, 2777. https://doi.org/10.3389/fpsyg.2019.02777 Ota, M. (2006). Input frequency and word truncation in child Japanese: Structural and lexical effects. Language and Speech, 49(2), 261–294. https://doi.org/10.1177/00238309060 490020601 Ota, M., & Green, S. J. (2013). Input frequency and lexical variability in phonological development: a survival analysis of wordinitial cluster production. Journal of Child Language, 40(3), 539–566. Phillips, B. (2006). Word frequency and lexical diffusion. Springer. Pierrehumbert, J. (2001). Exemplar dynamics: Word frequency, lenition and contrast. In J. Bybee & P. Hopper (Eds.), Frequency and the emergence of linguistic structure (pp. 137–157). John Benjamins Publishin. Pierrehumbert, J. B. (2003). Phonetic diversity, statistical learning, and acquisition of phonology. Language and Speech, 46(2–3), 115–154. https:// doi.org/10.1177/00238309030460020501 Raymond, W., & Brown, E. (2012). Are effects of word frequency effects of context of use? An analysis of initial fricative reduction in Spanish. In S. T. Gries & D. Dagmar (Eds.), Frequency effects in language learning and processing. Walter de Gruyter. Redford, M. A. (2015). Unifying speech and language in a developmentally sensitive model of production. Journal of Phonetics, 53, 141–152. https://doi.org/10.1016/j.wocn.2015.06.006 Rescorla, L., & Ratner, N. B. (1996). Phonetic profiles of toddlers with specific expressive language impairment (SLI-E). Journal of Speech, Language, and Hearing Research, 39(1), 153–165. https://doi.org/10.1044/jshr.3901.153 Roberts, J., & Labov, W. (1995). Learning to talk Philadelphian: Acquisition of short a by preschool children. Language Variation and Change, 7(1), 101–112. https://doi.org/ 10.1017/s0954394500000910 Rosch, E., & Mervis, C. B. (1975). Family resemblances: Studies in the internal structure of categories. Cognitive Psychology, 7(4), 573–605. https://doi.org/10.1016/0010-0285(75)90024-9 Sankoff, G., & Blondeau, H. (2007). Language change across the lifespan: /R/ in Montreal French. Language, 83(3), 560–588. https://doi. org/10.1353/lan.2007.0106
Sosa, A. (2016). Lexical considerations in the treatment of speech sound disorders in children. Perspectives of the ASHA Special Interest Groups, 1(1), 57–65. https://doi. org/10.1044/persp1.sig1.57 Sosa, A. V., & Stoel-Gammon, C. (2006). Patterns of intra-word phonological variability during the second year of life. Journal of Child Language, 33(1), 31–50. https://doi. org/10.1017/s0305000905007166 Sosa, A. V., & Stoel-Gammon, C. (2012). Lexical and phonological effects in early word production. Journal of Speech, Language, and Hearing Research, 55(2), 596–608. https://doi. org/10.1044/1092-4388(2011/10-0113) Storkel, H. L. (2018). Implementing evidencebased practice: Selecting treatment words to boost phonological learning. Language, Speech, and Hearing Services in Schools, 49(3), 482–496. https://doi.org/10.1044/2017 _lshss-17-0080 Studdert-Kennedy, M. (1998). The particulate origins of language generativity: From syllable to gesture. In J. R. Hurford, M. StuddertKennedy, & C. Knight (Eds.), Approaches to the evolution of language (pp. 202–221). Cambridge University Press. Tomasello, M. (2005). Constructing a language. A usage-based theory of language acquisition. Harvard University Press. Tottie, G. (1991). Lexical diffusion in syntactic change: Frequency as a determinant of linguistic conservatism in the development of negation in English. In D. Kastovsky (Ed.), Historical English syntax (pp. 439–468). Walter de Gruyter. Tyler, A. A., & Edwards, M. L. (1993). Lexical acquisition and acquisition of initial voiceless stops. Journal of Child Language, 20(2), 253–273. https://doi.org/10.1017/s0305000900008278 Zamuner, T. S., Gerken, L., & Hammond, M. (2004). Phonotactic probabilities in young children’s speech production. Journal of Child Language, 31(3), 515–536. https://doi. org/10.1017/s0305000904006233 Zamuner, T. S., & Thiessen, A. (2018). A phonological, lexical, and phonetic analysis of the new words that young children imitate. Canadian Journal of Linguistics/Revue canadienne de linguistique, 63(4), 609–632. https://doi. org/10.1017/cnj.2018.10 Zipf, G. K. (1965). Human behavior and the principle of least effort: An introduction to human ecology. Hafner Pub. Co. 1965 [c1949].
27 Typical and Nontypical Phonological Development MICHELLE PASCOE 27.1 Introduction Most children acquire the phonology of their language/s with ease. They usually follow an effortless trajectory that will be unremarkable to those familiar with the language/s and young children. However, a relatively small proportion of children experience difficulties with this process and may need additional support from their families and professionals, such as speech and language therapists and teachers. Consequently, understanding typical phonological acquisition forms the foundation for identifying and supporting children with speech sound difficulties. This chapter aims to describe the process of typical phonological acquisition, showing how knowledge of this process can be used to identify children who are not acquiring phonology in a typical way, ensuring that appropriate intervention can be offered to them. The chapter starts with a rationale for the study of typical and nontypical phonological acquisition before moving on to a brief discussion of theoretical and methodological aspects that set the scene for the remainder of the chapter. I then provide a detailed description of typical phonological development. An important distinction is made between aspects of acquisition that are general across languages and aspects which are language specific. Rather than providing a detailed description of typical phonological development in one language, the chapter provides a global description of the general process of phonological acquisition that could apply to all children irrespective of the specific language or languages they are learning. Children acquiring specific languages or language combinations are described at various points throughout the chapter to illustrate how these general patterns are realized in specific languages, and readers are directed to sources for more information about acquisition in specific languages. The final section focuses on nontypical phonological acquisition. When children’s phonological development is nontypical, what do we observe and how do we account for the differences observed? Here I return to the developmental phase model introduced earlier in the chapter to account for nontypical phonology. Case vignettes illustrate nontypical acquisition in some languages and language combinations.
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
378 Michelle Pascoe
27.2 W hy Is It Important to Study Typical and Nontypical Phonological Development? Typical phonological acquisition refers to the expected, predictable process that children follow from the time they are born until their early school years, learning to use and understand the sounds and sound structures of the languages they are exposed to. In this sense, typical phonological development refers to statistical similarity or frequency, that is, a child will be considered typical when their phonology falls within a delineated range of acceptability based on what is commonly found in their age group and for their language (McLeod & Baker, 2017). Statistical probability is used to determine the acceptable range based on research documenting children’s phonological knowledge and abilities at different ages and the cut-off scores from normative tests that indicate typical acquisition. Information on whether a child’s speech development is on par with their peers can be obtained from normreferenced standardized tools where the clinician compares a child’s performance to that of typically developing peers (De Lamo White & Jin, 2011). Typically developing children will have mastered most of the sound structures of their language/s by the time they start formal schooling, ensuring they are easily intelligible to those around them, including unfamiliar adults. However, they will not have completed language acquisition by this age and will continue developing aspects of language like vocabulary, syntax, pragmatics, and literacy skills for some time. Thus, phonology is one of the earliest acquired and mastered aspects of language and, for many children, a process so natural as to be unremarkable. Knowing what to expect – and roughly when to expect it – is important for speech and language therapists (and teachers, parents and health professionals) so that action can be taken to identify children who are not developing typically and provide them and their families with support. Conversely, families whose children are developing as expected can be reassured that all is “on track”. Babatsouli et al. (2018) note: The laborious ant race involved in collecting and documenting norms in human language acquisition serves to develop common yardsticks that sketch out typical language acquisition. It is primarily on these grounds that nontypical phenomena are evaluated as such, that is, whether they meet the criteria of what is typical or not. (p. xxvi)
When compared to typically developing children, children with communication impairments have been shown to have poorer social skills, more behavioral problems, and difficulties forming friendships as they are often bullied and socially isolated (McLeod & Bleile, 2007; Tempest & Wells, 2012). They may also face challenges with academic work and literacy acquisition. Early identification of speech and other communication difficulties is therefore critical given the important role of speech and its relationship to later academic performance, psychosocial well-being and quality of life. However, to identify children at risk, speech and language therapists must be able to differentiate between typical and nontypical speech sound development. When assessing children for phonological difficulties and planning for intervention, it is vital for speech and language therapists to fully understand how children acquire speech in their home language or languages (McLeod & Bleile, 2007). Languages differ in terms of the size of consonant and vowel inventories, and how consonants contrast with each other, for example, voicedness is a contrastive feature for English, Spanish, and Welsh, while aspiration is contrasted in Putonghua and Cantonese. Urdu has both aspiration and voicedness as contrastive features (Hua, 2006). Syllable shapes vary across languages, as well as whether or not consonant clusters are used (and where they can be used). For further analysis of different phonologies, see Hua and Dodd (2006) and van der
Typical and Nontypical Phonological Development 379 Merwe and Le Roux (2014). Hua and Dodd (2006) note that the pair or set of languages being learned may affect the type of phonological errors made, that is, exposure to two specific phonological systems leads to specific developmental error patterns. Obtaining this general picture of how children develop phonology across a range of languages requires detailed study of phonological development in many different languages, and then crosslinguistic comparison of the process children undergo when acquiring each language or a particular set of languages. Hua and Dodd (2006), Bernhardt and Stemberger (2018), McLeod and Goldstein (2012) and McLeod (2007, Forthcoming) have collated information on p honological acquisition in a wide range of languages to yield a picture of general and language-specific aspects of phonological acquisition. Focusing specifically on bilingual populations, Hambly et al. (2013) systematically reviewed 23 studies of bilingual phonological acquisition, revealing a complex picture of the languages under study. In some cases, bilingual children showed no differences compared with monolingual children acquiring the languages. In other cases, bilingual acquisition was slower than monolingual acquisition, and in other instances, bilingual acquisition was more rapid than for monolingual peers. Hambly et al. (2013) found that bilingual children follow a slightly different trajectory in their phonological development compared with monolingual peers, and the languages acquired often influence each other, although the degree of influence varies. In summary, it is important to understand typical phonological development as it is the baseline against which judgments about nontypical development are made. Some aspects of phonological development are common across all languages, but other aspects are language specific. Knowledge of the general process of typical phonological development is thus an important starting point, but language-specific information is also essential to understand whether a child’s phonology is developing in the expected way.
27.3 Theoretical Frameworks Several theoretical models can be applied to children’s phonological development, for example, Optimality Theory (Barlow & Gierut, 1999; Prince & Smolensky, 1993) and other constraints-based nonlinear phonological theories (see Chapter 23) and Government Phonology (see Chapter 25). A descriptive-linguistic approach describes a child’s phonological output in terms of their phonetic and phonemic inventories, and uses metrics such as percentage consonants correct (PCC) and percentage vowels correct (PVC), intelligibility and phonological processes. The phonetic inventory refers to the consonants and vowels a child can produce regardless of the adult target and whether they are used in correct contexts. The phonemic inventory, in contrast, refers to the range of phonemes a child can use appropriately in words to indicate meaning and make meaningful contrasts. Psycholinguistic approaches consider that phonological development involves three main aspects: phonological input processing, stored representations and phonological output (Dodd, 2013; Rvachew & Brosseau-Lapré, 2016; Stackhouse & Wells, 2001). Children build knowledge of their phonology based on the input they receive, which is stored as phonological knowledge (and could be linked to semantic and syntactic knowledge). Phonological output then involves retrieving appropriate phonological information, combined with motor and articulatory abilities to physically realize the word. Stackhouse and Wells’ developmental phase model adopts a developmental approach to describing phonological development. In this chapter, we use a developmental phase model (Stackhouse & Wells, 2001) (Figure 27.1) to describe the phases of children’s speech development. The phase model is helpful in its longitudinal perspective of the child’s journey from reflexive sounds, cooing and babbling to intelligible speech, and can also be applied together with some of the linguistic theories detailed above and a descriptive-linguistic approach.
380 Michelle Pascoe Pre-lexical Whole-word Systematic Simplification Assembly Metaphonological Normal Development
Figure 27.1 The developmental phase model (adapted from Pascoe et al., 2006; Stackhouse & Wells, 2001 with permission from John Wiley and Sons).
The Stackhouse and Wells (2001) developmental phase model describes five stages of typical phonological development, noting that a child does not need to complete a phase before moving on to the next phase – phases can overlap. The pre-lexical phase occurs in the first year of a typically developing child’s life. The elements of the child’s speech processing system (input, stored representations and output) are being integrated, but the child has yet to produce recognizable speech. Characterized by unrecognizable speech (or babbling), this is an important prerequisite phase for developing first words, as there is a continuous link between babbling and first-word vocalizations. In the second year of life, the whole-word phase occurs, in which single words are used to communicate. The systematic simplification phase occurs next, during which processes of phonological simplification are observed. Single words are then combined to form phrases in the assembly phase. The assembly phase is described as “bringing it all together” with single words incorporated into connected speech and used for various communicative purposes. At an early school age, the metaphonological phase ensues. This entails understanding and reflecting on language in a more abstract way to link speech and literacy. Figure 27.1 summarizes the developmental phase model.
27.4 Methodological Considerations To understand the nature of typical phonological development, research is needed that documents the process – what Babatsouli et al. (2018) call “the laborious ant race.” Hua (2006) sets out guidelines for designing normative studies of phonology, guidelines for participant inclusion criteria and the type of stimuli to use. The first stage in the research cycle investigates and documents the typical developmental patterns of children acquiring a particular language or languages. Stage 2 involves consideration of children acquiring the same languages in the first step but doing so in nontypical ways – due to known (e.g., hearing impairment, childhood apraxia of speech, cleft lift and/or palate) – or unknown etiologies. Stages 3 and 4 then involve wider comparison of both typical and nontypical patterns with those observed in other languages to test hypotheses about language universals or generate new hypotheses. Thus, researchers focusing on a specific language provide language-specific information that is essential to understand/evaluate children acquiring that language, but from a theoretical standpoint, also contribute to a global picture of phonological acquisition. General information about the nature of phonological acquisition across languages is based
Typical and Nontypical Phonological Development 381 on a comparison of many studies of different languages (i.e., Hua’s Stage 3 and 4), for example, see Hua and Dodd (2006), Bernhardt and Stemberger (2018), McLeod and Goldstein (2012) and McLeod (2007). Extensive research on children’s speech sound acquisition has been conducted. Much of it has involved assessment of monolingual, English-speaking children in developed contexts. However, there is a need for many more studies of typical phonological acquisition in underresearched languages beyond English. For many languages Stage 1 and Stage 2 information is minimal or not yet available. For example, in South Africa, there are not yet any published studies describing typical phonological acquisition of some of the country’s official languages like Xitsonga, Tshivenda, and isiNdebele, and only a handful of studies (e.g., Naidoo et al., 2005; Pascoe & Jeggo, 2019) describing typical phonological acquisition of isiZulu, the most frequently spoken home language in the country (Statistics South Africa, 2012). There are no published studies regarding nontypical phonological development of isiZulu-speaking children. Without the foundation of knowledge about “what to expect when,” it is impossible to make an evidence-based diagnosis, and children might be over or under diagnosed with speech difficulties leading to inappropriate interventions or missed opportunities. Major language families with published speech acquisition norms include Jordanian Arabic, Lebanese Arabic, Cantonese, Dutch, Filipino, Finnish, French, German, Greek, Hungarian, Israeli Hebrew, Japanese, Korean, Maltese, Norwegian, Portuguese, Putonghua (Modern Standard Chinese), Sesotho, Spanish, Thai, Turkish, Vietnamese, Welsh, and Zapotec (see McLeod, 2007, Forthcoming), and new studies are consistently being published. In a clinical situation, bilingual children need to be compared with typically developing bilingual peers. However, there is a lack of normative data for many languages and language combinations, although this body of literature is growing (De Lamo White & Jin, 2011). Different languages and language pairs influence one another in different ways. That is, data from the study of one language does not necessarily hold for other languages, and data from a language pair spoken by a bilingual child does not necessarily predict findings for other language pairs, even when one language of the pair is the same. McLeod and Baker (2017) caution that the notion of what is typical is linked to what is judged as acceptable or desirable in a given context. Even within one language, there are many acceptable ways for adults to produce words and sounds of their language/s. Language variation depends on a range of factors such as geographical location, age, gender and socio-economic status, and means that when evaluating a child’s phonology, we need to consider what is acceptable or typical for a given speech community. There may be slight differences between varieties of a language. In McLeod’s (2007) text, General American English is compared with three other American English dialects: African American English, Appalachian English, and Cajun English because these varieties of English are considered distinct from each other and differ phonologically from each other in several ways. The range of what is considered typical may differ depending on the population used to derive normative data. These standardized tools are, however, culturally biased in that they assume that all children have had similar life experiences and that their knowledge of concepts and vocabulary is also the same. To overcome this cultural bias, De Lamo White and Jin (2011) proposed that speech and language therapists working with children from bilingual and/or different cultural backgrounds should familiarize themselves with the child’s culture, and how language is used by the child’s family, including differences in dialects spoken. Many factors influence children’s phonological acquisition – even for children acquiring the same language. Aspects such as gender and caregivers’ educational level have been investigated in several studies, but overall findings remain inconclusive (Clausen & Fox-Boyer, 2017). Group study data are useful for examining larger questions statistically, but information about individual children is also important because it shows what is possible in a language (Bernhardt & Stemberger, 2018).
382 Michelle Pascoe
27.5 Typical Phonological Development 27.5.1 Overview Phonological acquisition is the process of acquiring the sounds and speech rhythms of one’s language/s. It is a process that occurs effortlessly in the first few years of a child’s life based on the language exposure and input around them. Children acquire the sounds particular to their languages, learning how to make the sound segments, combine them in appropriate ways to convey meaning, and recognize these sounds when used in combination by those around them. Although much of this chapter will focus on segmental phonology, suprasegmental aspects of language are also an important part of phonological acquisition, deserving of detailed study. Phonological development occurs alongside other aspects of language development – vocabulary, syntax and pragmatics – and although much has been written about speech development in isolation from other aspects of language development, increasingly researchers are focusing on the overlap between different language domains (e.g., Byun & Tessier, 2016; Kehoe & Havy, 2019) and the mutual relationship between domains in the process of development. In the following sections, I describe the typical process of phonological development from birth to the early years of formal schooling. Some aspects of phonological acquisition are general across languages and other aspects are language specific. Rather than describing phonological development for a specific language or languages, phonological development is described in a general way that could be applied to any child acquiring any language or combination of languages. Each age category is then exemplified by describing children acquiring some specific languages.
27.5.2 Birth to Three Years The period from birth until a child’s third birthday is a time of extremely rapid and dramatic developmental change. By the end of this phase, infants can walk, talk and carry out some basic self-care tasks with minimal assistance. In terms of phonology, they move from being able to make only reflexive vocalizations in the early months of life to producing intelligible speech near the end of this period. This age period sees children moving from the pre-lexical phase of the Stackhouse and Wells (2001) model into the single word phase and then the systematic simplification phase. In the first year of life, or the prelexical phase, infants can communicate through vocalizations, crying, and gesture – and while they may not be able to speak intelligibly, they are fully engaged in absorbing and processing the sounds in the language/s around them. Although phonology is typically described by observable characteristics in the child’s speech, authors such as Rvachew and Brosseau-Lapré (2016) and Stackhouse and Wells (2001) emphasize that the phonological system involves processing of input, storage of phonological knowledge and phonological output. Thus, in the first year of life, there may be minimal phonological output, but phonological processing and storing knowledge are taking place. Babbling may be reduplicated (e.g., /mama/) or variegated (e.g., /maba/), and the production of these forms proves this “behind the scenes” processing. When children combine sounds of their language into nonsensical strings, they may not convey r ecognizable meaning. Still, the strings typically follow the phonotactic rules of their languages, and often have the CV syllable/word shape – a universal syllable shape that appears first in many languages. This suggests that children are actively organizing and storing phonological information about their languages from when they are born or even before. According to Levitt and Utman (1992), 11 months is the age at which languages begin to show different CV repertoires when compared cross-linguistically.
Typical and Nontypical Phonological Development 383 First words usually occur between one and two years of age, marking the start of Stackhouse and Wells’ (2001) single word phase. Vihman and Croft (2007) suggest that children learn to produce these single words using phonological word templates a ppropriate for their language/s. Other authors suggest that first words reflect adult projections of what adults imagine children might be saying, e.g., the babbling string [dadadada] is i nterpreted as “daddy,” and this acknowledgement then promotes more intentional word use. Children’s first words normally reflect the words they hear in their environment and follow the phonotactic rules of their language/s. The legal constraints of their native language/s are adhered to after the early babbling stage, and phonemes that are not part of the phonology of their language/s are excluded (Levitt & Utman, 1992). Phonetic inventories – that is, consonants and vowels produced independently regardless of the adult target – have been studied in many languages. Researchers suggest that vowels, nasals and plosives are the earliest sounds produced by young children across languages. For example, Amayreh and Dyson (2000) described the phonetic acquisition of children (aged 1 to 2 years) acquiring Jordanian Arabic and indicated that plosives [b, d, t, ʔ] appeared, along with fricatives [ʃ, ʕ, ħ, h], nasals [m, n], lateral [l] and approximants [w, j]. Maltese children (aged 2;0) have nasals [m, n], plosives [p, b, t, d, k, ʔ], fricative [h] and approximants [w, l, j] in their inventories (Grech, 1998). Tones also appear to be acquired early when used in languages to signal meaning (Hua & Dodd, 2006). By the end of the second year of life, children will have acquired many of the sounds of their language/s and be combining these in appropriate ways to produce many intelligible words and word combinations. There is great variation in young children’s phonology under two years of age. Still, from that age, they appear to shift to a rule-based strategy of phonological learning with many predictable patterns emerging, even across languages. The exact age at which children master their ability to correctly use and contrast consonants varies widely, necessitating the collection of normative data for individual languages or language combinations even when languages share similar consonants. Percentage of consonants correct (PCC, Shriberg et al., 1997) is a metric that has been used to capture children’s ability to use consonants correctly in relation to adult targets. Measuring the number of phonemes the child has developed is achieved by calculating the percentage of consonants the child produced correctly or the percentage of vowels correct (PVC). Studies show that most two-year-olds can produce consonants correctly at least 70% of the time. Intelligibility refers to the understandability of a child’s speech and how successful a child is in getting their message across to others. Overall, two-year-olds are thought to be intelligible at least 50% of the time. In this phase, children steadily acquire the vowels and consonants of their language/s and learn to use these sounds in appropriate word positions/word shapes to make the relevant contrasts. However, child speech typically sounds immature and is not yet like that of the adults around them. When we consider typically developing children acquiring the same language, we find that they tend to make the same “errors” or simplifications. Phonological processes, errors or patterns are how children simplify their phonology according to a set framework of rules (Hua & Dodd, 2006). These simplifications characterize Stackhouse and Wells’ (2001) third developmental phase, the systematic simplification phase. Some processes that are nontypical in one language are typical in another, some are language-specific and some appear common across most if not all languages. For example, where languages have consonant clusters, cluster reduction always seems to occur; simplification of syllable structure is common across languages and assimilation, weak syllable deletion, stopping and fronting seem widely used. Hua and Dodd (2006) note: “Language-specific variation is also evident among languages in many ways. In particular, what counts as a typical or an unusual error pattern is language specific.” (p. 439). These authors note that the language-specific nature of typical error patterns most likely reflects the interaction between developmental universals and the characteristics of an individual language. For bilingual children, it may
384 Michelle Pascoe reflect the interaction between the two phonological systems being acquired. Bernhardt and Stemberger (2018) note that most phonological substitutions come from within a language’s phonetic inventory. Crosslinguistic comparisons of these data indicate that some phonological processes indeed occur in many languages, for example, fronting of velars, but they are not necessarily overcome at the same age (Grech, 2006). Additionally, language-specific phonological processes can also be observed. Alex and Busi are two children in this age group who are acquiring two different languages. Here we consider how their phonology is developing at different ages in their languages. CHILD A: Alex lives with his English-speaking parents and two older brothers. He celebrated his first birthday just a few weeks ago, and although he cannot walk on his own, he is an enthusiastic explorer who loves crawling and can pull himself up to stand. Alex is in the prelexical phase but vocalizes readily, saying /wuwuwu/ when he is excited and /a a a a:/ in what his mother describes as his specific “asking voice” when he wants something. He can use six consonants in the initial position [b, d, g, m, h, w] and sometimes uses [m] in the final position. CHILD B: Busi is an 18-month-old child acquiring isiXhosa from her parents, who are both first-language isiXhosa speakers and proficient English speakers. Busi has acquired all the isiXhosa vowels and many of the consonants of her language, which she uses appropriately in single words. She uses most of the nasals in words like “imali” (= money), “inja” (= dog) and “ilanga” (= sun), stops in words such as “ibhola” (= ball), “iti” (= tea), “pheka” (= cook), “ihagu” (= pig), “ikati” (= cat) and “ipapa” (= porridge). Fricatives and affricates are still emerging, although she can say /tʃ’/ in “iwotshi” (= watch), and she has acquired some of the clicks of isiXhosa already. Busi’s parents consider her to be intelligible about 50% of the time and her PCC is estimated as 80%.
27.5.3 Three to Five Years Between the ages of three and five years, children become more independent. If they attend a preschool or daycare away from their families, they have the opportunity for exposure to many more conversational partners, languages and experiences. Three-year-olds generally have spontaneous speech that is at least 50% intelligible to unfamiliar adults and about 75–100% intelligible to familiar adults (Bowen, 2011). There is considerable variation between children, but these figures provide general guidelines that indicate how early children master intelligible speech irrespective of the language/s spoken. Children in this age band are still in the systematic simplification phase. They are still likely to be using phonological processes, although these start to be eliminated from around the age of three in most languages, and in fact, some phonological processes can persist well beyond this phase until children are six or seven – although the exact details of the processes and the ages at which they can be expected to disappear, vary from language to language. While children of this age are firmly in the systematic simplification phase, they also move into the assembly phase, where they can now assemble phonemes into more complex words, and words into phrases and sentences. At the same time, this assembly involves using appropriate suprasegmental aspects of their languages, such as tones, stress and intonation to convey their meaning. Preschool children aged 4–5 years are considered to be intelligible most of the time, even when conversing with strangers, and five-year-olds are estimated to produce consonants correctly 90% of the time (McLeod & Baker, 2017). Children between three and five have increased syllable and word shape inventories, and can produce more combinations of
Typical and Nontypical Phonological Development 385 vowels and consonants, and longer words or word combinations. This is the period when children can accurately produce the majority of their consonants. As children grow older, their phonetic inventories expand so that they can produce more sounds in more combinations. Let us now consider three children in this age group. CHILD C: Christian (aged 36 months) is acquiring Danish. His parents report that he is easy to understand most of the time, and confidently communicates with unfamiliar people. He still finds the Danish speech sound [ɕ] difficult to produce and typically uses the phonological process of fronting for this sound. However, /ɕ/ occurs in very few Danish words, and thus it is not usually noted to be a problem for Christian and was only revealed in a routine speech assessment. Christian has already acquired several initial and final consonant clusters, for example, in “snemand” (= snowman), “prinsesse” (= princess), “blomst” (= flower) and “tandbørste” (= toothbrush). However, clusters consisting of three elements like “stjerner” (= stars) and the cluster /fj/ in “fjer” (= feather) are yet to be acquired. Overall his PCC is estimated at 90% and PVC 100% despite the complex Danish vowel system (based on Clausen & Fox-Boyer, 2017). CHILD D: Dilva (aged 4;4) is acquiring Kurdish and lives with her parents and grandmother, who are first-language speakers of Kurdish in Iran. She has been using all the vowels of her language from before the age of three and now has acquired all the consonants of her language apart from /ʤ, ɣ, ʒ, z, g, ɣ/. Many of her phonological processes have resolved over the past year, including medial consonant deletion, deaffrication, stopping, weak syllable deletion, affrication, voicing, metathesis, final consonant deletion, and gliding. Fronting and final devoicing persist in her speech, but her family considers her speech intelligible almost all the time. She is well into Stackhouse and Wells’ (2001) assembly phase, able to use her phonology in complex ways to convey meaning (based on Syadar et al., 2021). CHILD E: Elisa (age 5;0) is a bilingual Spanish–English speaking child who attends preschool. Her input and ability to use both languages are judged as roughly equal by her parents, and her PCC in both languages approximates 100%. In her Spanish, she makes few substitution errors, occasionally using a flap for a trill and [l] for flap. Flaps and trills are some of the last acquired sounds in Spanish, and, therefore, this is entirely expected. In English, she occasionally substitutes [j] for /l/. Overall her production of initial clusters is slightly more accurate in Spanish than English, but both exceed 80% accuracy. Like Dilva, Elisa is successfully leaving the systematic simplification phase behind her and is comfortably developing her skills in the assembly phase, and possibly even the metaphonological phase as she starts to develop her phonological awareness and metalinguistic awareness in the preschool context (based on Goldstein et al., 2005).
27.5.4 Six Years and Beyond From the age of six, most typically developing children will be entering the metaphonological phase. This is the phase in which they become aware of the sound structure of their language/s and start to use this as a basis for developing their literacy, usually through formal schooling. Children in this phase may also show evidence of a few last systematic simplifications. The ages at which these remaining systematic simplifications are expected to be eliminated vary from language to language, and there may be a fairly broad range for children even within a language or language combinations. Children will also continue to master aspects of phonological assembly as they develop their metaphonological skills. The age of phonetic inventory completion varies from language to language – for example, as young as 3;11 for Maltese-speaking children but not until 7;0 for English-speaking
386 Michelle Pascoe children (Hua & Dodd, 2006). Wells and Stackhouse (2016) go up to age 14 in their developmental phase model of intonation, focusing on suprasegmental aspects related to advanced skills such as reading aloud, dialect-specific phonetic exponents of tones and establishment of a tone “lexicon” for reading, drama and other metalanguage tasks. The perception of suprasegmental aspects continues to develop until 10 or 11 years (Wells et al., 2004). During the school years, children continue to master some of the more complex aspects of language – what McLeod and Baker (2017 p. 176) describe as “the finishing touches” in their ability to perceive subtle differences in speech, more complex aspects of language, for example, the challenging consonant strings in English words like “statistical” and “Massachusetts” and their use of stress and timing in complex words and sentences.
27.6 Nontypical Phonological Development Most children develop their phonology with no problems and do not require any intervention. However, some children experience difficulties with this process – either as part of a general difficulty with language acquisition, for example, syntax and vocabulary may also be challenging for the child, or in isolation – phonology alone is affected. Difficulties may be linked to a specific cause or etiology, for example, the child has a hearing impairment or was born with a cleft lip and/or palate, or there may be no known etiological factors involved. Nontypical phonological development can be complex to explain but usually arises from structural, articulatory, behavioral, biological, and cognitive factors (Bowen, 2015). Here we return to the Stackhouse and Wells (2001) developmental stage model to understand the difficulties that might be experienced. Figure 27.2 shows the phases of the developmental phase model linked to difficulties that might occur in any of these phases. If the individual components – input, stored representations and output – of a child’s phonological system do not come together in the pre-lexical stage of development, difficulties may be noted in this phase, and children may struggle to proceed into the subsequent stages. When hearing impairment or structural difficulties, such as cleft lip and palate, are left unaddressed, children’s phonological development may not proceed at the normal rate Difficulties in speech development Pre-lexical Whole-word Systematic Simplification Assembly Metaphonological
Hearing and ‘structural’ problems
Apraxia Phonological impairment/delay
Dysfluency, mumbling, prosodic difficulties Literacy, dyslexia
Normal Development
Figure 27.2 Developmental phase model indicating difficulties that could occur at each phase (adapted from Pascoe et al., 2006; Stackhouse & Wells, 2001 with permission from John Wiley and Sons).
Typical and Nontypical Phonological Development 387 to the whole word phase. Some children might have difficulties with proceeding beyond the whole word phase. They may be unable to store and access phonological templates for whole words, possibly because of childhood apraxia of speech. Other children do not move through the systematic simplification phase with the same ease as typically developing peers, experiencing delays when phonological processes are not eliminated as expected. Some children might find themselves stuck in the assembly phase, unable to master this phase because of dysfluent speech, unclear speech or prosodic difficulties. Children with difficulties in the metaphonological phase may struggle to reflect on their speech and may experience difficulties with acquiring literacy. Pascoe et al. (2006) note that some children become arrested at an early stage of development, and then present with successive challenges in other phases in relation to their typically developing peers. Stackhouse and Wells’ developmental phase model is one way of accounting for d ifficulties that may occur in children’s speech. Many other models account for nontypical speech. Dodd (2013) used a diagnostic model in which children with phonological difficulties are classified as having delayed or disordered speech. A delay is characterized as development which is typical in nature but occurs at a later chronological age than expected. For example, when a five-year-old uses expected phonological processes for a given language and has consonants and vowels that are part of the repertoire of that language but does so in a way that is similar to what is expected for a three-year-old, a delay is likely. Knowing a child’s age is, thus, important because what may be typical for a three-year-old will be considered nontypical for a sixyear-old. A disorder refers to unusual processes that are not expected in a language and the use of consonants and vowels that are not expected for the language. Phonological disorder can then be classified as disorders of a consistent type – systematic use of unexpected phonological rules, that is, error patterns that are atypical – or disorders of an inconsistent type, that is, variable production of the same words or phonological features in the same contexts (Dodd, 2013). Research focusing on nontypical phonology – what Hua (2006) terms stage 4 of the research process for phonological acquisition – is even more limited for many languages than research focusing on typical acquisition. This is logical, considering that identifying a child with nontypical phonological development requires knowledge of what is typical. If there are no clear guidelines or normative data about “what to expect when” in a child’s phonological development, identifying the child as having nontypical phonology will be hard – if not impossible. Intervention studies with bilingual isiXhosa-English speaking children in South Africa exemplify this point (Rossouw & Pascoe, 2018). To undertake an intervention project with young children with speech difficulties, tools had to first be d eveloped to evaluate phonological acquisition in isiXhosa and South African English. These tools went through a validation and pilot process, and were then used to obtain information about what is typical for young children acquiring these languages. Only once that work had been done were the researchers able to use the tools to identify children with speech d ifficulties and document the nature of their difficulties and the impact of the intervention. Studies of nontypical phonological acquisition in specific languages have been e xamined and compared to generate information about phonological difficulties common to all languages. These include that vowels are more resistant to phonological disorders (although vowel disorders do occur – see Chapter 28) and that tones appear to be early acquired and resistant to phonological impairment (Hua & Dodd, 2006). We also know that phonological delay accounts for the greatest proportion of phonological difficulties (Fox & Dodd, 2001). Now let us consider some children with nontypical phonological development. CHILD F: Feng is a 3;7-year-old male acquiring Cantonese as his main language. He has some consonants missing from his inventory, including /k, kʰ, kʰw/, but the consonants and vowels in his inventory are all ones expected in Cantonese. He uses four error patterns typical in Cantonese but normally not present in a child of this age (cluster reduction,
388 Michelle Pascoe fronting, final consonant deletion and final glide deletion). Feng appears to have a phonological delay. His speech is similar to that of a younger child. Regarding the developmental phase model, he appears to be stuck in the systematic simplification phase and may need intervention to help him catch up with his age-matched peers. Children with phonological delays usually respond well to intervention (based on So, 2006). CHILD G: Giuseppe (age 4;2 years) has hearing within normal limits and developmental milestones that were normal. His parents are fluent speakers of Italian and English, and Giuseppe has been exposed to both languages since birth. His parents consider that similar amounts of English and Italian are spoken at home. Most people outside the family have difficulty understanding him in either language. Giuseppe’s variable productions of words – he pronounces his name as /depi/, /epi/, /bepi/ – is a cause of concern. Intelligibility of single words is fair. However, connected speech is very difficult to understand. In Italian, his phonetic inventory includes 15 of the 23 consonants. Giuseppe produces a bilabial fricative, a non-Italian phoneme, several times. Giuseppe’s speech shows a very unpredictable error pattern with inconsistent use of phonological processes such as stopping, voicing, devoicing, assimilation, epenthesis, weak syllable deletion, backing, and fronting. In English, Giuseppe shows 16 of the 24 consonants in his consonant inventory. He inconsistently uses processes such as stopping, fronting, gliding, assimilation, weak syllable deletion, and final consonant deletion. The inconsistency in his Italian was also noted in his English. Although Giuseppe’s speech was like that of a younger child in many regards, the main indicator of disorder was his inconsistency. Inconsistency to the degree described in his speech is not characteristic of typical development (based on Holm & Dodd, 1999).
27.7 Conclusion This chapter provided an overview of children’s remarkable journey from unintelligible babbling to intelligible speech in just a few years. The chapter described typical phonological acquisition based on what is known about many languages and language combinations. There are general trends in phonological acquisition that can be applied to most languages but also some language-specific aspects. Readers interested in typical and nontypical phonological development in specific languages may access the relevant references to obtain more detailed information. Knowledge about typical acquisition is essential for understanding and identifying difficulties that may occur when phonological acquisition is nontypical. A developmental phase model was used as a theoretical framework to explain the trajectory of typical development and the challenges that may arise with nontypical acquisition. While there is a substantial body of knowledge about typical phonological acquisition and what can go wrong, there is a great need for more research into languages not yet well documented, for both practical purposes and contribution to theory.
REFERENCES Amayreh, M. M., & Dyson, A. T. (2000). Phonetic inventories of young Arabic-speaking children. Clinical Linguistics & Phonetics, 14(3), 193–215. Babatsouli, E., Ingram, D., & Müller, N. (Eds). (2018). Crosslinguistic encounters in language acquisition: Typical and nontypical development. Multilingual Matters.
Barlow, J. A., & Gierut, J. A. (1999). Optimality theory in phonological acquisition. Journal of Speech, Language, and Hearing Research, 42(6), 1482–1498. Bernhardt, B. M., & Stemberger, J. P.(2018). Investigating typical and protracted phonological development across languages.
Typical and Nontypical Phonological Development 389 Chapter in E. Babatsouli, D. Ingram, & N. Müller (Eds.), Crosslinguistic encounters in language acquisition: typical and nontypical development (pp. 71–108). Multilingual Matters. Bowen, C. (2011). Table 1: Intelligibility. Retrieved from http://www.speech-language-therapy.com Bowen, C. (2015). Children’s speech sound disorders (2nd ed.). John Wiley. Byun, T. M. A., & Tessier, A. M. (2016). Motor influences on grammar in an emergentist model of phonology. Language and Linguistics Compass, 10(9), 431–452. Clausen, M. C., & Fox-Boyer, A. (2017). Phonological development of Danish-speaking children: A normative cross-sectional study. Clinical Linguistics & Phonetics, 31(6), 440–458. De Lamo White, C., & Jin, L. (2011). Evaluation of speech and language assessment approaches with bilingual children. International Journal of Language & Communication Disorders, 46(6), 613–627. Dodd, B. (2013). Differential diagnosis and treatment of children with speech disorder. Wiley & Sons. Fox, A. V., & Dodd, B. (2001). Phonologically disordered German-speaking children. American Journal of Speech Language Pathology, 10(3), 291–307. Goldstein, B. A., Fabiano, L., & Washington, P. S. (2005). Phonological skills in predominantly English-speaking, predominantly Spanishspeaking, and Spanish–English bilingual children. Language, Speech and Hearing Services in Schools, 36(3), 201–218. Grech, H. (1998). Phonological development of normal Maltese speaking children. Unpublished dissertation. University of Manchester (UK). Grech, H. (2006). Phonological development of Maltese-speaking children. In Z. Hua & B. Dodd (Eds.), Phonological development and disorders in children: A multilingual perspective (pp. 135–178). Multilingual Matters. Hambly, H., Wren, Y., McLeod, S., & Roulstone, S. (2013). The influence of bilingualism on speech production: A systematic review. International Journal of Language & Communication Disorders, 48(1), 1–24. Holm, A., & Dodd, B. (1999). Differential diagnosis of phonological disorder in two bilingual children acquiring Italian and English. Clinical Linguistics & Phonetics, 13(2), 113–129. Hua, Z.(2006). The need for comparable criteria in multilingual studies. Chapter in Z. Hua & B. Dodd (Eds.), Phonological development and disorders in children: A multilingual perspective (pp. 15–22). Multilingual Matters. Hua, Z., & Dodd, B.(2006). Towards developmental universals. Chapter in Z. Hua
& B. Dodd (Eds.), Phonological development and disorders in children: A multilingual perspective (pp. 431–449). Multilingual Matters. Kehoe, M., & Havy, M. (2019). Bilingual phonological acquisition: The influence of language-internal, language-external, and lexical factors. Journal of Child Language, 46(2), 292–333. Levitt, A. G., & Utman, J. G. A. (1992). From babbling towards the sound systems of English and French: A longitudinal two-case study. Journal of Child Language, 19(1), 19–49. McLeod, S. (2007). The international guide to speech acquisition. Thomson Delmar Learning. McLeod, S. (Ed.). (Forthcoming). The Oxford handbook of speech development in languages of the world. Oxford University Press. McLeod, S., & Baker, E. (2017). Children’s speech: An evidence-based approach to assessment and intervention. (Always learning). Pearson. McLeod, S., & Bleile, K. M. (2007). The ICF and ICF-CY as a framework for children’s speech acquisition. In S. McLeod (Ed.), The international guide to speech acquisition (pp. 2–7). Delmar Learning. McLeod, S., & Goldstein, B. (Eds.). (2012). Multilingual aspects of speech sound disorders in children (Vol. 6). Multilingual Matters. Naidoo, Y., Van der Merwe, A., Groenewald, E., & Naudé, E. (2005). Development of speech sounds and syllable structure of words in Zulu-speaking children. Southern African Linguistics and Applied Language Studies, 23(1), 59–79. Pascoe, M., & Jeggo, Z. (2019). Speech acquisition in monolingual children acquiring isiZulu in rural KwaZulu-Natal, South Africa. Journal of Monolingual and Bilingual Speech, 1(1), 94–117. Pascoe, M., Stackhouse, J., & Wells, B. (2006). Persisting speech difficulties in children. (Book 3 in Children’s Speech and Literacy Difficulties Series). Wiley and Sons Ltd. Prince, A., & Smolensky, P. (1993). Optimality theory: Constraint interaction in generative grammar (Tech. Rep. No. 2). Rutgers Center for Cognitive Science, Rutgers University. Rossouw, K., & Pascoe, M. (2018). Intervention for bilingual speech sound disorders: A case study of an isiXhosa-English-speaking child. South African Journal of Communication Disorders, 65(1), 1–10. Rvachew, S., & Brosseau-Lapré, F. (2016). Developmental phonological disorders: Foundations of clinical practice. Plural Publishing. Shriberg, L. D., Austin, D., Lewis, B. A., McSweeny, J. L., & Wilson, D. L. (1997). The percentage of consonants correct (PCC) metric: Extensions and reliability data.
390 Michelle Pascoe Journal of Speech Language and Hearing Research, 40(4), 708–722. https://doi. org/10.1044/jslhr.4004.708. So, L. K. H.(2006). Cantonese phonological development: Normal and disordered. Chapter in Z. Hua & B. Dodd (Eds.), Phonological development and disorders in children: A multilingual perspective (pp. 109– 134). Multilingual Matters Ltd. Stackhouse, J., & Wells, B. (2001). Children’s speech and literacy difficulties 2: Identification and intervention. Whurr Publishers. Statistics South Africa. (2012). Census 2011: Census in brief. Report no.: 03-01-41. Statistics South Africa. Syadar, S. F., Zarifian, T., Pascoe, M., & Modarresi, Y. (2021). Phonological acquisition in 3-to 5-year-old Kurdish-Speaking children in Iran. Journal of Communication Disorders, 93, 106141.
Tempest, A., & Wells, B. (2012). Alliances and arguments: A case study of a child with persisting speech difficulties in peer play. Child Language Teaching and Therapy, 28(1), 57–72. Van der Merwe, A., & Le Roux, M. (2014). Idiosyncratic sound systems of the South African Bantu languages: Research and clinical implications for speech-language pathologists and audiologists. South African Journal of Communication Disorders, 61(1) https://doi. org/10.4102/sajcd.v61i1.86 Vihman, M., & Croft, W. (2007). Phonological development: Toward a “radical” templatic phonology. Linguistics, 45(4), 683–725. Wells, B., Peppé, S., & Goulandris, N. (2004). Intonation development from five to thirteen. Journal of Child Language, 31(4), 749–778. Wells, B., & Stackhouse, J. (2016). Children’s intonation: A framework for practice and research. Wiley Blackwell.
28 Vowel Development and Disorders KAREN POLLOCK AND CAROL STOEL-GAMMON 28.1 Introduction This chapter provides an overview of vowel development and vowel disorders in children. The focus is limited to investigations of English-speaking children. The body of work on the acquisition of vowels is very small compared with research on consonants. Most clinical studies are based on phonetic transcription and focus on patterns of accuracy and types of errors that occur. Hence, these are the studies that will be reviewed in the next sections. For a review of vowel development and disorders with an emphasis on acoustic studies, see Kent and Rountrey (2020). The chapter begins with a description of the vowel system of American English; this description will serve as a framework for the discussion of vowel acquisition and vowel disorders that follows. In terms of physiology, vowels are characterized as having no obstruction in the vocal tract. In terms of articulation, there is a fundamental difference between place of articulation for vowels and consonants: consonant place distinctions are discrete – a consonant stop is labial or coronal but not somewhere in between, whereas place features for vowels are continuous, varying in small steps from high to low, or front to back. In terms of phonology, vowels differ from consonants regarding their role within a word, phrase, or sentence: they typically form the syllable nucleus and are the elements that carry stress, pitch and basic aspects of rhythm. They are also the elements that tend to differ most across dialects. Regardless of the language, articulatory descriptions of vowels include a basic set of f eatures: (a) vowel height, based on tongue and jaw position; (b) front-back, based on the position of the tongue, and (c) rounding (or not), based on the lips. The acoustic realization of tongue height is the first formant (F1) which is inversely related to tongue height (i.e., high vowels have low F1 values and low vowels have high F1 values), while the acoustic realization of tongue advancement is directly related to the second formant, or F2 (front vowels have high F2 values and back vowels have low F2 values). Tongue height appears to be more important to vowel identity than tongue advancement and some languages, for example, Kabardian, have vowel systems that differ only along the height dimension. For many vowel systems, additional features are used to distinguish phonemic contrasts; common secondary features include phonemic distinctions based on vowel length, nasalization, rhoticity, or vocal quality: creaky vs. normal phonation.
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
392 Karen Pollock and Carol Stoel-Gammon Many languages also have complex vowels, such as diphthongs and triphthongs. Diphthongs have two different targets within the same nucleus, one of which is more prominent. Similarly, triphthongs have three different targets. In English, the first target of a diphthong is always prominent; this type of diphthong is called a falling diphthong. However, other languages have rising diphthongs, in which the second element is more prominent. Languages vary in the number of vowels in their phonological system, from as few as two phonologically contrastive vowels (e.g., Margi) to 16 or more (e.g., Swedish; Urdu). The number of diphthongs and triphthongs also varies considerably, with some languages (e.g., Spanish) having none, and others having more than 10 (e.g., Cantonese). However, the number of diphthongs reported is related to the phonological description of the language in question. Phonetically similar complex vowels may be described in one language as diphthongs and in another as a sequence of vowel + glide or glide + vowel. For example, in English, the pronunciation of the word “why” is generally interpreted as glide + diphthong aɪ]) in ([wa͡͡ɪ ]), but a phonetically similar sound might be interpreted as a triphthong ([u Cantonese. Diphthongs occur in approximately one third of the world’s languages, but triphthongs are relatively rare. A majority of languages have between three and nine vowel phonemes (Maddiesson, 1984), with the most common vowel system being the triangular five vowel system: /i, e, a, o, u/.
28.2 The Vowel System of General American English The vowel system of General American English (GAE) can be categorized in terms of three phonetic dimensions: front-central-back, high-mid-low, and tense-lax.1 There are five front vowels: /i, ɪ, e, ɛ, æ/, five back vowels: /u, ʊ, o, ɔ, ɑ/and the central vowel /ʌ/ ([ə] when unstressed). In terms of vowel height, /i, ɪ, u, ʊ/ are classified as high vowels, /e, ɛ, ʌ~ə, o, ɔ/ as mid vowels, and /æ, ɑ/ as low vowels. In most of the studies discussed in the sections that follow, the vowels /ɪ, ɛ, æ, ʌ, ʊ, ɔ/ are classified as lax, a feature that we will also use. In addition, GAE has three phonemic diphthongs: /a͡ɪ/, /a͡ʊ /, and /ɔ͡ɪ/. The vowels /e/ and /o/ are also typically diphthongized in English (e.g., [e͡ɪ] and [o͡ʊ]), but the diphthongization is not phonemic. Some phoneticians and child phonologists consider syllabic and post-vocalic /ɹ/ to be rhotic vowels. For example, they consider the nucleus of “bird” to be a rhotic monophthong /ɝ/ ([ɚ] when unstressed as in “water”) and post-vocalic /ɹ/ to be part of a rhotic diphthong, with [ɚ] as the second vowel target (e.g., /ɑ͡ɚ/, /ɛ͡ɚ/ as in “car,” “hair”). Vowels are more affected by dialect differences than consonants in GAE. For example, speakers in the western states do not differentiate between the vowels /ɔ/ and /ɑ/ as in “caught” and “cot,” respectively, producing both with /ɑ/. Other dialectal differences include a lack of rhotic vowels in some northeastern dialects (e.g., the unstressed syllable of “father” is produced with [ə] rather than [ɚ]) and the merger of [ɛ] and [ɪ] before nasals in some southern dialects (e.g., “pen” and “pin” are both pronounced [pɪn]). Many dialect features are inherently variable, that is they occur in some contexts but not others. For example, although monophthongization (e.g., “pie” produced as [pa:]) is a common feature of southern American English dialects, it occurs most frequently with /a͡͡ɪ /, rarely with /a͡ʊ /, and gener ally only before /l/ with /ɔ͡ɪ /. Furthermore, in African American Vernacular English (AAVE), monophthongization of /a͡͡ɪ / does not occur before voiceless consonants. Thus, knowledge of common dialect variations is especially important when assessing and treating vowel errors in children.
Vowel Development and Disorders 393
28.3 Development of Vowels in Infancy 28.3.1 Early Vocalizations In the first six months of life babies produce a variety of sounds, some more speech-like than others (Oller et al., 2021). In terms of speech-like productions, low central vowels predominate. In the first few months, these vowel-like sounds (sometimes called vocants or vocoids) are perceived as being very nasal. With changes in the infant vocal structures that occur in the first months, the separation between the velum and the epiglottis grows and vowels become less nasal. Changes in vowel productions in the first year are usually attributed to maturational changes of the oral-motor structures and neurophysiological system (Kent, 1992). Around 6–7 months, babies begin to produce consonant-vowel (CV) syllables such as [mɑ] or [dæ] which exhibit adult-like timing and often resemble real words. The great majority of vowels produced in CV syllables are perceived as being front or central, and mid or low (Lieberman, 1980). According to Kent and Bauer (1985) the most frequent vowel types produced by one-year-old infants were mid central [ʌ, ə]; other vowels occurring frequently in the speech samples they analyzed were lax [ɛ, æ, ʊ]. MacNeilage and Davis (2001) propose that the basic CV syllable type developed from the closing (for consonant production) and opening (for vowels) mandibular movements associated with chewing and sucking. These movements constitute the frame of a syllable and the particular consonants and vowels within the frame form the content, hence the term Frame/Content theory of the evolution of speech (MacNeilage, 1998; MacNeilage & Davis, 2001). According to MacNeilage (1998), the content that occurs within the syllable frame is influenced by movements of the jaw with no independent movements of the tongue, resulting in the likelihood of more frequent o ccurrences of particular consonant-vowel (CV) sequences. Specifically, it is argued that (1) consonants produced with a constriction in the front of the mouth are more likely to precede front vowels (e.g., [dɪ]); (2) consonants produced with the lips (i.e., with no tongue involvement) will be strongly associated with central vowels (e.g., [bɑ]); and (3) consonants produced with a constriction in the back of the mouth will be associated with back vowels (e.g., [ku]).
28.3.2 From Babble to Speech It is well documented that the set of consonants occurring frequently in babble, namely stops, nasals and glides, are (a) the same consonants that appear frequently in a child’s early word productions and (b) the consonants that tend to be produced accurately. The same pattern does not hold for vowels. Front and central lax vowels predominate in pre-speech utterances (Kent & Bauer, 1985), yet analyses of accuracy of production indicate that it is precisely these vowels that are often in error in meaningful speech. Furthermore, the set of vowels that is frequent in babble is not the same set that is frequent in early words (Davis & MacNeilage, 1990). Jakobson (1968) proposed that children would follow a universal pattern of acquisition of consonants and vowels in their early word productions, beginning with the greatest differentiation between segments (e.g., labial stop and open vowel yielding /pa/) and proceeding to finer and finer distinctions. For vowels, Jakobson claimed that children begin with a three vowel system, usually the “fundamental” vowel triangle /i/ – /a/ – /u/, and then acquire vowels that fall between the extreme points of the triangle. The minimal threevowel system that serves as the starting point is, Jakobson notes, the same minimal system found in many of the languages of the world.
394 Karen Pollock and Carol Stoel-Gammon
28.4 Vowel Acquisition in Meaningful Speech Vowels in the speech of children between 12 and 24 months have been examined using independent and relational analyses. An independent analysis is based on the child’s productions regardless of accuracy; this type of analysis can include glossable forms, that is, those for which a target word can be identified, and non-glossable forms, that is, babble or unintelligible speech; a relational analysis is based on a comparison of a child’s productions with the target form and provides information on accuracy of production and on error patterns (Stoel-Gammon & Dunn, 1985). Independent analyses are used to determine a child’s phonetic inventory – the phones that occur in his/her speech. In a longitudinal study of vowel inventories based on analyses of conversational interactions, Selby et al. (2000) described the inventories of four children aged 15–36 months; a vowel was considered to be part of the group inventory at a particular age if it occurred in the sample of three of the four children. At 15 months the group inventory was limited to four vowels: [ɑ, ɪ, ʊ, ʌ]. Notably lacking are the high tense corner vowels [i] and [u]. By 18 months, the group inventory had expanded to include these corner vowels, and the inventories at 21 and 24 months, taken together, included all target vowels of American English except the rhotic vowels. Thus, the size of vowel inventories across the four children increased rapidly between 15 and 24 months.
28.4.1 Accuracy of Production Although the study by Selby and colleagues (2000) indicates that the full range of vowels is produced by 21–24 months, relational analyses indicate that accuracy is not uniform across vowel targets. Reports of vowel accuracy vary somewhat from one study to another in part because some investigations were based on analysis of elicited single word targets and while others a nalyzed vowels in spontaneous utterances. Generally, the research indicates that accurate production of nearly all vowel targets is achieved by the age of 36 months. However, these studies have focused primarily on the production of vowels in monosyllabic and disyllabic words; vowel accuracy in more complex polysyllabic words is reportedly lower (James et al., 2001). In terms of individual vowel phonemes, Paschall (1983) found that, at 16–18 months of age, the 20 American children in her study produced the targets /i, ɪ, ʊ, ʌ, ɑ/ with accuracy rates of 73–81% in conversational speech. Only the mid-front vowels /e/ and /ɛ/ and the rhotic vowels exhibited accuracy rates that were lower than 50%. Using the same methodology as Paschall, Hare (1983) examined vowel accuracy in productions of children 21–24 months. The findings indicate a substantial increase in accuracy of production: the vowels /i, æ, u, ʊ, o, ɔ, ɑ, ʌ/ were correct in over 91% of target words; accuracy rates for /e, ɛ, ə, ɪ/ were also high, ranging from 84–89%. In this age range, only the rhotic vowels were produced with less than 50% accuracy. Findings from studies of vowels in single-word productions also indicate that, except for the rhotic vowels, production accuracy is quite high. Templin (1957) analyzed word productions of 480 children aged 3–8 years and reported that, across all vowels, accuracy was 93% in the group of 3-year-old children and did not change between ages 3 and 8 years. In an earlier cross-sectional study of single-word productions, Wellman and colleagues (1931) investigated speech development in 204 children aged 2–6 years. Using correct production by 75% of children a particular age group as the criterion for mastery of a vowel phoneme, Wellman showed that the vowels /i, ɑ, u, o, ʌ, ə/ were mastered by age 2;0, mastery of /e, ɔ / occurred at age 3;0, and /ɪ, ɛ, æ, ʊ/ were mastered at 4;0. More recently, in a study of 165 children ages 18–81 months, Pollock (2013) reported mean vowel accuracies for non-rhotic vowels exceeding 92% by 24 months. By 36% months, mean accuracy levels exceeded 97%. The notable exception was the rhotic vowels which were mastered later, typically around four years (Pollock, 2013).
Vowel Development and Disorders 395 Overall, typical acquisition of English vowels in meaningful words can be described by the following general patterns: (a) corner vowels (except /æ/) are acquired before non- corner vowels; (b) tense vowels are acquired before their lax counterparts; and (c) rhotic vowels are acquired later than non-rhotic vowels.
28.4.2 Suprasegmental Aspects of Vowel Acquisition Descriptions of vowel acquisition should include not only studies focusing on vowel quality, but also on the suprasegmental features of vowels associated with stress, rhythm and timing. In English, these features interact in predictable ways: vowels in stressed syllables tend to be longer, louder and higher pitched than vowels in unstressed syllables. As a result, the rhythm of English is described as stress timed (i.e., having approximately equal intervals between stressed syllables in a word or phrase). Although it has been shown that the intervals are not exactly equal, the rhythmic pattern of stress-timed languages such as English or Dutch is notably different from the pattern of syllable-timed languages such as Spanish or Hindi in which the timing of each syllable is said to be approximately the same regardless of stress placement (see Ramus et al., 1999, for a discussion of the rhythmic properties of languages traditionally classified as syllable-timed or stress-timed). In terms of suprasegmental properties, productions of young children acquiring English are clearly affected by the stress patterns of words and phrases. Syllables that receive primary stress are longer, louder, and higher pitched than unstressed syllables, as noted above. In addition, these syllables tend to have “clearer” vowel articulations than the vowels of unstressed syllables. In general, segmental accuracy, including vowel accuracy, is greater in stressed than in unstressed syllables. In the early stages of word production, unstressed syllables, particularly those in word initial or medial position, may be omitted, leading to the production of [nænə] for “banana” and [ɛfɪnt] for “elephant.” Unstressed syllables at the ends of words are omitted much less frequently. Acoustic analyses of American children’s disyllabic word productions indicate that stressed syllables in the adult form are characterized by greater duration, greater amplitude, and longer duration – the three features associated with stress in English (Kehoe et al., 1995). Researchers have noted, however, that young children appear to have difficulty with the t iming pattern of unstressed syllables. Allen and Hawkins (1980) reported that American c hildren have problems reducing the vowel of unstressed syllables, thereby diminishing the durational d ifferences between stressed and unstressed vowels. This finding was further explored by Kehoe and colleagues (1995; see also Kehoe, 2013) who compared the syllable durations of phonetically controlled word pairs such as “key” – “monkey” where the syllable [ki] was produced as a stressed monosyllable in the first word and as an unstressed syllable in the disyllabic form. They found that the ratio of the duration of the unstressed to the stressed syllable was .51 in adult productions (i.e., the unstressed syllable was half as long as the stressed syllable) and was .75 for children aged 18–30 months; thus, the children distinguished between stressed and unstressed syllables, but the length distinction was less marked in the children as compared to adults. Difficulty reducing the vowel of non-final weak (unstressed) syllables to schwa in polysyllabic words was also reported for 3–7year old children, suggesting that the durational aspects of vowels continue to develop after 3 years of age (James et al., 2001).
28.4.3 Vowel Errors in the Speech of Children with Typical Development Analysis of errors in consonant productions tends to be fairly straightforward as it is easy to identify changes on the basis of place or manner classes. Thus, in the speech of young children, velar consonants are often produced as alveolars, and liquids are often produced as glides. Descriptions of vowel errors are somewhat messier; in some cases, a target vowel is
396 Karen Pollock and Carol Stoel-Gammon raised (e.g., /ɪ/ is produced as [i]), in other cases the same vowel may be lowered (/ɪ/ produced as [ɛ]) or centralized (/ɪ/ produced as [ʌ]). Pollock (1991) identified a wide range of possible error patterns for vowels, including feature changes (e.g., backing and fronting, raising and lowering, tensing and laxing, rounding and unrounding), complexity changes (e.g., diphthong reduction and diphthongization), and harmony (i.e., assimilation) patterns. Some error patterns apply to a limited set of target vowels (e.g., unrounding can only apply to round vowels), and some patterns are affected by word stress and complexity of the vowel target (e.g., monophthongs vs. diphthongs). Donegan (2002), using a slightly different framework for identifying vowel errors, distinguished between context-sensitive and context-free error patterns. An example of the former is the allophonic raising and diphthongization of /æ/ or /ɛ/ to [e͡ɪ ] in words like “bag,” “thank,” and “leg,” that is, when /æ/ or /ɛ/ precedes a velar consonant. Other context sensitive changes are likely to occur when a vowel precedes liquid /l/, as in “milk” produced as [mʊk]. Context-free errors, in contrast, are not affected by phonetic environment. As with consonants, vowel errors may change with development. In a longitudinal study focusing on front vowels in a controlled set of words, Otomo and Stoel-Gammon (1992) noted that, at 22 months, children often produced /ɪ/ as [i], accounting for 30% of the incorrect productions, an error which could be classified as raising and tensing of /ɪ/. At 30 months, the most common error for /ɪ/ was lowering to [ɛ]; at this age, the use of [i] as a substitute had declined to 3%. Otomo and Stoel-Gammon also reported that productions of /ɛ/ were often inaccurate at 22 and 26 months, but there was no obvious developmental pattern and no single substitution that was most common. The most frequent substitutions for /ɛ/ at these ages were [e], [æ], and [ʌ]. At 30 months, the most frequent substitutions were [e] (raising and tensing of the target) and [æ] (lowering of the target). At 30 months, the accuracy levels for the lax targets /ɪ/ and /ɛ/ were lowest of the phonemes studied at 40% and 49% respectively.
28.5 V owel Production in Children with Speech Sound Disorder (SSD) Vowel errors are relatively less common than consonant errors in children with speech sound disorders (SSDs) and thus have received less attention in the literature and in clinical practice. However, when compared to children with typically developing speech, children with SSD produce more vowel errors (Pollock, 2013; Roepke & Brosseau-Lapré, 2021). Children with SSD also produce more vowel errors in polysyllabic words than in monosyllabic and disyllabic words (Masso et al., 2016), although the same is not true for consonant errors.
28.5.1 Incidence of Vowel Errors Pollock and Berni (2003) investigated non-rhotic vowel errors in 149 children (30 to 81 months of age) with SSD, and found that the incidence of non-rhotic vowel errors ranged from 11 to 32%, depending on the criteria used (percent vowels correct (PVC) < 85, 90, or 95). Furthermore, the incidence of vowel errors appeared related to the severity of consonant errors. Children exhibiting severe consonant errors (percent consonants correct (PCC) < 50) were three to four times more likely to also have vowel errors than children who had mild (PCC > 85) or mild-moderate (PCC = 66–84) consonant errors. In a more recent study of 45 children with SSD, Roepke and Brosseau-Lapré (2021) found similar results. The number of vowel errors (excluding rhotic vowels and vowels followed by /l/) was moderately but significantly correlated with severity of SSD. In other words, children with severe SSD (as measured by consonant errors) are at greater risk of having vowel errors.
Vowel Development and Disorders 397
28.5.2 Vowel Accuracy and Error Patterns in Children with SSD There appears to be considerable variation in the types of vowel errors produced by children with SSD, based on a review of individual case studies (e.g., Gibbon et al., 1992; Harris et al., 1999; Penney et al., 1994) and small group studies (e.g., Pollock, 2013; Reynolds, 2013; Roepke & Brosseau-Lapré, 2021; Stoel-Gammon & Herrington, 1990). However, some general trends are apparent. For example, the corner vowels /i, u, ɑ/ and the mid back vowel /o/ were rarely in error. Rhotic vowels and diphthongs were most often incorrect. Among the nonrhotic vowels, mid and low front vowels (/e, ɛ, æ/), high lax vowels (/ɪ, ʊ/), and diphthongs (especially /a͡ʊ / and /ɔ͡ɪ /) were most frequently in error. Some children with SSD also had difficulty with low-back /ɑ/ and mid-central /ʌ, ə/. Backing, lowering, and diphthong reduction were the most common error patterns observed for non-rhotic vowels in studies of children acquiring American English. Backing most often affected the low front vowel (e.g., /æ/ produced as [a] or [ɑ]), and lowering occurred most frequently with the mid front vowels (/e, ɛ/ produced as [æ] or [a]). Diphthong reduction resulted in the loss of the offglide, sometimes (but not always) accompanied by a lengthening of the vowel remaining (e.g., /a͡ʊ / produced as [a] or [aː]). Loss of contrast between tense/lax vowel pairs was also common, with individual children exhibiting either tensing or laxing patterns. Mid vowels also were highly susceptible to error. Interestingly, an asymmetry has been observed in errors on mid vowels, with mid front vowels /e, ɛ/ often produced incorrectly and mid back vowels /o, ɔ/ generally produced correctly (e.g., Stoel-Gammon & Herrington, 1990). However, this may be specific to American English speakers, as the same asymmetry has not been noted in speakers of other varieties of English (e.g., Reynolds, 2013). Other differences across varieties of English have also been observed. Children learning West Yorkshire English (Reynolds, 2013) did not use backing, as /æ/ is not part of the target vowel system. However, fronting of low back /ɒ/ to [a] was commonly observed. It appears then, that the American English backing of /æ/ and the West Yorkshire English fronting of /ɒ/ may both serve to reduce the front-back distinction among the low vowels. Errors on mid-central vowels may reflect a preference for peripheral vowels (Reynolds, 2013), although if they occur in unstressed (weak) syllables they may be due to difficulties with timing. Phonetic context may influence vowel error patterns. Although the C-V interactions common in infant speech typically phase out after about 12 months of age, they may persist in children with SSD. Most often, vowel context influences consonant production; however, some vowel errors may be conditioned by consonant context (Bates et al., 2013). The most commonly reported example is the backing and/or lowering of front vowels followed by liquid /l/ as in [maɫt] for “melt,” [pɛlo] for “pillow,” and [mʊk] for “milk.” Such errors represent a natural assimilatory effect, with the vowel pulled back in anticipation of the velarized /l/. Another example is the lowering of high and mid front vowels before nasals, as in [kwɛn] for “queen” or [twan] for “train.” Reynolds (2013) suggests that these errors may be related to the perception of nasalized vowels as more open. Overall, consonant-conditioned vowel errors are infrequent in SSD and have not been reported in typically developing speech. Although the majority of vowel errors produced by children with SSD may follow the common trends noted above, there is ample evidence across many studies that considerable individual variation exists. Stoel-Gammon and Herrington (1990) hypothesized two subgroups of children with vowel errors, based on a review of studies at that time. Children in the first subgroup have large vowel repertoires but produce many errors, most often on mid vowels, high lax vowels, and rhotic vowels. Children in the second group have a restricted vowel inventory consisting of two to three lax vowels; their systems resemble those of infants in the prelinguistic period, with tense vowels missing from the inventory. Beyond differences in which error types are produced and the relative frequency of occurrence of error types, Reynolds (2013) points out the use of idiosyncratic vowel error patterns in a number
398 Karen Pollock and Carol Stoel-Gammon of children with SSD in his sample. For example, one child simplified diphthong productions by introducing a schwa offglide, in essence changing the original offglide to a glide and adding another syllable (e.g., “down” produced as [daʊ̯ə n]). Another produced the high back vowels (/u, ʊ/) as unrounded back vowels ([ɯ]) or as front rounded vowels ([y]). Pollock (2013) also reported the presence of idiosyncratic vowel error patterns in several children with SSD, including rhotic diphthong coalescence (e.g., [u] for /ɪ͡ɚ / and [o͡ʊ ] for /ɛ͡ɚ /), systematic sound preference (e.g., [ɔ] for all rhotic monophthongs and diphthongs), and vowel assimilation (e.g., [kiki] or [kɪkɪ] for “cookie”).
28.5.3 Vowel Errors and Intelligibility Intelligibility is an important consideration in determining the presence of SSD and need for intervention. Numerous studies have documented reduced intelligibility in children with SSD, and measures of intelligibility are frequently recommended to document functional outcomes of intervention. However, the relative contribution of vowel errors to intelligibility has received little attention. Listeners tend to tolerate minor shifts in vowel production, as they are accustomed to hearing such variations in the vowels of speakers of other dialects. However, when errors involve a loss of phonemic contrast, or are unusual/idiosyncratic, they can have a significant negative impact on intelligibility. Vowel errors have been shown to significantly impact intelligibility in speakers who are deaf or have dysarthria (e.g., Levy et al., 2016; Markides, 1970), but few studies have directly investigated the relative impact of vowel errors on the intelligibility of speech in children with SSD. In an experimental study controlling the number and type of errors, speaker, word familiarity, and neighborhood density, Mackie (2015) found that vowel errors affected intelligibility to the same extent as consonant errors, but that words containing vowel errors in addition to consonant errors (as is typically the case for children with SSD who have vowel errors) had the biggest impact on intelligibility. Speake et al. (2012) took a unique approach to the topic, looking at differences in intelligibility before and after intervention targeting vowels for two school-age children with severe SSD. Using peers as listeners, they found that intelligibility (as measured by the number of words understood) increased along with improvement in vowel production accuracy following intervention even though there was limited or no change in consonant accuracy. Although more research is needed, these findings support a link between vowel errors and intelligibility for children with SSD.
28.6 V owel Errors as a Marker of Childhood Apraxia of Speech (CAS) Childhood apraxia of speech (CAS) is a neurologically based motor speech disorder affecting speech production in young children. The American Speech-Language-Hearing Association (2007) describes CAS as follows: “a neurological childhood (pediatric) speech sound disorder in which the precision and consistency of movements underlying speech are impaired in the absence of neuromuscular deficits (e.g., abnormal reflexes, abnormal tone). The core impairment in planning and/or programming spatiotemporal parameters of movement sequences results in errors in speech sound production and prosody.” Estimates of the prevalence of CAS vary widely; one to two children per thousand is a conservative estimate (Shriberg et al., 1997a). CAS is even more common among children with complex neurodevelopmental disorders (Chenausky et al., 2022). Although children with CAS may exhibit many of the same symptoms as listed above for children with cognitive-linguistic speech disorders, there are also some important features that can be used for
Vowel Development and Disorders 399 differential diagnosis. These include inconsistency; prosodic differences, especially stress errors; and difficulty transitioning from one sound to the next, which can result in prolongations and choppy speech (Iuzzini-Seigel et al., 2022).
28.6.1 Vowel Accuracy and Error Patterns in CAS Vowel errors are an oft-cited characteristic of Childhood Apraxia of Speech (CAS), and many consider vowel errors to be a marker useful in the differential diagnosis of CAS (Davis, 2003; Hall et al., 1993; Rosenbek & Wertz, 1972). A few studies have provided detailed descriptions of the vowel errors produced by children with CAS. Pollock and Hall (1991) tested five children aged 8;2–10;9 and found they had difficulty with both rhotic and non-rhotic vowels and diphthongs. The most frequent errors on non-rhotic vowels included diphthong reduction, laxing, backing, tensing, and lowering, patterns also commonly seen in children with SSDs. Children in the study had received speech-language intervention services through the public schools beginning at five years of age, and vowels had been previous remediation targets for three of the children. Thus, all of the children had difficulties with the production of vowels at some point in time. In another study of children with CAS, Davis et al. (2005) investigated the vowel inventories and accuracy patterns of three children (aged 4;6–5;10 at first testing) over a three-year period during which they were receiving intervention services. At each test period, all three children showed relatively complete vowel inventories, with the exception of rhotic vowels. However, accuracy of vowel production was low, ranging from 61 to 85% correct overall. Vowel accuracy was not related to utterance length and only slightly reduced with increased syllable or word complexity. Two of the three children showed consistent improvement in vowel accuracy over the three-year time period. Despite intervention, all three children showed persistent vowel errors over time, with overall vowel accuracy rates remaining below 85% at the third test time (6;5 to 7;7).
28.6.2 Disordered Prosody and Timing as a Marker of CAS The suprasegmental features of timing and prosody are also mentioned as a characteristic of CAS, with the speech patterns described as robotic or monopitch (e.g., Davis, 2003; Shriberg et al., 1997a, 1997b). In a study of lexical stress in disyllabic words, Shriberg and colleagues (2003) examined correlates of stress (frequency, intensity, and duration) in terms of a stressed/ unstressed ratio. Findings indicated that those participants who met the authors’ criteria for CAS produced the majority of the highest and lowest lexical stress ratios, compared to the control participants, who had other diagnoses of speech disorders of unknown origin (see also work on lexical stress by Munson et al., 2003). Peter and Stoel-Gammon (2005) hypothesized that the control of movement timing rather than vocal characteristics such as intensity and pitch may contribute most substantially to the perceivable deficits in lexical stress production in children with CAS. They examined the productions of two children whose speech was consistent with a diagnosis of CAS and those of two age-matched controls during a variety of tasks including sentence imitation, nonword imitation, singing a familiar song, clapped rhythm imitation, and paced repetitive tapping. The timing accuracy in the productions of the controls was found to be higher in all tasks, compared to the participants with the speech disorder. Although more research with larger numbers of children is needed, these studies suggest that children with CAS exhibit prosody and timing patterns that differ from children with typical development and children with other types of SSDs. Thus, this area of speech production may ultimately be useful in the diagnosis of CAS.
400 Karen Pollock and Carol Stoel-Gammon
28.7 Clinical Assessment of Vowel Errors As has been noted earlier, the bulk of research in speech sound development and disorders has focused on consonants. This lack of attention to vowels is evident in the clinical procedures used to assess children with suspected SSDs. Although several norm-referenced standardized tests include supplemental vowel inventory or accuracy measures (e.g., PVC), these analyses are not factored into the test scores. Thus, standard scores for children with vowel errors in addition to consonant errors may not fully reflect the severity of their SSD or its impact on intelligibility. One notable exception is the Diagnostic Evaluation of Articulation and Phonology (Dodd et al., 2002), which includes vowel errors in the calculation of scores on the Articulation Assessment subtest. A common recommendation is to transcribe the child’s production of all of the segments in the test words and to conduct informal analyses such as a vowel inventory or percent vowel accuracy. However, several studies have pointed out that commonly used tests of articulation or phonology do not provide adequate opportunities for assessing vowels, even when all of the segments are transcribed, noting that none of the commonly used tests include at least two opportunities to produce each vowel in stressed syllables of monosyllabic or disyllabic words in two different post-vocalic phonetic contexts (Eisenberg & Hitchcock, 2010; Pollock, 1991; Roepke & Brosseau-Lapré, 2021). Based on the findings of lower vowel accuracy in polysyllabic words in children with and without SSD, vowels in longer words with non-final weak syllables should also be assessed (Masso et al., 2016). Although the target words from standardized tests may provide a starting point, probes including supplemental words are needed to obtain an acceptable sample for vowel analysis. Words with postvocalic liquids should be avoided or addressed separately, as they are known to influence preceding vowels. Bates et al. (2013) provide a list of words that could be used to probe for consonant-conditioned vowel errors. Researchers investigating vowel errors in children with SSD have developed stimuli and analysis protocols specifically designed to assess vowel productions (e.g., Pollock, 2002; Speake et al., 2012), but these have not been normed or validated psychometrically. They may nevertheless be useful for clinicians wanting to complete more in-depth analyses and identify vowel error patterns. Both independent and relational analyses are included. An inventory of the vowels produced, regardless of whether they are produced correctly or not, provides an overview of the range of vowels produced and points out any gaps in the child’s use of the vowel space. The vowel inventory may also have prognostic significance, as vowels missing from the inventory are less likely to be acquired without direct intervention. Relational analyses typically include vowel accuracy measures, such as the percent of vowels correct (PVC) and an analysis of the frequency and distribution of vowel error patterns. The combined results from these analyses can provide a basis for determining the selection of appropriate vowel targets for intervention, if needed. In the clinical assessment of a child’s phonological system, it is critical to distinguish between vowel errors (incorrect productions) and dialect variation to avoid over- and underidentification of disorder. For example, it would be inappropriate for a speech language pathologist working with children in the western US to identify lack of contrast between /ɑ/ and /ɔ/ as a vowel error. Conversely, across-the-board patterns of diphthong reduction by children speaking AAVE should not be attributed to dialect; monophthongization of /a͡ɪ / before both voiced and voiceless consonants may in fact be vowel errors (Pollock, 2002). A thorough knowledge of the local dialect and associated vowel patterns is essential to a ccurate identification of vowel errors in children. A summary of vowel variation in major dialects of American English was provided by Pollock and Berni (2001), but clinicians should always confirm the patterns present in their local dialect(s).
Vowel Development and Disorders 401
28.8 Treatment of Vowels Most children with vowel errors also have difficulty with consonant production, often leading to an unstated assumption that if the consonants are treated first the vowels may improve without direct intervention. Although this observation appears to hold true for some children (e.g., the child studied by Robb et al., 1999, and one of the children with CAS followed by Pollock & Hall, 1991), there are other reports of children for whom vowel errors did not spontaneously improve following intervention for consonants (e.g., Pollock, 1994; Pollock & Hall, 1991). Bernhardt and colleagues found significant improvement in vowel accuracy following treatment targeting word structure as well as consonants (Bernhardt et al., 2010). Gibbon and Beck (2002) suggest that vowels be targeted as part of a treatment program because their improved accuracy is likely to lead to improved speech intelligibility and acceptability. In addition, they argue that because the vowel system is typically mastered earlier than the consonant system, vowel errors should be targeted before consonant errors in order to restore a normal developmental pattern. Case studies of direct intervention for vowel errors in children with SSD reported the use of techniques similar to those commonly used with consonants, but represented a wide range of therapy approaches and targeted vowel error patterns. Several studies reported the use of minimal pair activities in which the contrast included the target vowel and the child’s error production. For example, Pollock (1994) used minimal pair activities with a four-year-old boy with diphthong reduction and lowering/backing of mid front vowels, but only after they were able to establish correct production of the targets through imitation, successive approximation, and drill activities. Gibbon and colleagues (1992) described a therapy program for a four-year-old boy who reduced diphthongs; the first step was establishing a suitable vocabulary for discussing the distinguishing features of monophthongs and diphthongs, in this case a sliding analogy to represent the articulatory movement between the two components of the diphthong. Minimal pair word pictures were then used to encourage the production of a monophthong/diphthong contrast. Other studies reported the use of more traditional articulation approaches to intervention. Hargrove and colleagues (1989) described a program for remediating a pattern of vowel prolongation in four-year-old twins that involved targeting vowels in increasingly demanding contexts (i.e., starting with imitation of vowel targets in isolation and ending with correct spontaneous productions in connected speech). Strategies included the use of contingent reinforcement, verbal feedback, auditory/visual cues, imitation, and modeling. Penney et al. (1994) used a similar progression from isolation to spontaneous speech in intervention for /u/ and /æ/ targets for a four-year-old girl with a reduced vowel inventory and low overall vowel accuracy, but included perceptual strategies (e.g., auditory detection, auditory bombardment, vowel discrimination) as well as production strategies (e.g., phonetic placement). In all of the cases reported, direct intervention resulted in improvement in vowel accuracy, although the extent of improvement and degree of carryover was variable. Baseline measures and monitoring of untreated control sounds provided evidence that the direct vowel treatment, and not maturation, was responsible for the improvement. However, more research is clearly needed to establish the efficacy of vowel treatment and to compare different approaches to vowel treatment. General principles for vowel treatment include the need for clinicians to have good perceptual skills for identifying and transcribing vowels, knowledge of the sociolinguistic variations expected in the child’s community, and production skills adequate for modeling target vowel qualities. Targets should be selected based on a detailed assessment including phonetic and phonological analyses, and treatment approaches may include those that focus on the development of auditory/perceptual skills (e.g., auditory input
402 Karen Pollock and Carol Stoel-Gammon therapy, auditory bombardment), linguistic/phonological abilities (e.g., minimal pairs contrasts, Metaphon), and motor/articulatory skills (e.g., phonetic placement, contextual facilitation, and repetitive drill practice). In addition, computer-assisted visual feedback methods also show promise in the treatment of vowel errors (Gibbon, 2013; Gibbon & Beck, 2002). An oft-cited challenge with vowel treatment is difficulty in the establishment of new vowels in the phonetic inventory. Unlike consonants, articulatory placement awareness for vowels (with the exception of high vowels such as /i/) is difficult to teach due to the lack of discrete articulatory anchors and tactile feedback. Jaw position and lip rounding cues are easily illustrated, but relative tongue position within the oral cavity is difficult to describe and demonstrate. Biofeedback technology such as ultrasound imaging has been successful in providing clients with visual information about tongue shape, position, and movement within the oral cavity during vowel production (e.g., Bernhardt et al., 2010; Cleland & Preston, 2021). Such feedback can be directly related to instructional cues for correcting vowel errors; however as noted by Cleland and Preston, a certain level of cognitive skill is needed “to integrate the visual feedback with their feedforward speech production system” (p. 576). Furthermore, despite the exciting opportunities offered by ultrasound and other technologies, widespread clinical adoption has not occurred given the financial costs involved and the limited availability of such equipment in pediatric speech-language clinics and schools. Acoustic biofeedback methods utilizing real-time spectrogram or linear predictive coding spectrum have also been successful in vowel treatment and are currently being incorporated into low-cost, accessible, user-friendly apps (e.g., Cleland & Preston, 2021; McAllister et al., 2017). Important considerations in using acoustic biofeedback methods are the timing of the feedback (e.g., it must be quick enough to allow children to associate tactile and kinesthetic cues with the acoustic feedback, but not so transitory as real-time displays which do not give sufficient time for children to interpret the feedback) and the type of visual display (e.g., whether the visual display is a representation of articulatory information or an abstract display unrelated to speech production that serves to maintain attention and reward success). Acoustic feedback systems are most useful in differentiating gross vowel categories, but may be less accurate in detecting subtle vowel differences. In addition, the utility of such feedback needs to be interpreted cautiously given the signal processing difficulties often associated with the high F0, low intensity, and nasality of children’s speech. In summary, a wide range of approaches to the treatment of vowel errors are available, many already familiar to clinicians because they are commonly used to remediate consonant errors. New technologies have also been shown to be effective, although their use has often been restricted to school-age children and adolescents with residual speech errors who have not achieved success with other intervention approaches.
28.9 Summary and Conclusion This chapter has provided an overview of vowel development and disorders in children, including clinical assessment and treatment. It is clear that, compared with consonants, our understanding of vowel acquisition is limited. Many of the investigations cited are case studies or studies of relatively few children; several of the larger studies focused on accuracy, with little information on error types and even less on suprasegmental aspects of vowel productions. The picture of vowel development and disorders is further complicated by lack of agreement on a common framework to describe patterns of errors.
Vowel Development and Disorders 403 Based on the American English data of children with and without SSD, the generalizations listed below appear to be valid. • • • •
Corner vowels tend to be acquired (i.e., produced correctly) before non-corner vowels. Monophthongs tend to be acquired before diphthongs. Non-rhotic vowels tend to be acquired before rhotic vowels. Vowels in stressed syllables tend to be more accurate than vowels in unstressed syllables. • Adult-like timing of stressed vs. unstressed syllables is achieved later than adult-like accuracy of segments. Although the patterns of accuracy and errors are similar across children with typical and atypical phonological development, some possible differences have emerged. First, it appears as though vowel errors in children with SSD often affect a larger number of vowel types. Second, children with atypical development, especially those with CAS, may exhibit greater variability in their vowel errors. Lastly, a subset of children with SSD exhibit prosodic patterns not seen in children with typical development; these differences are particularly apparent in the features of stress and timing. As noted above, literature on treatment of vowels is very sparse. Traditionally, clinicians have been inclined to treat consonants before attempting to treat vowels and many children may never receive intervention on vowels in spite of obvious errors. However, reports of vowel treatment utilizing methods similar to those used with consonants have found significant improvement post-treatment in terms of vowel accuracy and intelligibility, suggesting that vowel treatment can be successful and contribute to better functional outcomes for children with SSD and CAS. It is clear that this is an area that deserves further attention.
NOTE 1 For a description of the vowels of the Southern British Standard accent of English see Ball and Müller (2005).
REFERENCES Allen, G., & Hawkins, S. (1980). Phonological rhythm: Definition and development. In G. Yeni-Komshian, J. Kavanagh, & C. Ferguson (Eds.), Child phonology, vol.1: Production (pp. 227–256). Academic Press. https://doi. org/10.1016/b978-0-12-770601-6.50017-6 American Speech-Language-Hearing Association (ASHA). (2007). Childhood apraxia of speech [Technical report] (TR2007-00278). https:// www.asha.org/policy/tr2007-00278 Ball, M. J., & Müller, N. (2005). Phonetics for communication disorders. Lawrence Erlbaum. Bates, S., Watson, J., & Scobbie, J. (2013). Context conditioned error patterns in disordered systems. In M. J. Ball & F. E. Gibbon (Eds.),
Handbook of vowels and vowel disorders (2nd ed., pp. 288–325). Psychology Press. https://doi. org/10.4324/9780203103890.ch11 Bernhardt, B. M., Stemberger, J., & Bacsfalvi, P. (2010). Vowel intervention. In L. Williams, S. McLeod, & R. McCauley (Eds.), Interventions for speech sound disorders in children (pp. 41–72). Brookes. Chenausky, K. V., Gagne, D., Stipancic, K. L., Shield, A., & Green, J. R. (2022). The relationship between single-word speech severity and intelligibility in childhood apraxia of speech. Journal of Speech, Language, and Hearing Research, 65(3), 843–857. https://doi. org/10.1044/2021_jslhr-21-00213
404 Karen Pollock and Carol Stoel-Gammon Cleland, J., & Preston, J. (2021). Biofeedback interventions. In A. L. Williams, S. McLeod, & R. J. McCauley (Eds.), Interventions for speech sound disorders in children (2nd ed., pp. 573–600). Paul H. Brookes Publishing. Davis, B. (2003). Developmental apraxia of speech. In R. Kent (Ed.), Encyclopedia of communication sciences and disorders (pp. 121–124). MIT Press. https://doi.org/10.7551/mitpress/ 4658.003.0042 Davis, B., Jacks, A., & Marquardt, T. (2005). Vowel patterns in developmental apraxia of speech: Three longitudinal case studies. Clinical Linguistics and Phonetics, 19(4), 249–274. https://doi.org/10.1080/0269920041 0001695367 Davis, B., & MacNeilage, P. (1990). Acquisition of correct vowel production: A quantitative case study. Journal of Speech and Hearing Research, 33(1), 16–27. https://doi.org/10.1044/ jshr.3301.16 Dodd, B., Hua, Z., Crosbie, S., Holm, A., & Ozanne, A. (2002). Diagnostic evaluations of articulation and phonology (DEAP). Psychological Corporation. Donegal, P. (2002). Normal vowel development. In M.J. Ball & F.E. Gibbon (Eds.), Vowel Disorders (pp. 1–35). Butterworth Heinemann. Eisenberg, S., & Hitchcock, E. (2010). Using standardized tests to inventory consonant and vowel production: A comparison of 11 tests of articulation and phonology. Language, Speech and Hearing Services in Schools, 41(4), 488–503. https://doi.org/10.1044/0161-1461(2009/ 08-0125) Gibbon, F. (2013). Therapy for abnormal vowels in children with speech disorders. In M. J. Ball & F. E. Gibbon (Eds.), Handbook of vowels and vowel disorders (2nd ed., pp. 429–446). Psychology Press. https://doi.org/10.4324/ 9780203103890.ch17 Gibbon, F., & Beck, M. (2002). Therapy for abnormal vowels in children with phonological impairment. In M. J. Ball & F. E. Gibbon (Eds.), Vowel disorders (pp. 217–248). Butterworth-Heinemann. Gibbon, F., Shockey, L., & Reid, J. (1992). Description and treatment of abnormal vowels. Child Language Teaching and Therapy, 8(1), 30–59. https://doi.org/10.1177/ 026565909200800103 Hall, P. K., Jordan, L. S., & Robin, D. A. (1993). Developmental apraxia of speech: Theory and clinical practice. Pro-Ed. Hare, G. (1983). Development at 2 years. In J. V. Irwin & S. P. Wong (Eds.), Phonological
development in children: 18 to 72 months (pp. 55–88). Southern Illinois University Press. Hargrove, P., Dauer, K., & Montelibano, M. (1989). Reducing vowel and final consonant prolongations in twin brothers. Child Language Teaching and Therapy, 5(1), 49–63. https://doi. org/10.1177/026565908900500104 Harris, J., Watson, J., & Bates, S. (1999). Prosody and melody in vowel disorder. Journal of Linguistics, 35(3), 489–525. https://doi. org/10.1017/s0022226799007902 Iuzzini-Seigel, J., Allison, K. M., & Stoeckel, R. (2022). A tool for differential diagnosis of childhood apraxia of speech and dysarthria in children: A tutorial. Language, Speech and Hearing Services in Schools, 53(4), 926–946. https://doi.org/10.1044/2022_LSHSS-21-00164 Jakobson, R. (1968). Child language, aphasia and phonological universals. Mouton (trans. of Kindersprache, Aphasie, und allegemeine Lautgesetze, 1941). https://doi.org/10.1515/ 9783111353562 James, D., van Doorn, J. A., & McLeod, S. (2001). Vowel production in mono-, di- and polysyllabic words in children aged 3;0 to 7;11 years. In L. Wilson & S. Hewat (Eds.), Evidence and innovation: Proceedings of the 2001 Speech Pathology Australia National Conference (pp. 127–135). Speech Pathology Australia. Kehoe, M. (2013). The development of prosody and prosodic structure. Nova Science Publishers. Kehoe, M., Stoel-Gammon, C., & Buder, E. H. (1995). Acoustic correlates of stress in young children’s speech. Journal of Speech and Hearing Research, 38(2), 338–350. https://doi.org/ 10.1044/jshr.3802.338 Kent, R. (1992). The biology of phonological development. In C. Ferguson, L. Menn, & C. Stoel-Gammon (Eds.), Phonological development: Models, research, implications (pp. 65–90). York Press. Kent, R. D., & Bauer, H. R. (1985). Vocalizations of one year olds. Journal of Child Language, 12(3), 491–526. https://doi.org/10.1017/ s0305000900006620 Kent, R. D., & Rountrey, C. (2020). What acoustic studies tell us about vowels in developing and disordered speech. American Journal of SpeechLanguage Pathology, 29(3), 1749–1778. https:// doi.org/10.1044/2020_AJSLP-19-00178 Levy, E., Leone, D., Moya-Gale, G., Hsu, S., Chen, W., & Ramig, L. (2016). Vowel intelligibility in children with and without dysarthria: An exploratory study. Communication Disorders Quarterly, 37(3), 171–179. https://doi.org/ 10.1177/1525740115618917
Vowel Development and Disorders 405 Lieberman, P. (1980). On the development of vowel production in young children. In G. H. Yeni-Komshian, J. F. Kavanagh, & C. A. Ferguson (Eds.), Child phonology, vol. 1: Production (pp. 113–142). Academic Press. https://doi.org/10.1016/ b978-0-12-770601-6.50012-7 Mackie, K. M. (2015). Vowels and consonants: The relative effect of speech sound errors on intelligibility [Unpublished master’s thesis]. University of Alberta. https://doi.org/ 10.7939/R3WM14024 MacNeilage, P. F. (1998). The frame/content theory of evolution of speech production. Behavioral and Brain Sciences, 21(4), 499–546. https://doi.org/10.1017/s0140525x98001265 MacNeilage, P. F., & Davis, B. L. (2001). Motor mechanisms in speech ontogeny: Phylogenetic and neurobiological and linguistic implications. Current Opinion in Neurobiology, 11(6), 696–700. https://doi.org/10.1016/ s0959-4388(01)00271-9 McAllister Byun, T., Campbell, H., Carey, H., Liang, W., Park, R. H., & Svirsky, M. (2017). Enhancing intervention for residual rhotic errors via app-delivered biofeedback: A case study. Journal of Speech-Language-Hearing Research, 60(Suppl.), 1810–1817. Maddiesson, I. (1984). Patterns of sounds. Cambridge University Press. https://doi. org/10.1017/cbo9780511753459 Markides, A. (1970). The speech of deaf and partially hearing children with special reference to factors affecting intelligibility. International Journal of Language & Communication Disorders, 5(2), 126–139. https://doi.org/10.3109/13682827009011511 Masso, S., McLeod, S., Baker, E., & McCormack, J. (2016). Polysyllable productions in preschool children with speech sound disorders: Error categories and the framework of polysyllable maturity. International Journal of SpeechLanguage Pathology, 18(3), 272–287. https://doi. org/10.3109/17549507.2016.1168483 Munson, B., Bjorum, E. M., & Windsor, J. (2003). Acoustic and perceptual correlates of stress in nonwords produced by children with suspected developmental apraxia of speech and children with phonological disorder. Journal of Speech, Language, and Hearing Research, 46(1), 189–20. https://doi. org/10.1044/1092-4388(2003/015) Oller, D. K., Ramsay, G., Bene, E., Long, H. L., & Griebel, U. (2021). Protophones, the precursors to speech, dominate the human infant vocal
landscape. Philosophical Transactions of the Royal Society B, 376(1836), 20200255. https://doi. org/10.1098/rstb.2020.0255 Otomo, K., & Stoel-Gammon, C. (1992). The acquisition of unrounded vowels in English. Journal of Speech and Hearing Research, 35(3), 604–616. https://doi.org/10.1044/jshr.3503.604 Paschall, L. (1983). Development at 18 months. In J. V. Irwin & S. P. Wong (Eds.), Phonological development in children: 18 to 72 months (pp. 27–54). Southern Illinois University Press. Penney, G., Fee, E. J., & Dowdle, C. (1994). Vowel assessment and remediation: A case study. Child Language Teaching & Therapy, 10(1), 47–66. https://doi.org/10.1177/026565909401000103 Peter, B., & Stoel-Gammon, C. (2005). Timing errors in children with suspected childhood apraxia of speech (sCAS) during speech and music-related tasks. Clinical Linguistics and Phonetics, 19(2), 67–87. https://doi.org/10.1080 /02699200410001669843 Pollock, K. E. (1991). Identification of vowel errors using traditional articulation or phonological process test stimuli. Language, Speech, & Hearing Services in Schools, 22(2), 39–50. https://doi. org/10.1044/0161-1461.2202.39 Pollock, K. E. (1994). Assessment and remediation of vowel misarticulations. Clinics in Communication Disorders, 4(1), 23–37. PMID: 8019549. Pollock, K. E. (2002). Identification of vowel errors: Methodological issues and preliminary data from the Memphis Vowel Project. In M. J. Ball & F. E. Gibbon (Eds.), Vowel disorders (pp. 83–113). Butterworth-Heinemann. Pollock, K. E. (2013). The Memphis Vowel Project: Vowel errors in children with and without phonological disorders. In M. J. Ball & F. E. Gibbon (Eds.), Handbook of vowels and vowel disorders (2nd ed., pp. 260–287). Psychology Press. https://doi.org/10.4324/ 9780203103890.ch10 Pollock, K. E., & Berni, M. (2001). Transcription of vowels. Topics in Language Disorders, 21(4), 22–40. https://doi.org/10.1097/00011363-20012104000005 Pollock, K. E., & Berni, M. C. (2003). Incidence of non-rhotic vowel errors in children: Data from the Memphis Vowel Project. Clinical Linguistics & Phonetics, 17(4–5), 393–401. https://doi. org/10.1080/0269920031000079949 Pollock, K. E., & Hall, P. (1991). An analysis of the vowel misarticulations of five children with developmental apraxia of speech. Clinical Linguistics and Phonetics, 5(3), 207–224. https:// doi.org/10.3109/02699209108986112
406 Karen Pollock and Carol Stoel-Gammon Ramus, F., Nespor, M., & Mehler, J. (1999). Correlates of linguistic rhythm in the speech signal. Cognition, 73(3), 265–292. https://doi. org/10.1016/s0010-0277(99)00058-x Reynolds, J. (2013). Recurring patterns and idiosyncratic systems in some English children with vowel disorders. In M. J. Ball & F. E. Gibbon (Eds.), Handbook of vowels and vowel disorders (2nd ed., pp. 229–259). Psychology Press. https://doi. org/10.4324/9780203103890.ch9 Robb, M., Bleile, K., & Yee, S. S. L. (1999). A phonetic analysis of vowel errors during the course of treatment. Clinical Linguistics and Phonetics, 13(4), 309–321. https://doi.org/ 10.1080/026992099299103 Roepke, E., & Brosseau-Lapré, F. (2021). Vowel errors produced by preschool-age children on a single-word test of articulation. Clinical Linguistics & Phonetics, 35(12), 1161–1183. https://doi.org/10.1080/02699206.2020.1869834 Rosenbek, J., & Wertz, R. (1972). A review of fifty cases of developmental apraxia of speech. Language, Speech, and Hearing Services in Schools, 3(1), 23–33. https://doi.org/ 10.1044/0161-1461.0301.23 Selby, J., Robb, M., & Gilbert, H. (2000). Normal vowel articulations between 15 and 36 months of age. Clinical Linguistics and Phonetics, 14(4), 255–265. https://doi.org/10.1080/ 02699200050023976 Shriberg, L. D., Aram, D. M., & Kwiatkowski, J. K. (1997a). Developmental apraxia of speech: II. Toward a diagnostic marker. Journal of Speech, Language, and Hearing Research, 40(2), 286–312. https://doi.org/10.1044/jslhr.4002.286
Shriberg, L. D., Aram, D. M., & Kwiatkowski, J. K. (1997b). Developmental apraxia of speech: III. A subtype marked by inappropriate stress. Journal of Speech, Language, and Hearing Research, 40(2), 313–337. https://doi.org/ 10.1044/jslhr.4002.313 Shriberg, L. D., Campbell, T. F., Karlsson, H. B., Brown, R. L., McSweeny, J. L., & Nadler, C. J. (2003). A diagnostic marker for childhood apraxia of speech: The lexical stress ratio. Clinical Linguistics and Phonetics, 17(7), 549–574. https://doi.org/10.1080/ 0269920031000138123 Speake, J., Stackhouse, J., & Pascoe, M. (2012). Vowel targeted intervention for children with persisting speech difficulties: Impact on intelligibility. Child Language Teaching and Therapy, 28(3), 277–295. http://doi.org/ 10.1177/0265659012453463 Stoel-Gammon, C., & Dunn, C. (1985). Normal and disordered phonology in children. Pro-Ed. Stoel-Gammon, C., & Herrington, P. (1990). Vowel systems of normally developing and phonologically disordered children. Clinical Linguistics & Phonetics, 4(2), 145–160. https:// doi.org/10.3109/02699209008985478 Templin, M. (1957). Certain language skills in children: Their development and interrelationships. Institute of Child Welfare Monographs, 26. University of Minnesota Press. https://doi. org/10.5749/j.ctttv2st Wellman, B., Case, I., Mengert, I., & Bradbury, D. (1931). Speech sounds of young children. University of Iowa Studies in Child Welfare, 5(2), 82.
29 Cross-Linguistic Phonological Acquisition DAVID INGRAM AND ELENA BABATSOULI 29.1 Introduction A complete understanding of phonological acquisition in children will not be achieved until in-depth studies are available on how children acquire the wide range of phonological systems that characterize human language. This challenging enterprise is complicated by the large number of languages that exist, the need to determine the full range of ways that languages differ phonologically, and the manner in which phonological properties interact with grammatical properties. Estimates of the number of languages vary, but the numbers are typically in the thousands. Linguistic studies have made great strides in the last one hundred years in understanding the structure of language, but many languages have not been studied in depth, and many aspects of language structure are far from being completely understood. Table 29.1 presents a summary of the major aspects that need to be considered in developing a phonological typology of languages. As shown, languages will differ in their prosodic systems, syllable structure, consonantal and vocalic systems, phonotactics, and interactions with morphology and syntax. Phonologists are actively studying each of these areas to add to our understanding of their universal and language-specific properties, but many gaps exist in our current knowledge. In addition to developing accurate phonological characterizations of languages, research into phonological acquisition deals with both theoretical and practical issues. Theoretical issues involve the assumptions that need to be made about the nature of phonological acquisition in general. At one extreme, research can take a strong theoretical stand, as in viewing phonological data from within a non-linear theoretical framework (e.g., Babatsouli, 2019). At the other end, research can be more descriptively driven, collecting data and determining patterns of acquisition on under-represented languages, as exemplified by recent research on the African Bantu languages of the Niger–Congo family: isiZulu and isiXosa (Pascoe & Jeggo, 2019; Pascoe et al., 2017). The present chapter assumes the basic, yet conservative theoretical assumptions about how phonological acquisition takes place to discuss evidence in early and later-developing systems cross-linguistically. Besides selecting a theory of phonology, research also needs to work within a theory of phonological acquisition. Issues in phonological acquisition concern assumptions about the extent to which children form linguistic systems comparable to those of adult speakers. At one end we have maturational accounts that propose children are not like adults, and that they follow a path of discontinuous development. This would occur, for instance, if the early words produced by children were constrained by the maturation of the articulators, so that less complex syllables
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
408 David Ingram and Elena Babatsouli Table 29.1 Properties of a phonological typology of languages. 1. Prosody 2. Syllable structure 3. Vowels
4. Consonants
5. Phonotactics 6. Grammatical interactions
systems of stress, tone, and pitch accent; stress vs. syllable timing nature of onsets, codas, consonant clusters, reduplication, and mora structure systems of vowels, e.g., number of vowels, existence of vowel length, nasalization, tense vs. lax vowels, and degrees of frontness; vowel harmony systems. types of consonants in relation to place, voice and manner; states of the larynx; common vs. rare sounds (e.g., “v” in Czech). constraints on consonant and vowel co-occurrences (e.g., final devoicing in German) morphological and syntactic conditioning (e.g., English plural)
and sounds are acquired before more complex ones (Locke, 1983). Such theories have important implications for the study of both typical children, monolingual and bilingual, and children with phonological delay or disorder. If such theories are accurate, then children across languages should look very similar to one another, regardless of their typicality. Alternatively, children could begin phonological acquisition with the basic cognitive and articulatory skills in establishing phonological systems, albeit simple ones, that are constructed with the same phonological units that characterize adult language. This is a long-standing position (Jakobson, 1968) that still has its proponents today (Ingram, 1989; Ingram et al., 2018). It would be impossible to cover all these aspects of cross-linguistic acquisition in a single article, because of both the scope of the enterprise and a lack of research on many of them. The present chapter will approach the topic by concentrating on two descriptive aspects as these relate to two theoretical questions. The descriptive part presents early and later developing data in the acquisition of consonant segments and whole-word complexity from selected languages. The data to be cited are discussed in relation to two theoretical questions: (1) Is phonological acquisition universal or influenced by the linguistic environment? (2) Is evidence of disorder the result of articulatory or phonological impairment? It will be concluded that although children show noticeable cross-linguistic differences in phonological acquisition, these differences demonstrate phonological organization of the input that is subject to language typology and articulatory skill. Further, it will be documented that atypical productions in children with speech delay or disorder show the same ambient language characteristics found in their respective typically developing peers, and that such impairment is manifested along a spectrum of interaction between phonology and articulation (Ingram et al., 2018). These combined results provide strong evidence of phonological organization as the guiding force in cross-linguistic acquisition.
29.2 Determinants of Phonological Acquisition 29.2.1 Early Consonantal Inventories A question fundamental to the understanding of phonological acquisition is: When do children begin phonological organization of the target language? The Russian linguist Roman Jakobson, in his classic work (Jakobson, 1968), proposed that it begins with children’s very first words, but that properties of the ambient language emerge soon after. Counter arguments to this have
Cross-Linguistic Phonological Acquisition 409 Table 29.2 Early consonantal system of children across languages (Adapted from Ingram, 2011).
English French Spanish Cantonese Quiché
labials
coronals
palatals
velars
glottal
mpbfw mpbf mpb mpw mpw
ntds ntdsl n t l tʃ nt n t l tʃ
j
kg
h
j j
kg k kx
h
viewed the limited range of early speech sounds resulting from articulatory limitations and that phonology begins after the first fifty words (Locke, 1983). This controversy can be addressed by examining children’s early consonantal inventories across different languages. If children are constrained by the maturation of the articulatory system, similar patterns in word p roduction will be seen across languages. If, however, children are more advanced in their articulatory development, and ready for phonological organization of the input, cross-linguistic variation will be evidenced. Ingram (2011) explored early word-initial consonant inventories based on published data in different language families: Germanic (English), Romance (French, Spanish), Sinitic (Cantonese), and Guatemalan Mayan (Quiché). As shown in Table 29.2, the children studied exhibit variability in their early consonant acquisition, though grouped data clearly indicate a general system. Among the languages examined, the early consonant inventories of English, Spanish and Cantonese share most places of articulation (labial, coronal, velar, glottal) and four manners (nasals, stops, fricatives, glides). The early lateral and affricate in French, Spanish, and Quiché indicate more advanced early systems, especially so in the frequent Quiché tokens. Such cross-language similarities are subject to ambient language distinctions, as exemplified by the contrast for voice in French and Spanish plosives, but not in Quiché, where the language-specific /x/ is preferred instead. Interestingly, cross-linguistic variability is more robustly indicated by absence rather than presence of sounds, as for Spanish and Cantonese fricatives /f/, /s/, Spanish /x/, Quiché /s/, English /l/, /tʃ/, Cantonese /l/, and French /k/, /g/. Further, Ingram’s (2011) analysis of the first 25 words in the French, English, Japanese, and Swedish data of De BoyssonBardies and Vihman (1991) showed that French velars had the lowest rates of production (9%) compared to English (20%), Japanese (26%) and Swedish (23%). French also differed in having higher occurrence of labials (52%) over dentals (39%), compared to reverse patterns in the other languages, similarly indicated in Table 29.2 for Spanish, Cantonese, and Quiché. Inventory variability was documented earlier by Ingram (1999) for typically developing children speaking English, Quiché, Turkish, and Dutch at 20–27 months. Similar investigations focusing on the early word template as a phonological organizing principle further support variable inventories cross-linguistically as, more recently, for Brazilian and European Portuguese (Baia & Correia, 2017) and Czech (Cilibrasi & Dunková, 2022).
29.2.2 Consonantal Inventories across the Developmental Path To determine cross-linguistic acquisition patterns of consonants along the span of development, McLeod and Crowe (2018) reviewed sixty-four studies of consonant acquisition in twenty-seven languages representing fourteen language families: Austronesian (Malay), Germanic (West Germanic: Afrikaans, Dutch, English, and North Germanic: Danish, German, Icelandic), Germanic-West African (Jamaican Creole), Greek (Standard and Cypriot),
410 David Ingram and Elena Babatsouli Japonic (Japanese), Koreanic (Korean), Niger-Kongo (Setswana, Swahili, Xhosa), Romance (French, Haitian Creole, Italian, Portuguese, Spanish), Semitic (Arabic, Hebrew, Maltese), Sinitic (Cantonese, Mandarin), Slovenian (Slovene), Turkic (Turkish), and Uralic (Hungarian). Most studies described monolinguals, though few also included multilingual children. Ages studied range from ca. 2 to 8 years, and data are cumulatively shown for 1;10–2;11 (referenced here as T1), 3;0–3;11 (T2), 4;0–4;11 (T3), 5;0–5;11 (T4), 6;0–6;11 (T5), and 7;0–7;6 (T6). There are two consonant acquisition criteria used: 90–100% (referenced here as fully acquired) and 75–85% (partly acquired), while acquisition levels primarily refer to word-initial contexts. The reviewed data enable the following commentary. Fully acquired consonants at T1 comprise nasals (labial), voiceless plosives (labial, alveolar, palatal), voiceless a ffricates (alveopalatal, retroflex), and the velarized lateral, but neither fricatives nor rhotics are r epresented. Partly acquired consonants indicate the gradual enhancement of segmental repertoire: nasals (all anterior, and /ŋ/), plosives (all contrasts, including pre-nasalized and anterior aspirated), fricatives (voiceless labiodental and retroflex, voiced velar, glottal, pharyngeal, except coronal), and glides (labio-dental, labio-velar, palatal). At T1, inventories also include non-pulmonic productions: ejectives (all), implosives (voiced bilabial), and clicks (anterior). Consonants that become fully acquired at T2 are remaining nasals (posterior placement), plosives (aspirated productions), geminates (anterior plosives), approximants (the glides, clear lateral), and voiceless affricates (labial, post-alveolar, palatal, retroflex). Partly acquired consonants during this time encompass fricatives (most places/manners), laterals (all, including lisp), and rhotics (alveolar tap and approximant, uvular trill). Non-pulmonic productions at T2 also indicate the acquisition of more complex ejectives (dual place, affricates). The fully acquired consonant repertoire is enhanced at T3 across languages to comprise remaining fricatives (sibilants, voiceless palatal, voiced velar), lateral (palatal) and the rhotic (tap). The 75–85% acquisition criterion in the same age bracket is represented by the voiced palatal plosive, the voiced interdental fricative, the alveolar trill, but also two complex nonpulmonic consonants: a click (alveolar aspirated) and an ejective (post-alveolar affricate). Among the very last to be mastered after age 3;11 are fricatives and rhotics. In particular, the last rhotics to be fully acquired at T4 are the alveolar trill and approximant. Also, at T4, outstanding fricatives becoming fully acquired are the voiceless interdental and velar, as well as the voiced post-alveolar sibilant. At T5, the voiced bilabial fricative finally comes into play, as do the retracted /s/ and /ð/ that are shown partly acquired. The last consonant to be shown acquired across languages at T6 is the voiceless labio-velar fricative. It is clearly indicated that very early inventories (T1) cross-linguistically comprise both non-pulmonic and pulmonic consonants. Interestingly, affricates and retroflex consonants do not pose distinct difficulties to children early on, at T1 and T2. Dual place consonants are among those lingering in acquisition past age T2, though approximants (e.g., /ɥ/ acquired at 3;11) precede fricatives as seen for example, /ʍ/ (acquired at 7;6) during the development of such manner complexity. By and large, however, all nasals, glides, most obstruents, and liquids are acquired by age 3;11, supporting previous inferences that phonological systems cross-linguistically are largely acquired by age 4;0 (Ingram, 1989). Among the last acquired consonants are pulmonic sounds, specifically, rhotics and fricatives, but not across the board. Thus, the last developing rhotics are the alveolar trill and approximant, while the last developing fricatives are alveolar sibilants, /s/ /ʒ/, and anterior non-stridents, /β/ /ð/ /θ/, likely due to their more engaging manner of articulation, among other factors. These well-defined groupings hint on a universal course for development no matter what the language targeted. Implicational hierarchies and markedness considerations, in terms of both perceptual and articulatory ease/effort, also signal the presence of common ground, since distinct preferences for specific place/manner of articulation and laryngeal distinctions are documented but, again, not without exceptions. For instance, plosives precede fricatives (e.g., palatal /c/ /ɟ/ vs. /ç/ /ʝ/), laterals precede rhotics, and secondary posterior placement
Cross-Linguistic Phonological Acquisition 411 (e.g., velarized consonants, /ŋ/ /ɫ/) is not as challenging as primary posterior placement (e.g., palatals /ɲ/ /ʎ/). Similarly, secondary manner is not detrimental to acquisition either, as shown by p renasalized and aspirated obstruents, for example, /mb/, /ph/, /tʃh/, or lateral fricatives, /ɬ/, /ɮ/ whose acquisition is concomitant, if not earlier, with respective primary articulations. Collective tendencies also hold for laryngeal distinctions, as exemplified in parallel acquisition paths across place and manner contrasts, for example, /p/-/b/, /ɸ//β/, /c/-/ɟ/, /ʃ/-/ʒ/, though not overwhelmingly so, for example, /ɣ/ before /x/, or /ð/ before /θ/. A precedence of acquisition of English /ð-/ (at age 3;8) over English /θ-/ (3;11) is also shown longitudinally in a bilingual Greek–English girl (Babatsouli, 2017).
29.2.3 Language Specificity While such universal patterns are recorded across language families, the influence of individual language typologies and exposure patterns cannot be underestimated in developing phonologies. Research on bilinguals’ babbling has shown that infants adjust their babbling to match characteristics of the language used by each parent present at the moment in one-parent-one-language exposure scenarios. This has been documented with longitudinal data for some bilingual infants, and more recently for a Czech–English infant, Zachary (0;6.13–1;6.21, whose babbling comprised more consonants in English early on, due to primary exposure to maternal output, and a gradual increase in the proportion of consonants produced during his Czech babbling, when interacting with his father. This reflects the influence of language-specific properties, since Slavic (Czech) languages have a more robust consonantal inventory than other languages including English (Cilibrasi & Dunková, 2022). Such language specificity is clearly evidenced in older children’s i nventories, as well. Zooming in on monolingual English acquisition based on the cross-linguistic data of f ifteen studies, reported by McLeod and Crowe (2018), demonstrates that individual consonants in English delay compared to cumulative cross-linguistic data, thus, English /p b m d n h t k ɡ w ŋ f j/ are shown acquired by T2, /l ʤ ʧ s v ʃ z/ by T3, and /ɹ ʒ ð θ/ by T5. Similar discrepancies are highlighted in the same report in case-studies of Korean, Japanese, and Spanish. A case in point would be the acquisition of Korean /s/ and /l/ by 4;0 and 5;0, respectively, or of Spanish /w/ by 4;0. There are several factors that may account for this variability, one of which draws attention to larger phonological units (e.g., syllable complexity) affecting segmental accuracy. Research endeavors exploring such influences across languages and dialects are increasingly represented in edited volumes, for example, Babatsouli et al. (2017), Macleod and Pollock (2020), Stemberger and Bernhardt (2018), as well as individual studies (e.g., Babatsouli & Geronikou, 2022). These efforts advance further our understanding of cross-linguistic variability in typical and atypical/delayed phonologies to include I ndo-Iranian (Persian), Indo-Aryan (Punjabi), Niger-Kongo (Akan), and more Austronesian (Tagalog) and Slavic (Bulgarian) languages.
29.2.4 Functional Load and Frequencies Subject to language typology, cross-linguistic evidence supports a strong case that children do not solidly begin nor end acquisition with the same consonants across language environments. Though English /w/ is an early acquisition for English-speaking monolinguals, this is not true for Spanish-speaking monolinguals, as noted earlier, nor without exceptions for English-speaking bilinguals with compromised input, such as the Greek-English bilingual who acquired /w/ at 3;7 (Babatsouli, 2015). What then is the explanation behind these differences? Pye et al. (1987) looked at the distribution of English and Quiché inventories in children’s words and found that functional load (how important a phoneme is within the phonological system) is a catalytic factor in determining acquisition patterns. Specifically,
412 David Ingram and Elena Babatsouli the study showed significant correlations with the frequency of occurrence of the individual consonants and the number of word types (rather than tokens) they occurred in. Stated differently, the more words a consonant occurred in, the more likely it was that the children would acquire it. English /ð/ has a low functional load as it occurs in a small number of function words, like “the,” “this,” “that,” but it is high in token frequency since English function words are used frequently. Due to either low functional load or articulatory difficulty, /ð/ is a late acquisition for English children, acquired at >90% by age 5;0 which is later than, for instance, in Brazilian Portuguese, Spanish, Greek, and Swahili where it is acquired between ages 2;0 and 3;0 (McLeod & Crowe, 2018). Babatsouli (2017, 2020) showed earlier acquisition of /θ/ /ð/ in a bilingual girl’s weaker English, influenced by her stronger Greek due to the higher frequency of these sounds in Greek nouns and verbs. Edwards et al. (2015) explored phoneme frequency in English, Cantonese, Greek, and Japanese two- and three-year-old children, concluding that there is a language-specific correlation between consonant accuracy and type frequencies. For instance, comparing Greek and Cantonese, they found that while /t/ is more accurate than /ts/, the difference is smaller in Cantonese where /ts/ token frequency is higher than in Greek. In their Paidologos project, Korean and Mandarin data were also included, and the authors document that children’s phonological representations become gradually more segmental (articulatorily complex) subject to phoneme frequency in the linguistic input, in combination with an individual child’s increasingly larger lexicon, thus supporting the known relationship between lexical and phonological acquisition with additional cross-linguistic data.
29.2.5 Whole Word Complexity, Proximity, and Accuracy While the study of phonological acquisition has mostly focused on segmental development, recent research has expanded to quantitative measures of whole-word properties. Ingram (2002) suggested two aspects of whole words to incorporate into phonological analyses, whole-word complexity, and proximity. Whole-word complexity refers to the extent to which one word can be said to be more complex than another. A thorough measure of complexity will need to consider syllabic and segmental complexity, and some way to weigh the two. Ingram (2002) proposed a simpler measure, one for providing a relatively fast and simple way to establish an initial impression, referred to as the phonological mean length of utterance, or pMLU. The pMLU is a basic count of the number of sounds in a word, and the number of consonants. Each vowel receives one point, and each consonant two points, under the assumption that consonants, particularly when combined into consonant clusters, add more complexity to a word than vowels. An English word like “bee” will receive a score of 3, while a word like “between” would score 10. The mean of these counts across a child’s vocabulary can give some idea of whether or not a child is acquiring simpler or more complex words, and it can be used to assess the complexity of words on articulation tests. The calculation of the pMLU of a child’s productions is done slightly differently, where consonants are scored with one point, and only receive a second point if they are produced correctly. The child who produces “truck” as [gak] will receive four points: three points for the three segments, and one more point for the correct /k/. The scoring of phonological word proximity, PWP, involves comparing the child’s pMLU to that of the adult targets, by dividing the latter into the former. The child who produces all words accurately (technically, all consonants correctly and some effort for each vowel) will have a score of 1.0 or 100%. In the example above for “truck,” the proximity would be .57, by dividing 7 (the pMLU of the target word) into 4 (the pMLU of the child’s word). Ingram (2002) reported PWP scores for English children acquiring their first 25 words of around 64 percent. The study of whole-word measures is still at a very early stage, and much more needs to be done concerning the specific rules for their calculation and appropriate sample sizes. There has been, however, increasing research applying the measures cross-linguistically and, also, to phonological disorders and bilingualism.
Cross-Linguistic Phonological Acquisition 413 The measurement of pMLU cross-linguistically has been reported for Cantonese, Dutch, English, French, Finnish, and Spanish (cf. Ingram, 2011), and the present chapter enhances the discussion with data on Greek and Kannada (Dravidian language family, India). Table 29.3 provides a summary of the data for pMLU and/or proximity in each language as available. Comparisons suggest that children differ on these measures cross-linguistically. The Cantonese child showed a relatively low pMLU and high PWP, suggesting that she was having trouble acquiring phonology. The Finnish children had higher pMLUs and PWPs for the first 25 words than the English children, suggesting that longer words with relatively simple syllable structure do not impede phonological acquisition. The Spanish children were older than the Finnish children but showed a similar pattern with relatively high PWP. The English and Dutch data indicate that the phonologies of these languages are harder to acquire, since both pMLU and PWP were low in the cross-linguistic comparison (at least for the available English data). The French child showed low scores for both measures which she maintained across the first 250 words; these scores are comparable to the English scores in the first 50 words. The available longitudinal data indicates a gradual increase in both whole-word measure scores (Table 29.3), documented in Jennika and Kristen’s early monolingual English phonologies. Additional longitudinal data along the entire span of development exist for children with bilingual exposure, as shown for pMLU in Kannada (Kumar & Bhat, 2009) and PWP scores in English in a bilingual Greek–English girl between ages 2;7 and 3;9 (Babatsouli et al., 2014), that score respectively higher and lower than the cross-linguistic data in that Table. The results for Kannada reveal an incremental trend with rapid growth at 3;0–4;0, which is
Table 29.3 A comparison of pMLU and PWP for eight languages. Language
Ages/samples/children
pMLU
PWP
English
0;11–1;10/first 25 words/5 children 1;3–2;3/longitudinal/Jennika 1;4–1;8/longitudinal/Kristen 2;7/317 words/bilingual Maria-Sofia 2;6–3;9/longitudinal/bilingual Maria-Sofia 2;9–3;4/ibid 3;4–3;9/ibid 1;7//Wai 1;4–1;10//7 children 1;11–2;4//6 children 2;6–2;11//4 children 1;5–2;0/first 25 words/17 children 1;5–1;11/first 250 words/Fernande 2;7/540 words/bilingual Maria-Sofia 3;0–3;6/children 3;6–4;0/children 4;0–4;6/children 6;0–6;6/children 7;6–7;0/children 2;6//5 children 2;7//7 children 3;0//8 children
3.2 3.6 to 5.1 4.2 to 5.6 4;3
.64. .67 to .74 .65 to .86 .80 .70 .70 to .90
4.8 4.4 5.4 6.0 5.1 3.5 to 3.8 4.82 7.5 8.2 9.2 10.0 10.6 6.3 6.4 7.0
.93 not given not given not given .78 .62 to .68 .83 not given not given not given not given not given .82 .84 .92
Cantonese Dutch
Finnish French Greek Kannada
Spanish
414 David Ingram and Elena Babatsouli followed by a plateau and then another increase between ages 6;0 and 7;0. These progressive stages are explained by a combination of phonological and lexical advancement, that includes progress on the morphosyntactic level. Though no information is provided in the study, the Kannada-speaking children were also exposed to Tulu (South Dravidian) and Konkani (Indo-Aryan) languages which may have further contributed to the children’s advanced phonological skill in Kannada. The Greek-English bilingual’s, Maria-Sofia’s, PWP growth data in English is based on 25 words of variable complexity that were sampled in her speech at least once a month between 2;6 and 3;9. There is evidence of three distinctive stages in the developmental path: Progressive Stage I (2;6–2;9): 0.32 to 0.70, Cyclic Stage II (2;9–3;4): fluctuating around 0.70, and Progressive Stage III (3;4–3;9): 0.70 to 0.90. Clearly, such PWP scores are low compared to the monolingual Cantonese, English, French, and Spanish data at respective ages, which result from input and usage constraints in this child’s English that is the weaker language in her bilingualism. Bunta et al. (2009) compared whole-word complexity and proximity scores in monolingual and bilingual English-speaking and Spanish-speaking children, and found no differences in Spanish, though English monolinguals outperformed bilinguals on both measures. Another study on English–Spanish bilinguals’ pMLUs accounted for the lower English scores in terms of syllable complexity rather than word length in syllables (Freedman & Barlow, 2011). Comparing the two languages in an English–Hungarian simultaneous bilingual at 2;0, Bunta et al. (2006) showed that the child’s pMLU in Hungarian was significantly higher than in English, but proximity in the child’s languages was similar. This finding is corroborated by the Greek-English bilingual’s data on pMLU (Greek 4.82, English 4.3) and PWP (Greek .83, English .80) based on a total vocabulary count of 540 Greek and 317 English word-types at age 2;7 (Babatsouli, 2020). Thus, regardless of typological complexity (e.g., Hungarian and Greek have more multisyllabic words, but English has more complex syllables), bilinguals’ languages are approximated similarly during acquisition. A study comparing cross-linguistic whole word match (WWM) scores in typically and delayed German, Icelandic and Swedish (Germanic), Canadian French, European Portuguese and Spanish (Romance), Bulgarian and Slovenian (Slavic), showed that WWM scores in phonologically delayed children were lower than for typically developing peers, there was no substantial increase by age in months, and that the measure is suitable to ascertain whether a child’s WWM score is indicative of their respective designation as typical or delayed (Bernhardt et al., 2020).
29.3 The Nature of Phonological Impairment Traditionally, children who showed atypical or delayed speech were assumed to have a speech problem, that is, an inability to move the articulators appropriately to produce words correctly. In the late 1960s, researchers conducted phonological analyses on children with impaired speech, and found that their patterns required a phonological, not just an articulatory, explanation (Ingram, 1976). For example, a child who produced an /s/ as a [t] could nonetheless produce an [s] as a substitute for a /ʃ/. This is the classic Jakobsonian argument that speech development is phonological, not just articulatory. Word acquisition requires the child to form phonological representations of words, and to be capable of mapping those representations into speech forms. Cross-linguistic phonological acquisition provides an excellent test case to explore the articulatory versus phonological nature of speech impairments. If, in fact, the primary characteristic of a speech disorder is an inability to make sounds, then children with speech problems should look similar across linguistic environments. On the other hand, if these children are nonetheless making an effort to establish a phonological system along the lines
Cross-Linguistic Phonological Acquisition 415 of their language peers, then their word productions should look more similar to those of typical children in their language environment than to children with speech problems in a different linguistic environment. The determination of consonant inventories, as done earlier, can be used to study the nature of a child’s phonological impairment. The methodology is basically to determine inventories for children with typical development and compare them with those of children with disorder/delay. Relatively little cross-linguistic research of this kind has been done, since such research would involve studies that plan such comparisons with careful matching of the children. Nonetheless, some data exist where one set comprises studies on normal phonological acquisition, and another set involves studies on phonological impairment. These data can then be compared on an ad hoc basis to get an initial impression on how the two sets of data compare. If there is a trend in the sets of comparisons, such trends can at least be suggestive of whether children with impairments compare more closely to their same-language peers, or to children with phonological impairments in other languages. Here, data will be presented for comparison in English, Italian, Swedish, Greek, Turkish, and Bulgarian. Ingram (1981) matched 15 typically developing children (1;5 to 2;2, median age 1;9) to 15 children with phonological impairments (3;11 to 8;0, median age 5;3), according to their Articulation Scores (AS). The AS is a measure that weighs the number of consonants a child is using by their frequency of use. The data, except for some small numerical differences, showed that the consonant inventories of typical children and peers with disorder are the same. With just one language being compared, however, it is not possible to conclude either that the similarities are due to a language effect or that they are due to articulatory complexity. Evidence for a language effect in phonological impairment increases when other languages are considered. Bortolini et al. (1993) compared nine typical Italian children (2;2 to 2;11) with nine peers with disorder (4;9 to 7;1). The word-initial consonant inventories of the typical children provided more evidence of cross-linguistic differences, such as the early acquisition of /tʃ/ and frequent use of /v/, consonants that are later acquired in English. Though the inventories in Italian disorder show similarities to English disorder, there are more similarities with the Italian typical children, especially with regard to /v/ and /tʃ/ o ccurrence. Data on the phonological systems of Swedish (cf. Ingram, 2011) in 42 children (between 3;9 and 6;6) that were classified as showing phonological delay, and a typically developing child’s longitudinal data (1;8, 2;0, 2;2, and 2;5.8), across three studies (Magnusson, 1983; Saaristo-Helin et al., 2006) showed that the children’s phonetic inventories were the same. Of particular interest was the production of /v/ by the groups of phonologically impaired children, despite its later acquisition by typical peers, similarly to the Italian data. Elsewhere, Ingram (1988) has reported the early acquisition of /v/ in languages other than English. The fact that the Italian and Swedish children with a phonological impairment still produce a /v/, presumably due to its importance in the ambient language, is support for the hypothesis that their impairment is not overriding their developing a phonological system for the more important phonemes in the language. The next sets of data on Greek, Turkish, and Bulgarian are among the newly undertaken endeavors to compare typical phonological acquisition with phonological impairment. Petinou and Okalidou (2004) investigated phonological impairment in seven Cypriot Greek late talkers comparing with a cohort of seven typical peers at 2;6, 2;8, and 3;0. They found the same established consonant types within each group, but a larger and earlier acquired repertoire in typical children. Specifically, the typical group had mastered most singletons by 3;0 in all word positions, with a preference for word-initial accuracy between 2;6–3;0, when consonant types increased. While the typical phonetic inventories comprised all place stops, fricatives, affricates, nasals, and approximants, the late talkers persistently deleted word-initial consonants, they did not produce word-final /s/ (/n/ being the only
416 David Ingram and Elena Babatsouli other permitted word-final consonant in Greek), and only produced bilabial and alveolar stops, nasals, and [l]. Velar stops appeared at 3;0. Further evidence sustaining phonological delay over phonological deviance is found in the case-studies of two girls speaking standard Greek, Theodora at age 4;8 (Babatsouli, 2019), and Dorothea at 5;10 and 6;3 (Babatsouli & Geronikou, 2022). Among Theodora’s non-acquired consonants: /l/ (60%), /ɾ/ (14%), /x/ (21%), [ç] (55%), and /g/ (29%), affricates, /ts/ /dz/, and palatal allophones [ʎ] [ɟ] were not produced at all, while remaining consonants were below norms: /l/ (3;6–4;0), /x/ (3;0–3;6), [ç] (3;0–3;6), g (2;6–3;0), except ɾ (5;6–6;0). Also, Theodora had a pMLU of 7.72 and a PWP of .84, which is further indication of delay in mastering phonological complexity and proximity when compared to the only available data for Greek whole-word measure scores, the typically developing bilingual girl, Maria-Sofia, at age 2;7 (see Table 29.3). Similarly, phonological delay in Dorothea’s data surfaces as chronological mismatches, child-specific idiosyncratic forms, and most notably, disparity between segmental and structural development. For instance, while her predominant mismatch was /ɾ/→[l], as per norms, her word-initial /v/ /ð/ and word-medial /θ/, /ð/ showed a lot of inaccuracy in multisyllabic words with complex syllables (e.g., CVC). Individual idiosyncrasy is also documented in her production of word-initial sounds, like /v/ in specific words (e.g., /vi. vlio/→[livio] “book”), which is further evidence of word-initial constraints noted earlier for Cypriot Greek late-talkers. Dorothea’s delay is also shown in her whole-word accuracy score of only 50%, resulting mostly errors in syllables with complex onsets and codas (CCV=50%, CCCV=0%, VC=60%, CVC=70% (7/10), CCVC=0%). Her percentage clusters correct (PClC) was 40%, where 93% were reductions except for the one timing unit match in /xt/→[st]. Greek has an extensive fricative system with 10 consonants. If phonological delay was the result of impaired articulatory ability alone, interdentals /θ/ /ð/ would be among the last acquired fricatives, especially based on analysis of English data and the relatively limited distribution of this sound cross-linguistically. As discussed earlier, these sounds have a high functional load in Greek, and their occurrence in the delayed speech of these two girls suggests that the environment is playing a role in their use. Consequently, the argument that impairment is phonological rather than purely articulatory is additionally supported by Greek, and, as we also will see next, for Turkish and other languages. Topbaş (2006) reported the phonological development of 70 Turkish children classified as phonologically disordered (ages 4;0–8;0) and compared their consonant inventories with those of 665 typical peers (1;3–8;0). The study showed that Turkish consonants are acquired by age 3;0, earlier than in other languages, and that level of acquisition for some consonants is language specific. For example, Turkish affricates are acquired (and appear in the inventory of children with disorder) before fricatives, which is the reverse from what is known for English. There was no significant qualitative difference between the typical and atypical productions, though impaired performance was comparable to that of typically developing two- to three-year-olds. When error patterns in Turkish disorder were compared to those in studies investigating German, English, Cantonese, Putonghua, and Puerto-Rican Spanish, the study indicated that late-acquired sounds were more problematic for phonologically impaired children cross-linguistically, and that while some error patterns were universal (e.g., stopping, backing, metathesis), language specific tendencies were indicated per language, such as stopping of affricates in English, but of fricatives in Turkish. By and large, errors in children with disorder were at an earlier age and more frequent than for respective typical children. Notably, some unusual and inconsistent productions for Turkish were also reported, reflecting more severity of disorder. Bernhardt et al. (2019) compared consonant accuracy in 60 Bulgarian children (30 with phonological delay and 30 typical) between ages 3;0–5;0. While the data supported known order of acquisition, the atypical data showed lower accuracy overall (especially at 3;0 for onsets in unstressed syllables), specific delay in laterals (also corroborated in the atypical data in Greek
Cross-Linguistic Phonological Acquisition 417 and Swedish discussed earlier), and a larger proportion and range of errors that included deletions and feature complexity (e.g., errors in both place and manner). Typical and atypical groups showed steady increase in accuracy at 3;0, 4;0, and 5;0, but the typical children performed better word-initially at ages 4;0–5;0 (similarly to the Cypriot Greek typical children at 3;0). Interestingly, the Bulgarian atypical data showed marked delay in word-initial position at all ages, reminiscent of the constraints in standard and dialectal Greek phonological delay. In summary, comparisons of typically and atypically developing consonants in English, Italian, Swedish, Turkish, Greek, and Bulgarian strongly support the position that phonological impairment exhibits similarity to typical acquisition within a language, as well as some cross-linguistic similarity. Subject to the apparent ambient language effect, there is a clear indication that the productions of children with speech impairment are subject to abstract phonological representations, language specificity, environmental effects, as well as articulatory skill. These data are additional support of the spectrum approach to speech sound disorders, recently proposed by Ingram et al. (2018), whereby disorder is viewed within a model that establishes interaction of phonology and articulation as a spectrum upon which both reside. Children’s atypical speech productions within this spectrum may show a tendency for phonological and/or articulatory patterns or varying combinations of them subject to where these productions are, both in the spectrum, and along the developmental trajectory of phonological and articulatory skill.
29.4 Future Directions Cross-linguistic research is critical in the effort to understand phonological acquisition in typically developing children, and children with phonological impairments. It provides strong support for the view that children are capable of phonological organization at the time of the first words, and that difficulties in speech acquisition have a phonological c omponent. Research into these questions, however, is still at a very early stage. Many aspects of the questions identified in Table 29.1, as well as other theoretical questions, have been little researched to date. Virtually no research exists that applies the whole-word measures discussed earlier to typical and atypical populations cross-linguistically. Another area with a great need for more research is that of prosodic development in terms of the development of syllable complexity across languages, and how timing, stress patterns, and intonation, are acquired across languages and across children in monolingual and bilingual contexts, and in populations developing typically or showing impairment. As the rudimentary comparisons of this chapter across languages and language acquisition contexts suggests, motor skill is subject to maturation constraints, but it is phonology in terms of grammatical complexity, prominence and frequency in the input, and during use, that accounts for cross-linguistic discrepancies. The kinds of results reviewed here, however, show that cross-linguistic studies are vital for the advancement of our knowledge of phonological acquisition.
REFERENCES Babatsouli, E. (2015). Technologies for the study of speech: Review and an application. Themes in Science and Technology Education, 8(1), 17–32. Babatsouli, E. (2017). Bilingual development of theta in a child. Poznań Studies in Contemporary
Linguistics, 53(2), 157–195. https://doi.org/ 10.1515/psicl-2017-0007 Babatsouli, E. (2019). A phonological assessment test for child Greek. Clinical Linguistics & Phonetics, 33(7), 601–627. https://doi.org/10.10 80/02699206.2019.1569164
418 David Ingram and Elena Babatsouli Babatsouli, E. (2020). Enhanced phonology in a child’s weaker language in bilingualism: A portrait. In E. Babatsouli & M. J. Ball (Eds.), An anthology of bilingual child phonology (pp. 117–139). Multilingual Matters. Babatsouli, E., & Geronikou, E. (2022). Phonological delay of segmental sequences in a Greek child’s speech. Clinical Linguistics & Phonetics, 36(7), 642–656. https://doi.org/10.10 80/02699206.2021.2001574 Babatsouli, E., Ingram, D., & Müller, N. (Eds.). (2017). Crosslinguistic encounters in language acquisition: Typical and atypical development. Multilingual Matters. Babatsouli, E., Ingram, D., & Sotiropoulos, D. (2014). Phonological word proximity in child speech development. Chaotic Modeling and Simulation, 3, 295–313. Baia, M. F., & Correia, S. (2017). Self-organisation in phonological development: Templates in Brazilian and European Portuguese. In E. Babatsouli (Ed.), Proceedings of the International Symposium on Monolingual and Bilingual Speech 2017 (pp. 45–51). ISBN: 978-618-82351-1-3. http://ismbs.eu/publications-2017 Bernhardt, B. M., Ignatova, D., Amoako, W., Aspinall, N., Marinova-Todd, S., Stemberger, J. P., & Yokota, K. (2019). Bulgarian consonant acquisition in preschoolers with typical versus protracted phonological development. Journal of Monolingual and Bilingual Speech, 1(2), 143–181. https://doi.org/10.1558/jmbs.v1i2.11801 Bernhardt, B. M., Stemberger, J. P., Bérubé, D., Ciocca, V., Freitas, M. J., Ignatova, D., Kogošek, D., Lunderborg Hammaström, I., Másdóttir, T., Ozbič, M., Perez, D., & … Ramalho, M. (2020). Identification of protracted phonological development across languages: The whole word match and basic mismatch measures. In E. Babatsouli & M. J. Ball (Eds.), An anthology of bilingual child phonology (pp. 274–308). Multilingual Matters. Bortolini, U., Ingram, D., & Dykstra, K. (1993). The acquisition of the feature [voice] in normal and phonologically delayed Italian children. Paper presented at the Symposium of Research in Child Language Disorders, University of Wisconsin-Madison, May 21–24. Bunta, F., Davidovich, I., & Ingram, D. (2006). The relationship between the phonological complexity of a bilingual child’s words and those of the target languages. International Journal of Bilingualism, 10, 71–88. https://doi. org/10.1177/13670069060100010401 Bunta, F., Fabiano-Smith, L., Goldstein, B., & Ingram, D. (2009). Phonological whole-word measures in 3-year-old bilingual children and their age-matched monolingual peers.
Clinical Linguistics and Phonetics, 23(2), 156–175. https://doi.org/10.1080/ 02699200802603058 Cilibrasi, L., & Dunková, J. (2022). A longitudinal case-study on the development of consonantvowel distribution in the babbling of a Czech-English infant. Journal of Monolingual and Bilingual Speech, 4(2), 127–144. https://doi. org/10.1558/jmbs.21123 De Boysson-Bardies, B., & Vihman, M. M. (1991). Adaptation to language: Evidence from babbling and first words. Language, 67(2), 297–319. Edwards, J., Beckman, M. E., & Munson, B. (2015). Frequency effects in phonological acquisition. Journal of Child Language, 42, 306–311. https:// doi.org/10.1017/S0305000914000634 Freedman, S. E., & Barlow, J. A. (2011). Using whole-word production measures to determine the influence of phonotactic probability and neighborhood density on bilingual speech production. International Journal of Bilingualism, 16(4), 369–387. Ingram, D. (1976). Phonological disability in children. Edward Arnold. Ingram, D. (1981). Procedures for the Phonological Analysis of Children’s Language. University Park Press. Ingram, D. (1988). The acquisition of word initial [v]. Language and Speech, 31(1), 77–85. Ingram, D. (1989). First language acquisition. Cambridge University Press. Ingram, D. (1999). Phonological acquisition. In M. Barrett (Ed.), The development of language (pp. 73–97). Psychology Press. Ingram, D. (2002). The measurement of whole word productions. Journal of Child Language, 29(4), 713–733. Ingram, D. (2011). Cross-linguistic phonological acquisition. In M. J. Ball, M. R. Perkins, N. Muller, & S. Howard (Eds.), The handbook of clinical linguistics (2nd ed., pp. 626–640). Wiley-Blackwell. Ingram, D., Williams, L. A., & Scherer, N. (2018). Are speech sound disorders phonological or articulatory? A spectrum approach. In E. Babatsouli & D. Ingram (Eds.), Phonology in protolanguage and interlanguage (pp. 27–48). Equinox Publishing. Jakobson, R. (1968). Child language, aphasia, and phonological universals (A. R. Keiler, Trans.). Mouton. Originally published as Kindersprache, Aphasie, und allgemeine Lautgesetze. Almqvist and Wiksell, 1941. Kumar, R. B., & Bhat, S. J. (2009). Phonological mean length of utterance (pMLU) in Kannadaspeaking children. Language in India, 9(8), 489–502.
Cross-Linguistic Phonological Acquisition 419 Locke, J. L. (1983). Phonological acquisition and change. Academic Press. Macleod, A. N., & Pollock, K. E. (2020). Global perspectives in child phonology. International Journal of Speech-Language Pathology, 22(6), 611–613. https://doi.10.1080/17549507.2020.1844800 Magnusson, E. (1983). The phonology of language disordered children: Production, perception, and awareness. Liber. McLeod, S., & Crowe, K. (2018). Children’s consonant acquisition in 27 languages: A cross-linguistic review. American Journal of Speech Language Pathology, 27(4), 1546–1571. https://doi.org/10.1044/2018_AJSLP-17-0100 Pascoe, M., & Jeggo, Z. (2019). Speech acquisition in monolingual children acquiring isiZulu in rural KwaZulu-Natal, South Africa. Journal of Monolingual and Bilingual Speech, 1(1), 94–117. https://doi.org/10.1558/jmbs.11082 Pascoe, M., Mahura, O., LeRoux, J., Danvers, E., de Jager, A., Esterhuizen Chané Naidoo, N., Reynders, J., Senior, S., & van derMerwe, A. (2017). Speech development in 3-year-old children acquiring IsiXhosa and English in South Africa. In E. Babatsouli, D. Ingram, N. Müller (Eds.), Crosslinguistic encounters in language acquisition: Typical and atypical development (pp. 3–26). Multilingual Matters.
Petinou, K., & Okalidou, A. (2004). Speech patterns in Cypriot Greek late talkers. Applied Psycholinguistics, 27(3), 335–353. https:// doi.10.1017.S0142716406060309 Pye, C., Ingram, D., & List, H. (1987). A comparison of initial consonant acquisition in English and Quiché. In K. E. Nelson & A. van Kleeck (Eds.), Children’s language (pp. 175–190). Erlbaum. Saaristo-Helin, K., Savinainen-Makkonen, T., & Kunnari, S. (2006). The phonological mean length of utterance: Methodological challenges from a crosslinguistic perspective. Journal of Child Language, 33(1), 179–190. Stemberger, J. P., & Bernhardt, B. M. (2018). Tap and trill clusters in typical and protracted phonological development: Challenging segments in complex phonological environments. Introduction to the special issue. Clinical Linguistics & Phonetics, 32(5–6), 411–423. https://doi.org/10.1080/02699206. 2017.1370019 Topbaş, S. (2006). Does the speech of Turkishspeaking phonologically disordered children differ from that of children speaking other languages? Clinical Linguistics & Phonetics, 20(7–8), 509–522. https://doi10.1080/ 02699200500266331
30 Cross-linguistic Aspects of System and Structure in Clinical Phonology MEHMET YAVAŞ AND MARGARET KEHOE 30.1 Introduction It is a widespread assumption that all languages have a similar level of complexity (Akmajian et al., 2010; McMahon, 1994; O’Grady et al., 2017), and it is a futile attempt to rank languages in terms of simplicity/complexity. The argument for this assumption states that if a language is simple in its sound system, it will force elaboration in other areas (morphology, syntax, etc.). While this may be valid in general, it should not stop us from studying the same language component (in our case phonology) cross-linguistically and from discussing the potential outcomes of these different degrees of complexity on phonological development and disorders. A simple comparison, even by a non-specialist, of the phonological inventory of !Xu (Snyman, 1969; Khoisan family, spoken in Namibia & Botswana) with 143 phonemes (47 non-clicks, 48 clicks, and 48 vowels and diphthongs) with that of Rotokas (Firchow & Firchow, 1969; East Papuan family, spoken in New Guinea) with 11 phonemes (6 consonants and 5 vowels) may be sufficient to conclude that, due to the huge discrepancy in inventory sizes, the former language has greater phonological complexity than the latter. This chapter will focus on cross-linguistic aspects of system and structure in phonology in both typical acquisition and in speech disorders. We will first discuss several metrics to evaluate phonological complexity and then point out the potential effects of different degrees of complexity on phonological acquisition and disordered phonology.
30.2 Metrics of Phonological Complexity The size of the phoneme inventory (the number of distinct phonemes in the language’s phonemic inventory) is the most basic metric proposed for measuring phonological complexity (Nettle, 1995). Historically, several studies have counted both the number of vowels and the number of consonants (Maddieson & Disner, 1984). Increases in phonemic inventory size are hypothesized to negatively correlate with word length measures in phonemes. (Moran & Blasi, 2014). Although the inventory size is relatively easy to compute, it ignores the
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
422 Mehmet Yavaş and Margaret Kehoe importance of syllable structure and the language’s phonotactics. Also, studies done on the correlation between the size of vowel and consonant inventories have shown contradictory results. An approach that has greater appeal than the size of the phoneme inventory is one that takes into account the markedness of individual phonemes. Lindblom and Maddieson (1988) deal with the markedness of a consonant inventory by proposing the following classification of three categories, which is based on articulatory complexity, and is reminiscent of Trubetzkoy (1931). Set I: Basic articulations [p b t d k g ʔ ʧ f s h m n ŋ w l r j] Set II: Elaborated articulations. Any action of the larynx other than simple voicing and voicelessness (e.g. breathy voice, aspirated, ejectives, implosives), any superposition or sequencing of different oro-nasal articulatory configurations (e.g. prenasalized, nasal/lateral release, secondary articulations), articulations which require two sets of oral articulatory gestures for their production (e.g. clicks, double articulations), articulations which depart from a default mode of phonation (e.g. voiceless sonorants). In terms of place of articulation, configurations representing departures from the near-rest positions of the lips, tongue tip, and tongue body (e.g. labio-dentals, palato-alveolars, retroflexes, uvulars, pharyngeals) are also included. Set III: Complex articulations: Combinations of at least two elaborated articulations Thus, the following: /k/ (Basic), /kʷ/ (Elaborated), and /kʷʼ/ (Complex) This classification can be summed up as stating that small systems tend to exhibit set I (Basic) “unmarked” phonetics whereas large systems have “marked” phonetics. As the systems get “larger and larger we observe that set II elements are brought into play approximately at the point where the set I elements reach their level of saturation. At still further size increases the set III segments come in” (Lindblom & Maddieson, 1988:70). A somewhat different markedness position is taken by McWhorter (2001), based on cross-linguistic distribution, which states that the presence of marked phonemes implies the presence of unmarked counterparts. Several studies on acquired deficits of speech production show that phonetic complexity, based on markedness criteria, influences both accuracy and the probability of error outcomes (Buchwald, 2009; Romani & Galluzzi, 2005). Considerations beyond individual segments lead us to investigate syllabic structure and to count the number of licit syllables with the idea that syllabic complexity could give us a better picture of phonological complexity. Shosted (2006) attempts such a characterization by looking at the number of consonants, vowels, diphthongs, tones, and syllable types within a language, and by calculating the potential number of syllables with examples from 32 languages. He finds that while the size of the inventory (the number of consonants, vowels, and diphthongs) contributes to complexity, the impact of syllable types is much greater. This can be demonstrated with the following comparison between Slave(y) (Na-Dene family, spoken in northeast Alberta and northwest British Columbia) and Yurok (Algic family, spoken in Northwestern California). While the inventory size of the former is greater (34 consonants and 16 vowels) than that of the latter (29 consonants and 11 vowels), Slave(y) has 4 syllable types resulting in 4480 potential syllables, while Yurok, has 13 syllable types, resulting in 48,840 potential syllables. Although these are improvements over the sheer size and/or markedness of the inventory, it does not cover the phonotactic restrictions on the syllable. These include specific combinations in consonant clusters and coda constraints. Typically, clusters comprise only a subset of possible pairings of consonants, and not all consonants that occur in onset can also occur in coda position. Thus, the number of potential syllables can be very different from the number of actually occurring syllables.
Cross-linguistic Aspects of System and Structure in Clinical Phonology 423 Several other issues might contribute differently to the phonological complexity of syllables. One of these is the open syllable frequency in the lexicon of a language. For this purpose, we can compare Comanche (Uto-Aztecan family, spoken in Oklahoma) with Yupik (EskimoAleut family, spoken in Alaska). While Comanche has a higher number of syllables than Yupik (17,485 vs. 7973), it has a higher percentage of open syllables than the latter (85.7 vs. 52.7%). Which creates more complexity? The number of syllables or the number of non-open syllables? A syllable with a single consonant followed by a vowel (CV) is considered the simplest (most unmarked) syllable; any deviation from this with a complex onset (CCV), a coda (CVC), or the lack of an onset (V) can be considered to add a degree of complexity (markedness) to the basic CV template. However, the relative complexity of a syllable goes beyond the above. There are well-established principles emanating from the relative sonority of the segments that can distinguish syllables with seemingly similar structures. According to the Sonority Sequencing Principle – (henceforth SSP) (Jespersen, 1922; Sievers, 1901), the segment constituting the peak (nucleus) of a syllable is preceded and/or followed by a sequence of segments with progressively decreasing sonority values. Thus, a CCVC word that follows the SSP such as [prɪn] is more frequently found in languages (i.e. less marked) than [rpɪn] which has the same segments but the order in the onset reversed and, thus, violates the SSP. As such, cluster patterns in which there is an increasing sonority slope toward the nucleus (e.g. /pl/) are preferred to sonority plateaus (e.g. /pk/) or sonority reversals (e.g. /lp/ Parker, 2011).1 Additionally, the Sonority Dispersion Principle (henceforth SDP), (Clements, 1990) states that “The sonority profile of the optimal syllable type rises maximally and steadily from the beginning to the peak and falls minimally to the end.” This principle assumes that greater sonority differences are preferred between members of a complex onset, but also between a consonant in the onset and the following vowel. Thus, two complex onset candidates /pl/ and /pn/ need to be evaluated differently by factoring the former as more unmarked than the latter. Although both clusters obey the SSP by a rise in sonority, the rise is steeper in the former than in the latter, a fact supported by the frequencies of cluster types attested in the world’s languages. It is also hypothesized that whereas greater differences in sonority are preferred between segments in the onset and the nucleus, smaller differences in sonority are preferred between the nucleus and the segments in the coda. Thus, a CV such as /pa/ is better (more unmarked) than /ra/ because of the maximization of the sonority slope in the onset. On the other hand, in a /VC/ sequence, /ar/ will be considered better (more unmarked) than /ap/, because the former provides a gentler decline via a smaller difference in sonority. This also is proven correct with the fact that if a language allows a single obstruent coda, then it always allows a single sonorant coda, but the reverse is not necessarily the case. Finally, we need to talk about frequency. The possible sounds and sequences are not all equiprobable, some are more frequent than others. Frequency refers to the rate of a phonological unit. Two types of frequency are generally discussed “type” and “token” frequencies. Type frequency refers to the incidence of a specific sound (e.g. /ð/) or sequence of sounds (e.g. /tr/) in unique words in the lexicon, whereas token frequency refers to the raw number of exposures to the segment/sequence in a given word position. For example, an initial /st/ sequence has a high “type” frequency in English, whereas /ʒ/ and the onset cluster /fj/ rank low. Sometimes, there is a clash between the two frequencies. For example, word-initial /ð/ has low type frequency (not found in many different words) but high token frequency (it is found in frequently used grammatical morphemes such as the, that, this). There are two other frequency-related concepts that may be instrumental in explaining the order of acquisition of phonemes. The first is “input frequency” which refers to the frequency of occurrence of a phonological structure in the input, be that child- or adult-directed speech. One of the frequently cited examples in the literature – the early acquisition of /d/ in English versus its later acquisition in Finnish (Macken, 1995) – is explained by its high
424 Mehmet Yavaş and Margaret Kehoe frequency in English as opposed to its lower frequency in Finnish. Sometimes the frequency of sounds and structures may be different in child- versus adult-directed input. Tsurutani (2007) accounts for the earlier acquisition of the alveopalatal /ʧ/ and /ʃ/ before the alveolar /s/ in Japanese speech by the high frequency of alveopalatals in the speech of Japanese mothers to their children. In contrast, /s/ is more frequent than /ʧ/ and /ʃ/ in adult-directed speech. The second frequency-related concept is “functional load,” which refers to the relative importance of each phoneme within a specific phonological system. Pye et al. (1987) calculate the functional load of a phoneme by its frequency of occurrence in oppositions or minimal pairs (e.g. they calculate the functional load of word-initial /d/ in minimal pairs bait-date, den-then, etc.). Pye et al. (1987) concluded that functional load explained the earlier acquisition of /ʧ/ by Quiché- (Mayan family, spoken in Guatemala) compared to Englishspeaking children. Supporting evidence for significant positive correlations between functional load and order of acquisition is also found in Stokes and Surendran (2005), and Cataño et al. (2009). The influence of frequency has been observed in many speech production studies: high frequency sound sequences result in faster picture naming latencies (Vitevich et al., 2004), and have positive effects on repetition accuracy in impaired (aphasic) speakers (Lallini & Miller, 2011). Further evidence comes from experimentally induced speech errors (Goldrick & Rapp, 2007). As we have seen, phonological complexity is a rather elusive notion that may relate to many different parameters such as the size of the phoneme inventory, the markedness of individual segments, the number of syllable types, syllable structure complexity, phonotactic restrictions, and sonority. These different parameters along with frequency may have consequences for phonological acquisition leading to differences in age and order of acquisition and types of error patterns across languages. Needless to say, the patterns found in typical development are of great value in the clinical setting because they may guide clinicians in the treatment of phonological disorders.
30.3 F indings on Cross-linguistic Differences in Systems and Structure In this section, we focus in greater detail on how differences in phonological complexity across languages influence the development of sound systems and syllable structure. We will see that apart from phonological complexity, other factors come into play such as frequency, functional load, and prosodic salience.
30.3.1 Phonetic Inventory Size First, we consider phonetic inventory size. Is it harder for children to acquire sounds from large versus small consonant inventories? The World’s Atlas of Language Structures (WALS) Online (Dryer & Haspelmath, 2013) classifies languages according to consonant inventory size. Arabic has a moderately large consonant inventory whereas English has an averagesized one. Finnish and Japanese have moderately small consonant inventories. Amayreh and Dyson (1998), in a normative study of consonant acquisition in Jordanian Arabic, found that most consonants were acquired at similar ages in Arabic as in English. There were a few consonants acquired earlier in Arabic than in English and a few consonants acquired later. Similarly, Ota and Ueda (2006), when summarizing studies on consonant acquisition in Japanese, did not report major differences in the age of acquisition of Japanese consonants in comparison to the age of acquisition of the same consonants in English (see norms provided
Cross-linguistic Aspects of System and Structure in Clinical Phonology 425 by Smit et al., 1990, Tables 5 and 7). McLeod and Crowe (2018) provide the largest analysis of consonant acquisition to date, describing consonant acquisition across 27 languages, some of which contain moderately large- (e.g., Arabic, Hungarian), average- (e.g., English, Spanish, Korean), and moderately small- (e.g., Japanese, Finnish) sized inventories. Despite the varied size of the consonant inventories, many sounds were acquired at similar age levels across languages, although there were some exceptions. For example, [v] had an age of acquisition ranging from 20 to 102 months across 14 different languages. Already some time ago, Ingram (1988), explained the earlier acquisition of /v/ in Swedish and Estonian as opposed to its later acquisition in English by its high frequencies in the former languages. Overall, this set of findings suggests that a consonant may be acquired at a similar age regardless of whether it belongs to a small or large inventory. Nevertheless, Kunnari (2003) shows that Finnish children have smaller consonant inventories than those reported for English-speaking children at the same developmental level (Robb & Bleile, 1994; StoelGammon, 1985), indicating that the development of inventory size may be commensurate with the size of the target inventory.
30.3.2 Marked Consonants Rather than simply focusing on inventory size, the markedness of consonants within the phonetic inventory offers another dimension in which to examine cross-linguistic variation. Sounds that have elaborate or complex articulations are more difficult to acquire than sounds with basic articulations. Arabic, for example, has a series of consonants, referred to as emphatic; they are produced with a secondary articulation in which the tongue root is retracted into the pharynx. These sounds are marked from an articulatory point of view; however, other factors such as their variable surface forms and their low frequency in standard speech may contribute to their late acquisition. Amayreh (2003) provides data on Jordanian Arabic to show that some may be acquired by children, aged 7;4, but others, later than 8;4. In this sense, the acquisition of emphatic consonants resembles consonants with secondary articulations in other languages, for example, /kʰw/ and /tsʰ/ in Cantonese (So & Dodd, 1995) and /kp/ and /ŋʷ/ in Igbo (Nwokah, 1986) which have also been found to be late acquired. A more recent study on emphatic consonants in Ammani Arabic indicates earlier acquisition of emphatic consonants: the majority of emphatics were acquired at the age of five years (Mashaqba et al., 2022). Nevertheless, emphatic consonants were still acquired well after non-emphatic consonants. Bader (2009) reports that language-disordered children often de-emphasize emphatic consonants, replacing them with non-emphatic cognates. Emphatic consonants and other consonants with secondary or complex articulations do not occur in all languages. In contrast, liquids (laterals and rhotics) occur frequently in the world’s languages and contain segments that differ according to their relative markedness. These different degrees of markedness may lead to varied timelines of acquisition across languages. In the case of rhotics, for example, there is some indication that uvular /r/ is acquired earlier than alveolar taps and trills (Kehoe, 2018; Ru͂ ķe Draviņa, 1965). The alveolar trill is complex from an articulatory point of view requiring the right balance of tongue position, timing, and aerodynamic force (Solé, 2002). In Table 30.1, we present the percent production of target /r/ in word-initial position in German-, French- and Spanish-speaking monolinguals. The data presentation includes some of our own published findings (or reanalyzes of these data) as well as findings cited in the literature. In French, the /r/ is a uvular fricative, and in German, a uvular approximant. In Spanish, the /r/ is an alveolar trill in word-initial singleton onsets. In comparison 1, the children are aged around 2;4 to 2;6 on average, whereas in comparison 2, they are 3;0 years. In all comparisons, we see a clear difference between the acquisition of uvular and alveolar /r/ with earlier acquisition of uvular /r/ as determined by higher percent accuracy. At 3;0, we
426 Mehmet Yavaş and Margaret Kehoe do not have equivalent data for German as we do for French and Spanish; however, Fox (2000) indicates that 90% of German-speaking children, aged 2;6–2;11, produce /r/ singleton consonants correctly suggesting more similar results between German and French than German and Spanish. Stemberger and Bernhardt (2018), in a special edition in Clinical Linguistics and Phonetics, compare the acquisition of taps and trills in singleton position and in clusters across a variety of languages, including in speech-disordered children. They report that the most prevalent substitutions of taps and trills were [l], [j], other rhotics, [d], and [ð]. Nevertheless, they observed that languages and clinical groups (typically developing vs. disordered) differed according to which was the most common sound substitution. For example, [j] was the most common and [l] the next most common substitution in Hungarian phonologically disordered children, whereas the reverse was true of Hungarian typically developing children. Bernhardt and Stemberger (2018) also note that there is a bias for the substituted sound to be present in the adult inventory. Thus, they observed that the most common substitute for the singleton trill in Icelandic was [ð], a sound also present in the phonemic inventory of Icelandic. It was not a frequent substitute in the other languages (except for Spanish), /ð/ not being in the adult inventories of the other languages. Rose and Penney (2021) make a similar observation for the sound substitutions of uvular /r/. They observe that [h] substitutions are frequent for uvular /r/ in Dutch and German but are hardly present for uvular /r/ in French and Portuguese. Importantly, /h/ exists in the phonemic inventories of Dutch and German but not in the inventories of French and Portuguese. Bernhardt et al. (2015) observed a similar pattern in the fricative production of children with protracted phonological development. German-speaking children substituted fricatives with a higher proportion Table 30.1 Comparison of alveolar and uvular /r/ production in German-, Spanish-, and French-speaking children.
Language
Authors
Participant Information
% Mean target /r/ production
Comparison 1: Production of /r/ onsets in children 2;4–2;6 German Spanish French
Kehoe (2018)a Kehoe (2018)b Kehoe et al. (2008)c Kehoe and Havy (2019)d
5 children, aged 2;3–2;9 3 children, aged 2;6 15 children, mean age of 2;4 16 children aged 2;6
79% 11% 72% 67%
Comparison 2: Production of /r/ onsets in children aged 3;0 Granada Spanish Chilean Spanish French a
Perez et al. (2018)e
Kehoe and Girardier (2020)f
(Approx.) 10 children, aged 3;0 the same
44%
8 children aged 2;11–3;10
86%
Data in Supplementary Materials (Appendix A) from Kehoe (2018) Data in Supplementary Materials (Appendix C) from Kehoe (2018) c Reanalysis of data from the corpus of Kehoe et al. (2008) d Reanalysis of data from the corpus of Kehoe and Havy (2019) e Data in Table 7 (p. 11) for LProm from Perez et al. (2018) f Reanalysis of data from the corpus of Kehoe and Girardier (2020) b
10%
Cross-linguistic Aspects of System and Structure in Clinical Phonology 427 of dorsal sounds (e.g., [ç]-[x], /ʁ/) than English-speaking children, reflecting the German phonetic inventory. Turning to /l/, several authors have noted cross-linguistic differences in their acquisition patterns. /l/ is considered a late acquired sound in English and it is often substituted by early acquired glides (Barlow et al., 2013; Smit, 1993). In contrast, /l/ is an early acquired sound in Spanish (Cataño et al., 2009) and is often the substitute for late acquired taps and trills (Barlow et al., 2013). Cataño et al. (2009) argue that /l/ has a greater functional load in Spanish than in English because it occurs in many articles and clitic pronouns (e.g., el, la, los, las, le, lo, les). Barlow et al. (2013) also refer to the allophonic properties of /l/. The English /l/ has two allophones: clear /l/ which occurs in prevocalic position and dark (or velarized) /l/ which occurs in post-vocalic position. Clear /l/ is produced with the tongue body toward the front of the oral cavity, whereas dark /l/ is produced with the tongue body towards the back. Barlow et al. (2013) note, however, that even the clear /l/ variant in English is darker than in Spanish. Thus, the different phonetic properties of Spanish versus English /l/ as well as the fact that English has allophonic variants may explain the later acquisition of /l/ in English. In this respect, European or Brazilian Portuguese offers a good test case since it has a velarized /l/. We might also predict late acquisition of /l/ in Portuguese. Table 30.2 compares the age of acquisition (90% criterion) for word-initial singleton /l/ across French, Spanish, German, English, and (Brazilian) Portuguese. Overall, the findings support the early acquisition of /l/ in all languages except English. The earlier acquisition of /l/ in French compared to English may be related to the greater token frequency of /l/ in French, in association with the fact that /l/ appears in the definite article (e.g., le, la, l’, les), consistent with /l/ having a high functional load in French as has already been proposed for Spanish (Cataño et al., 2009). Interestingly, /l/ was not found to be acquired particularly late in Brazilian Portuguese (see Barlow et al., 2013, for similar observations); although, it has been reported to be late in European Portuguese (Bernhardt & Stemberger, 2018; Ramalho & Freitas, 2018). Overall, the late acquisition of /l/ in English may result from multiple factors, including reduced functional load, the phonetic properties of /l/, and the presence of allophonic variants (Barlow et al., 2013). When examining phonologically disordered children, liquid simplification and gliding involving /l/ are frequently reported in English (StoelGammon & Dunn, 1985) whereas only occasional errors in /l/ production have been reported in German (Fox, 2000), consistent with findings in typically developing children. Although in many instances there is an overlap between the unmarked and frequent and thus early-learned phonological units, there are obvious clashes between the two. For Table 30.2 Comparison of age of acquisition for word-initial /l/ across languages. Language French Spanish German English Brazilian Portuguese a
Authors
Age at 90% criteria a
MacLeod et al. (2011) McLeod and Crowe (2018)b Fox (2000)c McLeod and Crowe (2018)d Smit et al. (1990)e Ceron et al. (2022)f
3;6–4;0 3;0–3;11 2;6–2;11 4;0–4;11 5;0–6;0 3;6
Data cited in Table 5 (p. 101) from MacLeod et al. (2011) Findings from four studies summarized on p.1560 by McLeod and Crowe (2018) c Data cited in Table 2.5 (p. 41) from Fox (2000) d Findings from 15 studies summarized on p.1558 by McLeod and Crowe (2018) e Findings in Table 7 of Smit et al. (1990) f Data cited in Figure 1 from Ceron et al. (2022) b
428 Mehmet Yavaş and Margaret Kehoe example, while sonorant consonants are less marked in coda position than obstruents, alveolar stops, which are the most frequent coda consonants in English, are the first-acquired coda consonants in English (Stites et al., 2004). This is interpreted as showing robust frequency effects over markedness tendencies. Another example of frequency over-riding markedness can be seen in cross-linguistic comparisons of /s/ and /ʃ/ acquisition. Fronting of /ʃ/ to [s] (e.g. “fish” → [fɪs]), is very common in English (Li & Edwards, 2006), whereas we observe the reverse in Japanese: /s/ is produced as [ʃ]. This pattern may be explained by input frequency, whereby the occurrence of /ʃ/ is higher than [s] in child-directed speech in Japanese (Tsurutani, 2007). To further this point, we can cite the tendency of /s/ to be produced as [ʂ], that is to change from an alveolar fricative to post-alveolar retroflex, in Putonghua (Baum & McNutt, 1990). While the case of fronting in English (/ʃ/→ [s]) can be accounted for by markedness, because /s/ is the most unmarked fricative, the Putonghua example displays two counter tendencies: (a) change to a back articulation (i.e., more marked), and (b) change from a set I (basic articulation) /s/ to a set II (more marked elaborated articulation [ʂ]) articulation (Lindblom & Maddieson, 1988). The rationale offered for this pattern is that in English /s/ occurs in six times as many words as /ʃ/; in Putonghua /s/ occurs in only a third of as many words as /ʂ/ (Edwards & Beckman, 2008). However, when there are no such disparities in frequency, /s/ remains a more unmarked sound than /ʂ/ (in Polish, /s/ is acquired earlier than /ʂ/; M. Marecka, personal communication).
30.3.3 Syllable Structure Many studies have compared children’s development of codas and clusters across languages finding different timelines of acquisition, which may relate to complexity, frequency, and other factors. We consider findings on codas and obstruent-liquid and /s/C clusters.
30.3.3.1 Codas Differences in coda development across languages have often been related to the frequency or complexity of codas in the input. For example, syllables with codas are more frequent in German and English in comparison to Spanish (Keffala et al., 2018; Lleó et al., 2003). They are also more complex in German and English, consisting of all manners and places of articulation, and consisting of coda clusters. Spanish, in contrast, has only coronal codas and does not have coda clusters. In order to make a cross-linguistic comparison of coda production, we have collected sources that report coda production (i.e., coda presence) in word-final stressed syllables of familiar words by children around 2;0 (comparison 1) and three years and older (comparison 2). As can be seen in Table 30.3, in both age ranges, there are differences between German-, English- and French-speaking children on the one hand and Spanish-speaking children on the other hand. The findings on German and Spanish are based on small numbers of participants; however, a large-scale study by Fox (2000) confirms that final codas are acquired early by German-speaking children (i.e., before 2;6 years) and several case studies support the slow acquisition of final consonants by Spanish-speaking children (Núñez-Cedeño, 2007; Roark & Demuth, 2000). Given that codas are not frequent in either Spanish or French, yet codas are acquired more easily by French- than by Spanish-speaking children (Demuth, 2007; Kehoe, 2021), factors apart from frequency such as complexity seem to explain the differences between Spanish and French. French codas are more complex than Spanish ones. They consist of all manners and places of articulation. There are also coda clusters. Another factor is prosody. Word-final codas appear in the accented word-final syllable in French, thus, making them particularly salient. Brosseau-Lapré and Rvachew (2014) indicate low rates of word-final consonant deletion in phonologically disordered French-speaking children, aged 4;0 to 5;11, suggesting similar findings in typical and disordered acquisition.
Cross-linguistic Aspects of System and Structure in Clinical Phonology 429 Table 30.3 Comparison of final consonant production in German-, English- Spanish-, and French-speaking children.
Language
Authors
Participant Information
%final consonant production
Comparison 1: Final consonant production at the age of 2;0 German English Spanish French
Lleó et al. (2003)a Kehoe and Stoel-Gammon (2001)b Lleó et al. (2003)c Polo (2018)d Hilaire-Debove and Kehoe (2004)e
3 children aged 2;0 8 children aged 2;0 3 children, aged 2;0 2 children, aged 2;0 6 children aged approximately 1;10–2;1
97% 81% 16% 50% 90%
Comparison 2: Final consonant production at the age of 3;0 English
Keffala et al. (2018)f
Spanish
Keffala et al. (2018)f
French
Kehoe (2021)g
12 children, aged 2;5–4;10 5 children, aged 2;4–4;2 19 children, aged 2;11–3;11
92% 63% 99%
a
Data in Table 1b (p. 203) from Lleó et al. (2003) Data in Table 3 (p. 413) from Kehoe and Stoel-Gammon (2001) c Data in Table 1a (p. 202) from Lleó et al. (2003) d Data in Table 2 (p. 7) from Polo (2018) e Reanalysis of data from the corpus of Hilaire-Debove and Kehoe (2004) f Data in Table 5 from Keffala et al. (2018) g Data in Table 5 (p. 520) from Kehoe (2021) b
30.3.3.2 Clusters Obstruent-liquid clusters. Cross-linguistic findings on obstruent-liquid clusters have reported different patterns of acquisition which appear to be tied to the phonetic characteristics of the liquid segment. Thus, for example, Kehoe (2018) reported earlier acquisition of C/r/ clusters in German than in Spanish, possibly due to the greater articulatory complexity of the Spanish compared to the German /r/. As mentioned above, the Spanish /r/ is an alveolar trill whereas the German /r/ is a uvular approximant. The German children produced C/r/ clusters with 90% accuracy between two to three years whereas the Spanish children realized C/r/ clusters with less than 20% accuracy during the same age period. Stemberger and Bernhardt (2018), in their cross-linguistic comparison of tap and trill clusters, report comparable results for Icelandic, Hungarian, and Bulgarian as for Spanish, namely, low accuracy rates. Similar trends were observed in typical and disordered acquisition. In terms of preservation patterns, most of the time, children reduce obstruent-liquid clusters by preserving the least sonorant element. This pattern has been attested across a variety of languages (Goad & Rose, 2004; Pater & Barlow, 2003). Stemberger and Chávez-Peón (2015), however, observed a different pattern in Valley Zapotec, a language spoken in Mexico.
430 Mehmet Yavaş and Margaret Kehoe Valley Zapotec is characterized by a large proportion of equal sonority (e.g., bd, mn) and falling sonority (e.g., nd, st, wɾ) clusters and only a small proportion of rising sonority ones. Stemberger and Chávez-Peón (2015) report that most of the time children reduce clusters to the second element. In the specific case of obstruent-liquid clusters, children preserved either the first or second element, which seems to suggest that the distribution of clusters in the input (i.e., the higher weighting of equal and falling sonority clusters in comparison to rising sonority) influenced the children’s preservation patterns rather than the single factor of sonority.
30.3.3.3 /s/C clusters Yavaş (2013, 2014) in a series of studies has examined /s/C cluster production in typically developing and phonologically disordered children across six languages. He grouped the languages into Germanic (English, Dutch, and Norwegian) and non-Germanic (Hebrew, Croatian, and Polish). Yavaş (2013, 2014) noted that, rather than sonority distance, the distinction [+/- continuant] could best explain the accuracy findings of /s/C clusters in the Germanic languages: /s/[+continuant] clusters (e.g., /sl, sw, sʋ/) were produced with higher percent accuracy than /s/[-continuant] clusters (e.g., /sT, sN). In contrast, neither sonority distance nor continuancy explained the findings in the non-Germanic languages: /s/[-continuant] clusters were produced with higher percent accuracy than /s/[+continuant] clusters. In exploring the reasons for the different findings, Yavaş (2014) points out that the non-Germanic languages have richer consonant inventories characterized by higher proportions of equal and falling sonority clusters, as well as rising sonority clusters with small sonority differences, than the Germanic languages. Children raised in these languages have more experience producing marked clusters which may influence their patterns with the /s/C clusters. Similar to the findings for obstruent liquid clusters in Valley Zapotec, the overall distribution of marked and unmarked clusters in the individual languages appears to influence the children’s realization of clusters. There are also reports, however, that reveal conflicting results for the same targets in the same language. Studying the acquisition of /s/C clusters in 40 typically developing Englishspeaking children, Yavaş and Core (2006) report that /st/, which is the most frequent /s/C cluster in English, has the highest correct realization, and /sn/, which is the least frequent, has the lowest percentage correct. While this study highlights the influence of frequency, Yavaş and McLeod (2010), looking at data from 30 English-speaking phonologically disordered children, come to the opposite conclusion namely that markedness based on the sonority distance between /s/ and C2 is a better explanation of the patterns than frequency: low frequency /s + nasal/ clusters had greater accuracy than the most frequent but negativesonority /st/. This, it seems that factors such as frequency or markedness may influence typically developing and phonologically disordered children differently. When looking at consonant preservation patterns in /s/C cluster reduction a rather consistent picture is observed in that if C2 is [-continuant] (i.e., stop or nasal), then the retained consonant is C2. When, C2 is [+ continuant] (i.e. fricative or approximant), then the tendency is to retain C1 (i.e. /s/). This pattern has been observed in Germanic languages (English, Dutch, Norwegian), and also in non-Germanic ones (Hebrew, Croatian, Polish) (Yavaş, 2014). Yavaş (2013) also observed differences across languages. In the case of /sl/ clusters, preference for C1, the least sonorous consonant, was the most common pattern. However, this preference was absolute in Croatian and Norwegian (i.e., 100 vs. 0%), extremely high in English (85% vs. 15%), and lower in Hebrew (i.e., 68% vs. 32%) and Dutch (i.e., 63% vs. 37%) for typically developing children. Yavaş (2013) also noted differences in the patterning of /sm/ and /sn/ in Dutch with much higher reductions to the nasal for /sm/ (90%) than for /sn/ (52%). Other authors have observed that labials may receive preference in cluster
Cross-linguistic Aspects of System and Structure in Clinical Phonology 431 reduction patterns. This seems to be what is driving the sN cluster reduction patterns in Dutch, although this pattern was less prevalent in the other Germanic and non-Germanic languages that Yavaş (2013) looked at. Stemberger and Chávez-Peón (2015) observed that in the few instances in which C1 was preferred over C2 in Valley Zapotec clusters, the clusters involved initial labial consonants (e.g., /bz/, /bd/, /mn/, /wr/). Similarly, several authors have reported that English-speaking children may select the sonorous rather than non-sonorous member of a cluster to favor the production of labials (e.g., drink produced as [wɪŋk] where [r] is assumed to have labial features; Pater & Barlow, 2003). More recently, an unexpected pattern emerged in German-speaking children’s cluster reduction patterns of /ʃv/ (e.g. Schwein [ʃvaɪn] “pig”). Here, we observe a predominant pattern of C2 retention, despite the fact that C2 is [+continuant] (Yavaş et al., 2018). Hamann and Sennema (2005), examining the acoustic measures of duration (ms), harmonicity median (db) and center of gravity (kHz), conclude that German /v/ is more like a /ʋ/. If this is the case, then the reduction pattern of German /ʃv/ are expected to be similar to what is observed in Norwegian and Croatian /sʋ/, which is an unambiguous retention preference for C1 (i.e., /s/). Thus, the rather unexpected and inexplicable German retention of C2 is diametrically opposed to the general picture observed for the [+ continuant] /s/ clusters.
30.4 Conclusion We started off the chapter by examining how phonological complexity has been defined in adult languages. We saw that languages vary in phonetic inventory size, the markedness of their consonants, the number of syllable types, and their syllable structure complexity, the latter involving differences in phonotactic restrictions and sonority. In general, research reveals that simple systems are acquired before complex ones (i.e., unmarked consonants are acquired before marked; simple syllable structure is acquired before complex). However, multiple factors interact with complexity leading to different patterns of acquisition across languages. In terms of the influence of phonetic inventory size, we did not observe that size of inventory plays a significant role in the individual acquisition of consonants across languages. Rather, the markedness of consonants plays a greater role. Sounds that have complex articulations are more difficult to acquire than sounds with basic articulations regardless of the target language, whether Arabic, Igbo, or Cantonese (Amayreh, 2003; Nwokah, 1986; So & Dodd, 1995). Even within the sound class of liquids (i.e., rhotics and laterals), certain liquids are acquired more easily than others which may relate to markedness (i.e., articulatory difficulty), but also to functional load, and the presence of allophonic variations (Kehoe, 2018; Ru͂ ķe Draviņa, 1965). Markedness may be overridden by frequency as observed by the earlier acquisition of /ʂ/ versus /s/ in Putonghua. Phonetically similar target sounds may be subject to different substitution patterns across languages depending upon the inventory of sounds in the target language (Bernhardt & Stemberger, 2018; Rose & Penney, 2021). Turning to syllable structure, both frequency and complexity may explain why codas are acquired earlier in German and English versus Spanish (Lleó et al., 2003); however, another factor, prosodic saliency appears to explain their earlier acquisition in French. Numerous studies have documented cross-linguistic differences in the acquisition of clusters (Kehoe, 2018; Stemberger & Bernhardt, 2018; Yavaş, 2013, 2014). These differences may relate to the markedness of the different consonantal elements of the cluster (e.g., clusters containing uvular /r/ are acquired before clusters containing taps and trills). However, another factor that plays a role is the distribution of marked versus unmarked clusters within the language itself. Having a larger proportion of marked clusters (clusters containing sonority falls and plateaus) may alter the general effects of sonority leading to different cluster preservation patterns (Stemberger & Chávez-Peón, 2015) and accuracy rates (Yavaş, 2013, 2014). Some
432 Mehmet Yavaş and Margaret Kehoe apparent differences in cluster preservation patterns across languages, on closer examination, may actually reflect common articulatory constraints such as a preference for labials, patterns prevalent in child speech. Finally, we observed that some cross-linguistic differences such as the preservation of C1 in /sʋ/ clusters in Norwegian and Croatian and C2 in /ʃv/ clusters in German are not easily explained and may depend on additional factors not yet identified. This chapter also included studies on phonological disorders, although, the findings are preliminary since cross-linguistic comparisons of phonologically disordered children are not plentiful. Some of the findings suggest similar cross-linguistic patterns in typical and disordered populations (Bernhardt & Stemberger, 2018; Brosseau-Lapré & Rvachew, 2014); however, different patterns have also been observed as described by Yavaş and McLeod (2010) for /s/C cluster production. Another possibility is that cross-linguistic differences may be reduced in disordered populations due to the phonological or speech-motor impairment, which dilutes the influence of complexity, frequency, or saliency. Thus, we may document cross-linguistic differences in typically developing children but reduced or non-existent ones in their speech-disordered counterparts. Currently, there is insufficient research to provide support for one of these alternatives, suggesting the need for future research to investigate cross-linguistic aspects of system and structure not only in typically developing but also in disordered populations.
NOTE 1 Ohala and Kawasaki-Fukumori (1997) reject the validity of sonority arguing that it is both circular and too broadly defined to account for the linguistic rarity of sequences such as /pw/ and /dl/ and the crosslinguistic prevalence of sequences such as /st/, /sk/.
REFERENCES Akmajian, A., Demers, R. A., Farmer, A. K., & Harnish, R. M. (2010). Linguistics: An introduction to language and communication (6th ed.). MIT Press. Amayreh, M. (2003). Completion of the consonant inventory of Arabic. Journal of Speech- Language and Hearing Research, 3, 517–529. Amayreh, M., & Dyson, A. (1998). The acquisition of Arabic consonants. Journal of Speech-Language and Hearing Research, 41(3), 642–653. Bader, S. (2009). Speech and language impairments of Arabic-speaking Jordanian children within natural phonology and phonology as human behavior. Poznań Studies in Contemporary Linguistics, 45, 191–210. Barlow, J., Branson, P., & Nip, I. (2013). Phonetic equivalence in the acquisition of /l/ by Spanish–English bilingual children. Bilingualism: Language and Cognition, 16(1), 68–85.
Baum, S. R., & McNutt, J. C. (1990). An acoustic analysis of frontal misarticulation of /s/ in children. Journal of Phonetics, 18(1), 51–63. Bernhardt, B. M., Romonath, R., & Stemberger, J. P. (2015). A comparison of fricative acquisition in German and Canadian English-speaking children with protracted phonological development. In M. Yavaş (Ed.), Unusual productions in phonology. Universals and language-specific considerations (pp. 91–116). Psychology Press. Bernhardt, B. M., & Stemberger, J. P. (2018). Tap and trill clusters in typical and protracted phonological development: Conclusion. Clinical Linguistics & Phonetics, 32(5–6), 563–575. Brosseau-Lapré, F., & Rvachew, S. (2014). Cross-linguistic comparison of speech errors produced by English- and French-speaking preschool-age children with developmental phonological disorders. International Journal of Speech-Language Pathology, 16(2), 98–108.
Cross-linguistic Aspects of System and Structure in Clinical Phonology 433 Buchwald, A. (2009). Minimizing and optimizing structure in phonology: Evidence from aphasia. Lingua, 119, 1380–1395. Cataño, L., Barlow, J. A., & Moyna, M. I. (2009). Phonetic inventory complexity in the phonological acquisition of Spanish: A retrospective, typological study. Clinical Linguistics & Phonetics, 23(6), 446–472. Ceron, M., De Simoni, S., & Keske-Soares, M. (2022). Phonological acquisition in Brazilian Portuguese: Ages of customary production, acquisition and mastery. International Journal of Language & Communication Disorders, 57(2), 274–287. Clements, G. N. (1990). The role of the sonority cycle in core syllabification. In J. Kingston & M. Beckman (Eds.), Papers in laboratory phonology 1 (pp. 283–333). Cambridge University Press. Demuth, K. (2007). The role of frequency in language acquisition. In I. Gulzow & N. Gagarina (Eds.), Frequency effects in language acquisition (pp. 528–538). Mouton de Gruyter. Dryer, M., & Haspelmath, M. (Eds.) (2013). The world atlas of language structures online. Max Planck Institute for Evolutionary Anthropology. Retrieved from http://wals.info Edwards, J., & Beckman, M. E. (2008). Some cross-linguistic evidence for modulation of implicational universals by language-specific frequency effects in phonological development. Language Learning and Development, 4(2), 122–156. Firchow, I., & Firchow, J. (1969). An abbreviated phoneme inventory. Anthropological Linguistics, 11, 271–276. Fox, A. (2000). The acquisition of phonology and the classification of speech disorders in Germanspeaking children. Unpublished dissertation. The University of Newcastle upon Tyne. Goad, H., & Rose, Y. (2004). Input elaboration, head faithfulness and evidence for representation in the acquisition of left edge clusters in West-Germanic. In R. Kager, J. Pater, & W. Zonneveld (Eds.), Constraints in phonological acquisition (pp. 109–157). Cambridge University Press. Goldrick, M., & Rapp, B. (2007). Lexical and post-lexical phonological representations in spoken production. Cognition, 102(2), 219–260. Hamann, S., & Sennema, A. (2005). Voiced labiodental fricatives or glides – All the same to Germans? In V. Hazan & P. Iverson (Eds.), Proceedings of the conference on plasticity in speech processing (pp. 164–167). University College London.
Hilaire-Debove, G., & Kehoe, M. (2004). Acquisition des consonnes finales (codas) chez les enfants francophones des universaux aux spécificités de la langue maternelle. In Actes de la 25ème Journée d’Études sur la Parole. Ingram, D. (1988). The acquisition of word-initial [v]. Language and Speech, 31(1), 77–85. Jespersen, O. (1922). Language, its nature and origin. Holt. Keffala, B., Barlow, J., & Rose, S. (2018). Interaction in Spanish-English bilingual’s acquisition of syllable structure. International Journal of Bilingualism, 22(1), 16–37. Kehoe, M. (2018). The development of rhotics. A comparison of monolingual and bilingual children. Bilingualism: Language and Cognition, 21(4), 710–731. Kehoe, M. (2021). Coda consonant production in French-speaking children. Clinical Linguistics & Phonetics, 35(6), 509–533. Kehoe, M., & Girardier, C. (2020). What factors influence phonological production in Frenchspeaking bilingual children, aged three to six years? Journal of Child Language, 47(5), 945–981. Kehoe, M., & Havy, M. (2019). Bilingual phonological acquisition: The influence of language-internal, language-external and lexical factors. Journal of Child Language, 46(2), 292–333. Kehoe, M., Hilaire, G., Demuth, K., & Lleó, C. (2008). The structure of branching onsets and rising diphthongs: Evidence from the acquisition of French and Spanish. Language Acquisition, 15(1), 5–57. Kehoe, M., & Stoel-Gammon, C. (2001). Development of syllable structure in Englishspeaking children with particular reference to the rhyme. Journal of Child Language, 28(2), 393–432. Kunnari, S. (2003). Consonant inventories: A longitudinal study of Finnish-speaking children. Journal of Multilingual Communication Disorders, 1, 124–131. Lallini, N., & Miller, N. (2011). Do phonological neighborhood density and phonotactic probability influence speech output accuracy in acquired speech impairment? Aphasiology, 25(2), 176–190. Li, F., & Edwards, J. (2006). Contrast and covert contrast in the acquisition of /s/ and /ʃ/ in English and Japanese. Poster presented at the 10th Conference on Laboratory Phonology, Paris, France. June 27-30. Lindblom, B., & Maddieson, I. (1988). Phonetic universals in consonant systems. In C. Li & M. Hyman (Eds.), Language, speech and mind (pp. 62–78). Routledge.
434 Mehmet Yavaş and Margaret Kehoe Lleó, C., Kuchenbrandt, I., Kehoe, M., & Trujillo, C. (2003). Syllable final consonants in Spanish and German monolingual and bilingual acquisition. In N. Müller (Ed.), (Non)Vulnerable domains in bilingualism (pp. 191–220). John Benjamins. Macken, M. (1995). Phonological acquisition. In J. Goldsmith (Ed.), The handbook of phonological theory (pp. 671–696). Basil Blackwell. MacLeod, A., Sutton, A., Trudeau, M., & Thordardottir, E. (2011). The acquisition of consonants in Québécois French: A crosssectional study of pre-school aged children. International Journal of Speech-Language Pathology, 13(2), 93–109. Maddieson, I., & Disner, S. F. (1984). Patterns of sounds. Cambridge University Press. Mashaqba, B., Daoud, A., Zuraiq, W., & Huneety, A. (2022). Acquisition of emphatic consonants by Ammani Arabic-speaking children. Language Acquisition. 29(4), 441-456. https:// doi.org/10.1080/10489223.2022.2049600 McLeod, S., & Crowe, K. (2018). Children’s consonant acquisition in 27 languages: A cross-linguistic review. American Journal of Speech-Language Pathology, 27(4), 1546–1571. McMahon, A. M. S. (1994). Understanding language change. Cambridge University Press. McWhorter, J. (2001). The world’s simplest grammars are creole grammars. Linguistic Typology, 5(2–3), 125–166. Moran, S., & Blasi, D. (2014). Cross-linguistic comparison of complexity measures in phonological systems. In F. J. Newmeyer & L. Preston (Eds.), Measuring grammatical complexity (pp. 217–240). Oxford University Press. Nettle, D. (1995). Segmental inventory size, word length, and communicative efficiency. Linguistics, 33, 359–367. Núñez-Cedeño, R. (2007). The acquisition of Spanish codas: A frequency/sonority approach. Hispania, 90, 147–163. Nwokah, E. (1986). Consonantal substitution patterns in Igbo phonological acquisition. Language and Speech, 29(2), 159–176. O’Grady, W., Archibald, J., Aronoff, M., & Rees-Miller, J. (2017). Contemporary linguistics (7th ed.). Bedford. Ohala, J. J., & Kawasaki-Fukumori, H. (1997). Alternatives to the sonority hierarchy for explaining segmental sequential constraints. In S. Eliasson & E. H. Jahr (Eds.), Language and its ecology: Essays in memory of Einar Haugen (pp. 343–365). Mouton de Gruyter.
Ota, M., & Ueda, I. (2006). Japanese speech acquisition. In S. McLeod (Ed.), The international guide to speech acquisition (pp. 457–471). Thomson Delmar Learning. Parker, S. (2011). Sonority. In M. van Oostendorp, C. J. Ewen, E. Hume, & K. Rice (Eds.), The Blackwell companion to phonology (Vol. II, pp. 1160–1184). Blackwell. Pater, J., & Barlow, J. (2003). Constraint conflict in cluster reduction. Journal of Child Language, 30(3), 487–526. Perez, D., Vivar, P., Bernhardt, B. M., Mendoza, E., Ávila, C., Carballo, G., Fresneda, D., Muñoz, J., & Vergara, P. (2018). Word-initial rhotic clusters in Spanish-speaking preschoolers in Chile and Granada, Spain. Clinical Linguistics & Phonetics, 32(5–6), 481–505. Polo, N. (2018). Acquisition of codas in Spanish as a first language: The role of accuracy, markedness and frequency. First Language, 38(1), 3–25. Pye, C., Ingram, D., & List, D. (1987). A comparison of initial consonant acquisition in English and Quiché. In K. E. Nelson & A. van Kleeck (Eds.), Children’s language (pp. 175–190). Erlbaum. Ramalho, A. M., & Freitas, M. J. (2018). Wordinitial rhotic clusters in typically developing children: European Portuguese. Clinical Linguistics & Phonetics, 32(5–6), 459–480. Roark, B., & Demuth, K. (2000). Prosodic constraints and the learner’s environment: A corpus study. In S. Howell, S. Fish, & T. Keith-Lucas (Eds), Proceedings of the 24th annual Boston university conference on language development (pp. 597–608). Cascadilla Press. Robb, M., & Bleile, K. (1994). Consonant inventories of young children from 8 to 25 months. Clinical Linguistics & Phonetics, 8(3), 295–320. Romani, C., & Galluzzi, C. (2005). Effects of syllabic complexity in predicting accuracy of repetition and accuracy of errors in patients with articulatory and phonological difficulties. Cognitive Neuropsychology, 22(7), 817–850. Rose, Y., & Penney, N. (2021). Language and learner specific influences on the emergence of consonantal place and manner features. Frontiers in Psychology, 12, 646713. Ru͂ ķe Draviņa, V. (1965). The process of acquisition of apical /r/ and uvular /R/ in the speech of children. Linguistics, 17, 58–68. Shosted, R. K. (2006). Correlating complexity: A typological approach. Linguistic Typology, 10(1), 1–40. Sievers, E. (1901). Grundzuge der phonetic. Breitkopf und Hartel.
Cross-linguistic Aspects of System and Structure in Clinical Phonology 435 Smit, A., Hand, L., Freilinger, J., Bernthal, J., & Bird, A. (1990). The Iowa articulation norms project and its Nebraska replication. Journal of Speech and Hearing Disorders, 55(4), 779–798. Smit, A. B. (1993). Phonologic error distributions in the Iowa–Nebraska Articulation Norms Project: Consonant singletons. Journal of Speech and Hearing Research, 36(5), 533–547. Snyman, J. W. (1969). An introduction to !Xu language. University of Cape Town. So, L., & Dodd, B. (1995). The acquisition of phonology by Cantonese-speaking children. Journal of Child Language, 22(3), 473–495. Solé, M.-J. (2002). Aerodynamic characteristics of trills and phonological patterning. Journal of Phonetics, 30(4), 655–688. Stemberger, J., & Chávez-Peón, M. (2015). Development of word-initial consonant clusters in Valley Zapotec: Universal vs. language-specific effects of sonority. In M. Yavaş (Ed.), Unusual productions in phonology. Universals and language-specific considerations (pp. 49–69). Psychology Press. Stemberger, J. P., & Bernhardt, B. M. (2018). Tap and trill clusters in typical and protracted phonological development: Challenging segments in complex phonological environments. Introduction to the special issue. Clinical Linguistics & Phonetics, 32(5–6), 411–423. Stites, J., Demuth, K., & Kirk, C. (2004). Markedness versus frequency effects in coda position. In A. Brugos, L. Micciulla, & C. E. Smith (Eds.), Proceedings of the 28th annual Boston university conference on language development (pp. 565–576). Cascadilla. Stoel-Gammon, C. (1985). Phonetic inventories, 15–24 months: A longitudinal study. Journal of Speech and Hearing Research, 28(4), 505–512.
Stoel-Gammon, C., & Dunn, C. (1985). Normal and disordered phonology in children. Pro-Ed. Stokes, S. F., & Surendran, D. (2005). Articulatory complexity, ambient frequency and functional load as predictors of consonant development in children. Journal of Speech, Language, and Hearing Research, 48(3), 577–591. Trubetzkoy, N. (1931). Die phonologischen Systeme. Travaux du Cercle Linguistique de Prague, 4, 96–116. Tsurutani, C. (2007). Early acquisition of palatealveolar consonants in Japanese: Phoneme frequencies in child-directed speech. Journal of the Phonetic Society of Japan, 11(1), 102–110. Vitevich, M. S., Ambruster, J., & Chu, S. (2004). Sublexical and lexical representations in speech production. Journal of Experimental Psychology: Learning, Memory and Cognition, 30(2), 514–529. Yavaş, M. (2013). What explains the reductions in /s/-clusters: Sonority or [continuant]? Clinical Linguistics & Phonetics, 27(6–7), 394–403. Yavaş, M. (2014). What guides children’s acquisition of #sC clusters? A cross-linguistic account. In A. Farris-Trimble & J. Barlow (Eds.), Perspectives on phonological theory and development. In honor of Daniel Dinnsen (pp. 115–132). John Benjamins. Yavaş, M., & Core, C. (2006). Acquisition of #sC clusters in English-speaking children. Journal of Multilingual Communication Disorders, 4(3), 169–181. Yavaş, M., Fox-Boyer, A., & Schaefer, B. (2018). Patterns in German /ʃC/ cluster acquisition. Clinical Linguistics & Phonetics, 32(10), 913–931. Yavaş, M., & McLeod, S. (2010). Acquisition of /s/ clusters in English-speaking children with phonological disorders. Clinical Linguistics & Phonetics, 24(3), 177–187.
31 Connected Speech CAROLINE NEWTON, SARA HOWARD, BILL WELLS, AND JOHN LOCAL 31.1 Introduction Many people with speech difficulties are unintelligible when using longer strings of speech in everyday, spontaneous communication, even though they may be able to produce single words in isolation quite accurately. Despite this common observation connected speech may not be routinely assessed and often is not specifically addressed in intervention. Using connected speech places a greater load on the speech processing system than does the production of single words. However, the challenge of connected speech is about more than just extra processing load: it is also qualitatively different from single words, in terms of its phonology and therefore its phonetics. Connected speech is more than just a string of individual target segments joined together in series, since each segment is liable to influence the segments that surround it. The precise form that these influences take is determined by the particular language in question, and so the phonology of connected speech is a part of the phonology of the language that the child has to master, just like its systems of vowels and consonants and its phonotactic structures. As adults we display our mastery of the phonology of the language as much by the ways in which we connect words – our realization of word junctures – as we do by our pronunciation of individual words.
31.2 Connected Speech Processes and Word Junctures So I told my girlfriend I had a job in a bowling alley. She said “Ten pin?” I said, “No, it’s a permanent job.” In spoken-English humor, many jokes rely on the listener’s subconscious awareness of the tension between the form a lexical item takes when spoken rather carefully in isolation, and its realization in the company of other words in connected speech (in the case above,1 this produces an ambiguity between the words “ten pin” and “temping”). Such tensions may be said to reflect a set of simplifying processes which operate on words in larger contexts, and which have been extensively described in the literature (e.g. Brown, 2016; Cruttenden, 2014; Farnetani & Recasens, 2010; Shockey, 2003). Connected speech processes (CSPs) which affect speech production at word
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
438 Caroline Newton, Sara Howard, Bill Wells, and John Local
boundaries include assimilation (e.g. red cat / rεd kæt/ → / rεɡ kæt/), coalescence (e.g. miss you/ mІs ju/ → / mІ�u/), elision (e.g. last summer /lɑst ʹsʌmə/→/lɑs ʹsʌmə/; put him off /ʹpʊt Іm ʹɒf/), and liaison (e.g., in a non-rhotic accent, far /fɑ/ but far away /fɑr əʹweI/. Other processes which contrast words produced in isolation with words in connected speech include the use of weak forms (e.g. from /frəm/ rather than /ʹfrɒm/) and other vowel reductions connected to the stress and rhythm patterns of the language in question. There is a significant body of evidence which suggests that similar phonetic and phonological simplifications in connected speech can be found across languages and language varieties (see, for example, Barry & Andreeva, 2001; Bolozky, 2019; Engstrand & Krull, 2001; Kohler, 2000; Mitterer & Ernestus, 2006; Nicolaidis, 2001; Oladipupo, 2014), driven largely by the fact that, as Barry and Andreeva (2001, p. 51) observe “all languages allow for variation in the time and effort invested in any given part of an utterance.” However, it is important to note that significant differences have also been found between languages. For example, in English where consonants assimilate at word boundaries, the direction of this assimilation is typically regressive (i.e. the final consonant of the first word is influenced by the initial consonant in the following word). In other languages, however, including German, Portuguese and Swedish, we see instances of progressive assimilation, where the influence operates in the opposite direction. Whilst in English place assimilation is typically alveolarto-labial or alveolar-to-velar, in Korean labial-to-velar assimilation is also observed (Mitterer et al., 2013). The degree to which CSPs occur crosslinguistically may be governed by grammatical factors. For example, we have mentioned above that word boundary coalescence of /s$j/ to /�/ is an extremely common process in spoken English: Rechziegel (2001) notes that its appearance is quite variable in Dutch, and that in Czech, a strongly inflectional language, grammatical markers at word boundaries appear to exert strong resistance to its occurrence. Bush (2001, p. 256), meanwhile, suggests that the appearance of this process at specific word junctures in spoken English is motivated by lexical collocational frequency: “word-boundary palatalization is more likely between two words if those words occur together in high frequency.” The influence of frequency is also picked up by Bybee (2002) who suggests that high frequency words and phrases are often subject to greater degrees of simplification than low frequency words, where, presumably, the speaker is maximally facilitating listener comprehension. Indeed, in outlining a usage-based model of language production and change, Bybee (2000, p. 268) cautions that “many cases of what was earlier postulated as structural turn out to be derivable from the way language is used.” (See also Sosa and Bybee, this volume). There appear to be multiple factors affecting the likely occurrence of CSPs. Shockey (2003) provides a clear summary of phonetic, phonological, prosodic, grammatical, and discourse factors which may all contribute to the likelihood of a particular word boundary process taking place. Thus, for example, in spoken English, a word final alveolar (particularly /t/), which forms part of a consonant cluster, in an unstressed syllable, would be extremely vulnerable to simplification or elision, whereas the same segment appearing as a singleton consonant in initial position in a stressed syllable would be extremely unlikely to be affected by word boundary speech behaviors. Discourse function and familiarity can also affect a word’s realization in connected speech. Fowler and Housum (1987) have suggested that words functioning to provide new information within an utterance are typically more intelligible than words relating to given information. Other phonologists suggest an interactional function for certain phonetic behaviors. For them, a broad distinction is made between Open Juncture and Close Juncture, following Sprigg (1957). When a speaker of English produces two words in sequence, there may be features that serve to keep the words distinct (e.g. a silence; audible release of final stop in the coda of the first word). Adult speakers may deploy such open juncture for the purposes
Connected Speech 439 of emphasis or repair. Close juncture on the other hand is characterized by phonetic features that bind adjacent words together. How this can be done depends on the phonological structures that abut at the junction; close juncture types include the connected speech “processes” outlined above, such as assimilation, elision and liaison.
31.3 Typical Development By far the majority of research on children’s phonological development over the past four decades has focused on phonology within the word. This focus is at odds with the fact that by the end of their second year, typically developing children are starting to produce multiword utterances, and from then on, the use of utterances consisting of a single word is only part of their repertoire (as it is for adults). In principle, then, the study of between-word junctures should be germane to our understanding of early phonological development. This impacts not only on our knowledge of phonological development per se, but also on clinical practice with children with speech impairment, for whom between-word junctures are a significant constraint on improved intelligibility. There is a need to understand to what extent the atypical juncture patterns documented in older speechimpaired children reflect normal, early juncture development. Moreover, remedial strategies depend on greater knowledge of how juncture types are deployed and developed. The work that has been published in the development of between-word junctures allows us to begin to address these questions, and sheds light on two key issues:
31.3.1 Developmental Order of Close vs Open Juncture A cross-sectional study on the use of adult CSPs by 94 typically developing English children aged 3 to 7 years was carried out by Newton and Wells (1999). As part of this study the processes of alveolar-to-bilabial/velar assimilation, final alveolar plosive elision, and /j, w, r/-liaison (e.g. tidy up: [taIdijʌp]; saw a: [sɔrə]. show us: [�əʊwʌs]) were investigated. The incidence of each process (i.e. close juncture) remained fairly consistent across the age range, at between 70 and 90%. This suggests that from a quantitative perspective, there is no developmental progression in the frequency of occurrence of adult CSPs, though it is likely that there were qualitative developments in the phonetic realization of these CSP targets which were unexplored. Qualitative analysis of the same set of juncture types was the focus of a follow-up study by Newton and Wells (2002), who studied the spontaneous speech of a typically developing boy (CW) learning Southern British English between the ages of 2;4 and 3;4. The earliest recordings showed examples of close juncture forms, but no open junctures, across almost all CSP environments. As the year progressed, open junctures appeared, but close junctures were always the majority. This result suggests that CW did not learn to join the two words together phonologically having first learnt to combine them grammatically, but that junctural phonology and grammar had emerged simultaneously. There was one exception to the general pattern of close-juncture-first: while y- and w- liaison were evident from the onset of multiword utterances, target r-liaison sites were initially realized by CW as open juncture. Thus r-liaison specifically seems to have been learnt as a phonological rule. The apparent late emergence of r-liaison was also observed by Thompson and Howard (2007), in a cross-sectional study investigating the spontaneous speech of six typically developing children from the North of England (three aged 2;0 to 2;6, three aged 3;0 to 3;6). However, other findings appear to contrast markedly with those of Newton and Wells (2002): they observed a clear quantitative shift from a predominance of open junctures for the twoyear-olds, to a strong preference for close junctures shown by the three-year-olds.
440 Caroline Newton, Sara Howard, Bill Wells, and John Local Apparent differences between these two studies with respect to assimilation and elision processes may be attributable to the productive versus formulaic nature of the word combinations produced, so that adult-like close junctures are evident first in the child’s stereotypical or formulaic utterances – those multiword utterances that appear to be learnt from the outset as a gestalt and reproduced by the child in that way. For example, the phrase there they are was most often produced by CW with close juncture, but open juncture was more likely to occur with obviously productive (grammatically incorrect) utterances, such as i upstairs.
31.3.2 Factors Affecting Use of Open vs Close Juncture Like adults, therefore, there may be a number of factors which influence children’s production of word boundaries. It was observed above that high frequency words and phrases are often subject to greater degrees of simplification than those with low frequency (Bybee, 2002). Similarly, Howard et al. (2008) observed that one typically developing child, aged 2;3–2;10, produced familiar utterances including higher frequency verbs with close juncture, and less familiar utterances with open juncture. The occurrence of close juncture in children’s productions could also be a function of the frequency of the particular juncture in the speech the child hears. Formulaic utterances such as there they are, are likely to be phrases which occur with high frequency in adult speech; in contrast phrases like I upstairs and he upside down are unlikely to have been heard in the input. The relationship between input frequency and children’s productions was also highlighted in Bryan’s (2012) case study of Thomas whose productions of assimilation between the ages of 2 and 4 gradually mirrored those of his parent, supporting a usage-based approach to acquisition. The variability in use of open and close juncture highlighted above may also relate simply to individual differences between children in their approach to the articulatory demands of connected speech and word boundaries in particular. Indeed, this was reported by Yuen et al. (2017) in their exploration of r-liaison in the speech of 13 Australian children aged between 5;6 and 6;10. Contrary to their hypothesis that close juncture (liaison) and open juncture (glottalization) would occur in complementary distribution relative to the phonological context (either within a foot or at the foot boundary), the children appeared to use either one strategy or the other: eight children used r-liaison, the remainder tended to use open juncture. Children may also vary between open and close juncture in their realization of the same target sequence of abutting segments, and this may be influenced by phonological and grammatical as well as interactional factors. Given the constraints on young children’s articulatory capabilities, it is likely that all children produce some junctures that are non-adult in their phonetic form. In early recording sessions, for example, CW produced for target assimilation junctures, tokens such as man come – [mæʔkʌm] (coda consonant elided; glottal stop inserted); and for target elisions lost Bertie – [lɒʔbɜti] (glottal stop for target cluster coda of first word at the juncture).
31.4 Assessment As we have observed earlier, while children with atypical speech production are often at their least intelligible in spontaneous, multi-word utterances, their speech output skills are nevertheless most often assessed clinically by single word elicitation tasks such a picture naming. Connected speech can be challenging to assess not least because of the undoubted problems of knowing what the speaker is attempting to say in spontaneous unintelligible speech. Uncertainty may also arise about how large a sample of connected speech is required to be representative of a child’s abilities in this domain, and about how best to elicit and analyze such a sample.
Connected Speech 441
31.4.1 Methods of Elicitation 31.4.1.1 How Much to Elicit Although a large sample of spontaneous speech clearly has advantages in terms of ecological validity, collecting and analyzing such a sample may not be feasible in a clinical environment. Wren et al. (2021), amongst others, have therefore sought to determine the minimum length of a CS sample which would enable effective meaningful analysis of the impact of a speech impairment on that domain. They explored different measures of performance (see below) across three different sample sizes in a large group of 776 five-year-old children from the West of England. The authors suggest that for almost all measures a sample size of 75 word tokens is sufficient to provide a reliable measure of connected speech proficiency.
31.4.1.2 How to Elicit A range of methods of elicitation might be used, from more or less naturalistic speech samples to more structured testing. With children the former is often collected during free play with parents/caregivers (e.g. Beers et al., 2019; Deveney & Scheffel, 2019); with older children and adults a narrative sample may be collected by asking them to repeat the storyline of a recently viewed film (e.g. Staiger et al., 2010) or describing the steps required to perform a familiar task (e.g. Wren et al., 2013). Picture description provides a further option for eliciting an extended sample of multi-word speech, and is included in some standardized assessments. For example, in the connected speech subtest of the Diagnostic Evaluation of Articulation and Phonology (Dodd et al., 2006) and the Cookie Theft picture in the Boston Diagnostic Aphasia Examination (Goodglass et al., 2001). These more naturalistic speech samples can be said to have the advantage of ecological validity, however the use of naturalistic samples may seriously limit the occurrence of particular structures and forms of interest. To circumvent this problem, some researchers have devised sets of pictures including objects and activities which could be described using phrases containing the CS phenomena of interest (Howard, 2004; Yuen et al., 2017). Newton (1999) designed a comprehensive set of sentences suitable for repetition by children, targeting the main English CSPs. The test is reproduced in Stackhouse et al. (2007) and normative data from this procedure is presented in Newton and Wells (1999).
31.4.2 Means of Analysis Within all the approaches to assessment described above, it is possible to carry out quantitative and qualitative analyses. One approach to analysis may be to draw on the tools already at our disposal at a single word level, for example,
i. Phonetic inventory – provides a detailed distribution of the speech sounds produced by
the child usually across different positions in a word, without reference to the phonemes produced in an idealized adult target. Results from Deveney and Scheffel’s (2019) exploration of this measure in the connected speech of 11 two-year-old children indicate good test-retest reliability for a consonant inventory across all three word positions (initial, medial, and final) and in clusters. ii. Word (or syllable) shape analysis – identifies the set of sequences of consonants and vowels which are utilized by the child, though recent evidence suggests this may not be a reliable measure in multi-word speech (Deveney & Scheffel, 2019). iii. Phonological “simplification” error analysis – this widely used means of analysis describes a child’s production in comparison to an adult production of the target for the purpose of categorizing patterns of error that are evident (e.g. Fronting, Cluster Reduction, Final Consonant Deletion).
442 Caroline Newton, Sara Howard, Bill Wells, and John Local iv. Whole word analysis – increasingly used in the clinical literature (e.g. Beers et al., 2019; Newbold et al., 2013; Watson & Terrell, 2012). Key measures here are phonological Mean Length of Utterance (pMLU) which is intended to capture both the accuracy and complexity of the child’s productions (Ingram & Ingram, 2001), and is determined by calculating the sum of all segments produced plus all correct consonants and dividing by the total number of words. A child’s overall accuracy can be evaluated with reference to the related measure Proportion of Whole-word Proximity (PWP) which is determined by dividing the child’s pMLU by that of the idealized adult production of the target words. Whole word measures such as these may be particularly favored with younger children who may not be operating at the level of individual phonemes. They have also been found to be sensitive to change in children with speech sound disorders, at least at the single word level (Newbold et al., 2013). Other means of analysis are associated more specifically with connected speech and may be used to provide a quantitative measure of intelligibility in this domain, for example, v. Percentage of Consonants Correct (PCC) – this measure (and variants, including percentage phonemes correct (PPC) and percentage clusters correct) is routinely reported in published studies of intelligibility in connected speech (e.g. Barnes et al., 2009; Wren et al., 2013). While all the measures listed above provide a useful indication of performance, and to a degree take into account some of the articulatory challenges of multi-word utterances, they are limited if they fail to take into account the particular phonological phenomena described in this chapter that are only evident in connected speech. So, for example, if the individual is expected to produce the citation form of target words even in connected speech (e.g. in environments where elision might occur), this may lead to underestimation of the child’s phonological ability when using a measure such as the PCC. When viewing a CS sample through the lens of phonological “simplification” errors, it would be important to note that some of these may in part be conditioned by connected speech factors. This includes simplifications such as cluster reduction: as the CSP of consonantal elision has as its domain a coda cluster followed by an onset consonant, one might anticipate that children are more likely to produce radical Cluster Reduction before a word beginning with a consonant. In order to evaluate phenomena specific to connected speech, a full assessment should therefore also include: vi. Analysis of juncture types – As in clinical phonology generally, when interpreting the connected speech patterns of an impaired speaker, we can focus principally on their realization of target forms, and the frequency of occurrence of a CSP can be calculated relative to the number of target contexts. This depends on an adequate description of juncture types in the adult language, and in the case of English, this is available in a range of sources (Brown, 2016; Cruttenden, 2014; Shockey, 2003). In cases where such CSPs are not found, it is assumed that the citation form is maintained in connected speech. In order to systematize and interpret our observations of juncture behavior, making a broad distinction between Close and Open Juncture, as described earlier, can be a useful initial step (Wells, 1994). This overarching categorization includes the CSPs, which can be thought of as close junctures, while their non-application in the same environment would be an instance of open juncture. Once junctures have been assigned to Close and Open, it may be possible to detect an overall trend in a particular speaker toward the prevalence of one juncture type over the other. Wells and Stackhouse (1997) suggested that the prevalence of Close Juncture could be termed “hyperelision,” while the prevalence of Open Juncture has been termed “hyperarticulation.” Qualitative
Connected Speech 443 a nalysis can be used to investigate the unusual juncture behaviors produced by an individual. This may be done using auditory perceptual analysis alone (Speake et al., 2011; Wells, 1994), or supplemented by instrumental analysis such as electropalatography (EPG) (Howard, 2004, 2007; Newton, 2012).
31.5 A typical Connected Speech Behaviors in Impaired Speech For individuals with impaired speech production, spontaneous connected speech presents a particular challenge, that of integrating the articulatory and prosodic components of an utterance in order to achieve normal segmental realizations simultaneously with normal patterns of stress, pitch, rate, etc. As Wells (1994, p. 14) notes, “There is. . . a tension for the child between the demands of paradigmatic accuracy, i.e. the need to signal meaning in an intelligible way, and the demands of syntagmatic fluency, i.e. the need to realize phrases and sentences as cohesive wholes.” A speaker who favored fluency, and the production of perceptually acceptable prosodic patterns, might accomplish this by extensive use of close juncture patterns at word boundaries, but these might be achieved at the expense of articulatory precision and accuracy in these contexts. Conversely, a speaker for whom articulatory accuracy is paramount may produce a greater proportion of open junctures at word boundaries, with the beneficial effect on segmental production being counter-weighed by the detrimental influence on rhythm and stress patterns. Both factors could have significant effects on a speaker’s overall intelligibility, and the growing body of studies which have explored the connected speech behaviors of speakers with impaired speech lend some support to these predictions.
31.5.1 Children with Speech Sound Disorders (SSD) It is now increasingly recognized that connected speech is an important component of a child’s output skills, not least because it is more likely to represent their everyday articulatory and phonological performance. When considering the connected speech of children with speech difficulties we are interested in both establishing the extent to which this group find the phonology of multi-word utterances more challenging than their typically developing (TD) peers (and whether this disparity is greater in CS than in single words), and in determining whether they address the challenges in similar ways (e.g. close vs open juncture; non-adult juncture types).
31.5.1.1 Comparing Children with SSD and TD Children Marked delays on the measures i-v outlined above have been found in groups of children with speech difficulties in connected speech contexts. In some studies differences between groups have been found to be consistent with observations at a single word level (e.g. Barnes et al., 2009). Other studies suggest differences may be smaller in CS than in single words. Yeh and Liu (2021), explored whether SW or CS best reflected differences between children with and without SSD in a group of 74 pre-school children in Taiwan. Across their three measures of speech accuracy, phonemic inventory and intelligibility, they found higher effect sizes for differences between the groups at the single word level. However, it is possible that not all measures are equal in terms of their sensitivity to differences between populations at different linguistic levels, and this may affect our interpretation of such contrasting findings. Wren et al. (2013) found that accuracy measures of PCC and PVC in particular were important for distinguishing between a group of 409
444 Caroline Newton, Sara Howard, Bill Wells, and John Local eight-year-old children with SSD and 50 TD controls across single words. There was also evidence that the type of speech sample elicited affected the proportion of errors. For example, the children with SSD were more likely to produce a greater number of distortions in single word production; in contrast, omissions of consonants and consonant clusters were more prevalent in connected speech. Note also that a direct comparison between SW and CS contexts may be problematic: in spontaneous speech children may elect to avoid phonemes and combinations of sounds which they find particularly challenging; a luxury not available to them at the single word level, for example, in a picture-naming task. The extent to which children do this may also go some way to account for differences between studies.
31.5.1.2 Exploring Word Juncture Behaviors A range of studies provide compelling evidence not just that children with SSD struggle to manage the demands of multi-word utterances but that this is seen particularly in their realization of word junctures themselves. Some phenomena may be similar to those observed in typical development, while others have not been reported in the TD literature. Klein and Liu-Shea (2009) examined the production of word pairs providing environments for assimilation and elision in a group of four boys aged 4;0 to 5;5. They found more frequent occurrence of final consonant deletion across these environments than was observed in single word production. Children deleted a much wider range of speech sounds and with greater frequency than has been reported in adults or in TD children. Further evidence that the connected speech of children may be radically affected by processes associated with hyperelision is provided by Howard (2004) and Newton (2012). All three 11 to 12-year old boys reported by Newton (2012) produced some adult-like word boundary behaviors, although adult-like open juncture was scarce in elision environments which seemed to be harder to manage than assimilation environments. The children’s output also showed patterns of what the author termed “hyperlenition.” This included the replacement of the whole coda by a glottal stop, similar to the pattern observed in typical development by Newton and Wells (1999), so that last train was realised by Jack (aged 12;2) as [lɑʔ t�ein]. In other instances, the entire coda was elided so that Jack produced hugged me as [hʌ mi]. Other productions involved articulatory weakening of the whole of the juncture, which has not been reported in the TD literature. In one example, the word pair robbed the was produced by Eric (aged 11;8) as [υɒꞴə], in which features of the target juncture consonants have been coalesced into a single consonant. Newton suggests the motivation for such phenomena is effort minimization in response to the complex phonetic environments of word boundaries. Howard (2004) provides evidence of more pervasive hyperelision in another group of older children with developmental speech impairments. For example, nine-year-old Holly’s production of the utterance a man taking a photo of a boy dressed up as a clown shows a combination of lenition and elision of consonants and reduction of vowels in the first half of the utterance which was produced with a faster speech rate: [{allegro əˈmæň tɛɪˇ⁀ kxˇɪˇ ə ̃ ˌfa tə ⁀ �ɪ allegro} {rallentando ˈdυεɬtː ˈʊpːh Iɬ̞ ː ə ˈkːl̥aʊ nrallentando}]. bβ In contrast, Wells (1994) and Howard (2007) describe children whose unusual preference for open junctures at word boundaries disrupts stress patterns and speech rhythms and results in speech which sounds rather slow, effortful, and disjointed. Zoe, at almost six years old, usually achieved close juncture at syllable boundaries within words, but displayed a strong preference for open junctures at word boundaries (Wells, 1994). This was realized by a number of phonetic devices, including sustained glottal closure (e.g. by the [baIʔː t̪ ə]); audible release of coda consonants; between-word pauses (e.g. big car [pIː̥ c . kɑ ]). The perceptual effect of these behaviors was of jerky, staccato speech, whose slow rate may have related to the rhythmic abnormalities, or may have been a compensatory strategy to maximize intelligibility.
Connected Speech 445 Sam (Howard, 2007) at the age of nine produced speech, like Zoe, which was characterized by inappropriate use of open juncture at word boundaries. In his case early speech development had been affected by a cleft palate. Sam used glottal onsets at word boundaries where the second word had a vowel onset (e.g. and all [ˈʔæŋ . ʔɔʊ]) and inserted glottal plosives at word boundaries where r-liaison would normally be expected to occur (e.g. are open [ˈʔɑ . ʔəʊpən]). The distinction between these two approaches is illustrated effectively in the pair of children with speech difficulties associated with a history of cleft palate described by Howard (2013). Both children had difficulties negotiating word junctures, but these manifested in quite different patterns. Eleven-year-old JO produced more close juncture realizations of CSP environments but this was often marked by segmental omissions so that although his speech maintained adult-like prosody, it was often unintelligible, resulting in productions like Sam loved to dance as [ɬ~ æ ˈmʊkːxə ˈɰ~ æ ]. Here, while the elision of the final consonant in loved is expected, elision of the initial consonant of that word and of the final consonant cluster in dance is not. The same sentence was produced by the other child, SB (aged 9;5), as [ˈçæm lʊvd̻h tə ˈdænxçs]. In this version, we can observe not only open juncture in the elision site loved to but also the prolonged “articulatory slide” for the final consonant in dance. SB’s preference for open juncture, and other adjustments at word boundaries, made him the more intelligible of the two but his speech was prosodically unusual. While the above case studies suggest that individual children with connected speech difficulties may be categorized as “hypereliders” or “hyperarticulators,” this is an oversimplification. The study reported by Howard (2007) includes two children who combine hyperelision with hyperarticulation, often within a single utterance, a phenomenon also observed in the example from Holly above (Howard, 2004). Speake et al. (2011) presents a thorough phonetic and phonological analysis of the speech production of Harry, a sevenyear-old boy with a persisting speech disorder, who had specific difficulties with word juncture. His multiword speech was characterized by a markedly lower proportion of appropriate close juncture (involving assimilation, elision or liaison), and hyperelisions in the form of omissions of segments and whole syllables as well as “unusual and weakened articulatory realisations” of segments (p. 142), such as in the phrase shall I tell you what: [ənˈ dæ d̞eːI ˈwɒʔh]. These were particularly, but not only, found in high frequency words and phrases. Alongside such utterances were other utterances, like and landed on the boat’s top ([æn ˈlændıʔ ɒn . d̞ə bəʊts . ˈtɒp ]), which showed evidence of hyperarticulation, mainly through the use of pauses between words. The authors suggest that interactional factors may go some way to accounting for Harry’s use of one strategy over the other, with hyperarticulation more likely when he was establishing a new topic and hyperelision used more where listener knowledge was assumed, thereby finding a solution to the tension between paradigmatic accuracy and syntagmatic fluency mentioned above.
31.5.2 Adults with Speech, Language, and Communication Needs As with the developmental field, connected speech phonology is sampled and analyzed less frequently in work on acquired disorders, though a small body of research work exists with two disorders in particular.
31.5.2.1 Apraxia Apraxic adults may find connected speech particularly challenging because their difficulties in motor planning are exacerbated in this context. (Galluzzi et al., 2015). Indeed it has been noted that apraxic speakers typically exhibit a laborious “less automated” speaking style than healthy speakers (Staiger et al., 2010). One of the first studies to explore apraxic errors
446 Caroline Newton, Sara Howard, Bill Wells, and John Local in spontaneous speech, conducted by Staiger & Ziegler, 2008, found that the proportion of errors produced in the output of three adults with apraxia was affected by frequency, so that more errors were observed in low than high-frequency syllables. Further than this, and especially with respect to the kinds of CSPs mentioned above, the findings are rather difficult to interpret as the authors compare the apraxic productions to the canonical forms of words, that is, in isolation, which suggests that appropriate phonetic reductions in connected speech were analyzed identically to actual segmental errors. This was addressed in a follow-up study by Staiger et al., 2010, who examined German phonetic reduction processes in one man with apraxia (RK), before and after a six-month block of speech and language therapy, and compared his output with that of two neurologically healthy adults. RK produced fewer reductions after therapy, but across both testing time points he produced less than one third of the number of reductions of the controls, the authors describing his output generally as “hyperspeech”. Possible accounts for RK’s output presented by the authors include being unable to perform the gestural adaptations required to convey phonetic reduction. It is also possible that RK’s “hyperspeech” was the result of intentional listener-focused adaptation in order to avoid any potential intelligibility issues. This echoes observations in other populations mentioned above that the realization of words in connected speech may be affected by discourse and interactional factors. However, a further explanation is possible: no information is provided on the intervention RK received, but it is possible that a focus on perfecting the isolated, “citation” form of words in earlier therapy had been preserved in RK’s connected speech production.
31.5.2.2 Dementia Connected speech has relatively recently emerged as a domain of interest in research into Alzheimer’s disease and other forms of dementia, because it is recognized that such a sample provides information about processing at several linguistic levels, including phonological. One feature of primary progressive aphasia (PPA) often identified is the frequency of phonological errors, and researchers have suggested that connected speech may be useful in characterizing the specific production features of different PPA subtypes not evident in single-word production (Boschi et al., 2017), which may be valuable in improving diagnostic sensitivity for early detection and phenotyping. For example, in a study involving 70 speakers each diagnosed with one of the three variants of PPA, Wilson et al. (2010) found those with the semantic variant made few errors in connected speech. More errors were made by speakers in the other two groups, which though difficult to distinguish in other contexts were found to differ here: the logopenic variant group were described as making more phonological errors (“paraphasias”), the non-fluent group made more motor errors (“distortions”). Dalton et al. (2018) provide a more detailed description of phonological behavior in connected speech in the three groups. Again, the nonfluent variant was distinct from the other variants in having the most errors across all locations within a word and all word classes. It is likely that the prevalence of errors reflects the broad articulatory challenges of spontaneous speech, however the extent to which word junctures present particularly marked challenges for these speakers remains unknown.
31.6 Intervention While numerous studies of intervention for phonological and articulatory difficulties have been carried out, very few have incorporated connected speech systematically in the intervention. However, there is evidence to suggest that this may be a worthwhile approach (Pascoe et al., 2005) and there are a number of possible avenues for work in this area:
Connected Speech 447
31.6.1 Phonological Therapy Pascoe et al. (2005) report an intervention conducted with Katy, aged 6;5 who had severe and persisting speech difficulties. The predominant patterns of phonological simplification found in Katy’s connected speech were cluster reduction (100%), final consonant deletion (96%) and pre-vocalic voicing (40%), with all three primarily affecting the boundaries of words. Tailor-made intervention focused on Katy’s final consonant deletion pattern and its aim was for Katy to use final consonants in CVC words embedded in sentences. Intervention which focused on Katy’s production of the final consonants of words produced in isolation was effective at that level but this was not matched by an improvement in her production of the same consonants when the words were embedded in a sentence. A further phase of intervention, focusing on connected speech, was successful in getting her to use the targeted forms in sentences. In the therapy, a graded hierarchy of sentences was devised around target words, moving from a facilitatory context to a more demanding one. For example, in the case of the target word rope the facilitatory sentence used as a starting point was: this rope pulled the car, where the onset consonant of the following word pulled is the same as the coda consonant of the target word rope. The rationale was that children using final consonant deletion should be able to produce the initial [p] in pulled even if they omit the final [p] in rope. At the next level the child was required to produce a sentence such as there’s rope on the road with the target rope being followed by a vowel. Finally sentences such as this rope got frayed were introduced, requiring change of place of articulation (and voicing) between the final [p] in rope and the following consonant [ɡ]. A similar approach was used by Moreiras (2015) who worked on junctures involving word-final /z/ with Gerrard, a child of nearly nine years. In both cases, change made after intervention was not limited to the words in the treatment lists: it extended to untreated words in matched control lists, suggesting that generalized change had been brought about. Gains made with connected speech were maintained in the long-term, after a period of no intervention. Thus, it seems that improvement in connected speech can be brought about by specifically addressing connected speech in a carefully structured way.
31.6.2 Frequency-based Approach As we have described earlier in the chapter, frequency appears to be an important factor in the occurrence of CSPs, with more reduction occurring in words and phrases that are more frequent, both in child and adult speech. The observation that formulaic utterances in particular are characterized by the phonological cohesion afforded by use of CSPs mirrors observations in aphasic language where otherwise non-fluent output may include some well-formed, phonologically cohesive fixed expressions, such as I like it and I don’t know (van Lancker Sidtis, 2012; see also Wray, this volume). It may therefore be effective to draw on the growing body of research exploring usage-based approaches to language intervention. For example, Bruns et al. (2021) describe an intervention study in which participants with non-fluent aphasia were trained to extend familiar high-frequency fixed constructions (e.g. I like it) by inserting a range of different lexical items into open slots in the sentences (e.g. you like cake) in order to help them generate novel expressions. Treatment of connected speech phonology might involve the grading of levels of frequency, and move from more frequent and familiar word combination to phrases which occur less frequently.
448 Caroline Newton, Sara Howard, Bill Wells, and John Local
31.6.3 Explicit Instruction A further avenue is to draw on work with L2 language learners who have also been shown to find the appropriate use of CSPs challenging, both in terms of struggling to compensate for processes in speech comprehension (Darcy et al., 2007) and in failing to incorporate relevant processes in their own output (Wong et al., 2021), both of which are important obstacles preventing speakers from achieving native-like proficiency. These observations led practitioners in second language learning to explore ways of heightening speakers’ awareness of the need to link word pairs and how this might be achieved. Methods, which have shown promising results with L2 speakers, include simple explicit instruction on the rules of connected speech (Melenca, 2001), providing opportunity for speakers to “shadow” and repeat relevant environments produced with close juncture (Shiki et al., 2010), or making use of visual analysis of the speech signal to provide audio-visual feedback on productions where CSPs can be easily visualized (e.g., /j, w, r/ liaison). Although not suitable for all speakers, these approaches may prove fruitful in some cases, perhaps especially in adult speakers where hyperarticulation has been observed.
31.7 Implications Although connected speech is increasingly included in analyses of disordered speech, very few studies explicitly consider the particular challenges presented by word boundaries, and there is still work to be done in identifying the optimal assessment and intervention methods. We should beware of limiting our clinical speech assessments to single words on the grounds of ease and efficiency of analysis: careful and detailed analysis of connected speech, although undoubtedly challenging, may reveal more clinically useful and significant information about a speaker’s output difficulties.
31.7.1 Implications for Assessment At the CS level, there is a need to establish measures which both accurately capture a child’s abilities and difficulties and which allow clinicians to observe change over time or after intervention. As with the methods of elicitation, where a combination of informal and more structured tasks may be optimal, it is possible that a blend of quantitative and qualitative measures is most useful (e.g. Speake et al., 2011). The former provide data which can be useful for including in formal reports evaluating the effectiveness of intervention; we have attempted to demonstrate in this chapter why the latter are also necessary. Automated analysis of speech samples may provide some assistance, and is increasingly used in large-scale studies (e.g. Barrett et al., 2020; Wren et al., 2013, 2021). Tools include freely available software such as the PROPH+ analysis provided by Computerized Profiling (Long et al., 2006) and Phon from the CHILDES suite of analysis tools (Hedlund & Rose, 2020). PROPH+, for example, requires the researcher or therapist to submit a phonetic transcription, the program then performs the analysis and returns many of the measures listed above (e.g. PCC, pMLU). Use of these tools in connected speech appears to be limited in the same way as the measures outlined above: the default target is usually the citation form of words which is often not appropriate in connected speech (e.g. brown as /braʊn/ regardless of phonological context); the user is required manually to provide alternate forms. Future work might explore similar systems which systematically allow for the consequences of CSPs. Additionally, at the single word level, assessment targets may be informed by connected speech phonology. As Newton (2012, p. 725) observes, “since the usual context for the production of a word is in connected speech rather than in isolation, the citation form of a word may be an unrealistic – and arguably unnecessary – expectation.”
Connected Speech 449
31.7.2 Implications for Intervention For some individuals with speech disorders, improvements made with individual sounds and words may transfer their learning across linguistic levels to multi-word utterances. For others, generalization of gains from words to short phrases to conversation is more challenging without explicit treatment at that level (e.g. Katy reported in Pascoe et al., 2005). For these individuals, therapy might usefully target connected speech and we have provided some suggestions of how this might happen, though more work is required in this area to identify optimally effective evidence-based treatments. Even where intervention does not explicitly target connected speech, a knowledge of connected speech phenomena might inform therapeutic goals: as with assessment, the citation form may not be a realistic, or indeed helpful, target in intervention. In this chapter we have aimed to provide an overview of clinical research which addresses connected speech and guidance for how difficulties might be assessed and treated. As we have also observed, however, there is much work still to be done in this area, not least in languages other than English which are poorly represented in this area. An appreciation of the phonological demands of connected speech could have important implications for the ways we assess and manage impaired speech in the future.
NOTE 1 As told by comedian Tim Vine.
REFERENCES Barnes, E., Roberts, J., Long, S. H., Martin, G. E., Berni, M. C., Mandulak, K. C., & Sideris, J. (2009). Phonological accuracy and intelligibility in connected speech of boys with fragile X syndrome or down syndrome. Journal of Speech, Language, and Hearing Research, 52(4), 1048–1061. Barrett, C., McCabe, P., Masso, S., & Preston, J. (2020). Protocol for the connected speech transcription of children with speech disorders: An example from childhood apraxia of speech. Folia Phoniatrica et Logopaedica, 72(2), 152–166. Barry, W. J., & Andreeva, B. (2001). Crosslanguage similarities and differences in spontaneous speech patterns. Journal of the International Phonetic Association, 31(1), 51–66. Beers, M., Rodenburg-Van Wee, M., & Gerrits, E. (2019). Whole-word measures and the speech production of typically developing Dutch children. Clinical Linguistics & Phonetics, 33(12), 1149–1164. Bolozky, S. (2019). The Phonology of connected speech in Israeli Hebrew. Brill’s Journal of Afroasiatic Languages and Linguistics, 11(1), 201–225.
Boschi, V., Catricalà, E., Consonni, M., Chesi, C., Moro, A., & Cappa, S. F. (2017). Connected speech in neurodegenerative language disorders: A review. Frontiers in Psychology, 8, Article 279. https://www.frontiersin.org/ articles/10.3389/fpsyg.2017.00269/full Brown, G. (2016). Listening to spoken English. Taylor & Francis Group. Bruns, C., Beeke, S., Zimmerer, V. C., Bruce, C., & Varley, R. A. (2021). Training flexibility in fixed expressions in non‐fluent aphasia: A case series report. International Journal of Language & Communication Disorders, 56(5), 1009–1025. Bryan, S. C. (2012). Language development: A case study language development: A case study [Doctoral Dissertation]. University of Sheffield. Bush, N. (2001). Frequency effects and wordboundary palatalization in English. In J. Bybee & P. Hopper (Eds.), Frequency and the emergence of linguistic structure (pp. 255–280). John Benjamins. Bybee, J. (2000). Lexicalization of sound change and alternating environments. In M. Broe & J. Pierrehumbert (Eds.), Laboratory phonology V: Acquisition and the lexicon (pp. 250–268). Cambridge University Press.
450 Caroline Newton, Sara Howard, Bill Wells, and John Local Bybee, J. (2002). Phonological evidence for exemplar storage of multiword sequences. Studies in Second Language Acquisition, 24(2), 215–221. Cruttenden, A. (2014). Gimson’s pronunciation of English (8th ed.). Routledge. Dalton, S. G. H., Shultz, C., Henry, M. L., Hillis, A. E., & Richardson, J. D. (2018). Describing phonological paraphasias in three variants of primary progressive aphasia. American Journal of Speech Language Pathology, 27(1S), 336–349. Darcy, I., Peperkamp, S., & Dupoux, E. (2007). Bilinguals play by the rules: Perceptual compensation for assimilation in late L2- learners. In J. Cole & J. I. Hualde (Eds.), Laboratory phonology 9 (pp. 411–442). Mouton de Gruyter. Deveney, S. L., & Scheffel, L. (2019). Connected speech of two-year-olds: Test-retest reliability for assessment of phonetic inventory and word shape analysis. Clinical Archives of Communication Disorders, 4(3), 136–146. Dodd, B., Hua, Z., Crosbie, S., Holm, A., & Ozanne, A. (2006). Diagnostic evaluation of articulation and phonology. Psychological Corporation. Engstrand, O., & Krull, D. (2001). Simplification of phonotactic structures in unscripted Swedish. Journal of the International Phonetic Association, 31(1), 41–50. Farnetani, E., & Recasens, D. (2010). Coarticulation and connected speech processes. In W. J. Hardcastle, J. Laver, & F. Gibbon (Eds.), The handbook of phonetic sciences (2nd ed., pp. 316–352). Wiley-Blackwell. Fowler, C. A., & Housum, J. (1987). Talkers’ signaling of “new” and “old” words in speech and listeners’ perception and use of the distinction. Journal of Memory and Language, 26(5), 489–504. Galluzzi, C., Bureca, I., Guariglia, C., & Romani, C. (2015). Phonological simplifications, apraxia of speech and the interaction between phonological and phonetic processing. Neuropsychologia, 71, 64–83. https://www.sciencedirect.com/science/ article/pii/S0028393215001128 Goodglass, H., Kaplan, E., & Barresi, B. (2001). Boston diagnostic aphasia examination (3rd ed.). Lippincott Williams & Wilkins. Hedlund, G., & Rose, Y. (2020). Phon 3.1. [Computer software]. https://www.phon.ca/ phon-manual Howard, S. (2004). Connected speech processes in developmental speech impairment: Observations from an electropalatographic perspective. Clinical Linguistics and Phonetics, 18(6–8), 405–417. Howard, S. (2007). The interplay between articulation and prosody in children with
impaired speech: Observations from electropalatographic and perceptual analysis. Advances in Speech Language Pathology, 9(1), 20–35. Howard, S. (2013). A phonetic investigation of single word versus connected speech production in children with persisting speech difficulties relating to cleft palate. Cleft Palate-Craniofacial Journal, 50(2), 207–223. Howard, S., Methley, M., & Perkins, M. (2008). Emergence of word juncture in a typically developing child: Evidence of multiple interactions in speech and language development. [Poster presentation] The 12th Annual Symposium of the International Clinical Phonetics and Linguistics Association, Istanbul, Turkey. Ingram, D. & Ingram, K. (2001). A whole-word approach to phonological analysis and intervention. Language, Speech & Hearing Services in Schools, 32(4), 271–283. Klein, H. B., & Liu-Shea, M. (2009). Betweenword simplification patterns in the continuous speech of children with speech sound disorders. Language, Speech, and Hearing Services in Schools, 40(1), 17–30. Kohler, K. J. (2000). Investigating unscripted speech: Implications for phonetics and phonology. Phonetica, 57(2–4), 85–94. Long, S. H., Fey, M. E., & Channell, R. W. (2006). Computerized profiling (MS-DOS version 9.7.0). [Computer software]. https://sites.google. com/view/computerizedprofiling Melenca, M. (2001). Teaching connected speech rules to Japanese speakers of English so as to avoid a staccato speech rhythm [Masters dissertation]. Concordia University. Mitterer, H., & Ernestus, M. (2006). Listeners recover /t/s that speakers reduce: Evidence from /t/-lenition in Dutch. Journal of Phonetics, 34(1), 73–103. Mitterer, H., Kim, S., & Cho, T. (2013). Compensation for complete assimilation in speech perception: The case of Korean labial-to-velar assimilation. Journal of Memory and Language, 69(1), 59–83. Moreiras, C. (2015). Morphophonology in intervention for connected speech: A case study of a 9-year-old child with specific speech and language impairment [Masters dissertation]. University of Sheffield. Newbold, E. J., Stackhouse, J., & Wells, B. (2013). Tracking change in children with severe and persisting speech difficulties. Clinical Linguistics & Phonetics, 27(6–7), 521–539. Newton, C. (1999). Connected speech processes in phonological development [Doctoral dissertation]. University College London.
Connected Speech 451 Newton, C. (2012). Between-word processes in children with speech difficulties: Insights from a usage-based approach to phonology. Clinical Linguistics & Phonetics, 26(8), 712–727. Newton, C., & Wells, B. (1999). The development of between-word processes in the connected speech of children aged between 3 and 7. In B. Maassen & P. Groenen (Eds.), Pathologies of speech and language: Advances in clinical linguistics and phonetics (pp. 67–75). Whurr. Newton, C., & Wells, B. (2002). Between-word junctures in early multi-word speech. Journal of Child Language, 29(2), 275–299. Nicolaidis, K. (2001). An electropalatographic study of Greek spontaneous speech. Journal of the International Phonetic Association, 31(1), 67–85. Oladipupo, R. O. (2014). Aspects of connected speech processes in Nigerian English. Sage Open, 4(4), 1–6. Pascoe, M., Stackhouse, J., & Wells, B. (2005). Phonological therapy within a psycholinguistic framework: Promoting change in a child with persisting speech difficulties. International Journal of Language and Communication Disorders, 40(2), 189–220. Rechziegel, A. (2001). Consonants in contact: On assimilation and cross-language contrast. IFA Proceedings, 24, 103–115. Shiki, O., Mori, Y., Kadota, S., & Yoshida, S. (2010). Exploring differences between shadowing and repeating practices: An analysis of reproduction rate and types of reproduced words. Annual Review of English Language Education in Japan, 21, 81–90. https:// www.jstage.jst.go.jp/article/arele/21/0/21_ KJ00007108612/_article Shockey, L. (2003). Sound patterns of spoken English. Blackwell. Speake, J., Howard, S., & Vance, M. (2011). Intelligibility in children with persisting speech disorders: A case study. Journal of Interactional Research in Communication Disorders, 2(1), 131–151. Sprigg, R. K. (1957). Junction in spoken Burmese. In Studies in linguistic analysis: Special volume of the Philological Society, Oxford (pp. 104–138). Blackwell. Stackhouse, J., Vance, M., Pascoe, M., & Wells, B. (2007). Compendium of auditory and speech tasks: Children’s speech and literacy difficulties, book 4. John Wiley & Sons. Staiger, A., Rüttenauer, A., & Ziegler, W. (2010). The economy of fluent speaking: Phrase-level reduction in a patient with pure apraxia of speech. Language and Cognitive Processes, 25(4), 483–507. Staiger, A., & Ziegler, W. (2008). Syllable frequency and syllable structure in the spontaneous
speech production of patients with apraxia of speech. Aphasiology, 22(11), 1201–1215. Thompson, J., & Howard, S. (2007). Word juncture behaviours in young children’s spontaneous speech production. Clinical Linguistics & Phonetics, 21(11–12), 895–899. van Lancker Sidtis, D. (2012). Formulaic Language and Language Disorders. Annual Review of Applied Linguistics, 32, 62–80. https:// www.cambridge.org/core/journals/annualreview-of-applied-linguistics/article/abs/ formulaic-language-and-language-disorders/9 475D434F481429F386971377F91B993 Watson, M. M., & Terrell, P. (2012). Longitudinal changes in phonological whole-word measures in 2-year-olds. International Journal of SpeechLanguage Pathology, 14(4), 351–362. Wells, B. (1994). Junction in developmental speech disorder: A case study. Clinical Linguistics & Phonetics, 8(1), 1–25. Wells, B., & Stackhouse, J. (1997). Connected speech problems and developmental dyslexia [Paper presentation]. 6th ICPLA Conference, Neimegen, The Netherlands. Wilson, S. M., Henry, M. L., Besbris, M., Ogar, J. M., Dronkers, N. F., Jarrold, W., Miller, B. L., & Gorno-Tempini, M. L. (2010). Connected speech production in three variants of primary progressive aphasia. Brain: A Journal of Neurology, 133(7), 2069–2088. Wong, S. W. L., Dealey, J., Leung, V. W. H., & Mok, P. P. K. (2021). Production of English connected speech processes: An assessment of Cantonese ESL learners’ difficulties obtaining native-like speech. The Language Learning Journal, 49(5), 581–596. Wren, Y., McLeod, S., White, P., Miller, L. L., & Roulstone, S. (2013). Speech characteristics of 8-year-old children: Findings from a prospective population study. Journal of Communication Disorders, 46(1), 53–69. Wren, Y., Titterington, J., & White, P. (2021). How many words make a sample? Determining the minimum number of word tokens needed in connected speech samples for child speech assessment. Clinical Linguistics and Phonetics, 35(8), 761–778. Yeh, L. L., & Liu, C. C. (2021). Comparing the informativeness of single-word samples and connected speech samples in assessing speech sound disorders. Journal of Speech, Language, and Hearing Research, 64(11), 4071–4084. Yuen, I., Cox, F., & Demuth, K. (2017). Planning of hiatus-breaking inserted /ɹ/ in the speech of Australian English-speaking children. Journal of Speech, Language, & Hearing Research, 60(4), 826–835.
32 Clinical Phonology and Phonological Assessment BARBARA DODD, ALISON HOLM AND SHARON CROSBIE 32.1 Introduction Assessment for identification of speech sound disorders (SSDs) has a descriptive purpose rather than an explanatory role. To identify children with SSD and to diagnose their type of SSD, their phonetic and phonological abilities are compared with normative data. Interpretation of data depends on clinicians’ theoretical views of SSD (see Waring & Knight, 2013, for SSD classification by theoretical approach). Speech-language pathologists (SLPs) use the data to hypothesize underlying deficits for a child’s linguistic profile, designing intervention (i.e., goals and methods) to improve intelligibility. The ultimate test of this clinical process is response to intervention, measured by changes in functional communication. This chapter outlines the beginnings of clinical phonology before evaluating its role in assessment for identification, differential diagnosis, and intervention for children with suspected SSD. Early texts assumed that speech is a complex motor skill and that any difficulties affected articulation, fluency, or voice (Van Riper, 1963). However, Jakobson (1968 English translation of his 1941 monograph) raised awareness of the role of phonology in speech development and disorders. He distinguished between speech sound articulation as the surface level of a linguistic system, and the phonological system that specifies the distribution and organization of the speech sounds and syllable shapes in each language. Evidence for this clinically crucial distinction between articulation and phonology was provided by typically developing children (Smith, 1973) and children with SSD (Morley, 1972). Several people had a major influence on clinical phonology in the 1970s and 80s: • Ingram (1976) reconceptualized the nature of pediatric speech impairment. He shifted the focus from individual speech sounds to patterns of errors affecting more than one sound (e.g., cluster reduction, final consonant deletion). He made direct links between theories in developmental psychology and linguistic abilities. He recognized the relevance of psycholinguistics for speech-language pathology and provided examples of how analyses of impaired speech informed theory in linguistics and psychology. • Weiner (1981) reported evidence that meaningful minimal contrasts intervention based on the principles of clinical phonology was efficient and effective. For example, two fouryear-olds received only six intervention sessions focused on error patterns. The therapy
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
454 Barbara Dodd, Alison Holm and Sharon Crosbie reduced fronting, stopping and final consonant deletion errors and generalized to untaught targets. • Grunwell (1982) examined how different linguistic analyses of individuals contributed to assessment, diagnosis, and treatment. She developed a linguistic framework as well as a manual for assessment practice. • Locke (1983) provided a theoretical context for future research on children’s speech difficulties. This context recognized the shift to describing children’s patterns of speech errors using linguistic analyses rather than identifying individual children’s deficits in input, cognitive and motor domains. This shift had led to a change in labeling, with phonological disorder being used instead of articulation disorder. However, Locke (1983, p. 339) lamented the missed opportunity to make a “potentially useful clinical distinction” between the two. He called for a generic label for difficulties affecting the sounds of language that emerge during children’s early years that would promote research into differential diagnosis of speech disorders. Speech-sound disorders (SSD) emerged as that term. However, Locke (1983) also proposed that clinical phonology must explain, not just describe SSD. He advocated using scientific methodology to better understand phonological processing in both typical and disordered speech development. He argued that clinical phonology should account for why a particular child exhibits specific speech behaviors to enhance theory and so guide clinical intervention. Today we know more about typical and atypical speech development in diverse language learning contexts. Measures used to describe speech behavior are more comprehensive. There is growing awareness of the heterogeneity of speech behavior of children between and within recognized subgroups of SSD, leading to evaluation of different classification systems for speech disorders (Waring & Knight, 2013). The need for differential diagnosis prompted investigation of the ability profiles associated with distinct types of SSD. The identified strengths and weaknesses of homogeneous groups of children with SSD resulted in better understanding of the need for specific types of intervention. Assessment data underpin SSD theory and inform effective clinical management decisions. Three assessment issues are crucial for research and practice: criteria for identifying children with SSD; differentiation of SSD subgroups; and intervention informed by assessment data.
32.2 I dentification of Children with Speech Sound Disorder Typical phonological development may not be complete until the early school years, making earlier identification of SSD complicated. Accurate case identification is crucial given limited clinical intervention resources. Assessment tools require valid and reliable normative data on various speech measures.
32.2.1 The Speech Sample The speech sample should evaluate speech sound production in isolation and in syllables to detect impaired articulation. Stimulability testing is required when a phone is not elicited spontaneously or in imitation. Assessment of single words can systematically evaluate the target language’s phonemes across the range of syllable and word structures. Describing the effects of one sound on another within syllables (e.g., consonant harmony) and syllable/word structure
Clinical Phonology and Phonological Assessment 455 constraints (e.g., only nasals occur word finally) requires the analysis of all data, not just specific phonemes in words. Although not usually sampled in standardized tests, consistency of word production is a marker for SSD and must be measured. Despite difficulties quantifying connected speech, a child’s intelligibility can be rated informally, or more formally in narrative retell, sentence imitation or picture description tasks.
32.2.2 Normative Data Normative data for many measures are available for English-speaking children, although few standardized assessments provide comprehensive speech sampling. Normative data are essential because all children make developmental errors from speech onset. These errors change over time until children are aged around seven years, so most speech norms are presented in six-month age bands. Normative data describe what is typical for a specific population at a particular time. While normative data are not explanatory, they allow comparison of the performance of a child with suspected speech disorder with what is typical for a population of the same age and similar language learning context. Normative data are also crucial for research describing speech difficulties associated with pervasive impairments (e.g., sensory, cognitive, motor, genetic syndromes) and for children from impoverished language learning environments. However, the cause of most children’s SSD is unknown (McLeod & Masso, 2019). The value of normative data relies on the normative population being representative of the children to be assessed. Normative populations should not exclude children likely to score poorly (e.g., previous intervention for communication disorder) since that leads to increased false-negative identifications of disorder. The population must be large and demographically stratified. Language learning context is critical. Research shows that children learning different languages follow different phonological acquisition trajectories. Likewise, children learning more than one language make different types of errors compared to monolinguals of either language (Zhu Hua & Dodd, 2006). Normative assessment tasks require clear description and theoretical justification. Storkel (2019) cautions against relying on a single standardized assessment measure to identify SSD. She argues for data from varied sources to converge on a diagnosis that guides intervention. Clinicians can interpret case history information and reports from other specialists (education, audiology, medicine, psychology), as well as testing abilities underpinning speech and language, including speech perception and oro-motor abilities. Nevertheless, prevalence studies estimate that 12% of all kindergarten children present with SSD, with or without concomitant language disorder (McLeod & Masso, 2019), making speech the most common reason for caregivers to seek speech pathology assessment. Measures describing children’s speech abilities are now evaluated: percent consonants correct, phonetic repertoires, phonemic repertoires, phonological patterns, and consistency of word production The first three measures focus on speech sounds.
32.2.3 Percent Consonants Correct Percent consonants correct (PCC) is a common quantitative measure of severity. It is calculated by dividing the number of consonants pronounced correctly in any speech sample by the total number of consonants attempted and multiplying by 100 to gain a percentage. Shriberg and Kwiatkowski (1982) operationalized PCC to determine severity on a continuum: mild > 85%; moderate 50–84%; severe < 50%. PCC is a useful clinical measure because it allows comparison of a child over time to estimate change after intervention or an untreated review period of “watch and wait.” PCC can also inform therapy prioritization decisions for children of different ages and SSD types.
456 Barbara Dodd, Alison Holm and Sharon Crosbie In research studies, PCC provided normative data on consonant errors in speech development from two years of age. Most clinical research studies report PCC scores to establish (multiple) baseline performance before and after intervention. Knowing the severity of participants’ speech difficulties also allows comparison of studies examining the nature and treatment of SSD. PCC has several important limitations: 1. Researchers report PCC on a range of different types of speech samples: imitation of words (McLeod & Masso, 2019); picture naming (Dodd et al., 2003); phonological process analysis (Weiner, 1981); connected speech (Farquharson et al., 2020), but there has been no systematic examination about how the measure is affected by sample size or type of speech sample. Further, the levels of severity are arbitrary and lack theoretical or clinical justification (Garrett & Moran, 1992). 2. Consonant production is the only aspect of speech measured. Accuracy of vowel production is crucial for speech intelligibility in English because vowels are highly salient in marking differences in meaning between words (e.g., pat, part, pit, pot, etc. Zhu Hua & Dodd, 2006). PCC does not identify children making vowel errors (Grunwell, 1997). 3. Up until school age, all children make speech errors that are typical for their age (e.g., cluster reduction until 4;0; gliding until 6;0). PCC does not discriminate between age appropriate and delayed developmental errors (e.g., fricative stopping after 3;5). Nor does it distinguish between errors of omission, phonetic distortion and substitution of one sound for another. 4. Around 40% of children with SSD (Broomfield & Dodd, 2004) make errors that are atypical of normative data at any age. Their prognosis is less positive than for children with delayed development (Morgan et al., 2017). Although they always make more errors than age matched TD children and sometimes more than children with delayed development, severity is an unreliable predictive indicator (Garrett & Moran, 1992). 5. PCC does not provide useful information for making clinical decisions about what therapy approach should be used or what to target in therapy. 6. There are few normative studies of PCC (Dodd et al., 2003).
32.2.4 Phonetic Repertoires Phonetic repertoires list all speech sounds a child produces spontaneously in a speech sample plus those elicited in isolation and imitation. This independent analysis describes the speech sounds they can physically produce without reference to any adult target. Some assessments elicit consonant-vowel-consonant (CVC) structures to examine the presence of a sound in word-initial and word-final position. A few have extension testing to determine stimulability of any phone not elicited. When an articulation impairment is identified, then further assessment of oral structure and oromotor function are indicated (e.g., to rule out dysarthria). However, most commonly distorted sounds /s, z, ɹ/ reflect mislearning of the articulatory motor pattern.
32.2.5 Phonemic Repertoires Phoneme repertoires list all speech sounds that a child uses correctly in a speech sample. While vowels and consonant clusters are sometimes included, most normative data focuses on the age of consonant acquisition (at least one correct production) or mastery (75% or 90% correct productions) calculated from number of opportunities
Clinical Phonology and Phonological Assessment 457 (Crowe & McLeod, 2020). Some assessments note syllable position of production (e.g., Goldman-Fristoe Test of Articulation, GFTA-3, Goldman & Fristoe, 2015). The GFTA-3 uses targeted consonants in words to determine the age children begin to pronounce a sound in specific word p ositions accurately. The GFTA-3 provides normative data, for girls and boys in six-month age bands, when 50%, 75%, and 90% of children in the normative sample correctly produce specific phonemes, at least once, when naming pictures or retelling a story. Types of errors are not distinguished (e.g., SODA categories of substitution, omission distortion and addition, Van Riper, 1963). Vowel errors may be marked on the scoresheet but do not count as errors. Any error affecting a consonant cluster (e.g., slide [aɪd], [laɪd], [saɪd] [swaɪd] [ɸaɪd]) would count as one error. The limitations associated with PCC are also true for phoneme repertoire and other assessments focusing solely on consonant production accuracy.
32.2.6 Error Patterns Phonological patterns or processes are used to describe children’s speech errors. Stampe (1973, p. 1) defined a phonological process as “a mental operation that applies in speech to substitute for a class of sounds or sound sequences presenting a common difficulty to the speech capacity.” Ball (2016) argued that clinical phonology should differentiate innate natural processes and idiosyncratic rules. Making no assumptions about innateness or articulatory ability, “error patterns” seems an appropriate label for children’s disordered speech. Error patterns describe a regularity in a child’s speech that allows SLPs to describe how, and in what phonetic context, a phoneme or group of phonemes will be in error. For example, in “stopping,” continuants are substituted by plosives (e.g., fish [pɪt]). More than 10% of the normative population aged 3;0–3;5 made ≥ 5 stopping errors naming 50 words, indicating that stopping errors are age-appropriate for a child aged 3;3 (Dodd et al., 2003). Data from case studies (e.g., Smith, 1973) and large cohorts (e.g., Dodd et al., 2003) document the age developing children typically suppress developmental phonological error patterns. SLPs compare the error patterns used by a child to normative data to identify whether they reflect typical development. For example, stopping errors at four years reflect delayed phonological development. Both Dodd (1995) and Grunwell (1997) argued for differential diagnosis of children with SSD: phonological delay (use of phonological error patterns typical of the speech of younger children) and phonological disorder (the use of atypical error patterns not observed in the speech of 10% of any age band of the normative sample). Two common atypical error patterns are: “deletion of initial consonants,” as in zebra [ɛbʌ]; bridge [ɪd]; cake [eɪt]; and “backing” of alveolar sounds in rain [weɪŋ], teeth [git]. One consequence of analyzing whole words is that SLPs learned that some children pronounce the same word differently.
32.2.7 Consistency of Word Production Consistency of word production is a crucial measure because inconsistency is a symptom of SSD (e.g., childhood apraxia of speech (CAS), Ball 2016; Down syndrome, Dodd & Thompson, 2001). Another group of phonological disorder is children making inconsistent speech errors without other symptoms of CAS (Crosbie et al., 2020; McNeill et al., 2022). For example, when asked to name pictures three times, within the same assessment session, these children say the same word differently (e.g., slide [faɪd, haɪd, tɹaɪ]; bridge [dɹɪdz, wɪdz, fɹɪdʒ]; fish [sɪs, tɪs, ʃɪʃ]; helicopter [hɛ:i:kɒjɜ. hɛli:kɔɪjə. hɛkɒjɜ]). These children’s speech errors do not reflect predictable error pattern use.
458 Barbara Dodd, Alison Holm and Sharon Crosbie
32.3 C lassification of Distinct Groups of Children with SSD There is consensus that children with SSD are a heterogeneous population. They differ in severity, etiology, type of errors made, consistency, chance of spontaneous resolution, profile of associated abilities and response to therapy. There is a clinical need for a typology of developmental speech disorders. Waring and Knight’s (2013) review evaluated three dominant classification criteria: etiology (e.g., Shriberg et al., 2010); deficits in psycho-linguistic processing (e.g., Stackhouse & Wells, 1997); and speech symptomatology (Grunwell, 1997). Dodd’s (1995) approach described psycholinguistic profiles for four subgroups with distinct profiles of speech errors: (i) articulation disorder (non-age appropriate phone repertoire); (ii) delayed phonological development (sole use of non-age appropriate error patterns, typical of younger children); (iii) consistent phonological disorder (use of at least one atypical error pattern) and, (iv) inconsistent phonological disorder (inconsistent production of ≥ 40% of 25 single words in three naming trials). Data from children whose primary difficulty was SSD is used to evaluate the clinical usefulness of the four subgroups discriminated by phonetic and phonological speech errors. Articulation disorder is a much-misunderstood term because it was once the generic label for any children with a speech difficulty. However, only 10–20% of any sample of children referred for assessment of SSD are not stimulable to produce speech sounds appropriate for their age. Further confusion arises from articulation and phonological difficulties sometimes co-occurring. Dodd et al. (2018) examined the speech development of 93 children making non-age appropriate speech errors over time: • At four years, 79 children only made phonological errors (i.e., they were stimulable for all age appropriate speech sounds that they pronounced incorrectly in words) and those errors reflected delayed and/or disordered error patterns. Thirteen children had an interdental lisp and delayed phonological errors. All speech errors made by one child were due to a lateral lisp. • At seven years, the child with a lateral lisp remained unchanged with all occurrences of /s, z, ʃ, ts/ being produced laterally. Eight children with an interdental lisp and delayed phonological errors had error free speech, two only had an interdental lisp, two only had some delayed errors, and one had affrication of clusters as well as an interdental lisp. An odd finding was that eight children who had demonstrated articulation of fricatives at four years had acquired interdental production of ≥50% of occurrences of /s, z/ by seven. Flipsen (2015) has also reported acquisition of SSD in older children. While the child with a lateral lisp fits classic descriptions of articulation disorder, children with interdental lisps often resolved spontaneously when co-occurring with delayed phonological errors. These findings suggest the need to re-examine what constitutes functional articulation disorder. Two population studies of children with SSD reported the proportion of English-speaking children in each of the four diagnostic categories: articulation disorder 13% and 10%; delay 58% and 55% consistent disorder 21% and 20%; and inconsistent disorder 9 and 15% (Broomfield & Dodd, 2004, N=320, UK incidence study; Ttofari Eecen et al., 2019, N=126 Early Language in Victoria (ELVS) community cohort study, SSD identified at four years). Similar proportions are reported for children with SSD acquiring other languages (e.g., Zhu Hua & Dodd, 2006; Korean (Pi & Ha, 2020); Danish (Clausen & Fox-Boyer, 2022): most have delayed phonology, many fewer have consistent disorder, while articulation and inconsistent subgroups are the smallest. These data indicate that a specific language’s phonetic and phonological characteristics have a limited effect on the types of errors made.
Clinical Phonology and Phonological Assessment 459 Morgan et al. (2017) used ELVS data to determine the proportion of each subgroup who had resolved by seven years. While 40% of the whole group made persistent errors, those with disordered errors (consistent and inconsistent) “at age four years were twice as likely to have poor speech outcomes at age seven years compared with those with delayed errors” (p. 200). The literature describing the associated abilities of heterogeneous groups of SSD is vast. Many studies compare children with SSD and typically developing age-matched controls, usually on one type of measure. Most studies find that groups of children with SSD perform more poorly than controls and attribute speech difficulty to the ability domain of their task. (e.g., auditory perceptual processing, Brosseau-Lapré & Schumaker, 2020; phonetic planning or programming, Barbier et al., 2020). Bird and Bishop (1992) argued that individual differences on task performance precluded drawing valid conclusions about causes of SSD. However, studies of homogeneous groups of children with phonological SSD provide more discriminatory evidence. Table 32.1 shows the three phonological subgroups of SSD have very different profiles of associated abilities.
Table 32.1 Phonological subgroups of SSD: Performance on associated abilities (✓ indicates no statistical difference from matched typically developing control group; ✘ indicates statistically significant difference). Consistent disorder
Inconsistent disorder
✓
✓
✓
Thyer & Dodd, 1996
> CPD
< Del, IPD
> CPD
Dodd et al., 1989
✓
✘
✓^
Dodd et al., 2008
✓
✓
✘
Dodd & McCormack, 1995
✘
McNeill et al., 2022
Phonological awareness
✓
✘
✘
✓
Harris et al., 2011; Holm et al., 2008; Leitão et al., 1997; Gillon, 2004 Dodd et al., 1989
Ability
Phonological delay
Auditory perception and processing Auditory processing task *PSI Understanding own speech** Visual speech perception Other language skills Receptive vocabulary Expressive vocabulary
Syllable Onset-rime PA tasks Phonological legality
✓ ✓ ✓ ✓
✓
✘
✓
Source (examples only)
(Continued)
460 Barbara Dodd, Alison Holm and Sharon Crosbie Table 32.1 (Continued)
Ability Literacy Reading comprehension Non-word reading Spelling Executive function Rule abstraction Cognitive flexibility Short-term verbal memory PhWM Output processing Oro-motor, fine motor tasks Receptive non-word learning Expressive nonword learning
Phonological delay
Consistent disorder
Inconsistent disorder
✓
✘
✓
Leitão et al., 2004
✘ ✘ ✓
✓ ✓ ✓
Crosbie et al., 2009 Dodd, 2011 Waring et al., 2022
NT
Waring et al., 2022
✓
✓
✓
Bradford & Dodd, 1994
✓
✓
✘
Bradford & Dodd, 1996
✘# ✓
Dodd & McCormack, 1995 Dodd & McCormack, 1995
✓ ✓ ✓ ✓ ✘ ✘
✓
Phonetic planning and execution Vowel production Consonant production
✓ ✓
✘ ✘
✘
✓
✓ ✓
✓ ✘
✓
Source (examples only)
Harris et al., 2011 Dodd, 1995; Holm et al., 2008
Bradford & Dodd, 1996
*PSI: Pediatric Speech Intelligibility Test (Jerger et al., 1980) assesses physiological function of auditory pathways; NT not tested; ** No control group appropriate: statistical differences between groups (Del: delayed); CDP: consistent disorder; IPD: inconsistent disorder; ^ unpublished data; # phonetic variability (> than TD and other SSD controls in production of vowels perceived as correct, using acoustic instrumental formant analysis).
Children with phonological delay tend to perform at the bottom of the normal range on standardized assessments, often (but not always) performing no differently, statistically, from matched controls. Children with phonological disorder, who consistently use atypical error patterns, show phonological knowledge deficits that also explain their literacy difficulties. This deficit may reflect impaired executive functioning, affecting the cognitive-linguistic ability to abstract statistical regularities in their native phonological system, leading to poor cognitive flexibility. Children making inconsistent errors show a surprising pattern of
Clinical Phonology and Phonological Assessment 461 associated abilities. On most tasks they perform within normal limits, despite the severity of their speech disorder. Reading for comprehension did not differ from controls while spelling was impaired, even after their speech disorder was remediated (Holm et al., 2008). Their poor spelling and syllable counting, plausibly reflects difficulty in phonological assembly for speech output, identified in non-word learning tasks. McNeill et al. (2022) reported longitudinal data from 39 children with IPD, initially aged 4;6–7;11, who were assessed five times over two years. Those with the highest inconsistency made the least progress. Intervention was not linked to positive changes in speech accuracy, and, poor receptive vocabulary was associated with both speech accuracy and inconsistency (see Table 32.1).
32.4 Selection of Intervention Research exploring the abilities associated with subgroups of SSD provide SLPs with crucial information about clinical management. The cogency of links between characteristics of disordered speech, profiles of strengths and weaknesses in associated abilities, and choice of a specific intervention is controversial. For example, McNeill et al.’s (2022) longitudinal study of children with inconsistent phonological disorder collected data from referring clinicians on type of intervention and its outcomes. Over a two-year period, no positive outcomes were reported. The study describes the therapy as either consultative or direct, and suggest that “the type of therapy selected … [may not have been appropriate for the] participants’ underlying impairment” (p. 2470). However, current systematic reviews of treatment trials for children with speech difficulties are of limited clinical usefulness. Their focus is on identifying the “best intervention” for the generic category of SSD, failing to understand that child-related factors determine response to treatment (Dodd, 2021). Research needs to evaluate the link between assessment, type of SSD, and outcomes of intervention practice. An initial step is to implement case studies evaluating intervention approaches for subgroups of speech disorder, providing a theoretical rationale, detailed treatment methodology, and outcome measures (see Table 32.2).
Table 32.2 Treatment for different types of SSD. Articulation Disorder: attributed to motoric mis-learning / poor oro-motor control Therapy targets: motor-speech skills, phonetic plan for articulation of sounds Techniques: cues to teach articulation: touch, ultrasound, EPG Example sources: Van Riper (1963), Günther and Nieslony (2017) Consistent Phonological Disorder, persistent Phonological Delay: attributed to impaired understanding of contrasts and constraints of the phonological system. Therapy target: phonological error patterns Techniques: minimal, maximal pairs or multiple oppositions, phonological awareness of speech sound segmentation matched with letter knowledge Example sources: Weiner (1981), Crosbie and Holm (2017) Inconsistent Phonological Disorder: attributed to impaired phonological planning/ assembly of word templates (Levelt et al., 1999) Assumed deficits: phonological planning/assembly of word templates Therapy target: consistent production of core vocabulary of 50–70 whole words Techniques: elicit best word production. drill to consistency in isolation/carrier phrases Example sources: Dodd and Iacono (1989), Crosbie et al. (2020).
462 Barbara Dodd, Alison Holm and Sharon Crosbie The discipline of clinical linguistics rarely includes evidence from intervention to support theoretical argument. Clinical outcomes, however, provide the best evidence for theory. The following section illustrates how intervention research in SSD can inform theory and practice. It briefly describes two case studies that evaluated the effect of targeting an underlying deficit using detailed, theoretically motivated adaptations of a therapy approach.
32.4.1 A Toddler with Consistent Phonological Disorder The first case study described the intervention for a toddler with consistent phonological disorder (Claessen et al., 2017). Normative studies describe phonological errors toddlers typically make between 24 and 35 months, identifying children showing early signs of phonological disorder. At 25 months, “Claire” scored within normal limits on standardized tests of receptive and expressive language. In contrast, although within the low average range (Scaled Score (SS) = 8) for number of errors on the Toddler Phonology Test (McIntosh & Dodd, 2011), she made 10 atypical errors (backing and vowel errors). When reassessed at 36 months, Claire’s PCC was below the normal range (SS = 6), she made 12 atypical errors, and performed poorly on rule abstraction and cognitive flexibility measures. Therapy targeted rule abstraction and cognitive flexibility skills. Improvements in these abilities were expected to generalize to Claire’s phonology, to increase her PCC and reduce atypical errors. Two, 45-minute therapy sessions per week, for eight weeks at childcare, focused on non-linguistic play-based activities involving planning, sorting, sequencing, and shifting attention from one dimension to another (e.g., size, shape, color, types [animals]) and explicit discussion of rule-based language in narratives (e.g., marking tense, gendered pronouns). Speech production or patterns were never mentioned. Assessment immediately post therapy, and at three-month follow-up, indicated that Claire’s speech was more accurate (PCC SS=8). The improvement reflected a decrease in atypical and delayed phonological errors as the number of age-appropriate errors did not change. Intervention for toddlers with emerging SSD is not often considered feasible, given their limited attention spans and language comprehension. However, the positive outcomes for therapy targeting rule abstraction and cognitive flexibility in play, as opposed to surface speech errors, provide initial evidence for an appropriate approach to intervention for very young children. Research might explore whether similar intervention in the preschool years might prevent persistent SSD and associated literacy difficulties.
32.4.2 A Bilingual Child with Inconsistent Phonological Disorder The second case study described intervention for a bilingual child with inconsistent phonological disorder (Holm & Dodd, 1999). A bilingual boy, “HK,” had unintelligible speech in both languages. HK was exposed to English from 3;0 years at childcare. At 4;6 years his English receptive vocabulary and comprehension scores were age appropriate for a typically developing monolingual speaker. On standardized assessments, HK’s PCC was Punjabi 53% and English 46%; inconsistency scores were Punjabi 45% and English 56%. His English phone inventory was missing /θ, ð, ʒ/ and none were missing in Punjabi. This study investigated the effect of therapy delivered in English on both of HK’s languages. The research summarized in Table 32.1 suggests that a post-lexical, phonological planning impairment underlies inconsistent sequencing of speech sounds. Core vocabulary therapy, focusing on phonological planning, should increase consistency and accuracy of speech production in HK’s two languages, as research has shown generalization from treated to untreated words
Clinical Phonology and Phonological Assessment 463 (Crosbie et al., 2020). Targeting whole words, as opposed to individual speech sounds, enhances mapping from internal representation to spoken output. An English-speaking SLP provided 16 individual 30-minute core vocabulary therapy sessions over eight weeks. Sessions were alternately conducted in HK’s home and school to allow liaison with his teacher and parents. Before intervention, HK, his parents and teacher contributed to a list of 50 words that were functionally “powerful” (e.g., people’s names, favorite food, games, television, school activities). The SLP explained the principles of core vocabulary therapy to HK’s parents and teacher, and how to monitor and give feedback on target words at home and school. HK learned to pronounce all core vocabulary items consistently. After a slow start, progress improved each week and consistency of production of untreated probe words emerged. At three months post therapy, HK’s inconsistency had declined from 56% to 20% in English, and from 45% to 30% in Punjabi. Although therapy did not specifically target accuracy, HK’s PCC scores rose from 46% to 68% in English, and 53% to 70% in Punjabi. This treatment case study indicated that therapy targeting a post-lexical deficit could remediate inconsistent phonological disorder in both of a bilingual child’s languages, even when one of those languages was not a treatment target.
32.5 Conclusions There is no single explanatory account of SSD. This generic term encompasses difficulties with different speech characteristics, etiologies, associated abilities, developmental trajectories, and response to intervention. Nevertheless, linguistic theory can contribute insights to build research and practice frameworks that provide better clinical outcomes for children at risk because people don’t understand what they say. However, as Ball (2016) concluded, there needs to be “a move away from theoretical descriptive linguistics toward a more functional and cognitive linguistics” (p. 205). From a speech-language pathology perspective, there are three theoretical issues that need to be addressed. First, clinical linguistic theories need to acknowledge the heterogeneity of SSD. Despite widespread consensus that SSD is a heterogeneous population, some researchers assert that one ability can account for most cases. For example, Namasivayam et al. (2020) argued that “differences in speech sound errors between the subtypes of SSD may in fact be differences in how these individuals develop strategies for coping with the challenges of being on the lower end of the speech motor skill continuum” (p. 17). Similarly, Brosseau-Lapré and Schumaker (2020) concluded that “most children with phonological SSD present with speech perception difficulties” (p. 3970). Development of theory depends on researchers’ descriptions of participants’ speech characteristics, as opposed to PCC scores from targeted sounds in a picture naming task. Minimum standards might require information about phone repertoire including phone stimulability, proportion of non-developmental errors, and consistency of word production. These three measures would allow better evaluation of a study’s findings. Second, linguistic theory should acknowledge the role of cognition in speech acquisition. Ball (2016) outlines how linguistic theory adapted concepts from cognitive psychology. For example, van de Weijer (2019) argued that “general cognitive mechanisms, rather than … stipulations of innateness” (p. 131) could account for the acquisition of phonological constraints in Optimality Theory. However, a more salient challenge for clinical linguistics, arises from findings that young children’s language acquisition depends on their cognitive ability to implicitly derive “patterns,” “rules,” or “constraints” from their language exposure. One study of 62 typically developing two-year-olds assessed auditory-visual speech perception, motor speech skill, and verbal and non-verbal rule abstraction in relation to their
464 Barbara Dodd, Alison Holm and Sharon Crosbie phonological development (Dodd & McIntosh, 2010). While all three domains correlated with phonological accuracy in two-year-olds, rule abstraction skills explained far more of the variance in regression analyses. Most initial language learners have functionally intact input and output that allow the acquisition of at least one language. Exposure to ambient language(s) and a working set of articulators are not sufficient for the acquisition of phonology. Changes to theory in clinical phonology need to reflect the crucial role of children’s cognitive abilities in language learning. Third, linguistic theory underestimates the value of data from clinical cases and intervention outcomes in SSD. Clinical implications sections of academic papers often make recommendations for particular types of interventions for SSD. For example, a maximal contrast approach based on markedness theory (Gierut, 2001) is cited to support the clinical relevance of linguistic theory, despite Crockett’s (2012) review showing no outcome advantage for maximal over minimally paired targets. Yet clinical data provide a rich source for theory, as examples from bilingual children making speech errors demonstrate. For example, one fiveyear old child consistently used the atypical process of backing /t/ to [k] word finally in Cantonese but not in English where /k/ was consistently fronted to [t] (Holm et al., 1997). These data indicate that phonological error patterns do not reflect articulatory skills; they are language-specific, allowing different error pattern profiles across the two languages. Clinical findings from studies exploring the literacy abilities of children with SSD also have implications for linguistic theory. Children with consistent phonological disorder have persistent difficulties with reading, spelling, and phonological awareness, while those with inconsistent errors are at risk only for spelling problems. In contrast children with articulation and delayed phonological development often perform within normal limits. Theory should account for such differing relationships between spoken and written phonology. Phonological assessment of children with SSD is crucial for SLPs for diagnosis, for understanding the nature and type of SSD and for informing intervention. Phonological assessment data also challenges our understanding, and tests the validity of clinical phonology theory.
REFERENCES Ball, M. (2016). Principles of clinical phonology. Routledge. Barbier, G., Perrier, P., Payan, Y., Tiede, M. K., Gerber, S., Perkell, J. S., & Ménard, L. (2020). What anticipatory coarticulation in children tells us about speech motor control maturity. Plos One, 15(4), e0231484. https://doi. org/10.1371/journal.pone.0231484 Bird, J., & Bishop, D. (1992). Perception and awareness of phonemes in phonologically impaired children. European Journal of Disorders of Communication, 27(4), 289–311. https://doi. org/10.3109/13682829209012042 Bradford, A., & Dodd, B. (1994). The motor planning abilities of phonologically disordered children. European Journal of Disorders of Communication, 29(4), 349–369. https://doi. org/10.3109/13682829409031288 Bradford, A., & Dodd, B. (1996). Do all speechdisordered children have motor deficits?
Clinical Linguistics and Phonetics, 10(2), 77–101. https://doi.org/10.3109/02699209608985164 Broomfield, J., & Dodd, B. (2004). The nature of referred subtypes of primary speech disability. Child Language Teaching and Therapy, 20(2), 135–151. https://doi.org/10.1191/0265659004c t267oa Brosseau-Lapré, F., & Schumaker, J. (2020). Perception of correctly and incorrectly produced words in children with and without phonological speech disorders. Journal of Speech Language and Hearing Research, 63(12), 3961–3973. https:// doi.org/10.1044/2020_JSLHR-20-00119 Claessen, M., Leitão, S., & Fraser, C.-J. (2017). Intervention for a young child with atypical phonology. In B. Dodd & A. Morgan (Eds.), Intervention case studies of child speech impairment (pp. 275–291). J&R Press.
Clinical Phonology and Phonological Assessment 465 Clausen, M. C., & Fox-Boyer, A. V. (2022). Diagnostic validity, accuracy and inter-rater reliability of a phonological assessment for Danish-speaking children. Journal of Communication Disorders, 95(1), 106168. https://doi.org/10.1016/j.jcomdis.2021.106168 Crockett, J. (2012). Minimal contrasts and maximal oppositions: An evidence-based practice brief [Masters thesis]. University of Texas. http:// hdl.handle.net/2152/ETD-UT-2012-05-5239. Crosbie, S., & Holm, A. (2017). Phonological contrast therapy for children making consistent errors. In B. Dodd & A. Morgan (Eds.), Intervention case studies of child speech impairment (pp. 201–222). J&R Press. Crosbie, S., Holm, A., & Dodd, B. (2009). Cognitive flexibility in children with and without speech disorder. Child Language Teaching and Therapy, 25(2), 250–270. https:// doi.org/10.1177/0265659009102990 Crosbie, S., Holm, A., & Dodd, B. 2020. Core vocabulary intervention. In A. Williams, S. McLeod, & R. McCauley (Eds.), Intervention for speech sound disorders in children (2nd ed., pp. 225–249). Brookes. Crowe, K., & McLeod, S. (2020). Children’s English consonant acquisition in the United States: A review. American Journal of SpeechLanguage Pathology, 29(4), 2155–2169. https:// doi.org/10.1044/2020_AJSLP-19-00168 Dodd, B. (1995). Differential diagnosis and treatment of children with speech disorders. Whurr. Dodd, B. (2011). Differentiating speech delay from speech disorder: Does it matter? Topics in Language Disorders, 31(2), 96–111. http://dx. doi.org/10.1097/TLD.0b013e318217b66a Dodd, B. (2021). Re-evaluating evidence for best practice in pediatric speech-language pathology. Folia Phoniatrica et Logopaedica, 73(2), 63–74. https://doi.org/10.1159/ 000505265 Dodd, B., Holm, A., Zhu, H., & Crosbie, S. (2003). Phonological development: A normative study of British English‐speaking children. Clinical Linguistics and Phonetics, 17(8), 617–643. https://doi.org/10.1080/0269920031000111348 Dodd, B., & Iacono, T. (1989). Phonological disorders in children: Changes in phonological process use during treatment. British Journal of Disorders of Communication, 24(3), 333–352. https://doi.org/10.3109/13682828909019894 Dodd, B., Leahy, J., & Hambly, G. (1989). Phonological disorders in children: Underlying cognitive deficits. British Journal of Developmental Psychology, 7(1), 55–71. https:// doi.org/10.1111/j.2044-835X.1989.tb00788.x
Dodd, B., & McCormack, P. (1995). A model of speech processing for differential diagnosis of phonological disorders. In B. Dodd (Ed.), Differential diagnosis and treatment of children with speech disorders (pp. 65–89). Whurr. Dodd, B., & McIntosh, B. (2010). Two-year-old phonology: Impact of input, motor and cognitive abilities on development. Journal of Child Language, 37(5), 1027–1046. https://doi. org/10.1017/S0305000909990171 Dodd, B., Mcintosh, B., Erdener, D., & Burnham, D. (2008). Perception of the auditory‐visual illusion in speech perception by children with phonological disorders. Clinical Linguistics and Phonetics, 22(1), 69–82. https://doi. org/10.1080/02699200701660100 Dodd, B., Reilly, S., Ttofari Eecen, K., & Morgan, A. T. (2018). Articulation or phonology? Evidence from longitudinal error data. Clinical Linguistics and Phonetics, 32(11), 1027–1041. https://doi.org/10.1080/02699206.2018.1488994 Dodd, B., & Thompson, L. (2001). Speech disorder in children with Down’s syndrome. Journal of Intellectual Disability Research, 45(Pt 4), 308–316. https://doi.org/10.1046/ j.1365-2788.2001.00327.x Farquharson, K., Tambyraja, S. R., & Justice, L. M. (2020). Contributions to gain in speech sound production accuracy for children with speech sound disorders: Exploring child and therapy factors. Language, Speech and Hearing Services in Schools, 51(2), 457–468. https://doi. org/10.1044/2019_LSHSS-19-00079 Flipsen, P. (2015). Emergence and prevalence of persistent and residual speech errors. Seminars in Speech and Language, 36(4), 217–223. https:// doi.org/10.1055/s-0035-1562905 Garrett, K. K., & Moran, M. J. (1992). A comparison of phonological severity measures. Language, Speech, and Hearing Services in Schools, 23(1), 48–51. https://doi.org/10.1044/ 0161-1461.2301.48 Gierut, J. A. (2001). Complexity in phonological treatment: Clinical factors. Language, Speech, and Hearing Services in Schools, 32(4), 229–241. https://doi.org/10.1044/0161-1461(2001/021) Gillon, G. (2004). Phonological awareness: From research to practice. Guilford. Goldman, R., & Fristoe, M. (2015). GoldmanFristoe test of articulation (3rd ed.). Pearson. Grunwell, P. (1982). Clinical phonology. Croom Helm. Grunwell, P. (1997). Natural phonology. In M. Ball & R. Kent (Eds.), The new phonologies: Developments in clinical linguistics (pp. 35–75). Singular.
466 Barbara Dodd, Alison Holm and Sharon Crosbie Günther, T., & Nieslony, J. (2017). Traditional articulation therapy. In B. Dodd & A. Morgan (Eds.), Intervention case studies of child speech impairment (pp. 319–336). J&R Press. Harris, J., Botting, N., Myers, L., & Dodd, B. (2011). The relationship between speech impairment, phonological awareness and early literacy development. Australian Journal of Learning Difficulties, 16(2), 111–125. https:// doi.org/10.1080/19404158.2010.515379 Holm, A., & Dodd, B. (1999). An intervention case study of a bilingual child with phonological disorder. Child Language Teaching and Therapy, 15(2), 139–158. https://doi. org/10.1177/026565909901500203 Holm, A., Dodd, B., & Ozanne, A. (1997). Efficacy of intervention for a bilingual child making articulation and phonological errors. International Journal of Bilingualism, 1(1), 55–69. https://doi.org/10.1177/136700699700100105 Holm, A., Farrier, F., & Dodd, B. (2008). Phonological awareness, reading accuracy and spelling ability of children with inconsistent phonological disorder. International Journal of Language and Communication Disorders, 43(3), 300–322. https://doi.org/10.1080/ 13682820701445032 Hua, Z., & Dodd, B. (2006). Phonological development and disorders in children: A multilingual perspective. Multilingual Matters. Ingram, D. (1976). Phonological disability in children. Edward Arnold. Jakobson, R. (1941, 1968). Child language, aphasia and phonological universals. Mouton. Jerger, S., Lewis, S., Hawkins, J., & Jerger, J. (1980). Pediatric speech intelligibility test. I. Generation of test materials. International Journal of Pediatric Otorhinolaryngology, 2(3), 217–230. https://doi. org/10.1016/0165-5876(80)90047-6 Leitão, S., & Fletcher, J. (2004). Literacy outcomes for students with speech impairment: Longterm follow-up. International Journal of Language & Communication Disorders, 39(2), 245–256. https://doi.org/10.1080/1368282031 0001619478 Leitão, S., Hogben, J., & Fletcher, J. (1997). Phonological processing skills in speech and language impaired children. European Journal of Disorders of Communication, 32(2), 91–111. https://doi.org/10.1111/j.1460-6984.1997. tb01626.x Levelt, W. J., Roelofs, A., & Meyer, A. S. (1999). A theory of lexical access in speech production. The Behavioral and Brain Sciences, 22(1), 1–75. https://doi.org/10.1017/s0140525x99001776
Locke, J. L. (1983). Clinical phonology: The explanation and treatment of speech sound disorders. Journal of Speech and Hearing Disorders, 48(4), 339–341. https://doi. org/10.1044/jshd.4804.339 McIntosh, B. & Dodd, B. (2011). Toddler Phonology Test. Pearson. McLeod, S., & Masso, S. (2019). Speech sound disorders in children. In J. Horst & J. von Koss Torkildsen (Eds.), International handbook of language acquisition (pp. 362–386). Routledge. McNeill, B., McIlraith, A. L., Macrae, T., Gath, M., & Gillon, G. (2022). Predictors of speech severity and inconsistency over time in children with token-to-token inconsistency. Journal of Speech, Language, and Hearing Research, 65(7), 2459–2473. https://doi. org/10.1044/2022_JSLHR-21-00611 Morgan, A., Ttofari Eecen, K., Pezic, A., Brommeyer, K., Mei, C., Eadie, P., Reilly, S., & Dodd, B. (2017). Who to refer for speech therapy at 4 years of age versus who to “Watch and wait”? Journal of Pediatrics, 185(June), 200–204.e1. https://doi.org/10.1016/j.jpeds.2017.02.059. Morley, M. (1972). The development and disorders of speech in childhood. Churchill. Namasivayam, A. K., Coleman, D., O’Dwyer, A., & van Lieshout, P. (2020). Speech sound disorders in children: An articulatory phonology perspective. Frontiers in Psychology, 10, Article 2998. https://doi.org/10.3389/ fpsyg.2019.02998 Pi, M., & Ha, S. (2020). Classification of subgroups of children with speech sound disorders: A preliminary study. Communication Sciences and Disorders, 25(1), 113–125. https:// doi.org/10.12963/csd.20685 Shriberg, L. D., Fourakis, M., Hall, S. D., Karlsson, H. B., Lohmeier, H. L., McSweeny, J. L., Potter, N. L., Scheer-Cohen, A. R., Strand, E. A., Tilkens, C. M., & Wilson, D. L. (2010). Extensions to the speech disorders classification system (SDCS). Clinical Linguistics and Phonetics, 24(10), 795–824. https://doi.org/10.3109/02699 206.2010.503006 Shriberg, L. D., & Kwiatkowski, J. (1982). Phonological disorders III: A procedure for assessing severity of involvement. Journal of Speech and Hearing Disorders, 47(3), 256–270. https://doi.org/10.1044/jshd.4703.256 Smith, N. (1973). The acquisition of phonology. Cambridge University Press. Stackhouse, J., & Wells, B. (1997). Children’s speech and literacy difficulties: A psycholinguistic framework. Whurr.
Clinical Phonology and Phonological Assessment 467 Stampe, D. (1973). A dissertation on natural phonology [Unpublished doctoral dissertation]. University of Chicago. Storkel, H. L. (2019). Using developmental norms for speech sounds as a means of determining treatment eligibility in schools. Perspectives of the ASHA Special Interest Groups, 4(1), 67–75. https:// doi.org/10.1044/2018_PERS-SIG1-2018-0014 Thyer, N., & Dodd, B. (1996). Auditory processing and phonologic disorder. Audiology, 35(1), 37–44. https://doi.org/10.3109/00206099609071928 Ttofari Eecen, K., Eadie, P., Morgan, A. T., & Reilly, S. (2019). Validation of Dodd’s Model for Differential Diagnosis of childhood speech sound disorders: A longitudinal community cohort study. Developmental Medicine and Child Neurology, 61(6), 689–696. https://doi. org/10.1111/dmcn.13993 van de Weijer, J. (2019). Where now with optimality theory? Acta Linguistica Academica, 66(1), 115–136. https://www.jstor.org/stable/26654005
Van Riper, C. (1963). Speech correction: Principles and methods. Prentice Hall. Waring, R., & Knight, R. (2013). How should children with speech sound disorders be classified? A review and critical evaluation of current classification systems. International Journal of Language and Communication Disorders, 48(1), 25–40. https://doi.org/ 10.1111/j.1460-6984.2012.00195.x Waring, R., Rickard Liow, S., Dodd, B., & Eadie, P. (2022). Differentiating phonological delay from phonological disorder: Executive function performance in preschoolers. International Journal of Language and Communication Disorders, 57(2), 288–302. https://doi.org/10.1111/1460-6984.12694 Weiner, F. (1981). Treatment of phonological disability using the method of meaningful minimal contrast: Two case studies. Journal of Speech and Hearing Disorders, 46(1), 97–103. https://doi.org/10.1044/jshd.4601.97
Part 4: Phonetics
33 Phonetic Transcription in Clinical Practice SALLY BATES, JOCELYNNE WATSON, BARRY HESELWOOD, AND SARA HOWARD
33.1 Introduction Phonetic transcription provides the data currently considered fundamental to the assessment, diagnosis, and treatment of people with atypical speech (e.g., Howard & Heselwood, 2002; Stemberger & Bernhardt, 2020). It delivers a principled approximation of the speaker’s output in linear notation form which can identify areas of strength and weakness in the speaker’s phonetic and phonological systems. This contributes to an understanding of a speaker’s communicative profile and supports clinical decision-making. Transcription is acknowledged to be effortful and time consuming (White et al., 2022). Providing speech samples to be transcribed can also be demanding for a speaker, especially so if they have concomitant medical and/or cognitive difficulties or if they have an inconsistent, developing or deteriorating profile requiring more frequent sampling. The Speech and Language Pathologist (SLP) needs to consider the level of detail to include in their transcription. Fortunately, the richness of acoustic information in speech, both in terms of its spectral and temporal structure, affords some discretion with regard to the type and amount of information on which to focus listening and notation. Being able to take a strategic view about which information will be most pertinent to an individual case will contribute to an efficiency and effectiveness which is axiomatic in the clinical setting. The SLP must also be sensitive to the possibility that in dealing with individual speakers there is potential for missing relevant data where collection is constrained by pre-set criteria (Howard & Heselwood, 2002). These issues must be taken into account as they navigate the ongoing cost benefit analysis which inevitably accompanies the path to providing the speaker with an optimal outcome. This chapter considers the practice and value of clinical phonetic transcription in the clinic and in research. We start with an overview of the ways in which speech might be impaired and the range of professional issues which the SLP will need to consider for clinical purposes. We follow with a description of the notation systems most widely available, the multiple tiers of information which can be derived from a transcription and the additional information which can be provided by instrumental analysis techniques. We illustrate the value of transcription with reference to the most prevalent pediatric clinical population, that is, children with unexplained Speech Sound Disorder (SSD) and include consideration of the
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
472 Sally Bates, Jocelynne Watson, Barry Heselwood, and Sara Howard critical role of speech sampling in supporting effective data analysis. The chapter concludes with an appraisal of the current challenges in implementing phonetic transcription in the clinical setting.
33.2 Clinical Context A person arrives for clinical assessment because concern has been expressed about their speech. While the problem will essentially involve either speech intelligibility or speech acceptability or both, the pattern of contributory difficulties for each speaker will be individual. From the clinical perspective the speaker may, for example, have a limited speech sound inventory and so a reduced ability to signal all the phonotactic and phonemic contrasts required in their target language/s. They may demonstrate speech error patterns only usual in earlier speech development. They may use sounds which are not found within their target language/s or which may not even be classified as speech sounds. Their speech may lack clarity and/or consistency. They may have problems with fluency, the appropriate use of intonation, stress, loudness and/or voice quality. In each case it is the task of the SLP to determine how best to capture those speech characteristics which will be relevant to clinical decision making and make a judgment relating to the value of information which could be derived from transcription and the analysis of the resulting data. In determining the optimal level of descriptive detail required, and the frequency and timing of speech sampling, the SLP will also need to consider the implications of any concomitant factors which may be pertinent to the diagnosis and likely progression of the speaker’s difficulties. Some factors will be generic, that is they would be a consideration for any [SLP] clinical group, others will relate specifically to individuals with speech difficulties. The SLP’s role in all these cases is to make professional, evidenced-based judgments about the nature, severity, and likely impact of the speech difficulties including the speaker’s ability to provide data which is representational. The SLP is also tasked with keeping careful contiguous notes of contact with the speaker. This information will include raw data from phonetic transcription, analysis of the data and a clear link to suggestions regarding the benefit of further intervention, the type of intervention and time for discharge. These notes, together with summary reports, are typically made available for independent scrutiny through an audit process and could act as evidence in a legal context. In this environment therefore the SLP is conscious of the importance of transcription data as a tool to support their clinical decision-making, as a measure of clinical effectiveness and as evidence of professional judgment and probity (Rahilly, 2011). The purpose of phonetic transcription within this complex matrix of decision-making is to contribute sufficient data for the SLP to be confident about identifying relevant aspects of the speaker’s ability at any given point in time. This includes those aspects which are functioning typically and those which might benefit from therapeutic input, augmented support, or advocacy. Phonetic transcription can also serve as an explanatory tool to help the speaker gain more insight into their speech, highlighting areas of strength, weakness, and potential development. In deciding what approach to take the SLP may consult professional guidelines which aim to summarize best practices for specific clinical groups. For example, the UK and Ireland’s Child Speech Disorder Research Network (CSDRN) provides guidelines and a decision-making tree to assist SLPs in their transcription of child speech (CDRN, 2017). SLPs may also have service provision or research protocols to abide by and/or local conventions which have evolved to capture specific circumstances. The primary tools available are however three professionally accepted notation systems which have global reach within the clinical field and are supported and updated by the International Phonetic Association: The
Phonetic Transcription in Clinical Practice 473 International Phonetic Alphabet (IPA) (revised to 2015), the Extensions to the IPA (extIPA) (IPA, 1999; Ball et al., 2018b) and the Voice Quality Symbols (VoQS) (Ball et al., 2018a). These systems provide a rich set of conventions for clinical transcription as discussed below and are included in the Appendices at the end of this chapter.
33.3 Transcription 33.3.1 Tools The IPA itself aims to provide a symbol for all speech sounds or “segments” which have been found to be contrastive in at least one world language. The sound symbols are presented in four separate sub-charts: Consonants – Pulmonic, Consonants – Non-pulmonic, Other Symbols (for consonants which have a secondary place of articulation such as the labialvelar approximant /w/, affricates and double articulations) and Vowels. There is also a chart containing Diacritics, markers used as adjuncts to the sound symbols to provide additional phonetic information, and two further sections which describe Suprasegmental Features and Tonal Phenomena. The extIPA chart provides additional sound symbols and diacritics developed specifically to support transcription of atypical speech, both at the segmental and suprasegmental level. This includes sounds produced by speakers with unusual dentition and occlusion, as well as a range of atypical phonatory, resonatory, and airstream behaviors. “Sounds with no available symbol” are denoted by an asterisk and augmented by accompanying notes. The VoQS conventions allow notation of the long-domain features of airstream type, phonation type and supra-laryngeal setting (Ball et al., 2018a). Taken together, the three symbol systems aim to provide the transcriber with the tools to record any sequence of speech in sufficient detail for it to be read and reproduced from the transcription. For any utterance, the transcriber can note the temporal order of sounds and where, in developing or atypical speech, this may be disrupted through the omission, transposition, or insertion of sounds. In the clinical context, the SLP can use both diacritics and symbols for unusual sounds that are not part of the speaker’s “target” inventory. These may reflect a phonetic distortion or mis-articulation for example, /s/ produced as [s̪ ] or a [ɬ] or a realization indicative of progressive change within the system, for example, use of [ɕ] in place of /s/ where /s/ was previously deleted or substituted by a plosive. Non-system sounds are of course also likely to occur in the output of multi-lingual speakers due to interference patterns or during attempts at code switching (Stemberger & Bernhardt, 2020). One feature of the IPA charts, as reflected in the descriptive labels above, is that they subscribe to the classification of sounds as a composite of articulatory features. In the case of consonants these are: voicing, place of articulation, and manner of articulation, and in the case of vowels: tongue placement (height and front-back co-ordinates) and lip-rounding. It is though important to recognize that the use of symbols in transcription is not a direct confirmation of the underlying articulation. Rather the transcriber is selecting a notation which reflects their perception of the acoustic information in the speech signal. The relationship between acoustic information and the underlying articulatory cause is one of inference. It is based on a developed research consensus regarding the most typical articulatory postures adopted by adult speakers of the world’s languages established through observation, introspection, and instrumental measures (Howard & Heselwood, 2002). There are, however, questions about the degree to which we can make this inference for early infant speech, including babbling (Vihman & Harding-Bell, 2019), particularly given the differences between infants and adults in vocal tract structure and dimensions (Kent & Miolo, 1995).
474 Sally Bates, Jocelynne Watson, Barry Heselwood, and Sara Howard A similar concern holds in the clinical field where transcribers must also be sensitive to a potential lack of traditionally proposed correspondence between articulation and acoustics on the one hand and between acoustics and perception on the other. A speaker may produce a segment which sounds typical, by using an unusual or compensatory articulation, for example following partial glossectomy. In this case it could be argued that an accurate understanding of the nature of the production is not required as the aims of the speaker, that is, intelligibility and/ or acceptability, are being met. However, the SLP will recognize that while in some cases an atypical configuration may be optimal for producing the sound in isolation it could be effortful to sustain in more extended stretches of speech where it may also impact the timing and co-ordination of articulatory movements for other sounds, thereby reducing intelligibility. Acoustic information can be supplemented by visual information where available that is, if face-to-face with the speaker and/or from video recordings. This may include information about labial and tongue tip/blade involvement, and mandibular settings (Howard & Heselwood, 2002). With some sounds however, much of the articulatory positioning and movement is taking place within the oral cavity and larynx and is therefore less accessible to visual inspection. In this case perceptual analysis can be supplemented with instrumental measures (see also Section 33.4). For any utterance, typical or atypical, the transcriber also has scope to record non-segmental aspects of speech. In practice, the amount of detail recorded in any transcription will vary depending on several factors including, not least, the purpose of the transcription (see discussion below).
33.3.2 Type and Level of Detail A general distinction is made between “broad” and “narrow” transcription based on the relative amount of phonetic detail represented in a transcription. In practice, this is not a discrete distinction but rather a continuum (Howard & Heselwood, 2002). An approach can be applied across the whole utterance or sample or can vary within a word, depending on the goal of the transcriber. The broadest type of transcription is a phonemic transcription, where the symbols stand for phonemes. In a narrow transcription, phonetic detail is added using diacritics and non-system symbols, for example, in English [ç] is not a phoneme but may occur clinically as a realization of /s/, particularly in a high front vowel context. Level of detail is measured in terms of the number of resonatory, phonatory, or articulatory features recorded when transcribing a word or phrase (or more properly the number of acoustic correlates of these features). A more detailed transcription will also include information about non-segmental aspects of speech including, for example, stress and tone, speech rate, pauses, and long-domain resonance and voice quality features, for example, [{lento Ṿ! ə ˈɣəb əːː (..) ̚ ˎɡɣɒ̃ɱĩ Ṿ! lento}] (“a cup of coffee”). The type and level of detail captured in a clinical phonetic transcription will vary depending on the purpose of the transcription (Howard & Heselwood, 2002), and the degree to which the SLP has knowledge of the accent or language they are transcribing. For instance, where the target sound system is known to the transcriber, it is arguably not necessary to note typical allophonic variation since this is predictable from the context. Rather, attention can be focused on capturing those sound patterns which deviate from the established target norm. The SLP may also choose to transcribe more broadly in an initial speech screen to establish general patterns and level of difficulty and more narrowly in follow-up testing to better inform choice of intervention approach and therapy targets or to monitor progress (Stemberger & Bernhardt, 2020). Phonetic transcription of disordered speech necessarily requires more careful listening and, ideally, full knowledge of the IPA since the SLP is not able to make any assumptions about how sounds function within the speaker’s system, and hence which phonetic detail may or may not be important. There is an increased risk of listener bias when transcribing
Phonetic Transcription in Clinical Practice 475 disordered speech in an unfamiliar language, that is, a tendency to assimilate unfamiliar sounds to ones within the transcriber’s own system (Stemberger & Bernhardt, 2020). Given this potential for error, the SLP might do better to focus initially on gauging the overall accuracy of a speaker’s productions following the “whole word match” approach. This is described by Bernhardt et al. (2020) who advise that the transcriber should only attempt detailed transcription once they are more familiar with the speaker’s speech and the target system. In the following discussion, we consider the relative merits of broad and narrow transcription, the added value provided by instrumental measures and the critical role of sampling in ensuring that conclusions drawn from subsequent analysis of the data are valid. We illustrate how the data can inform clinical decision-making with reference to the largest group in the pediatric SLP’s caseload, children with Speech Sound Disorder (SSD) (Dodd, 2014).
33.3.2.1 Broad Transcription Broad transcription can usefully highlight where, from the listener’s perspective, there is a loss of contrast within the speaker’s phonological system. With appropriate sampling, analysis of the transcription data can also provide useful information about: (1) t he speaker’s phonetic and phonotactic inventories: consonants, vowels, word/syllable structures, and lexical stress patterns. (2) the operation of any phonological processes and/or atypical patterns which affect natural classes of sound, for example, velar fronting is a phonological process characteristic of early typical development whereby velars are realized as alveolars. Similarly, alveolar backing is an atypical pattern, whereby alveolars are realized as velars. Universal application of either pattern results in a reduced system of sound contrasts relative to the adult target. For speakers who front velar sounds both /ki/ (“key”) and /ti/ (“tea”) may be pronounced as [ti], /bad/ (“bad”) and /bag/ (“bag”) as [bad] and /wɪn/ (“win”) and /wɪŋ/ (“wing”) as [win]. Conversely for speakers who back alveolar sounds /ki/ and /ti/ may be pronounced as [ki], /bad/ and /bag/ as [bag] and /wɪn/ and /wɪŋ/ as [wɪŋ]. (3) any principled variability in the production of individual sounds across different words which may indicate progressive change within the system, and which is therefore a positive prognostic indicator. This may reflect application of:
• positional constraints, for example, 100% application of velar fronting word-initially (e.g., /ki/ → [ti]) but correct tokens produced word-finally (e.g., /bak/ (“back”) → [bak]). • phonetic contextual constraints, for example, /k/ fronted in the context of non-low front vowels as in /ki/ → [ti] or /kɪŋ/ (“king”) → [tɪŋ] but produced correctly in the context of low back vowels as in [kɑ] (“car”) and [kɔ] (“core”). • lexical constraints, for example, /k/ fronted to [t] in early acquired words but produced correctly in later acquired words. (4) inconsistent production of the same word across different repetitions, for example, / katəpɪlə/ (“caterpillar)” produced as [tapəkɪlə], [kapətɪlə] or [papəpɪlə]. This feature is associated with two sub-types of Speech Sound Disorder: Inconsistent Phonological Disorder (IPD), and Childhood Apraxia of Speech (CAS) and is therefore an important diagnostic indicator (Dodd, 2014).
33.3.2.2 Narrow Transcription Narrow transcription provides a richer profile of an individual’s phonetic and phonological capabilities. At the segmental level, use of non-system sound symbols and diacritics allows the SLP to note specific phonetic detail regarding how the sounds are produced rather than
476 Sally Bates, Jocelynne Watson, Barry Heselwood, and Sara Howard simply capturing the contrastive categories used. With appropriate sampling, analysis of the transcription data can: (1) c apture the speaker’s phonetic inventory where this also includes sounds that do not occur in typical speech. These usually reflect phonetic distortions or compensatory articulations that are caused by anatomical and/or neurological impairment associated with conditions such as Cleft Lip/Palate, Hearing Impairment, Cerebral Palsy, or neurodegenerative diseases. For example, speakers with Cleft Lip/Palate have been reported to use palatal stops and nasals, palatal, velar, uvular, or pharyngeal fricatives in place of more anterior targets. Excessive nasalization and use of ejectives are also common characteristics of cleft palate speech (Howard et al., 2019). Speakers with hearing impairment or wearers of cochlear implants are known to use atypical resonance and airstream mechanisms, producing, for example, implosives (Parker & Rose, 1990). The more detailed information afforded by narrow transcription allows the SLP to make a principled inference about the nature of the underlying auditory, anatomical, or physiological constraints in operation where direct instrumental measures are not available. (2) capture sub-phonemic contrasts thereby both preventing potential misidentification of phonological processes and allowing identification of incipient change/emerging contrasts. For example, a child’s realization of “spoon” as [p=un] if transcribed as [bun] would suggest application of pre-vocalic voicing as well as cluster reduction rather than cluster reduction alone. Conversely, ability to note aspiration in a child’s realization of “spoon” as [phun], allows the SLP to identify coalescence, whereby the aspiration represents the voicelessness of the omitted /s/, and hence a realization that is closer to target /sp/ than either [p=] or [b]. Failure to note this level of specific phonetic detail might lead the SLP to underestimate the child’s productive phonological knowledge (PPK) of both voiceless obstruents word-initially and /s/ clusters. Similarly, the use of diacritics to indicate phonetic features such as vowel lengthening and nasalization can indicate where a child has underlying knowledge of a final deleted consonant, for example, /kat/ realized as [kaː], /sʌn/ realized as [sʌ᷈ː] and hence evidence of an emerging structural contrast, that is, CV v CVC. The SLP is also able to note where there is a nonstandard but subtle phonological contrast between two phonetically similar sounds such as /s/ and /ʃ/ where one or other of the sounds is atypically realized. For example, if the child realizes /s/ consistently as [ç] and /ʃ/ as [ʃ] there is no loss of contrast within the system despite the unusual realization of target /s/. In a broad transcription, the SLP is likely to employ [ʃ] for both, giving the false impression of a loss of contrast (Ball et al., 2009). (3) allow a finer-grained analysis of where there is variability in the production of individual phonemes across different words, increasing the likelihood of identifying progressive change within the system. To illustrate: Child 1 has difficulty achieving the post-alveolar affricates /tʃ, dʒ/ which are stopped wordinitially to [t, d] but realized word-finally as the alveolar affricates [ts, dz], sounds which are not part of the English consonant system. Their use of these non-system sounds can be seen to represent an intermediate stage in the resolution of stopping of affricates. While the child is not yet achieving correct production, [ts, dz] are closer to target /tʃ, dʒ/ than are [t, d]; they are achieving a delayed fricative release in addition to a stop closure. Note that this also exemplifies a positional constraint, that is, word final position is facilitating progressive change. Child 2 on first assessment, showed consistent velar fronting, pre-vocalic voicing and stopping of fricatives and affricates word-initially and, except for voiced plosives and nasals,
Phonetic Transcription in Clinical Practice 477 deleted all consonants word-finally. Four months later, assessment showed that they had a three-way place contrast: bilabial, alveolar, and velar for target plosives and nasals both word-initially and finally and a voicing contrast was also emerging word-initially. Although still stopping fricatives word-initially, the child was now consistently marking them word-finally albeit with non-system sounds or dental [s̪ , z̪ ] depending on phonetic context. Voiceless palatal [ç] was produced following a stressed, non-low front vowel, voiceless velar [x] following a stressed back vowel and dental [s̪ , z̪ ] following [t, d] in final plosive + fricative clusters. Narrow phonetic transcription coupled with an appropriate speech sample, allowed identification of systematic, phonetically principled patterning in the data, that is, a vowel-to-consonant influence (Bates et al., 2013). The child achieved realizations closest to the target in the final plosive + fricative clusters owing to the blocking effect of the consonant. This facilitative context, potentially useful in intervention, might otherwise have been missed. Variability in fricative production might have also been misinterpreted as non-progressive or lexical inconsistency, leading to misdiagnosis. Child 3 showed a highly unusual pattern relating to the high front and back vowels /i, u/ which he realized as [ɪɟ] and [ɪb] respectively, thereby maintaining contrastivity. This pattern is also phonetically principled and can be explained according to Element Theory as glide hardening (Harris et al., 1999). Using phonemic transcription, the SLP would have been constrained to note the palatal stop [ɟ], not a phoneme in English, as its nearest category equivalent [g]. Note: points 1–3 illustrate the advantages afforded by a narrow transcription relative to broad transcription when drawing up a phonetic inventory and identifying the processes and patterns in operation and any progressive change within a system. The following points, 4–6, illustrate additional insights provided by a narrow transcription which have bearing on differential diagnosis and intervention planning. (4) using diacritics, allow the SLP to note motor-phonetic errors such as atypical use of allophones, for example, [sphun], [ɫif] as well as other motor-phonetic errors such as abnormal vowel lengthening, or hyper-nasality (Ball et al., 2009). (5) at the suprasegmental level, allow the SLP to indicate “long domain” phenomena, for example, information distributed beyond notional segment boundaries (Local, 2003). The introduction of the “labelled braces” convention in the Voice Quality Symbols system (Ball et al., 2018a) provides a means of representing some long-domain features in a segmental transcription. For example, nasalized voice across an utterance can be represented using brackets rather than requiring a nasalization diacritic over each segment: [i ˈsed i {Ṽ ˈadnʔ ̀ɹe dɪʔ Ṽ} ˈpɹɒpli] (“He said he hadn’t read it properly”). Narrow transcription can also be used to capture non-English prosodic features in disordered speech (see Rutter et al. (2010) for a detailed account). (6) facilitate an investigation of how phonetic features may correlate with such conversational behaviors as turn taking, repair and topic management. Damico and Nelson (2005), for example, note cases of individuals with autistic spectrum disorders where creaky phonation relates to specific interactional and discourse behaviors, and Tarplee and Barrow (1999) use narrow transcription, including interlinear pitch contours, to capture significant interactional behaviors in their analyses of conversational interaction between mothers and their autistic children.
33.4 Instrumental Measures Transcription can be supplemented through use of technical instruments which measure aspects of articulatory movement or frequency and temporal characteristics of the acoustic
478 Sally Bates, Jocelynne Watson, Barry Heselwood, and Sara Howard signal. Articulatory techniques include laryngography for measuring vocal fold action, nasometry for measuring nasal airflow, electropalatography for recording patterns of tongue-palate contact and ultrasound tongue imaging for examining mid-sagittal or coronal views of tongue body movement. Speech spectography enables measurement of frequency, amplitude, and duration of segments, and voice pitch. With appropriate sampling, articulatory instruments can: • provide direct evidence of the speaker’s articulation along a given parameter to support or refute perception-based transcription (from which knowledge of production has been inferred). This includes evidence of variability, either variability which occurs naturally as a function of phonetic context, or variability which reflects abnormal timing errors, which is unlikely to be identified in a perception-based transcription. For example, the use of EPG has shown that what might appear to be a child’s inconsistent application of both velar fronting and alveolar backing and transcribed as such, may be attributed to use of an undifferentiated lingual gesture resulting in increased tongue-palate contact, with a variable release trajectory (Gibbon, 1990). • when used in conjunction with other articulatory measures, provide direct evidence of difficulties with the timing and co-ordination of articulatory sub-systems in real time. • provide an empirical measure of articulatory movement. Information about parameters such as range, speed, and force of movement, and how these vary across time facilitates an accurate understanding of speaker change, that is, improvement or deterioration, as well as comparison with population norms, where these are known. • provide evidence of covert contrasts, that is, productions which have consistent articulatory differences, but which are imperceptible to the transcriber (Munson et al., 2010). Even though not functional from a listener’s perspective, covert contrasts indicate that the child has some PPK of the contrast in question. This has important implications for clinical management. For example, traditional target selection criteria recommend prioritizing sounds for which the child has most PPK while more recent selection criteria recommend targeting areas of least PPK (Gierut, 2005).
33.5 Sampling As noted above, the extent to which transcription and subsequent data analysis can profile a speaker’s phonetic and phonological capabilities is critically dependent on the nature of the sample obtained. Ideally, the sample should be sufficiently representative to identify the sounds, word structures and lexical stress patterns the speaker has at their disposal (phonetic inventories) and how they use these in speech to convey meaning. The SLP should be able to identify any error patterns in operation and establish the extent to which any variability in the production of individual sounds is conditioned by either linguistic constraints such as word position and phonetic context or by lexical factors such as word length, word frequency and familiarity (Bates et al., 2021). In addition, the sample should allow consideration of both suprasegmental features and overall intelligibility as revealed in longer stretches of speech (CSDRN, 2017). The sample should also include sufficient tokens to mitigate the impact of any transcription error arising from poor auditory conditions and misperception. In clinical practice, the starting point when working with children with SSD, is most frequently a single word sample elicited through a picture or object naming task using a published assessment tool (Mcleod & Baker, 2014). Published tools typically offer scope for the speaker to produce all consonants in the target inventory. However, they are often limited in the extent to which consonants are tested in different word positions, in different phonetic
Phonetic Transcription in Clinical Practice 479 contexts and in words of more than two syllables. Vowel production is rarely targeted explicitly (Eisenberg & Hitchcock, 2010). It is therefore acknowledged that while these tools serve as a useful initial screen, they do not provide a sufficiently representative sample to deliver a detailed understanding of the child’s phonetic and phonological systems. Further informal probing is typically warranted to plan intervention and monitor outcome. Where verbal output allows, the sample should include polysyllabic words and connected speech either elicited (e.g., picture description, sentence completion or repetition) or from spontaneous conversation. Both contexts allow the SLP to gauge the child’s performance with increased task demands and thus can reveal word level errors, for example, sequencing errors, consonant insertions, and distortions as well as a greater application of the processes or patterns evident in the single word sample. They are also associated with an increased incidence of vowel errors and difficulties with lexical stress (Masso et al., 2016). A connected speech sample is also necessary to identify any difficulties with word juncture. These could involve the absence of typical assimilatory and/or elision behaviors (a case of open juncture) or the extreme lenition and over-use of glottal stops (Howard, 2004; Newton, 2012). The former impacts speech acceptability, while the latter decreases speech intelligibility. Inappropriate prosodic behaviors, lengthened coarticulatory transitions and open juncture in connected speech are core characteristics of CAS (ASHA, 2007). Transcription and analysis of connected speech also enable the SLP to record other suprasegmental features, for example, speech rate, intonation, nasality, and voice quality which can provide important diagnostic information, for example, slow rate of speech, low pitch, reduced intonation and abnormal voice quality are associated with Childhood Dysarthria (Haas et al., 2021), and hypernasality associated with cleft palate or velopharyngeal dysfunction (Howard et al., 2019).
33.6 Challenges/Constraints As outlined in the examples above, a competent transcription of a representative sample of speech can effectively provide information necessary to inform differential diagnosis and, where warranted, ongoing clinical management, including treatment planning and evaluation of effectiveness. However, it is important to recognize that transcription has limitations, both in the extent to which it can deliver a wholly objective record of the speaker’s output and to which it can match the fine level of detail that instrumental measures provide. The human auditory perceptual system on which data collection depends is inherently, and purposefully, subjective. This subjectivity supports efficient everyday communication between speaker and listener which is reliant on the listener utilizing top-down processing to support their interpretation of the incoming sensory (acoustic and visual) information. In recovering the speaker’s intended meaning, the listener is, to a large extent, guided by their expectations (i.e., both linguistic, and non-linguistic contextual information stored in long-term memory) and importantly, also aided by a learned differential sensitivity to the phonemic contrasts of their own language (Kuhl, 1991). It is therefore not usually necessary for mature interlocutors to process all the information in the acoustic speech signal and any degraded or missing information can be subconsciously restored (Grossberg, 2003). This facility however runs counter to the state of objective listening which underpins accurate transcription. In acknowledgment, Howard and Heselwood (2002) remind us of Ladefoged’s (1990) dictum that “most phonetic observations are made in terms of a phonological framework” (p. 341) and advise that “if we cannot totally suspend our phonological preconceptions, we can and should cultivate a critical awareness of them” (p. 383). In practice the most likely effects on transcription will be a tendency to assimilate errored sounds to the transcriber’s own target system. This will lead to either an under- or over-estimation of the speaker’s ability depending on the sound contrasts in question and similarities and
480 Sally Bates, Jocelynne Watson, Barry Heselwood, and Sara Howard differences between the transcriber’s and the speaker’s target system/s. Normalization toward expected forms is particularly likely during live transcription, given time pressures typically operating in the clinical setting (Howard & Heselwood, 2002). To mitigate this problem, SLPs are trained to optimize their objective listening as part of the curriculum at pre-registration level and can access further training if taking on specialist caseloads for example, for cleft palate or hearing impairment (White et al., 2022). They also have recourse to a range of online resources to practice ear training and maintain transcription skills. Other strategies to support objective listening (and provide confidence in transcription accuracy) include the use of inter-rater reliability (e.g., Cucchiarini, 1996; Seifert et al., 2020) and/or consensus procedures (e.g., Shriberg et al., 1984). (See Heselwood (2013) for a critical discussion.) Another strategy is to include repetition or reading of pseudo- or non-word material to reduce lexical and phonological priming influences. Transcribers can also recruit objective measures of acoustic characteristics taken from spectrographic analysis such as Voice Onset Time, formant, or pitch tracking against which to calibrate their use of symbols, for example, distinctions between [ph], [p=], [b]. Regarding level of detail, even narrow phonetic transcription cannot capture the finegrained distinctions listeners have been shown to be capable of detecting. For example, Munson and Meyer (2021) demonstrate that when using a perceptual rating scale both trained and untrained listeners were able to discriminate subtle differences in sounds intermediate along a continuum between two endpoints such as, for example, /t/ and /k/ or /s/ and /ʃ/. Given the continuous nature of developing contrasts, they argue that transcription should be supplemented with this technique. Instrumental techniques can also be used to support transcription in providing additional phonetic detail. For example, Sugden and Cleland (2021) found that ultrasound-aided transcription revealed a higher number of errors in the speech of children with unexplained SSD than perceptual-based transcription. These included more instances of increased contact, variability, and abnormal timing errors, indicative of underlying motor deficits. As the authors point out, this information has important implications for clinical management. UTI also had the added value of increased inter-rater reliability. Unfortunately, it must be acknowledged that currently these strategies are not routinely available for use in clinical settings outside the research environment. This is due to a variety of factors including personnel and time requirements, cost of equipment and maintenance, limited availability of instrumental measures, lack of training opportunities, at both pre- and post-registration levels, and service delivery constraints (Titterington & Bates, 2021; White et al., 2022). The clinical setting also presents other challenges to transcription practice. Some are intrinsic to the SLP’s role, others are extrinsic, pertaining to the environment. For the SLP there is inevitably an inherent tension between the need to maintain the focus and objectivity required for transcription and being receptive to information regarding the speaker’s overall communicative competence. At the same time, the SLP is also tasked with maintaining the success of the communicative interaction, including supporting any medical and/or behavioral needs the client may have, and orchestrating contributions from others in the room, for example, a parent/carer, sibling, or partner or other professional as well as achieve the session objectives. Ideally transcription should be carried out under optimal listening conditions, that is, in a quiet environment with good acoustics and lighting and without background noise or distractions. In the clinical context these conditions are not always met. Given these considerations it will be challenging for even a skilled transcriber to reliably record all relevant features of a child’s production in live transcription. This is particularly so in more severe cases, where the speaker is unable to repeat words and in the case of longer and articulatorily more complex material (i.e., polysyllabic words and connected speech).
Phonetic Transcription in Clinical Practice 481 For this reason, SLPs should also transcribe from audio- or, ideally, video-recordings (CSDRN, 2017; Stemberger & Bernhardt, 2020). See Stemberger and Bernhardt (2020) for a review of the standard of recording equipment required for clinical transcription
33.7 Research-practice Divide The value of phonetic transcription as a clinical tool is widely acknowledged among both SLP practitioners and students (White et al., 2022). Unfortunately, however, there appears to be an increasing gap between theory and practice. Morgan et al. (2021) in their review of clinical case notes from three UK SLP services, report that recording, and analysis of pre- and post-intervention data was inconsistent and did not provide sufficient detail to adequately monitor progress or evidence outcomes for children with SSD. In many cases, no data were provided beyond the initial assessment. White et al. (2022), in their survey of UK pediatric SLP practice, report that existing clinical guidelines are often not followed and there was also a widespread reliance on a screening tool which simply requires the SLP to tick sounds produced and circle those in error. This accords with earlier surveys which report over-reliance on screening tools, in contradiction to recommendations in the professional literature (Mcleod & Baker, 2014). This lack of evidence-based practice is thought to reflect a lack of confidence on the part of SLPs in their transcription skills, particularly narrow transcription. This, in turn, has been attributed to limited training and lack of continuing professional development opportunities, especially in generalist services (Knight et al., 2018; Nelson et al., 2020; White et al., 2022). Other barriers identified by the White et al. (2022) survey include: • Lack of consistency across services regarding the implementation of existing professional guidelines. • Time pressures imposed by service constraints. These include care pathway protocols which impose limits on the number and length of sessions that can be offered. • Technical challenges relating to the production and storage of transcription data. For example, many Electronic Patient Record systems do not support IPA symbols. • Lack of policies to facilitate production and storage of high-quality audio- or videorecordings of speech data for subsequent transcription and analysis. Titterington and Bates (2021) suggest that professional bodies should provide more specific recommendations regarding curriculum guidance on the transcription skills students need upon qualification. This should ensure parity in the content and format of the training that students receive at pre-registration level. They also argue for the development of a set of “core benchmark competencies for entry level SLTs” which will also “map out a path in professional development supporting maintenance and further development of specialist skills as appropriate” (p. 18).
33.8 Conclusion This chapter summarizes the value of phonetic transcription in the clinical context. It also acknowledges a current paradox. While transcription is regarded professionally as fundamental to offering an evidence-basis for clinical decision-making, its use is subject to several operational barriers. Given that this skill represents the unique contribution an SLP brings to the management of an individual’s speech difficulties, it is of paramount importance that concerns raised in recent surveys of professional practice are addressed.
482 Sally Bates, Jocelynne Watson, Barry Heselwood, and Sara Howard
REFERENCES American Speech-Language-Hearing Association (ASHA). (2007). Childhood apraxia of speech (technical report). Accessed December 20, 2022, from https://www.asha.org/policy/ tr2007-00278 Ball, M. J., Esling, J. H., & Dickson, B. C. (2018a). Revisions to the VoOS system for the transcription of voice quality. Journal of the International Phonetic Association, 48(2), 165–171. https://doi.org/10.1017/ S0025100317000159 Ball, M., Howard, S., & Miller, K. (2018b). Revisions to the extIPA chart. Journal of the International Phonetic Association, 48(2), 155–164. https://doi.org/10.1017/ S0025100317000147 Ball, M., Müller, N., Klopfenstein, M., & Rutter, B. (2009). My client is using non- English sounds! A tutorial in advanced phonetic transcription. Part 1: Consonants. Contemporary Issues in Communication Science and Disorders, 36(Fall 2009), 131–141. https://doi.org/10.1044/ cicsd_36_F_133 Bates, S., & Titterington, J. with the Child Speech Disorder Research Network. (2021). Good practice guidelines for the analysis of child speech. Published on RCSLT members webpage (www.rcslt.org) and Bristol Speech and Language Therapy Research Unit webpage (www.speech-therapy.org.uk) Bates, S. A. R., Watson, J. M. M., & Scobbie, J. M. (2013). Context conditioned error patterns in disordered systems. In M. Ball & F. Gibbon (Eds.), Handbook of vowels and vowel disorders (pp. 288–325). Psychology Press. Bernhardt, B. M., Stemberger, J. P., Bérubé, D., Ciocca, V., Freitas, M. J., & Ignatovas, D. (2020). Identification of protracted phonological development across languages – The Whole Word Match and basic mismatch measures. In E. Babatsouli, M. Ball, & N. Müller (Eds.), An anthology of bilingual child phonology (pp. 274–308). Multilingual Matters Child Speech Disorder Research Network. (2017). Good practice guidelines for the transcription of children’s speech in clinical practice and research. Published on RCSLT members webpage (www.rcslt.org) and Bristol Speech and Language Therapy Research Unit webpage (www.speech-therapy.org.uk) Cucchiarini, C. (1996). Assessing transcription agreement: Methodological aspects. Clinical Linguistics & Phonetics, 10(2), 131–156.
Damico, J., & Nelson, R. (2005). Interpreting problematic behaviour: Systematic compensatory adaptations as emergent phenomena in autism. Clinical Linguistics & Phonetics, 19(5), 405–417. Dodd, B. (2014). Differential diagnosis of pediatric speech sound disorder. Current Developmental Disorders Reports, 1(3), 189–196. https://doi.org/10.1007/s40474-014-0017-3 Eisenberg, S. L., & Hitchcock, E. R. (2010). Using standardized tests to inventory consonant and vowel production: A comparison of 11 tests of articulation and phonology. Language, Speech, and Hearing Services in Schools, 41(4), 488–503 https:// doi.org/10.1044/0161-1461(2009/08-0125). Gibbon, F. (1990). Lingual activity in two speech-disordered children’s attempts to produce velar and alveolar stop consonants: Evidence from electropalatographic (EPG) data. British Journal of Disorders of Communication, 25(3), 329–340. https://doi. org/10.3109/13682829009011981 Gierut, J. A. (2005). Phonological intervention: The how or the what? In E. Alan, G. Kamhi, & K. Pollock (Eds.), Phonological disorders in children: Clinical decision making in assessment and intervention (pp. 201–210). Paul H. Brookes Publishing Co. Grossberg, S. (2003). Resonant neural dynamics of speech perception. Journal of Phonetics, 31(3–4), 423–445. https://doi.org/10.1016/ S0095-4470(03)00051-2 Haas, E., Ziegler, W., & Scholderle, T. (2021). Developmental courses in childhood dysarthria: longitudinal analyses of auditoryperceptual parameters. Journal of Speech, Language and Hearing Research, 64(5), 1421–1435. https://doi.org/10.1044/2020_JSLHR-20-00492 Harris, J., Watson, J., & Bates, S. (1999). Prosody and melody in vowel disorder. Journal of Linguistics, 35(3), 489–525. https://doi. org/10.1017/S0022226799007902 Heselwood, B. (2013). Phonetic transcription in theory and practice. Edinburgh University Press. Howard, S. J. (2004). Connected speech processes in developmental speech impairment: Observations from an electropalatographic perspective. Clinical Linguistics & Phonetics, 18(6–8), 405–417. https://doi.org/10.1080/026 99200410001703547 Howard, S. J., & Heselwood, B. C. (2002). Learning and teaching phonetic transcription for clinical purposes. Clinical Linguistics &
Phonetic Transcription in Clinical Practice 483 Phonetics, 16(5), 371–401. https://doi. org/10.1080/02699200210135893 Howard, S. J., Heselwood, B. C., & Harding-Bell, A. (2019). The nature of speech associated with cleft palate. In A. Harding-Bell (Ed.), Case studies in cleft palate speech: Data analysis and principled intervention (pp. 23–47). J & R Press. IPA (1999). The handbook of the International Phonetic Association. Cambridge University Press. IPA Chart, (2015). http//www.international phoneticassociation.org/content/ipa-chart, available under a Creative Commons Attribu tion-Sharealike 3.0 Unreported Licence. © 2015 International Phonetic Association. Kent, R. D., & Miolo, G. (1995). Phonetic abilities in the first year of life. In P. Fletcher & B. MacWhinney (Eds.), The handbook of child language (pp. 303–334). Blackwell. Knight, R.-A., Bandali, C., Woodhead, C., & Vansadia, P. (2018). Clinicians’ views of the training, use and maintenance of phonetic transcription in speech and language therapy. International Journal of Language and Communication Disorders, 53(4), 776–787. https://doi.org/10.1111/1460-6984.12381 Kuhl, P. K. (1991). Human adults and human infants show a “perceptual magnet effect” for the prototypes of speech categories, monkeys do not. Perception and Psychophysics, 50(2), 93–107. Ladefoged, P. (1990). Some reflections on the IPA. Journal of Phonetics, 18(3), 335–346. Local, J. (2003). Variable domains and variable relevance: Interpreting phonetic exponents. Journal of Phonetics, 31(3–4), 321–339. https:// doi.org/10.1016/S0095-4470(03)00045-7 Masso, S., McLeod, S., Baker, E., & McCormack, J. (2016). Polysyllable productions in preschool children with speech sound disorders: Error categories and the framework of polysyllable maturity. International Journal of SpeechLanguage Pathology, 18(3), 272–287. https://doi. org/10.3109/17549507.2016.116848 Mcleod, S., & Baker, E. (2014). Speech-language pathologists’ practices regarding assessment, analysis, target selection, intervention, and service delivery for children with speech sound disorders. Clinical Linguistics and Phonetics, 28(7–8), 508–531. https://doi.org/10 .3109/02699206.2014.926994 Morgan, L., Overton, S., Bates, S., Titterington, J., & Wren, Y. (2021). Making the case for the collection of a minimal dataset for children with speech sound disorder. International Journal of Language and Communication Disorders, 56(5), 1097–1107. https://doi.org/10.1111/1460-6984.12649
Munson, B., Edwards, J., Schellinger, S. K., Beckman, M. E., & Meyer, M. K. (2010). Deconstructing phonetic transcription: Covert contrast, perceptual bias, and an extraterrestrial view of Vox Humana. Clinical Linguistics & Phonetics, 24(4–5), 245–260. https://doi.org/10.3109/02699200903532524 Munson, M. K., & Meyer, B. (2021). Clinical experience and categorical perception of children’s speech. International Journal of Language and Communication Disorders, 56(2), 374–388. https://doi.org/10.1111/1460-6984.12610 Nelson, T. L., Mok, Z., & Eecen, K. T. (2020). Use of transcription when assessing children’s speech: Australian speech-language pathologists’ practices, challenges, and facilitators. Folia Phoniatrica et Logopaedica, 72(2), 131–142. https://doi.org/10.1159/00050313 Newton, C. (2012). Between-word processes in children with speech difficulties: Insights from a usage-based approach to phonology. Clinical Linguistics and Phonetics, 26(8), 712–727. https://doi.org/10.31.09/02699206.2012.697973 Parker, A., & Rose, H. (1990). Deaf children’s phonological development. In P. Grunwell (Ed.), Developmental speech disorders (pp. 83–107). Whurr Publishers Ltd. Rahilly, J. (2011). Transcribing speech: Practicalities, philosophies, and prophesies. Clinical Linguistics & Phonetics, 25(11–12), 934–939. https://doi.org/ 10.3109/02699206.2011.601391 Rutter, B., Klopfenstein, M., Ball, M., & Müller, N. (2010). My client is using non-English sounds! A tutorial in advanced phonetic transcription. Part 3: Prosody and unattested sounds. Contemporary Issues in Communication Science and Disorders, 37(Fall 2010), 111–122. https:// doi.org/10.1044/cicsd_36_F_111 Seifert, M., Morgan, L., Gibbin, S., & Wren, Y. (2020). An alternative approach to measuring reliability of transcription in children’s speech samples: Extending the concept of near functional equivalence. Foilo Phoniatrica et Logopaedica, 72(2), 84–91. https://doi. org/10.1159/000502324 Shriberg, L. D., Kwiatkowski, J., & Hoffman, K. (1984). A procedure for phonetic transcription by consensus. Journal of Speech and Hearing Research, 27(3), 456–465. https://doi. org/10.1044/jshr.2703.456 Stemberger, J. P., & Bernhardt, B. M. (2020). Phonetic transcription for speech-language pathology in the 21st century. Folio Phoniatrica et Logopaedica, 72(2), 75–83. https://doi. org/10.1159/000500701 Sugden, E., & Cleland, J. (2021). Using ultrasound tongue imaging to support
484 Sally Bates, Jocelynne Watson, Barry Heselwood, and Sara Howard phonetic transcription of childhood speech sound disorders. Clinical Linguistics and Phonetics, 36(12), 1047–1066. https://doi.org/1 0.1080/02699206.2021.1966101 Tarplee, C., & Barrow, E. (1999). Delayed echoing as an interactional resource: A case study of a 3-year-old child on the autistic spectrum. Clinical Linguistics & Phonetics, 13(6), 449–482. https://doi.org/10.1080/026992099298988 Titterington, J., & Bates, S. (2021). Teaching and learning clinical phonetic transcription. In M. Ball (Ed.), Manual of clinical phonetics (pp. 175–186). Routledge.
Vihman, M., & Harding-Bell, A. (2019). Infant vocalisations, babble, and early speech. In A. Harding-Bell (Ed.), Case studies in cleft palate speech: Data analysis and principled intervention (pp. 51–89). J & R Press. White, S., Hurren, A., James, S., & Knight, R.-A. (2022). “I think that’s what I heard? I’m not sure”: Speech and Language Therapist’s views of, and practices in, phonetic transcription. International Journal of Language and Communication Disorders, 57(5), 1071–1084. https://doi.org/10.1111/1460-6984.12740
Phonetic Transcription in Clinical Practice 485
APPENDICES Appendix 33.1 The IPA Chart (IPA Chart, http://www.internationalphoneticassociation.org/content/ipa-chart, available under a Creative Commons Attribution-Sharealike 3.0 Unported License. Copyright © 2015 International Phonetic Association.)
486 Sally Bates, Jocelynne Watson, Barry Heselwood, and Sara Howard Appendix 33.2 The extIPA Chart (Reproduced by permission of ICPLA.)
Phonetic Transcription in Clinical Practice 487 Appendix 33.3 The VoQS Chart (Reproduced by permission of Martin J. Ball.)
34 Instrumental Analysis of Speech Production LUCIE MÉNARD AND MARK TIEDE 34.1 Preamble Speech production involves precisely timed and controlled movements of various parts of the orofacial system. Those movements modulate the airstream entering the various supralaryngeal cavities from the larynx, and give rise to an acoustic waveform. Characterizing the articulatory events that take place to produce speech sounds is a challenging endeavor. Sometimes movements can be inferred from perceptual identification (both auditory and visual) by the clinician. Other techniques such as acoustic analysis can also help determine the articulatory gestures from which a given speech sequence originates. However, the extraordinary flexibility of the speech production system usually makes this method problematic. Different combinations of articulatory positions give rise to similar acoustic-perceptual outputs. This phenomenon, referred to as “motor equivalence,” is a necessary characteristic of speech production as it allows speakers to produce intelligible speech despite various changes in speech conditions: increased speech rate, loudness, and competing demands induced by coarticulation, among others. People with speech disorders use this property of the motor system to compensate for various deficits and limitations. In-depth investigation of the articulatory strategies they use to produce speech can reveal alternative strategies. Over the years, several instrumental techniques and tools have been designed to track the movements and shape of the orofacial articulators. In this chapter, we will focus on the supralaryngeal articulators, namely the jaw, tongue, soft palate, and lips. Instrumental analyses of the laryngeal system are addressed in Chapter 36 (this volume) Depending on the accessibility of the structure being studied (the lips are more easily accessible to the experimenter than the tongue, for instance), different tools will be used, each of which has different degrees of invasiveness. Furthermore, whether the experimenter’s goal is to assess the speaker’s specific speech production characteristics at the evaluation stage or to compare production strategies at different points in time during therapy will have a substantial impact on the kind of instrument that can be used. Finally, and maybe most importantly, the specific type of disorder, its severity and the target population will dictate the pool of instrumental techniques that can be used to investigate speech production mechanisms. For the sake of clarity, we have grouped the techniques in five categories, depending on the targeted process: (i) muscle contraction measurement, (ii) point tracking systems to measure supraglottal articulatory displacement, (iii) image tracking systems, (iv) tongue-to-palate contact tracking, and (v) airflow measurement (aerometry).
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
490 Lucie Ménard and Mark Tiede
34.2 M easurement of Muscle Contractions (Electromyography) Electromyography (EMG) measures electrical signals related to muscle contractions. The myoelectric signal can be captured by two types of electrodes: surface electrodes (applied on the skin or tissue) and intramuscular electrodes (inserted through wires or needles). Because surface electrodes are less invasive than intramuscular electrodes, they have been more widely used by researchers. In both cases, electrode placement requires precision and preparation of the skin surface. EMG signals can be perturbed by noise, electrical sources in the vicinity, fatigue and muscle length, among other things. Once positioned, electrodes capture myoelectric activity signals, which must be amplified and filtered before analysis. EMG has been used to study the physiopathology of motor speech disorders. For instance, individuals who stutter have been found to display abnormal EMG patterns in the form of increased or abnormal signal peaks (Denny & Smith, 1992; McClean et al., 1984; van Lieshout et al., 1993). Another speech disorder that has been investigated using EMG is apraxia of speech (AOS), a disorder characterized by deficits in motor planning and programming (Fromm et al., 1982). Speakers with dysarthria, a condition involving problems with muscle timing and strength, have also been studied with EMG. Dysarthric speech related to Parkinson’s disease results in abnormal EMG patterns of the orofacial muscles (Moore & Scudder, 1989). The effects of treatment (namely levodopa) can be seen on muscle activity in clients with Parkinson’s disease. Another application of EMG in speech disorder is biofeedback. EMG measures can be converted into audio or visual signals and made accessible to the speaker; this can help reduce the manifestations of stuttering (e.g., see Block et al., 2004). Muscular biofeedback has also proven useful in managing dysarthria. Despite these promising results, EMG is still rarely used by speech and language pathologists in a clinical setting. Not only does it require specialized expertise, but it is also costly, and data processing and analysis are labor-intensive.
34.3 Point Tracking Systems Muscle contractions give rise to articulatory movements that can be monitored using a wide variety of techniques and methodological approaches. Flesh-point tracking methods, involving video tracking or electromagnetic fields, allow specific anatomical locations to be recorded. The orofacial articulators involved in speech production may be internal (e.g., the tongue) or external (e.g., the lips). Thus, some of them can be monitored using video tracking. Even though they are not directly accessible by cameras, most internal articulators create structural deformations of the skin, which can also be tracked and analyzed. Data collected by flesh-point tracking methods are particularly well suited to studying kinematic parameters, such as the displacement, speed and acceleration of a given location. In clinical speech research, each tool has its advantages and disadvantages, which we will describe in the f ollowing sections.
34.3.1 Video Tracking The easiest technique for recording and analyzing articulatory motion during speech production in a clinical setting is undoubtedly standard video tracking. This method is already widely used by clinicians to assess fluency disorders, for instance, or to keep track of clients’ progress. Simple adaptations can be made in order to use video recording as a tool to characterize articulatory motion. One technique consists in affixing small adhesive dots on locations of interest on the speaker’s face (e.g., the lips or the chin). Those dots are often in colors
Instrumental Analysis of Speech Production 491 that contrast with the face for easier detection on the recorded signal. For instance, to track lip position over time, simple blue dots affixed to the middle points of the upper and lower lips are used, as can be seen in Figure 34.1 (left panel). One camera is positioned to capture a frontal view of the face, and sometimes a second camera is positioned at the speaker’s side, to capture the profile view. This view allows for measurement of lip protrusion, usually in reference to a fixed point (a ruler attached to a head mount or glasses). The advantage of using blue is that it contrasts strongly with skin colors and can easily be automatically tracked (Kabakoff et al., 2023). In terms of temporal resolution, standard cameras (or smartphones) provide 29.97 images per second (NTSC format) or 25 per second (PAL/SECAM format). An alternative is to make use of facial landmark tracking provided by software such as the OpenFace package (Baltrušaitis et al., 2018). This open source software provides the spatial location of 68 landmarks distributed around the face, including inner and outer lip boundaries, with an output temporal resolution of 25 frames per second. An example of OpenFace landmark fitting is provided in Figure 34.1 (right panel). Several factors must be taken into account during the recording phase as they have an impact on the quality of detection on the video file. For instance, lighting needs to be adequate and shadows should be avoided. For consistency across sessions a calibration factor must be determined establishing the number of image pixels per mm. Importantly, head movement has to be considered, as the position of a blue marker on the lips, for instance, might shift due to active movement of the lips or to head movement. To avoid this kind of noise, one option is to limit head movement by constraining the head with a helmet. Provided the camera does not move, this method allows direct extraction of marker position, without pre-processing the data. Obviously, though, speakers may experience discomfort, especially very young children and persons with motor tremor. Another option is to track head movement and subtract it from the target markers. Usually, three markers are positioned on the speaker’s forehead (as shown in Figure 34.1, left panel) to extract reference data. When lip position over time is examined, for instance, head lip x and y positions are translated and rotated to an initial reference position. This method has the advantage of increasing the speaker’s comfort and minimally interfering with speech production, but it requires additional post-processing. Note that simple measures of lip opening, representing Euclidean distance between upper and lower midsagittal or left/right mouth corner landmarks are independent of head movement and can be computed directly.
Figure 34.1 Experimental setup for video tracking using blue markers (left), and example of OpenFace landmark fitting (right).
492 Lucie Ménard and Mark Tiede In recent years, research on speech disorders has witnessed the development of computer vision algorithms based on deep learning. Automatic detection of lip and jaw position from video recording provides kinematic markers (e.g., motion range and speed) that characterize speech produced by speakers with Parkinson’s disease and amyotrophic lateral sclerosis (ALS) versus healthy controls (Bandini et al., 2018; Guarin et al., 2020). An automated segmentation algorithm applied to video recordings of speakers uttering the sentence Buy Bobby a poppy can adequately detect the kinematic features of speech disorders (Naeini et al., 2022). Given a smartphone that incorporates a depth camera, a simple video recording of the face, without any markers, can be made and later analyzed with face motion tracking. For instance, facial landmark detection algorithms such as OpenFace and others have been developed to detect early signs of dysarthria and are currently being used for clinical assessment (Jafari, 2022).
34.3.2 Optoelectronic Systems Some optical systems track the position of infrared light emitting diodes (active marker iREDs) by triangulating across multiple cameras with a known geometry. Optotrak (Northern Digital) systems, for instance, have been used in a number of studies to evaluate the coordinated actions of visible articulators (e.g. Ostry et al., 1997; Yehia et al., 1998). Optotrak couples excellent spatial (< 0.1 mm) with higher temporal resolution than standard video tracking (up to 200 Hz, depending on the number of iREDs used) and does not require any calibration. iREDs, connected to their power source by lead wires, are placed on the locations of interest using double-sided tape. A different approach to optical tracking such as that used by Vicon systems uses reflective (passive) ball-like markers also placed on the locations of interest with double-sided tape. This kind of system uses multiple infra-red cameras that collate the reflections from the markers to provide highly accurate tracking of marker positions over time, though there is a tradeoff between marker size and accuracy. Because no wires are necessary this system is often used to capture large-scale limb movements like those observed in co-speech gesturing and sign language (e.g. Krivokapić et al., 2017). Optoelectronic systems can also provide real-time visual feedback of the lips and jaw. Indeed, landmarks positions can be mapped to virtual three-dimensional avatars. Speakers can therefore see their own speech movements reproduced on an avatar in real time, which generates realistic embodied biofeedback (Vidou et al., 2020).
34.3.3 Electromagnetic Articulography (EMA) The use of electromagnetic fields to track articulatory movements during speech production was fostered by the development of systems by two main research groups, one American (Perkell et al., 1992) and one German (Schönle et al., 1987). The system used by the latter has been commercialized and distributed by the German company Carstens Medizinelektronik. This method relies on the transduction of current flows in sensor “coils” induced by radio frequency transmitters oscillating at differing kHz rates (a now-discontinued alternative system, the Northern Digital WAVE, used strobed transmitters at the same frequency). The superimposed signal amplitudes in a sensor from each transmitter compared against its calibrated response within the field permits recovery of three spatial dimensions and two angular orientations (azimuth and elevation), a so-called 5-D system. Early EMA variants could track 2D positions of sensors only within the midsagittal plane, and consequently this is referred to as “Electromagnetic midsagittal articulography” (EMMA). These systems required a stabilizing helmet to mount the transmitters and to maintain the speaker’s head in the same position relative to them. As the experiment
Instrumental Analysis of Speech Production 493 proceeded, the helmet sometimes made participants uncomfortable. Current 5-D EMA systems (Carstens AG-500 and AG-501; Northern Digital WAVE and VOX) do not use a helmet and allow free movement of the head. Various assessments of the accuracy of these systems have been published, including Savariaux et al., 2017, Sigona et al., 2018, and Rebernik et al., 2021b). The AG-501 (displayed in Figure 34.2 – left panel) has a spatial accuracy of ~ 0.3 mm with a maximum sampling rate of 1,250 frames per second (in practice this is usually downsampled to 250 fps). Gating pulses linked to the beginning and end of kinematic data sampling provide precise alignment with co-collected audio data. Although EMA sensors are small (~ 3mm longitudinally), they are attached directly to articulator surfaces and connected by lead wires and thus potentially perturb natural articulation. Sensors are nevertheless well tolerated by most participants, who adapt to them with a few minutes of practice (Dromey et al., 2018). EMA sensors used intraorally are generally dipped in latex to minimize interaction with saliva, and affixed to articulator surfaces using dental glue. A typical sensor arrangement might include three placed on the tongue to track midsagittal posture, an additional parasagittal sensor on the blade to track doming, one or two placed on mandible dentition, two on the vermillion border of the upper and lower lips and one at the mouth corner. Typical sensor placement on the tongue is shown in Figure 34.2 (right panel). Additional reference sensors placed on the left and right mastoid processes, the upper incisors, and/or the nasion are used to normalize all trajectories to a subject-based coordinate system relative to their occlusal plane by correcting for head movement. This is accomplished by collecting a reference trial during which a bite plate (generally a piece of rigid Plexiglas) with three sensors attached to it is clasped between the participant’s teeth. This procedure is discussed in Rebernik et al. (2021a), which provides a review of best practices for recording data with electromagnetic articulography. EMA has been used mainly with adults, although some studies involved child participants (e.g., Katz & Bharadwaj, 2001; Murdoch et al., 2012; Schötz et al., 2013). In clinical speech research, various conditions have been examined using EMA. For instance, Didirková et al. (2021) investigated the movements of supraglottal articulators and interarticulatory coupling in dysfluencies produced by individuals who stutter. EMA data have also been collected in adults diagnosed with AOS to better understand the underlying mechanisms underlying specific kinematic profiles (Bartle-Meyer et al., 2009). Hypokinetic dysarthria related to Parkinson’s disease has also been investigated, either to gain a better understanding of
Figure 34.2 Carstens AG501 (left) and typical EMA sensor layout (right).
494 Lucie Ménard and Mark Tiede the speech disorder (Mefferd & Dietrich, 2019) or to assess the effects of drug treatment on speech production outcomes (Thies et al., 2021). ALS has been the focus of experimental studies using EMA (Shellikeri et al., 2016; Teplansky et al., 2023). This experimental technique has also been useful for examining different articulatory strategies acquired by congenitally blind adults and sighted adults, revealing the importance of vision in motor control (Trudeau-Fisette et al., 2017). Although manufacturers advise against the use of EMA with individuals wearing cochlear implants (because of the risk that the electromagnetic field may damage the implants), Masapollo et al. (2021) recently reported that the technique is safe for implant users. As with other point tracking methods, data collected with electromagnetic articulography can also be used to provide visual biofeedback of articulatory movements, especially of the tongue, which is not visually accessible. Because of the difficulty of interpreting dynamic visual displays of points on a screen, the data provided by the system are usually embedded in a two-dimensional or three-dimensional head displayed on a screen (Katz et al., 2010; Vick et al., 2017).
34.3.4 Analysis of Point Source Trajectories The point source tracking methods outlined above are characterized by sparse sampling but good spatial and temporal precision and accuracy, and their association with fixed fleshpoint landmarks. They represent a package of synchronized time-varying trajectories, usually aligned with co-recorded audio, that are useful for identifying the kinematic properties of coordinated articulatory movement. Quantification of speech events from such trajectories often relies on identification of inflections in the first order central difference across all dimensions of sensor movement which approximates articulator velocity. An effective way of measuring an articulatory “gesture” or event is to find the onset of related movement, peak velocity toward the target, the velocity minimum associated with the target, and the peak velocity and cessation of movement following target achievement, illustrated in Figure 34.3 (in practice the onset and completion of movement are noisy, and are usually determined instead by some consistent percentage of local peak velocity).
Figure 34.3 Landmarks for delimiting an apical nasal gesture, showing vertical component of tongue tip movement (TTy) and corresponding absolute velocity (TTvel). Gesture on/offset (GONS/GOFFS), peak velocity (PVEL), and target maximum constriction (MAXC) are determined by inflections on the velocity signal.
Instrumental Analysis of Speech Production 495 Several Matlab toolboxes have been built and shared among the speech research community to assist experimenters with labelling of this type; for example, EGUANA (described in Henriques & van Lieshout, 2013) and MVIEW (Tiede, 2005). Prosodic elements of articulation can be associated with gestural “stiffness” (peak velocity divided by distance traveled), and relative timings between maximum constriction landmarks can provide insight into syllable structure, as in CV vs. CCV coordination (C-center effects; Browman & Goldstein, 1988). Delimited trajectories have also been analyzed to assess variability across multiple repetitions for clinical assessment (e.g. Lucero et al., 1997; Smith et al., 1995), and systematic differences across dialectal groups using generalized additive mixed modeling (Wieling et al., 2016).
34.4 Imaging Methods Unlike point tracking methods, imaging tools provide an overall shape of one or more articulators. They can even allow the entire vocal tract to be visualized.
34.4.1 X-rays X-ray imaging involves sending ionizing radiation (a form of electromagnetic waves with higher energy than visible light) through the body. X-rays are absorbed by bodily structures to different degrees, depending on the structures’ density and atomic level. For instance, since bones have a high concentration of calcium, they absorb x-rays very readily. On the resulting image, where shadows of the structures are printed, bones therefore appear with very high contrast. Conversely, less dense tissues, such as muscles or air-filled cavities, do not absorb x-rays as much as bony structures and appear on the image in shades of gray. In speech research, this method has been used to collect still images (usually sagittal views) or dynamic sequences of the entire vocal tract (cineradiography). In a case study of an adult with congenital aglossia, McMicken et al. (2014) examined images collected with cineradiography and discovered compensation strategies involving vertical movement of the hyoid bone, which contributed to changing vocal tract length and formant values, leading to relatively intelligible sounds. Articulatory movements related to velopharyngeal function in individuals with cleft palate have also been investigated using cineradiography (Brooks et al., 1965; Henningsson & Isberg, 1991). This technique has also been used to show that interarticulator coordination in speakers with hearing impairments differs from that of normal hearing controls (Tye-Murray, 1987). To decrease the amount of x-rays required (and, hence, the risk of health hazards linked to exposure to ionizing radiation), a system consisting of a narrow beam of x-rays tracking 2 to 3 mm gold pellets affixed to the tongue, jaw, and lips was developed. The beam was active only in the known neighborhood of the pellets, which were tracked using their shadows cast onto CCD detectors. This system, first used for speech research at the University of Tokyo (Fujimura et al., 1973), has the advantage of allowing a large amount of data to be collected while minimizing exposure to radiation. The system was further developed and used at the University of Wisconsin-Madison, which published a large database of articulatory data collected with the system still in wide use (Westbury, 1994). Although x-rays are used, this technique is more similar to a point-tracking method, as described in previous sections. As with traditional x-ray technology, lip and tongue movements in various clinical conditions have been described with the x-ray microbeam system (Tye-Murray, 1991; Weismer et al., 2003). While it is still possible to use x-rays to image orofacial articulatory structures if the participant has to undergo an imaging exam (e.g., in preparation for surgery), in general radiation concerns and the onset of realtime MRI have mostly curtailed the use of x-rays in speech
496 Lucie Ménard and Mark Tiede research. Apart from the health hazard of ionizing radiation, another disadvantage of x-ray imaging is that it “passes through” the head, and thus superimposes structures over each other. For example, part of the tongue might be obscured by shadows cast by the jaw or the teeth. Moreover, it does not allow the tracking of given anatomical landmarks on the tongue, unless markers are used, as in the case of the x-ray microbeam system.
34.4.2 Ultrasound Ultrasound imaging has become a popular method for studying speech production due to its affordability, portability, and ability to capture real-time images of the tongue. Ultrasound imaging is a non-invasive technique that uses high-frequency sound waves to produce images of internal organs and tissues. In the context of speech disorders, ultrasound imaging is mainly used to capture images of the tongue during speech production. The tongue is imaged transorally, with the ultrasound probe placed beneath the chin and directed towards the tongue. In ultrasound imaging, the terms “probe” and “transducer” are often used interchangeably, but technically they refer to different components of the ultrasound system. A transducer is the part of the ultrasound system that generates and receives sound waves. The probe is the handheld device that is placed on the speaker’s skin. The sound waves are generated by piezoelectric crystals located in the transducer that vibrate when an electric current is applied. As the sound waves travel through the tissue, they encounter different structures and densities, which cause them to be reflected back to the transducer. Coherent reflected waves are converted into electrical signals by the piezoelectric crystals and are processed by a computer to create an image reflecting the imaged internal structure and surface of the tongue. An important limitation of ultrasound is that the sound waves lose coherence in air, so no useful structural information (e.g. palatal shape) can be recovered beyond an air/tissue boundary (like the surface of the tongue). There are different ways of viewing the imaged data. The most common and easiest to interpret is called B-Mode (for brightness) which displays a reconstructed 2D image of the region scanned, updated at a frame rate that interacts with the dimensions of that region. Figure 34.4 (left panel) shows a midsagittal view of the tongue collected with ultrasound imaging in B-Mode. M-Mode (for motion) provides in strip chart format the time-varying intensity of echoes along a single angular offset, usually relative to a cursor displayed on the B-Mode display. This can be interpreted as vocal tract constriction degree at that point in the vocal tract and has been used for example, to provide visual feedback in clinical applications (Dugan et al., 2019).
Figure 34.4 Midsagittal B-Mode ultrasound image with superimposed tongue surface contour (left); probe stabilization helmet (right).
Instrumental Analysis of Speech Production 497 Although 3D systems exist (Lulich et al., 2018), most research and clinical systems image a single plane, typically with midsagittal or coronal orientation. The piezoelectric crystals usually consist of a two-dimensional array of elements arranged in a grid pattern. The size and shape of the array can vary depending on the desired field of view and imaging depth. In speech research, convex and microconvex probes are the two main types used to image the tongue (Cleland, 2021). A convex probe, also known as a curved array probe, has a curved shape and a wide field of view. It is typically used with adults. The convex shape of the probe allows it to capture a larger area of the tissue being imaged, resulting in wider images with better penetration. On the other hand, a microconvex probe, also known as a sector probe or a phased array probe, has a smaller curved shape and a narrower field of view. The smaller size of the microconvex probe allows for greater maneuverability and access to tighter spaces, resulting in higher resolution images of smaller areas. This type of probe is well suited for imaging children’s tongue. The depth of tissue that can be imaged using various ultrasound transducers depends on the frequency of the transducer. In general, convex probes typically have a lower frequency range, typically between 2 and 5 MHz. This lower frequency allows the sound waves to penetrate deeper into the body to produce images of deep structures. Microconvex probes, on the other hand, typically have a higher frequency range, between 5 and 10 MHz. The higher frequency allows for better resolution of superficial structures, but the penetration depth is not as great as with a convex probe. Ultrasound imaging provides non-invasive access to tongue movements, which is clearly useful when examining speech production in clinical populations. However, unlike other imaging methods, such as EMA, ultrasound imaging does not provide clear anatomical landmarks in the resulting image. For example, while the extracted tongue contours can be broadly divided into parts (such as posterior and anterior), the same fixed anatomical points cannot be tracked over time. Therefore, caution is advised when analyzing data (see Section 34.4.4 for an overview of image analysis methods). An important point to consider when collecting data is whether or not the probe is in a fixed position relative to the head. Frequently a helmet system is used to keep the probe in a fixed orientation and in firm contact with the underside of the chin (Derrick et al., 2018). If this system also constrains the jaw, then the tongue images are in the same coordinate space throughout the sequence and can be compared between articulation instances (directly or using extracted contours, see Section 34.4.4); however, constraining the jaw potentially perturbs articulation. Because the probe is coupled to the jaw, a system which does not constrain the jaw results in images that are not consistent with palatal hard structure. While this supports comparison of shape, it does not appropriately reflect potential differences in tongue height. One approach to correcting for jaw displacement is to track relative head and probe position and correct the coordinates of extracted tongue contours so that they are projected onto a similar space (Whalen et al., 2005). The right panel of Figure 34.4 shows co-collection of EMA with sensors on the probe used for this purpose. Over the years, speech in various conditions such as childhood apraxia of speech, hearing impairment (Turgeon et al., 2015), cleft palate (Bressmann et al., 2009; Dokovova & Cleland, 2022), persistent speech sound disorders, residual speech sound errors (Preston et al., 2017), and Down syndrome have been investigated. For a comprehensive review of the use of ultrasound imaging in clinical speech practice, readers are referred to Cleland (2021), Cleland & Allen (2023). Several studies have also investigated the effectiveness of ultrasound as a biofeedback tool. While some researchers have found that visual feedback can be beneficial in improving speech production in children with childhood apraxia of speech (Peter et al., 2020; Preston et al., 2016), Preston et al. (2019) noted in their systematic review that generalization can be challenging due to the significant variations in sample size and study designs among existing studies.
498 Lucie Ménard and Mark Tiede
34.4.3 Magnetic Resonance Imaging Early attempts to image the vocal tract using Magnetic Resonance Imaging (MRI) suffered from very long acquisition times (3–4 hours) and consequent image noise and subject fatigue. The first successful demonstration of imaging contrasting vowels using MRI was reported in Baer et al. (1991), which used volume data acquired in the transverse plane to generate the corresponding vocal tract area functions. Numerous subsequent studies have examined vowels (Story et al., 1998), liquids (Zhou et al., 2007), fricatives (Shadle et al., 2008), and other sounds that can be sustained over the ~10 s needed to acquire a complete volume. More recently the development of new excitation methods has led to “real-time” acquisition in the midsagittal plane (RT-MRI; Ramanarayanan et al., 2018 provides an overview), which provides a full view of the vocal tract from larynx through lips with frame rates of at least 25 fps (Narayanan et al., 2004). While RT-MRI currently provides the most complete view of the interacting speech articulators during production, drawbacks include the expense of renting scanner time, and the need for participants to phonate in supine posture under acoustically noisy conditions. As with all imaging methods, tracking fixed flesh points on the continuous tongue surface is problematic, but some success has been reported using MRI-visible markers (Badin et al., 2013).
34.4.4 Analysis of Imaging Data Although some research applications rely on volumetric imaging of the three-dimensional vocal tract, as a means of modeling higher modes of acoustic wave propagation through it for example (e.g. Zhou et al., 2008), most speech researchers focus on the time-varying midsagittal projection through the tract, in particular the lingual air-tissue boundary. A starting point for analysis is to discretize this boundary, often referred to as a tongue “contour,” to a set of coordinates usually normalized to mm and a common origin. Software tools for doing this efficiently and systematically fall broadly into two categories: neural net approaches (e.g. Fasel & Berry, 2010), and image featural gradient methods (e.g. Laporte & Ménard, 2018; Li et al., 2005), as well as approaches that combine both (e.g. Chen et al., 2020). Tongue contour sets extracted from cineradiography or RT-MRI are inherently related to vocal tract hard structures and can therefore be compared directly. Ultrasound-derived contours however reflect the current position of the imaging probe. Since this is coupled to the mandible, contours obtained during a jaw lowered interval ([a]) do not reflect that lowering relative to a jaw raised interval ([i]). If it is important to compare across such height differences the contour position must be adjusted to account for probe displacement relative to the palate (e.g. Noiray et al., 2020; Whalen et al., 2005). Statistical approaches to quantifying systematic differences between tongue contours include smoothing-spline ANOVA (SS-ANOVA; e.g. Davidson, 2006) and generalized additive mixed modeling (GAMM; e.g. Coretta, 2019). In clinical settings, indexing tongue complexity using an index of contour curvature (MCI, Dawson et al., 2016) and the number of contour inflections (NINFL, Preston et al., 2019) has proved useful for diagnosing and measuring treatment progress for children with residual speech errors.
34.5 Palatography Palatographic methods provide data on contact patterns between the tongue and the palate. Static palatography was pioneered by Rousselot in the 19th century and is still used in linguistic fieldwork. It uses powder, typically charcoal, applied to the tongue to show where contact with the palate during a test utterance results in transfer of the powder to the palatal
Instrumental Analysis of Speech Production 499
Figure 34.5 Example EPG contact pattern showing co-constriction of /k/ and /t/ in “pact.”
surface, assessed using a mirror positioned within the speaker’s mouth. While this approach is inexpensive and straightforward, it is limited to a single production at a time as the mouth must be rinsed and powder reapplied for each trial. Electropalatography (EPG) is a more flexible approach that uses electrodes distributed over a palatal prosthesis to track electrical contact with the tongue continuously during running speech (Hardcastle, 1972). Figure 34.5 provides a typical example of EPG contact patterns. EPG has been extensively used in research, particularly for studying coarticulation (Byrd, 1996), and has many clinical applications (Dent et al., 1995; Gibbon & Hardcastle, 1989). Its main disadvantage is that the prosthesis must be custom-fit to each speaker’s palatal vault and can potentially interfere with normal articulation.
34.6 Aerometry Methods Methods for directly or indirectly observing airflow and pressure associated with speech are also important for understanding how respiration and its coordination with production can be degraded or disordered.
34.6.1 Plethysmography Pulmonary capacity can be assessed either directly, using an enclosed cabinet with monitored pressure compared to the volume of air inspirated by the participant, or indirectly (inductive plethysmography) using bands placed around the rib cage and abdomen whose extension is proportional to the measured electrical resistance, calibrated by a spirometer (commercialized as Respitrace by Ambulatory Monitoring; Russell & Stathopoulos, 1988). Clinical applications include diagnosis of such speech disorders as Parkinson’s and Dysarthria (Lowit & Kent, 2010) and guiding treatment of childhood speech breathing issues (Solomon & Charron, 1998).
500 Lucie Ménard and Mark Tiede
34.6.2 Nasalance Coupling of the nasal tract to the oral airway is contrastive in nearly all known languages. Direct observation of velopharyngeal function using endoscopy is highly intrusive, so preferred alternatives rely instead on either airflow methods (Rothenberg mask) which directly measures oral and nasal airflow, or distinct acoustic recordings made at the nares and mouth, with each microphone isolated from the other by an isolation plate (Nasometer). In both cases nasality is expressed as a proportion of the nasal to oral signal. The output of a Nasometer is generally converted to the quotient of the nasal sound pressure level to the sum of the nasal and oral SPL values, expressed as a percentage termed “nasalance” (Bressmann, 2021). Several commercial versions (e.g., Kay Pentax) exist and are used in clinical settings to diagnose hypernasality and assess outcomes of cleft palate surgical interventions (Dalston et al., 1991; Watterson, 2020).
34.7 Conclusion The current chapter provides an overview of some of the most commonly used instruments and methods for investigating speech production in clinical populations. The selection of an instrument depends on the specific problem being addressed, the targeted population, and the goal of data collection (whether it is for assessment, treatment, or research purposes). With the recent advancements in technology, the use of instrumental techniques in speechlanguage pathology is expected to expand, thereby increasing our understanding of speech disorders and enabling the development of finely tuned rehabilitation strategies.
REFERENCES Badin, P., Vargas, J. A. V., Koncki, A., Lamalle, L., & Savariaux, C. (2013). Development and implementation of fiducial markers for vocal tract MRI imaging and speech articulatory modelling. Interspeech, 1321–1325. Baer, T., Gore, J. C., Gracco, V. C., & Nye, P. W. (1991). Analysis of vocal tract shape and dimensions using magnetic resonance imaging: Vowels. Journal of the Acoustical Society of America, 90(2), 799–828. Baltrušaitis, T., Zadeh, A., Lim, Y. C., & Morency, L.-P. (2018). Openface 2.0: Facial behavior analysis toolkit. 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), 59–66. Bandini, A., Green, J. R., Taati, B., Orlandi, S., Zinman, L., & Yunusova, Y. (2018). Automatic detection of amyotrophic lateral sclerosis (ALS) from video-based analysis of facial movements: Speech and nonspeech tasks. 13th IEEE International Conference on Automatic Face & Gesture Recognition, 150–157.
Bartle-Meyer, C. J., Goozée, J. V., Murdoch, B. E., & Green, J. R. (2009). Kinematic analysis of articulatory coupling in acquired apraxia of speech post-stroke. Brain Injury, 23(2), 133–145. https://doi.org/10.1080/ 02699050802649654 Block, S., Onslow, M., Roberts, R., & White, S. (2004). Control of stuttering with EMG feedback. Advances in Speech Language Pathology, 6(2), 100–106. Bressmann, T. (2021). Nasometry. In M. J. Ball (Ed.), Manual of clinical phonetics (1 ed., pp. 322–338). Routledge. Bressmann, T., Bernhardt, B., & McLeod, S. (2009). Ultrasound imaging of the tongue in individuals with cleft palate: A preliminary study. Cleft Palate-Craniofacial Journal, 46(2), 162–169. Brooks, A. R., Shelton, R. L., Jr., & Youngstrom, K. A. (1965). Compensatory tongue-palateposterior pharyngeal wall relationships in cleft palate. Journal of Speech and Hearing Disorders, 30(2), 166–173.
Instrumental Analysis of Speech Production 501 Browman, C. P., & Goldstein, L. (1988). Some notes on syllable structure in articulatory phonology. Phonetica, 45(2–4), 140–155. Byrd, D. (1996). Influences on articulatory timing in consonant sequences. Journal of Phonetics, 24(2), 209–244. Chen, W. R., Tiede, M., & Whalen, D. H. (2020). DeepEdge: Automatic ultrasound tongue contouring combining a deep neural network and an edge detection algorithm. International Seminar on Speech Production. Cleland, J. (2021). Ultrasound tongue imaging. In M. J. Ball (Ed.), Manual of clinical phonetics (1 ed., pp. 399–416). Routledge. Cleland, J., & Allen, J. (2023). Ultrasound in clinical practice: What, how, why, when and where? Bulletin of the Royal College of Speech and Language Therapists. (In Press). Coretta, S. (2019). Assessing mid-sagittal tongue contours in polar coordinates using generalised additive (mixed) models. The Journal of the Acoustical Society of America, 73(4), 1322–1336. Dalston, R. M., Warren, D. W., & Dalston, E. T. (1991). Use of nasometry as a diagnostic tool for identifying patients with velopharyngeal impairment. The Cleft Palate-Craniofacial Journal, 28(2), 184–189. Davidson, L. (2006). Comparing tongue shapes from ultrasound imaging using smoothing spline analysis of variance. The Journal of the Acoustical Society of America, 120(1), 407–415. Dawson, K. M., Tiede, M. K., & Whalen, D. H. (2016). Methods for quantifying tongue shape and complexity using ultrasound imaging. Clinical Linguistics & Phonetics, 30(3–5), 328–344. Denny, M., & Smith, A. (1992). Gradations in a pattern of neuromuscular activity associated with stuttering. Journal of Speech and Hearing Research, 35(6), 1216–1229. Dent, H., Gibbon, F., & Hardcastle, B. (1995). The application of electropalatography (EPG) to the remediation of speech disorders in school-aged children and young adults. European Journal of Disorders of Communication, 30(2), 264–277. Derrick, D., Carignan, C., Chen, W. R., Shujau, M., & Best, C. T. (2018). Three-dimensional printable ultrasound transducer stabilization system. The Journal of the Acoustical Society of America, 144(5), EL392–EL398. Didirková, I., Le Maguer, S., & Hirsch, F. (2021). An articulatory study of differences and similarities between stuttered disfluencies and
non-pathological disfluencies. Clinical Linguistics & Phonetics, 35(3), 201–221. Dokovova, M., & Cleland, J. (2022). Testing the sensitivity of the Dorsum Excursion Index for comparing typically developing speech and cleft speech characteristics. International Cleft Congress, Edinburgh. Dromey, C., Hunter, E., & Nissen, S. L. (2018). Speech adaptation to kinematic recording sensors: Perceptual and acoustic findings. Journal of Speech, Language, and Hearing Research, 61(3), 593–603. Dugan, S., Li, S. R., Masterson, J., Woeste, H., Mahalingam, N., Spencer, C., Mast, D., Riley, M., & Boyce, S. E. (2019). Tongue part movement trajectories for /r/using ultrasound. Perspectives of the ASHA Special Interest Groups, 4(6), 1644–1652. Fasel, I., & Berry, J. (2010). Deep belief networks for real-time extraction of tongue contours from ultrasound during speech. 2010 20th International Conference on Pattern Recognition, 1493–1496. Fromm, D., Abbs, J. H., McNeil, M. R., & Rosenbek, J. C. (1982). Simultaneous perceptual-physiological method for studying apraxia of speech. Clinical Aphasiology, 251–262. Fujimura, O., Kiritani, S., & Ishida, H. (1973). Computer controlled radiography for observation of movements of articulatory and other human organs. Computers in Biology and Medicine, 3(4), 371–384. Gibbon, F., & Hardcastle, W. (1989). Deviant articulation in a cleft palate child following late repair of the hard palate: A description and remediation procedure using electropalatography (EPG). Clinical Linguistics & Phonetics, 3(1), 93–110. Guarin, D., Dempster, A., Bandini, A., Yunusova, Y., & Taati, B. (2020). Estimation of orofacial kinematics in Parkinson’s disease: Comparison of 2D and 3D markerless systems for motion tracking, 15th IEEE International Conference on Automatic Face and Gesture Recognition, 705–708. Hardcastle, W. J. (1972). The use of electropalatography in phonetic research. Phonetica, 25(4), 197–215. Henningsson, G., & Isberg, A. (1991). A cineradiographic study of velopharyngeal movements for deviant versus nondeviant articulation. The Cleft Palate-Craniofacial Journal, 28(1), 115–118.
502 Lucie Ménard and Mark Tiede Henriques, R. N., & van Lieshout, P. (2013). A comparison of methods for decoupling tongue and lower lip from jaw movements in 3D articulography. Journal of Speech, Language, and Hearing Research, 56(5), 1503–1516. Jafari, D. (2022). 3D video tracking technology in the assessment of orofacial impairments in neurological disorders [Doctoral dissertation]. University of Toronto (Canada). Kabakoff, H., Beames, S. P., Tiede, M., Whalen, D. H., Preston, J. L., & McAllister, T. (2023). Comparing metrics for quantification of children’s tongue shape complexity using ultrasound imaging. Clinical Linguistics & Phonetics, 37(2), 169–195. Katz, W. F., & Bharadwaj, S. (2001). Coarticulation in fricative-vowel syllables produced by children and adults: A preliminary report. Clinical Linguistics & Phonetics, 15(1–2), 139–143. Katz, W. F., McNeil, M. R., & Garst, D. M. (2010). Treating apraxia of speech (AOS) with EMA-supplied visual augmented feedback. Aphasiology, 24(6–8), 826–837. Krivokapić, J., Tiede, M. K., & Tyrone, M. E. (2017). A kinematic study of prosodic structure in articulatory and manual gestures: Results from a novel method of data collection. Laboratory Phonology, 8(1). https://doi. org/10.5334/labphon.75 Laporte, C., & Ménard, L. (2018). Multi-hypothesis tracking of the tongue surface in ultrasound video recordings of normal and impaired speech. Medical Image Analysis, 44, 98–114. Li, M., Kambhamettu, C., & Stone, M. (2005). Automatic contour tracking in ultrasound images. Clinical Linguistics & Phonetics, 19(6–7), 545–554. Lowit, A., & Kent, R. D. (2010). Assessment of motor speech disorders. Plural Publishing. Lucero, J. C., Munhall, K. G., Gracco, V. L., & Ramsay, J. O. (1997). On the registration of time and the patterning of speech movements. Journal of Speech, Language, and Hearing Research, 40(5), 1111–1117. Lulich, S. M., Berkson, K. H., & de Jong, K. (2018). Acquiring and visualizing 3D/4D ultrasound recordings of tongue motion. Journal of Phonetics, 71, 410–424. Masapollo, M., Nittrouer, S., Goel, J., & Oh, Y. (2021). Electromagnetic articulography appears feasible for assessment of speech motor skills in cochlear-implant users. Journal of the Acoustical Society of America Express Letters, 1(10). https:// doi.org/10.1121/10.0006719 McClean, M., Goldsmith, H., & Cerf, A. (1984). Lower-lip EMG and displacement during
bilabial disfluencies in adult stutterers. Journal of Speech and Hearing Research, 27(3), 342–349. McMicken, B., Vento-Wilson, M., Von Berg, S., & Rogers, K. (2014). Cineradiographic examination of articulatory movement of pseudo-tongue, hyoid, and mandible in congenital aglossia. Communication Disorders Quarterly, 36(1), 3–11. Mefferd, A. S., & Dietrich, M. S. (2019). Tongueand jaw-specific articulatory underpinnings of reduced and enhanced acoustic vowel contrast in talkers with Parkinson’s disease. Journal of Speech, Language, and Hearing Research, 62(7), 2118–2132. Moore, C. A., & Scudder, R. R. (1989). Coordination of jaw muscle activity in Parkinsonian movement: Description and response to traditional treatment. In Yorkston, K. M., & Beukelman, D. R. (Eds.), Recent advances in clinical dysarthria (pp. 147–163). Little, Brown. Murdoch, B. E., Cheng, H. Y., & Goozée, J. V. (2012). Developmental changes in the variability of tongue and lip movements during speech from childhood to adulthood: An EMA study. Clinical Linguistics & Phonetics, 26(3), 216–231. Naeini, S. A., Simmatis, L., Jafar, D., Guarin, D. L., Yunusova, Y., & Taati, B. (2022). Automated temporal segmentation of orofacial assessment videos. In 2022 IEEE-EMBS international conference on biomedical and health informatics (BHI) (pp. 1–6). Narayanan, S., Nayak, K., Lee, S., Sethy, A., & Byrd, D. (2004). An approach to real-time magnetic resonance imaging for speech production. The Journal of the Acoustical Society of America, 115(4), 1771–1776. Noiray, A., Ries, J., Tiede, M., Rubertus, E., Laporte, C., & Ménard, L. (2020). Recording and analyzing kinematic data in children and adults with SOLLAR: Sonographic & Optical LinguoLabial Articulation Recording system. Journal of the Association for Laboratory Phonology, 11(1). https://doi.org/10.5334/labphon.241 Ostry, D. J., Vatikiotis-Bateson, E., & Gribble, P. L. (1997). An examination of the degrees of freedom of human jaw motion in speech and mastication. Journal of Speech, Language, and Hearing Research, 40(6), 1341–1351. Perkell, J. S., Cohen, M. H., Svirsky, M. A., Matthies, M. L., Garabieta, I., & Jackson, M. T. (1992). Electromagnetic midsagittal articulometer systems for transducing speech articulatory movements. The Journal of the Acoustical Society of America, 92(6), 3078–3096.
Instrumental Analysis of Speech Production 503 Peter, B., Behn, N., & Brumberg, J. (2020). A systematic review of the efficacy of visual biofeedback therapy for childhood apraxia of speech. Journal of Speech, Language, and Hearing Research, 63(5), 1315–1333. Preston, J. L., Leece, M. C., & Maas, E. (2016). The role of ultrasound biofeedback in speech sound learning in childhood apraxia of speech: A single case study. Journal of Speech, Language, and Hearing Research, 59(1), 1–15. Preston, J. L., McAllister Byun, T., Boyce, S. E., Hamilton, S., Tiede, M., Phillips, E., RiveraCampos, A., & Whalen, D. H. (2017). Ultrasound images of the tongue: A tutorial for assessment and remediation of speech sound errors. Journal of Visualized Experiments, 119, e55123. https://doi.org/10.3791/55123 Preston, J. L., McCabe, P., Tiede, M., & Whalen, D. H. (2019). Tongue shapes for rhotics in school-age children with and without residual speech errors. Clinical Linguistics & Phonetics, 33(4), 334–348. Ramanarayanan, V., Tilsen, S., Proctor, M., Töger, J., Goldstein, L., Nayak, K. S., & Narayanan, S. (2018). Analysis of speech production real-time MRI. Computer Speech & Language, 52, 1–22. Rebernik, T., Jacobi, J., Jonkers, R., Noiray, A., & Wieling, M. (2021a). A review of data collection practices using electromagnetic articulography. Laboratory Phonology, 12(1), 6. https://doi. org/10.5334/labphon.237 Rebernik, T., Jacobi, J., Tiede, M., & Wieling, M. (2021b). Accuracy assessment of two electromagnetic articulographs: Northern digital inc. wave and northern digital inc. vox. Journal of Speech, Language, and Hearing Research, 64(7), 2637–2667. Russell, N. K., & Stathopoulos, E. (1988). Lung volume changes in children and adults during speech production. Journal of Speech, Language, and Hearing Research, 31(2), 146–155. Savariaux, C., Badin, P., Samson, A., & Gerber, S. (2017). A comparative study of the precision of Carstens and Northern Digital Instruments electromagnetic articulographs. Journal of Speech, Language, and Hearing Research, 60(2), 322–340. Schönle, P. W., Gräbe, K., Wenig, P., Höhne, J., Schrader, J., & Conrad, B. (1987). Electromagnetic articulography: Use of alternating magnetic fields for tracking movements of multiple points inside and outside the vocal tract. Brain and Language, 31(1), 26–35. Schötz, S., Frid, J., & Löfqvist, A. (2013). Development of speech motor control: Lip movement variability. Journal of the Acoustical Society of America, 133(6), 4210–4217.
Shadle, C., Proctor, M. I., & Iskarous, K. (2008). An MRI study of the effect of vowel context on English fricatives. Journal of the Acoustical Society of America, 123(5), 3735. Shellikeri, S., Green, J. R., Kulkarni, M., Rong, P., Martino, R., Zinman, L., & Yunusova, Y. (2016). Speech movement measures as markers of bulbar disease in amyotrophic lateral sclerosis. Journal of Speech, Language, and Hearing Research, 59(5), 887–899. Sigona, F., Stella, M., Stella, A., Bernardini, P., Fivela, B. G., & Grimaldi, M. (2018). Assessing the position tracking reliability of Carstens’ AG500 and AG501 electromagnetic articulographs during constrained movements and speech tasks. Speech Communication, 104, 73–88. Smith, A., Goffman, L., Zelaznik, H. N., Ying, G., & McGillem, C. (1995). Spatiotemporal stability and the patterning of speech movement sequences. Experimental Brain Research, 104, 493–501. Solomon, N. P., & Charron, S. (1998). Speech breathing in able-bodied children and children with cerebral palsy: A review of the literature and implications for clinical intervention. American Journal of Speech-Language Pathology, 7(2), 61–78. Story, B. H., Titze, I. R., & Hoffman, E. A. (1998). Vocal tract area functions for an adult female speaker based on volumetric imaging. Journal of the Acoustical Society of America, 104(1), 471–487. Teplansky, K. J., Wisler, A., Green, J. R., Heitzman, D., Austin, S., & Wang, J. (2023). Measuring articulatory patterns in amyotrophic lateral sclerosis using a datadriven articulatory consonant distinctiveness space approach. Journal of Speech, Language, and Hearing Research, 66, 1–13. https://doi. org/10.1044/2022_JSLHR-22-00320 Thies, T., Mücke, D., Dano, R., & Barbe, M. T. (2021). Levodopa-based changes on vocalic speech movements during prosodic prominence marking. Brain Sciences, 11(5), 594. Tiede, M. (2005). MVIEW: Software for visualization and analysis of concurrently recorded movement data. Haskins Laboratories. Trudeau-Fisette, P., Tiede, M., & Menard, L. (2017). Compensations to auditory feedback perturbations in congenitally blind and sighted speakers: Acoustic and articulatory data. PlosOne, 12(7), e0180300. Turgeon, C., Prémont, A., Trudeau-Fisette, P., & Ménard, L. (2015). Exploring productionperception relationships in normal hearing and cochlear implant adults: A lip-tube perturbation study. Clinical Linguistics and Phonetics, 29(5), 378–400.
504 Lucie Ménard and Mark Tiede Tye-Murray, N. (1987). Effects of vowel context on the articulatory closure postures of deaf speakers. Journal of Speech, Language, and Hearing Research, 30(1), 99–104. Tye-Murray, N. (1991). The establishment of open articulatory postures by deaf and hearing talkers. Journal of Speech, Language, and Hearing Research, 34(3), 453–459. van Lieshout, P. H. H. M., Peters, H. F. M., Starkweather, C. W., & Hulstijn, W. (1993). Physiological differences between stutterers and nonstutterers in perceptually fluent speech. Journal of Speech and Hearing Research, 36(1), 55–63. Vick, J., Mental, R., Carey, H., & Lee, G. S. (2017). Seeing is treating: 3D electromagnetic midsagittal articulography (EMA) visual biofeedback for the remediation of residual speech errors. Journal of the Acoustical Society of America, 141(5), 3647-3647. Vidou, C., Uribe, C., Boukhalfi, T., Labbé, D., & Ménard, L. (2020): Compensatory responses to real-time perturbation of visual feedback during vowel production. International Seminar on Speech Production, New Haven, United States, December. Watterson, T. (2020). The use of the nasometer and interpretation of nasalance scores. Perspectives of the ASHA Special Interest Groups, 5(1), 155–163. Weismer, G., Yunusova, Y., & Westbury, J. R. (2003). Interarticulator coordination in
dysarthria. Journal of Speech, Language, and Hearing Research, 46, 1247–1261. Westbury, J. (1994). X-ray microbeam speech production database: User’s handbook, version 1.0. University of Wisconsin. Whalen, D. H., Iskarous, K., Tiede, M. K., Ostry, D. J., Lehnert-LeHouillier, H., VatikiotisBateson, E., & Hailey, D. S. (2005). The Haskins optically corrected ultrasound system (HOCUS). Journal of Speech, Language, and Hearing Research, 48(3), 543–553. Wieling, M., Tomaschek, F., Arnold, D., Tiede, M., Bröker, F., Thiele, S., Wood, S., & Baayen, R. H. (2016). Investigating dialectal differences using articulography. Journal of Phonetics, 59, 122–143. Yehia, H., Rubin, P., & Vatikiotis-Bateson, E. (1998). Quantitative association of vocal-tract and facial behavior. Speech Communication, 26(1–2), 23–43. Zhou, X., Espy-Wilson, C. Y., Boyce, S., Tiede, M., Holland, C., & Choe, A. (2008). A magnetic resonance imaging-based articulatory and acoustic study of “retroflex” and “bunched” American English/r. Journal of the Acoustical Society of America, 123(6), 4466–4481. Zhou, X., Espy-Wilson, C. Y., Tiede, M., & Boyce, S. (2007). An articulatory and acoustic study of “retroflex” and “bunched” American English rhotic sound based on MRI. Eighth Annual Conference of the International Speech Communication Association, 54–57.
35 Instrumental Analysis of Articulation YUNJUNG KIM, RAYMOND D. KENT, AND AUSTIN THOMPSON 35.1 Introduction Speech is a product of the complex orchestration among several subsystems operating independently and collaboratively: respiration, phonation, articulation and resonance. Accordingly, studies on speech disorders often use this subsystem framework to identify and manage speech problems in an efficient, sophisticated manner. Among these subsystems, this chapter provides a brief discussion of acoustic and kinematic tools for the examination of articulation processes and disorders (see Ludlow, Kent, & Gray, 2018, for a comprehensive description of instrumental analysis across the subsystems). Articulation is typically viewed as highly coordinated gestures of the supraglottal speech organs such as the face, lips, jaw, and the tongue. In this chapter, some studies of resonance are also included in the context of discussing articulation of nasal sounds and their influence on neighboring sounds. Advantages of instrumental analysis of speech include the capability of acoustic signals to bridge the acts of speech production and speech perception and the capability of kinematic signals to bring us closer to events in the complex peripheral system that underly the shaping of acoustic signals. Further, both methods provide more objective, precise and reliable data compared to perceptual judgment. Note that kinematic studies discussed in the chapter mostly refer to electromagnetic articulography (EMA) data excluding other approaches such as brain imaging, ultrasound, and palatography (see Chapter 34), as EMA appears to be the most popular kinematic tool employed in the current literature. EMA is a method of tracking flesh point movements of the articulators (tongue, lips, jaw) that can be synchronized with acoustic signals. Currently, Carstens Medizinelektronik (Bovenden, Germany) is the only manufacturer offering a commercial EMA device, AG 501 (https://www.articulograph.de), since Northern Digital Inc. (NDI: Waterloo, Canada) decided to discontinue the Wave and the NDI Vox in 2020. EMA has several advantages, including its adaptability to speakers of different ages, safety for use in repeated studies (e.g., treatment efficacy), and availability of tools for data analysis. An important recent development on articulation research is a notable increase in the use of EMA. Figure 35.1 illustrates the number of articles using EMA published annually from 1987 to 2022. A PubMed search using the terms “electromagnetic articulography, EMA, speech” for the last five decades shows an accelerating pattern of EMA research.
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
506 Yunjung Kim, Raymond D. Kent, and Austin Thompson
Figure 35.1 Number of articles reporting EMA data on speech published annually for the period of 1987–2022. The search was conducted on August 2022. © Yunjung Kim, Raymond D. Kent, and Austin Thompson, 2023.
Table 35.1 Selected kinematic studies examining the tongue (T), lips (L) and jaw (J) in children and adults with and without communication disorders. Typical speakers
Childhood apraxia of speech
Childhood dysarthria
Adult apraxia of speech
Adult dysarthria
Allison et al. (2020)
van Lieshout et al. (2007), Bartle‐ Meyer and Murdoch (2010) van Lieshout et al. (2007)
Lee et al. (2017), Mefferd (2015) Eshghi et al. (2019), Teplansky et al. (2023) Berry et al. (2017), Lee et al. (2020)
T
Thompson and Kim (2019)
L
van Brenk et al. (2013)
Kopera and Grigos (2020)
Allison et al. (2022)
J
Birkholz and Hoole (2012)
Kopera and Grigos (2020)
Allison et al. (2022), Loh et al. (2005)
Bartle et al. (2007)
Table 35.1 includes selected recent EMA literature (within the last two decades) pertaining to the description and analysis of speech kinematics in children and adults with and without speech-language disorders. These studies have advanced our understanding on the relationship between acoustic and kinematic phenomena (Lee et al., 2017; Mefferd, 2015; Thompson & Kim, 2019) and the effects of age on speech kinematics (van Brenk et al., 2013). Accelerating
Instrumental Analysis of Articulation 507 progress in speech technology and notably increasing EMA data have also ushered in a new era in clinical application. In addition to the exploration of kinematic signatures in various speech-language disorders (Allison et al., 2022; Kopera & Grigos, 2020), studies have demonstrated the use of acoustic or kinematic signals as visual biofeedback for articulation therapy for children and adults (e.g., Haworth et al., 2019; Kearney et al., 2018; Peterson et al., 2022). The scope of the chapter is to offer a condensed and selective overview of the literature on acoustic and kinematic studies of articulation for different sound classes. We begin at the segmental level, with an examination of vowels and consonants, followed by analyses at the level of multisyllabic utterances. Each section concludes with examples of typical clinical applications. Throughout the chapter, we attempt to offer acoustic descriptions of frequent articulation problems in various populations, followed by their corresponding kinematic evidence based on a summary of progress and a blueprint for future research.
35.2 Vowels 35.2.1 Voiced Non-nasal Vowels For a typical vowel sound, the source of energy is the vibration of the vocal folds, and this energy is filtered by the combined effect of the vocal tract resonances (formants) and the radiation characteristic. Formant patterns are not the only way of describing vowels. However, formant specification is useful as a low-dimensional description, in that only two or three formants are sufficient to describe the vowels in most languages. Another advantage to formant specification is that the relationships of formants (including their frequencies and amplitudes) to vowel articulation are fairly well understood (Fant, 1970). Let us consider the classic F1-F2 formant plot (Figure 35.2) as an example. This graph is probably the most
Figure 35.2 The classic chart for vowels acoustics and kinematics. The labeled vowels are the corner vowels of American English. © Yunjung Kim, Raymond D. Kent, and Austin Thompson, 2023.
508 Yunjung Kim, Raymond D. Kent, and Austin Thompson commonly used in speech acoustics. It depicts a fundamental articulatory-acoustic relationship in which the F1 and F2 frequencies are related principally to tongue height and advancement, respectively. The relationship can also be expressed this way: the F2-F1 difference can be interpreted as tongue advancement/retraction, and the F1 value can be taken as an index of tongue height. The simple relationships just described are extremely useful, but they understate the challenges in formant descriptions. A major challenge is the variation of formant frequencies with vocal-tract length (which is to say, speaker sex and age, which are the primary determinants of vocal-tract length). The vowel formant patterns for any given vowel produced by a man, a woman, and a child are not identical. Attempts to classify tokens of vowel sounds from different speakers as having the same phonetic identity despite formant-frequency differences are obstructed by the problem of vowel (or speaker) normalization, and the most popular attempts at a solution rely on relational computations such as logarithms or ratios. The dependence of vowel formant frequencies on the age and sex of the speaker is a hindrance to comparisons of formant data from speakers who represent different age-sex combinations or longitudinal studies of the same individuals. The acoustic vowel quadrilateral is not only a framework for describing individual vowels, which appear as points on or within the quadrilateral; it can also be used for other purposes such as an index of the vowel working space. The index is the F1-F2 planar area that can be computed with the following formula for the area of an irregular quadrilateral: Area = 0.5 × {(/i/F2 × /æ/F1 + /æ/F2 × /ɑ/F1 + /ɑ/F2 × /u/F1 + /u/F2 × /i/F1) −(/i/F1 × /æ/F2 + /æ/F1 × /ɑ/F2 + /ɑ/F1 × /u/ × F2+/u/F1 × /i/F2)} Where Fn = the formant number for the vowel symbol shown in the preceding slashes; for example, /i/F2 = the second formant for vowel /i/. The usual procedure for estimating vowel space area in English has limitations that can affect its usefulness in the study of disordered speech (Karlsson & Doorn, 2012; Kent & Vorperian, 2018; Sandoval et al., 2013). First, it is calculated from data on only the corner (quadrilateral) vowels, rather than all vowels. Second, it is typically derived from singleword productions rather than connected speech such as conversation. Third, because it is based on the means of the formant data, it neglects variability in the production of individual vowels, which can be an important feature of disordered speech. Modern techniques of articulatory kinematics have greatly increased the speed and power of our understanding of articulator-specific motion in creating vocal tract shape. This is because kinematic analysis of vowels (or any speech sound) permits consideration of the individual or collective movements of multiple articulators for speech production including the tongue, lips, and jaw for speech production. The power of simultaneous examinations across multiple articulators was expressed by Fujimura (1990) as following: In the acoustic signal, the information resulting from such a multidimensional process is often collapsed into a single physical dimension. For example, specific formant frequency values in the acoustic domain reflect simultaneously occurring multidimensional articulatory maneuvers (p. 196).
Owing to early studies using optical-tracking or strain-gauge transducer systems, the history of lip and jaw movement research goes as early as the 1970s (e.g., Sussman & Smith, 1970), putting aside the pioneering work in the 1930s by Hudgins and Stetson using online techniques to record the jaw and thyroid cartilage during speech (Hudgins, 1934; Hudgins & Stetson, 1935). These initial approaches have explored speech changes across
Instrumental Analysis of Articulation 509 the lifespan in the anatomy of the speech structures and in motor skills by examining displacement, velocity, and variability of the lip and jaw movement. Further, the potential to use kinematic equipment as a treatment tool for developmental and acquired speech disorders was noted early on as the visual display of speech movement allows online feedback. However, because they do not allow a direct examination of tongue movements, articulatory data on vowels were limited with respect to articulators (i.e., lower lip and jaw), type of movements (i.e., vertical), and phonetic contexts (i.e., low vowels preceded by bilabial stops). Lingual data have become available since the advent of equipment allowing sensor placements on the primary articulator, the tongue, such as X-ray microbeam (Westbury et al., 1994) and EMA (Schönle et al., 1987). Up to five sensors (two or three being the most popular) have been placed on the tongue to examine its positions and trajectories during vowel production. Similar to acoustic methods, the articulatory distinctiveness among vowels is often described as differences in positions of operationally defined point coordinates on the tongue similar to the acoustic distinctiveness measures. The size of the articulatory vowel quadrilateral is also derived using the aforementioned formula but replacing F1 and F2 with x- and y-coordinates. Tracking multiple articulator points is also beneficial in allowing the study of the effects of the dynamic interaction between neighboring sounds (i.e., coarticulatory influences) on various articulator points. Inter alia, a greater coarticulation effect in a vowel-consonantvowel sequence is found at the tongue front than at the tongue back, possibly due to the lesser inertia for the tongue tip (Hoole & Gfoerer, 1990). For nasalized vowel /i/ (e.g., /i/ followed by a nasal sound), EMA data show a tongue raising gesture, which is interpreted as a compensation for the low-frequency shift in spectral energy which accompanies the nasopharyngeal coupling (Carignan et al., 2011).
35.2.2 Clinical Example 1: Vowel Space Area The size of the acoustic vowel space, as shown in Figure 35.2, has often been reported to be reduced in speech disorders and also functions as a predictor of intelligibility in both adults (Y-J. Kim et al., 2011; Tjaden & Wilding, 2004) and children (Allison et al., 2017; DuHadway & Hustad, 2012). Efforts have been made to develop alternative measures with better sensitivity to the presence and severity of speech disorders than the traditional triangle or quadrilateral vowel space. Some of these measures have also been proposed to minimize manual processing effort (i.e., segmentation). These measures include the degree of overlap or dispersion among vowels (H. Kim et al., 2011; Lansford & Liss, 2014), vowel articulation index (Roy et al., 2009) and its reciprocal value, formant centralization ratio (Sapir et al., 2010; Skodda et al., 2011). For example, Figure 35.3 shows the effect size across studies reporting different acoustic vowel space metrics for group differences between neurotypical speakers and speakers with Parkinson’s disease. Vowel space areas obtained from utterance- or sentence-level materials such as vowel space density (Story & Bunton, 2017), vowel space hull area (Whitfield et al., 2018), and articulatory-acoustic vowel space (Whitfield & Goberman, 2014) are discussed later in the section of 35.4. Utterance level analysis. Several studies have reported a similar finding of reduced tongue kinematic vowel space in speech disorders in adults as well as systematic changes across speaking modes such as clear speech. However, tongue kinematic vowel space may not be sensitive to a speech disorder or speaking mode as much as acoustic vowel space is, at least when computed in the traditional manner using vowel-point metrics (Lee et al., 2017; Whitfield et al., 2018).
510 Yunjung Kim, Raymond D. Kent, and Austin Thompson
Figure 35.3 The effect size (Hedge’s g, Hedges & Olkin, 2014) and 95% confidence intervals for acoustic measures capturing the size of the articulatory movement across studies of typical speech and speech produced by individuals with Parkinson’s disease. Group comparisons that were not statistically significant are represented in gray. VSA = vowel space area, tVSA = triangular vowel space area, VSAlax = vowel space area calculated using lax vowels, FCR = formant centralization ratio, VSD10, 90 = vowel space density of the innermost 10 and 90% of the formant density distribution, respectively; AAVS = articulatory-acoustic vowel space; AVD = acoustic vowel distance. The inverse effect size for FCR is reported for ease of interpretation. Speakers with Parkinson’s disease demonstrated higher FCR values compared to neurotypical speakers, indicating more centralized formants and reduced movement. © Yunjung Kim, Raymond D. Kent, and Austin Thompson, 2023.
35.2.3 Clinical Example 2: Vowel Transition Second formant frequency transition during a diphthong such as /ɑɪ/ (often embedded in the famous sentence, “Buy Bobby a puppy”) has been extensively studied often in the form of a formant-frequency transition slope (Hz/ms). Figure 35.4 demonstrates the example of the word “buy” and the transition state of F2 is indicated by the black square. F2 slope is one factor that relates to the capacity for intelligible speech in adults and children with speech disorders. Several studies have recommended phonetic contexts that require a greater degree of frequency change to maximize the sensitivity of F2 slope to speech disorders (Kim et al., 2009; Rosen et al., 2008). For example, words such as hail show greater group differences between speakers with and without dysarthria than words such as coat and shoot. The articulatory interpretation of F2 slope is fairly straightforward in that a greater F2 slope presumes a correspondingly greater speed of change in vocal tract configuration, when other factors (e.g., the length of the vocal tract) are held constant. However, the strength of the relationship between F2 slope and articulatory gestures of the tongue and jaw appears somewhat inconsistent across studies which may be due to different methodologies with
Instrumental Analysis of Articulation 511
Figure 35.4 An example of F2 slope. The waveform and spectrogram of the word “buy” are shown at the top and bottom of the figure, respectively. The transition state of F2, defined as a time interval including a spectral change greater than 20 Hz over 20 ms, is indicated by a boxed area. © Yunjung Kim, Raymond D. Kent, and Austin Thompson, 2023.
respect to neuropathologies under study and speech material (Mefferd, 2015; Rong et al., 2012; Thompson & Kim, 2019; Yunusova et al., 2012).
35.3 Consonants Consonants are a complex group of sounds, best considered in major classes defined by their articulatory and acoustic characteristics. Therefore, unlike vowels, a single set of measures (i.e., formant patterns) is not sufficient to capture characteristics across consonants. It is rather a matter of practical necessity to identify a manageable set of measures reasonably describing consonant characteristics. This section offers brief descriptions of articulatoryacoustic properties pertaining to selected sound classes with substantial data.
35.3.1 Sonorant Consonants Sonorant consonants are defined by a resonant pattern consisting of formants and, in some cases, also antiformants. These sounds are less intense than vowels but have vowel-like properties in respect to resonances. Unlike the obstruents, sonorant consonants are nearly free of noise components, such as bursts or sustained frication.
35.3.1.1 Glides (Semi-vowels) There are only two glides in American English, /w/ and /j/, which share the acoustic property of a relatively gradual (therefore glide-like) change in formant frequency pattern. This gradual transition in formant frequencies contrasts with the rapid transition in stop consonants as considered later. The main point is that glides have a well-defined formant pattern characterized by a relatively long duration of formant frequency shift. Consistent with the logic previously discussed in the vowel section, words containing a glide (e.g., wax, whip) are known to be susceptible to errors in various speaker groups with disordered or accented
512 Yunjung Kim, Raymond D. Kent, and Austin Thompson speech, presumably due to the high articulatory demand of a gradual but large change in vocal tract configuration and, in the case of /w/, coordination of lingual and labial movements (Espy‐Wilson, 1992; Kim et al., 2022).
35.3.1.2 Nasal Consonants In English, there are three nasals, /m/, /n/, /ƞ/. Nasal consonants are ordinarily voiced and they can be classified phonetically as sonorants. But their acoustic properties are easily distinguished from vowels. Fujimura (1962) noted three common properties of the nasal consonants: (1) all have a first formant of about 300 Hz, (2) the formants tend to be highly damped (i.e., they have large bandwidths), and (3) they have a high density of formants combined with antiformants (which are associated with either an obstructed for bifurcated vocal tract). These basic principles certainly help to characterize nasal consonants, but the detailed acoustic properties of these sounds are not so easily described. Because vowels adjacent to nasal consonants are themselves usually nasalized to some degree, nasalization is a frequently encountered property of speech. Further, hypernasality is a frequent feature in speech and hearing disorders in children (Bradford & Brooks, 1964; Kataoka et al., 2001) and adults (Eshghi et al., 2021; Wenke et al., 2010). Consistent with the traditional description of nasals, EMA data indicate a lower position of the velum for nasals (Amelot & Rossato, 2006). Further, similar to the previous discussion regarding nasalized vowels, growing evidence has shown the contribution of the tongue to the production of nasal consonants in several languages (e.g., American English: Carignan et al., 2021; Mandarin: Xue et al., 2018).
35.3.1.3 Liquid Consonants Liquid is a cover term for the consonant phonemes /l/ and /r/. The lateral /l/ is acoustically similar to the nasal consonants. This similarity is rooted in the shared production factor of a bifurcated vocal tract that introduces antiformants in the resonance patterns. For nasal consonants, the bifurcation relates to the oral and nasal cavities. For laterals, the bifurcation results from the midline obstruction with lateral openings for sound transmission. The rhotic /r/ is one of the most complex and variable sounds in American English. It can be produced in various ways, including a retroflex articulation and a bunched articulation. Acoustically, /r/ is associated with a low F3 frequency or a small F3-F2 difference. Partly due to complex articulatory demand, liquids are known to be challenging for various speaker groups. This includes children and adults with speech-language pathology (Bunton & Weismer, 2001; Preston et al., 2015), second-language speakers of English (Kim et al., 2022), and typically developing children (McAllister Byun & Tiede, 2017).
35.3.2 Non-sonorant Consonants 35.3.2.1 Stop Consonants Stop or plosive consonants (/p, b, t, d, k, g/ in English) are produced with rapidly changing and complex speech events. The events include acoustic phenomena from complete silence to high-amplitude bursts, then to a frication interval (plus aspiration interval for voiceless stop consonants). To meet these acoustic goals, highly coordinated articulatory motions among various structures are required, including the complete closure of the oral cavity (stop closure), a sudden and brief release of a positive intraoral air pressure (burst), and the formation of a tight constriction of the vocal tract (frication interval). The exact appearance of these segments varies with the position of a stop in a syllable.
Instrumental Analysis of Articulation 513 Stop consonants have been studied fairly extensively in both normal and disordered speech. Several different acoustic features are of interest: the stop gap, stop burst, and formant transitions in the vicinity of the stop burst. Some measures are known to characterize disordered speech, including voice onset time (VOT) and spectral moment coefficients. As an example of clinical application, VOT is considered in more detail under 35.3.2.2. Clinical Example 3: Voice onset time. Relative to vowels, very limited attention has been devoted to the kinematics of, and in the vicinity of, consonants. However, kinematic examination of stop consonants has potential value for studying various speech disorders. First, stops have a high frequency of occurrence, implying that their effects on speech intelligibility may be greater than less frequently occurring sounds. Second, rapidly changing and complex speech events required for stops may reveal articulatory deficits to a greater degree than simple speech events. Lastly, a wide range of speech-language disorders are associated with frequent errors in stop consonants at the perceptual and acoustic levels. Figure 35.5 demonstrates an example of articulatory contrast between lingual and labial positions from passage reading produced by a healthy control (left) and a speaker with Parkinson’s disease (right). The data show the average positions of /p/ (upper and lower lip) and /t/ and /k/ (tongue front and back) at the instant of the stop burst. Due to the sensor location (central on the vermillion border and the inferior border for the upper and lower lip, respectively), the two dots for /p/ do not overlap (~21 mm, comparable to Wang et al., 2013). It is clear that articulatory contrast between /t/ and /k/, measured by the Euclidean distance between the tongue dorsum sensors, is reduced for the speaker with Parkinson’s disease.
35.3.2.2 Clinical Example 3: Voice Onset Time (VOT) VOT has been the focus of numerous investigations of both normal and disordered speech (Cho et al., 2019), largely on the assumption that this acoustic interval between the noise burst and the onset of periodic energy corresponds to the physiological interval between the release of the consonantal constriction and the onset of vocal-fold vibration. Therefore, VOT is a possible index of intersystem (i.e., phonation-articulation) coordination or timing (but see Weismer, 2006 for caution in interpreting VOT data). VOT values relate to the phonetic feature of voicing as shown in Figure 35.6. Simultaneous voicing, prevoicing, and short lag
Figure 35.5 An example of articulatory contrast for stop consonants in a neurotypical speaker (left) and a speaker with Parkinson’s disease (right). Each dot represents the average position of lingual (TF: tongue front, TB: tongue back) and labial sensors (UL: upper lip, LL: lower lip) during passage reading for /p/, /t/, and /k/. © Yunjung Kim, Raymond D. Kent, and Austin Thompson, 2023.
514 Yunjung Kim, Raymond D. Kent, and Austin Thompson
Noise burst Voicing
Simultaneous (VOT = 0ms)
Prevoicing (VOT is negative)
Short lag (VOT < 25 ms)
Long lag (VOT > 40 ms)
Figure 35.6 Illustration of voice onset time (VOT) in relation to the categories of simultaneous voicing, prevoicing, short lag, and long lag. The first three of these typically are associated with voiced consonants in English, whereas long lag is associated with voiceless consonants. © Yunjung Kim, Raymond D. Kent, and Austin Thompson, 2023.
all tend to be judged as voiced consonants in English, whereas long lag tends to be judged as voiceless. In addition, the distribution of VOT values has been reported for speakers with various disorders, including apraxia of speech (Auzou et al., 2000; Mauszycki et al., 2007; Whiteside et al., 2012), dysarthria (Auzou et al., 2000; Karlsson et al., 2012; Thomas et al., 2022) and voice disorders (McKenna et al., 2020). Abnormal distributions of VOT in disordered speech are mainly observed as large intraspeaker variability (Hardcastle et al., 1985) or as a shortening of long lag durations or a lengthening of short lag durations, which leads to a reduced distinction or even overlaps in VOT between voiced and voiceless cognates (Morris, 1989). A considerable amount of data has been published on VOT in dysarthria. VOT has been used to identify speech characteristics of certain diseases and to examine the relationship between VOT values and intelligibility. VOT has been reported as not significantly related to intelligibility scores in English-speakers with cerebral palsy and Parkinson’s disease (Ansel & Kent, 1992; Kim & Choi, 2017). However, a possibility exists that VOT may serve as a language-specific predictor of speech intelligibility because other studies of similar methodology but in different languages (Korean, Mandarin) showed that VOT variation was significantly related to intelligibility scores (Kim & Choi, 2017; Liu et al., 2000). More data across languages with different consonant inventories are needed in future studies.
35.3.2.3 Fricative Consonants A fricative (noise) sound is made by a combination of a narrow constriction somewhere in the vocal tract and flow of air through this constriction that is adequate to produce turbulence. Several fricative classes are defined with respect to the position of the narrow
Instrumental Analysis of Articulation 515 constriction. English has four classes of surpraglottal fricatives: labiodental /f, v/, (inter) dental /ɵ, ð /, alveolar /s, z/, and palate-alveolar /ʃ, ʒ/. In addition, a glottal fricative /h/ is recognized by many phoneticians although others consider it to be an approximant. Studies of fricatives have tended to concentrate on the alveolar fricative /s/ in normal and disordered speech, because of its well-defined spectral pattern and high frequency of occurrence in many languages. Since a relatively precise articulation process is required to produce a fricative sound, primarily due to the requirement for a sustained narrow constriction in the oral cavity, it is expected that speakers with diverse speech-language disorders might exhibit errors for fricative consonants. Several studies have reported on the possibility that fricatives might have a role in categorizing disorder types based on the acoustic properties of /s/ in speakers with specific deficits, including speakers with nonfluent aphasia, with or without apraxia of speech (Baum, 1996; Haley, 2002; Kurowski et al., 2003), and speakers with diverse dysarthrias (Chen & Stevens, 2001). Several acoustic analyses can be conducted for fricatives, such as the amplitude of the frication noise, duration of the noise, and patterning of glottal excitation. First described for speech by Forrest et al. (1988), moment analysis has recently been of particular focus and has been used to study both typical and disordered speech in children and adults. Over alternative methods of fricative analysis, spectral moment analyses are advantageous in that they provide reliable and quantifiable data with respect to precision of articulatory positioning of the fricatives, especially distinguishing /s/ and /ʃ/. Several clinical studies of fricative production, for example, reported that speakers with dysarthria exhibit a reduced distinction between /s/ and /ʃ/ compared with typical speakers regarding spectral energy peaks (Turner & Tjaden, 2000).
35.4 Utterance-level Analysis of Articulation Traditionally, articulation processes and disorders are examined at the individual sound level (i.e., segment), as described above. This approach allows an examination of soundspecific articulatory demands and the varying degree to which each sound (or paired sound contrast) contributes to the listener’s ratings, such as intelligibility. Partly due to methodological benefits, articulation has been increasingly analyzed from speech units larger than a single sound. Such units include phrases, passages, and conversation. Thus, we provide a section to describe acoustic and kinematic methods used to examine articulatory processes and disorders. Probably one of the primary benefits of utterance-level analyses, especially when employing automated analyses, is the less labor-intensive measurement process than the extensive manual segmentation required for sound-specific analysis. Further, while the local, segment analysis captures a snapshot of speech characteristics based on isolated time points, the global, utterance-level analysis tracks continuous articulatory changes of the entire speech signals. Several segment measures are simply extended to utterance- or passage-levels. For example, F2 interquartile range (F2 IQR), as a logical extension of the F2 slope, considers the F2 history of all voiced segments across longer passages of speech and reflects the speakers’ F2 variability (Yunusova et al., 2005). Vowel space density (VSD) is another example, which is used to examine the vowel working space over a longer sequence (Story & Bunton, 2017). By tracking the continuous formant trajectories of voiced speech, VSD calculates the speaker’s formant space density. More dense areas of the speaker’s formant space reveal where the speaker spends most of their time in the formant, and presumably kinematic space.
516 Yunjung Kim, Raymond D. Kent, and Austin Thompson
Figure 35.7 Kinematic hull areas of the tongue (gray) and jaw (black), respectively, obtained from a passage reading task. The tongue sensor was placed approximately 5 cm from the tongue tip, and the jaw sensor was adhered to the labial surface of the lower central incisors.
Kinematic investigations tend to prefer utterance-level analyses to segment-level analyses to capture different aspects of speech movements. For example, jerk, defined as the derivative of acceleration to quantify the degree of smoothness of arm and shoulder movement, has been recently used to characterize speech disorders (Berry & Kim, 2023). The kinematic hull area serves as an estimate of articulatory working space. As demonstrated in Figure 35.7, it is calculated as the convex hull area of a single articulator over a target utterance, capturing the global articulatory characteristics of people with speech-language pathology. The hull area is, in general, reduced in people with speech-language pathology, but it also depends on the underlying neuropathology (Kearney et al., 2017; Whitfield et al., 2018). As an example of clinical application, we consider spatiotemporal index (STI), which is most frequently investigated at the utterance level as an index of speech variability or stability within speakers with and without speech-language pathology across the lifespan (Smith et al., 1995).
35.4.1 Clinical Example 4: Spatiotemporal Index (STI) STI is a measure of temporal and spatial variability and is typically obtained by prompting speakers to produce several repetitions (i.e., 10–20) of the same phrase. The famous sentence is introduced again here, “Buy Bobby a puppy,” which was designed to elicit large lip and jaw movements. Time- and amplitude-normalized movements are used to compute the STI, which is often reported to be increased for several populations, including older adults (Wohlert & Smith, 1998), children (Smith & Goffman, 1998), individuals who stutter (Howell et al., 2009), individuals with dysarthria (Kleinow et al., 2001), and individuals with apraxia of speech (Moss & Grigos, 2012). An STI value closer to 0 indicates less movement variability across repetitions (observed for the control speaker), while an STI value further
Instrumental Analysis of Articulation 517
Figure 35.8 An example of spatiotemporal index (STI) comparing a neurotypical speaker (top, STI = 14.21) and a speaker with Parkinson’s disease (bottom, STI = 20.93). © Yunjung Kim, Raymond D. Kent, and Austin Thompson, 2023.
from 0 indicates more movement variability across repetitions (seen for the speaker with Parkinson’s disease). Figure 35.8 demonstrates the STI measure of lower lip movement for a speaker with Parkinson’s disease and an age-matched neurotypical speaker obtained from “Buy Bobby a puppy.” The STI value is greater for the PD speaker, indicating a more variable lower lip movement across five repetitions compared to the neurotypical speaker.
35.5 Conclusion and Implications for Future Research Although the scope and depth covered are limited, our overview of the literature confirms the benefits of instrumental analysis in studying articulation processes and disorders. Many of the measures included in the chapter not only complement our ear-dependent assessment, but also enhances the objectivity and reliability of the data. Owing to advances in speech techniques and technologies that allow for data collection and analysis in a cost- and labor-effective way, we now live in the big data era like many other modern science fields. Further, data sharing practice increases the possibility of integrating data from multiple sources for large-scale analyses and reproducibility. Several speech databases are currently available for such purposes; MOCHA-TIMIT multi-channel articulatory database (Wrench, 2000), USC-TIMIT multimodal speech production database (Narayanan et al., 2014), X-ray microbeam speech production database (Westbury et al., 1994), to name a few. With a flood of data, a primary interest is to determine which aspects
518 Yunjung Kim, Raymond D. Kent, and Austin Thompson and variables are selectively or commonly vulnerable to diverse speech-language deficits. The use of computer techniques such as automatic speech analysis and machine learning has also identified measures that may detect (preferably early) speech disorders but some of them are hard to interpret. Especially considering the potential transfer of the results into clinical contexts, the identification of a set of measures that holds a firm relevance to articulatory behaviors and clinical interpretability will be a pivotal key to the next advancement.
REFERENCES Allison, K. M., Annear, L., Policicchio, M., & Hustad, K. C. (2017). Range and precision of formant movement in pediatric dysarthria. Journal of Speech, Language, and Hearing Research, 60(7), 1864–1876. Allison, K. M., Nip, I. S., & Rong, P. (2022). Use of automated kinematic diadochokinesis analysis to identify potential indicators of speech motor involvement in children with cerebral palsy. American Journal of Speech-Language Pathology, 31(6), 2835–2846. Allison, K. M., Salehi, S., & Green, J. R. (2020). Effect of prosodic manipulation on articulatory kinematics and second formant trajectories in children. The Journal of the Acoustical Society of America, 147(2), 769–776. Amelot, A., & Rossato, S. (2006). Velar movements for the feature [±nasal] for two French speakers. In Proceedings of the 7th international seminar on speech production (pp. 459–467). Ubatuba, Brazil. Ansel, B. M., & Kent, R. D. (1992). Acousticphonetic contrasts and intelligibility in the dysarthria associated with mixed cerebral palsy. Journal of Speech, Language, and Hearing Research, 35(2), 296–308. Auzou, P., Ozsancak, C., Morris, R. J., Jan, M., Eustache, F., & Hannequin, D. (2000). Voice onset time in aphasia, apraxia of speech and dysarthria: A review. Clinical Linguistics & Phonetics, 14(2), 131–150. Bartle, C. J., Goozée, J. V., & Murdoch, B. E. (2007). An EMA analysis of the effect of increasing word length on consonant production in apraxia of speech: A case study. Clinical Linguistics & Phonetics, 21(3), 189–210. Bartle‐Meyer, C. J., & Murdoch, B. E. (2010). A kinematic investigation of anticipatory lingual movement in acquired apraxia of speech. Aphasiology, 24(5), 623–642. Baum, S. R. (1996). Fricative production in aphasia: Effects of speaking rate. Brain and Language, 52(2), 328–341.
Berry, J., & Kim, Y.-J. (2023). Towards the dissociation of dysarthria and dialect in speech kinematics. Proceedings of Meetings on Acoustics, 30(1). https://asa.scitation.org/doi/ abs/10.1121/2.0001700 Berry, J., Kolb, A., Schroeder, J., & Johnson, M. T. (2017). Jaw rotation in dysarthria measured with a single electromagnetic articulography sensor. American Journal of Speech-Language Pathology, 26(2S), 596–610. Birkholz, P., & Hoole, P. (2012). Intrinsic velocity differences of lip and jaw movements: Preliminary results. Thirteenth Annual Conference of the International Speech Communication Association. Bradford, L. J., & Brooks, A. R. (1964). Clinical judgement of hypernasality in cleft palate children. The Cleft Palate Journal, 1(3), 329–335. Bunton, K., & Weismer, G. (2001). The relationship between perception and acoustics for a high-low vowel contrast produced by speakers with dysarthria. Journal of Speech, Language, and Hearing Research, 44(6), 1215–1228. Carignan, C., Coretta, S., Frahm, J., Harrington, J., Hoole, P., Joseph, A., Kunay, E., Voit, D. (2021). Planting the seed for sound change: Evidence from real-time MRI of velum kinematics in German. Language, 97(2), 333–364. Carignan, C., Shosted, R., Shih, C., & Rong, P. (2011). Compensatory articulation in American English nasalized vowels. Journal of Phonetics, 39(4), 668–682. Chen, H., & Stevens, K. N. (2001). An acoustical study of the fricative/s/in the speech of individuals with dysarthria. Journal of Speech, Language, and Hearing Research, 44(6), 1300–1314. Cho, T., Whalen, D. H., & Docherty, G. (2019). Voice onset time and beyond: Exploring laryngeal contrast in 19 languages. Journal of Phonetics, 72, 52–65. https://www.sciencedirect. com/science/article/pii/S0095447018302110
Instrumental Analysis of Articulation 519 DuHadway, C. M., & Hustad, K. C. (2012). Contributors to intelligibility in preschoolaged children with cerebral palsy. Journal of Medical Speech-Language Pathology, 20(4), 11–19. Eshghi, M., Connaghan, K. P., Gutz, S. E., Berry, J. D., Yunusova, Y., & Green, J. R. (2021). Co-occurrence of hypernasality and voice impairment in amyotrophic lateral sclerosis: Acoustic quantification. Journal of Speech, Language, and Hearing Research, 64(12), 4772–4783. Eshghi, M., Stipancic, K. L., Mefferd, A., Rong, P., Berry, J. D., Yunusova, Y., & Green, J. R. (2019). Assessing oromotor capacity in ALS: The effect of a fixed-target task on lip biomechanics. Frontiers in Neurology, 10, 1288. https://www. frontiersin.org/articles/10.3389/ fneur.2019.01288/full. Espy‐Wilson, C. Y. (1992). Acoustic measures for linguistic features distinguishing the semivowels/wjrl/in American English. The Journal of the Acoustical Society of America, 92(2), 736–757. Fant, G. (1970). Acoustic theory of speech production (No. 2). Walter de Gruyter. Forrest, K., Weismer, G., Milenkovic, P., & Dougall, R. N. (1988). Statistical analysis of word‐initial voiceless obstruents: Preliminary data. The Journal of the Acoustical Society of America, 84(1), 115–123. Fujimura, O. (1962). Analysis of nasal consonants. The Journal of the Acoustical Society of America, 34(12), 1865–1875. Fujimura, O. (1990). Methods and goals of speech production research. Language and Speech, 33(3), 195–258. Haley, K. L. (2002). Temporal and spectral properties of voiceless fricatives in aphasia and apraxia of speech. Aphasiology, 16(4–6), 595–607. Hardcastle, W. J., Morgan Barry, R. A., & Clark, C. J. (1985). Articulatory and voicing characteristics of adult dysarthric and verbal dyspraxic speakers: An instrumental study. British Journal of Disorders of Communication, 20(3), 249–270. Haworth, B., Kearney, E., Faloutsos, P., Baljko, M., & Yunusova, Y. (2019). Electromagnetic articulography (EMA) for real-time feedback application: Computational techniques. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 7(4), 406–413. Hedges, L. V., & Olkin, I. (2014). Statistical methods for meta-analysis. Academic press.
Hoole, P., & Gfoerer, S. (1990). Electromagnetic articulography as a tool in the study of lingual coarticulation. The Journal of the Acoustical Society of America, 87(S1), S123–S123. Howell, P., Anderson, A. J., Bartrip, J., & Bailey, E. (2009). Comparison of acoustic and kinematic approaches to measuring utterance-level speech variability. Journal of Speech, Language, and Hearing Research, 52(4), 1088–1096. Hudgins, C. V. (1934). A comparative study of the speech coordinations of deaf and normal subjects. The Pedagogical Seminary and Journal of Genetic Psychology, 44(1), 3–48. Hudgins, C. V., & Stetson, R. H. (1935). Voicing of consonants by depression of larynx. Archives neerlandaises de phonetique experimentale, 11, 1–28. Karlsson, F., & Doorn, J. V. (2012). Vowel formant dispersion as a measure of articulation proficiency. The Journal of the Acoustical Society of America, 132(4), 2633–2641. Karlsson, F., Unger, E., Wahlgren, S., & van Doom, J. (2012). Treatment effects in voice onset time of plosives associated with deep brain stimulation of the subthalamic nucleus and the caudal zona incerta. Journal of Medical Speech-Language Pathology, 20(4), 65–70. Kataoka, R., Warren, D. W., Zajac, D. J., Mayo, R., & Lutz, R. W. (2001). The relationship between spectral characteristics and perceived hypernasality in children. The Journal of the Acoustical Society of America, 109(5), 2181–2189. Kearney, E., Giles, R., Haworth, B., Faloutsos, P., Baljko, M., & Yunusova, Y. (2017). Sentencelevel movements in Parkinson’s disease: Loud, clear, and slow speech. Journal of Speech, Language, and Hearing Research, 60(12), 3426–3440. Kearney, E., Haworth, B., Scholl, J., Faloutsos, P., Baljko, M., & Yunusova, Y. (2018). Treating speech movement hypokinesia in Parkinson’s disease: Does movement size matter? Journal of Speech, Language, and Hearing Research, 61(11), 2703–2721. Kent, R. D., & Vorperian, H. K. (2018). Static measurements of vowel formant frequencies and bandwidths: A review. Journal of Communication Disorders, 74, 74–97. https:// www.sciencedirect.com/science/article/pii/ S0021992417302575?casa_token=ZAcjkII6AIYA AAAA:T6zaGCEoVdevwgTvOQqACK46cHfZ Se3Vr_EGwEdbff0hzXNca42W7u416GehYJi3J_ yjK5mK-w Kim, Y.-J., & Choi, Y. (2017). A cross-language study of acoustic predictors of speech
520 Yunjung Kim, Raymond D. Kent, and Austin Thompson intelligibility in individuals with Parkinson’s disease. Journal of Speech, Language, and Hearing Research, 60(9), 2506–2518. Kim, Y.-J., Chung, H., & Thompson, A. (2022). Acoustic and articulatory characteristics of English semivowels/ɹ, l, w/ produced by adult second-language speakers. Journal of Speech, Language, and Hearing Research, 65(3), 890–905. Kim, Y.-J., Kent, R. D., & Weismer, G. (2011). An acoustic study of the relationships among neurologic disease, dysarthria type, and severity of dysarthria. Journal of Speech, Language, and Hearing Research, 54(2), 417–429. Kim, Y.-J., Weismer, G., Kent, R. D., & Duffy, J. R. (2009). Statistical models of F2 slope in relation to severity of dysarthria. Folia Phoniatrica et Logopaedica, 61(6), 329–335. Kleinow, J., Smith, A., & Ramig, L. O. (2001). Speech motor stability in IPD: Effects of rate and loudness manipulations. Journal of Speech, Language, and Hearing Research, 44(5), 1041–1051. Kopera, H. C., & Grigos, M. I. (2020). Lexical stress in childhood apraxia of speech: Acoustic and kinematic findings. International Journal of Speech-Language Pathology, 22(1), 12–23. Kurowski, K., Hazen, E., & Blumstein, S. E. (2003). The nature of speech production impairments in anterior aphasics: An acoustic analysis of voicing in fricative consonants. Brain and Language, 84(3), 353–371. Lansford, K. L., & Liss, J. M. (2014). Vowel acoustics in dysarthria: Speech disorder diagnosis and classification. Journal of Speech, Language, and Hearing Research, 57(1), 57–67. Lee, J., Littlejohn, M. A., & Simmons, Z. (2017). Acoustic and tongue kinematic vowel space in speakers with and without dysarthria. International Journal of Speech-Language Pathology, 19(2), 195–204. Lee, J., Rodriguez, E., & Mefferd, A. (2020). Direction-specific jaw dysfunction and its impact on tongue movement in individuals with dysarthria secondary to amyotrophic lateral sclerosis. Journal of Speech, Language, and Hearing Research, 63(2), 499–508. Liu, H. M., Tseng, C. H., & Tsao, F. M. (2000). Perceptual and acoustic analysis of speech intelligibility in Mandarin-speaking young adults with cerebral palsy. Clinical Linguistics & Phonetics, 14(6), 447–464. Loh, E. L., Goozée, J. V., & Murdoch, B. E. (2005). Kinematic analysis of jaw function in children following traumatic brain injury. Brain Injury, 19(7), 529–538.
Ludlow, C. L., Kent, R. D., & Gray, L. C. (2018). Measuring voice, speech, and swallowing in the clinic and laboratory. Plural Publishing. Mauszycki, S. C., Dromey, C., & Wambaugh, J. L. (2007). Variability in apraxia of speech: A perceptual, acoustic, and kinematic analysis of stop consonants. Journal of Medical SpeechLanguage Pathology, 15, 223–242. McAllister Byun, T., & Tiede, M. (2017). Perception-production relations in later development of American English rhotics. PloS One, 12(2), e0172022. McKenna, V. S., Hylkema, J. A., Tardif, M. C., & Stepp, C. E. (2020). Voice onset time in individuals with hyperfunctional voice disorders: Evidence for disordered vocal motor control. Journal of Speech, Language, and Hearing Research, 63(2), 405–420. Mefferd, A. (2015). Articulatory-to-acoustic relations in talkers with dysarthria: A first analysis. Journal of Speech, Language, and Hearing Research, 58(3), 576–589. Morris, R. J. (1989). VOT and dysarthria: A descriptive study. Journal of Communication Disorders, 22(1), 23–33. Moss, A., & Grigos, M. I. (2012). Interarticulatory coordination of the lips and jaw in childhood apraxia of speech. Journal of Medical SpeechLanguage Pathology, 20(4), 127–132. Narayanan, S., Toutios, A., Ramanarayanan, V., Lammert, A., Kim, J., Lee, S., Nayak, K., Kim, Y-C., Zhu, Y., Goldstein, L., Byrd, D., Bresch, E., Ghosh, P., Katsamanis, A., & Proctor, M. (2014). Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC). The Journal of the Acoustical Society of America, 136(3), 1307–1311. Peterson, L., Savarese, C., Campbell, T., Ma, Z., Simpson, K. O., & McAllister, T. (2022). Telepractice treatment of residual rhotic errors using app-based biofeedback: A pilot study. Language, Speech, and Hearing Services in Schools, 53(2), 256–274. Preston, J. L., Irwin, J. R., & Turcios, J. (2015, November). Perception of speech sounds in school-aged children with speech sound disorders. Seminars in Speech and Language, 36(04), 224–233. Thieme Medical Publishers. Rong, P., Loucks, T., Kim, H., & HasegawaJohnson, M. (2012). Relationship between kinematics, F2 slope and speech intelligibility in dysarthria due to cerebral palsy. Clinical Linguistics & Phonetics, 26(9), 806–822.
Instrumental Analysis of Articulation 521 Rosen, K. M., Goozée, J. V., & Murdoch, B. E. (2008). Examining the effects of multiple sclerosis on speech production: Does phonetic structure matter? Journal of Communication Disorders, 41(1), 49–69. Roy, N., Nissen, S. L., Dromey, C., & Sapir, S. (2009). Articulatory changes in muscle tension dysphonia: Evidence of vowel space expansion following manual circumlaryngeal therapy. Journal of Communication Disorders, 42(2), 124–135. Sandoval, S., Berisha, V., Utianski, R. L., Liss, J. M., & Spanias, A. (2013). Automatic assessment of vowel space area. The Journal of the Acoustical Society of America, 134(5), EL477–EL483. Sapir, S., Ramig, L. O., Spielman, J. L., & Fox, C. (2010). Formant centralization ratio: A proposal for a new acoustic measure of dysarthric speech. Journal of Speech, Language, and Hearing Research, 53(1), 114–125. Schönle, P. W., Gräbe, K., Wenig, P., Höhne, J., Schrader, J., & Conrad, B. (1987). Electromagnetic articulography: Use of alternating magnetic fields for tracking movements of multiple points inside and outside the vocal tract. Brain and Language, 31(1), 26–35. Skodda, S., Visser, W., & Schlegel, U. (2011). Vowel articulation in Parkinson’s disease. Journal of Voice, 25(4), 467–472. Smith, A., & Goffman, L. (1998). Stability and patterning of speech movement sequences in children and adults. Journal of Speech, Language, and Hearing Research, 41(1), 18–30. Smith, A., Goffman, L., Zelaznik, H. N., Ying, G., & McGillem, C. (1995). Spatiotemporal stability and patterning of speech movement sequences. Experimental Brain Research, 104(3), 493–501. Story, B. H., & Bunton, K. (2017). Vowel space density as an indicator of speech performance. The Journal of the Acoustical Society of America, 141(5), EL458–EL464. Sussman, H. M., & Smith, K. U. (1970). Transducer for measuring mandibular movements. The Journal of the Acoustical Society of America, 48(4A), 857–858. Teplansky, K. J., Wisler, A., Green, J. R., Campbell, T., Heitzman, D., Austin, S. G., & Wang, J. (2023). Tongue and lip acceleration as a measure of speech decline in amyotrophic lateral sclerosis. Folia Phoniatrica et Logopaedica, 75(1), 23-34.
Thomas, A., Teplansky, K. J., Wisler, A., Heitzman, D., Austin, S., & Wang, J. (2022). Voice onset time in early-and late-stage amyotrophic lateral sclerosis. Journal of Speech, Language, and Hearing Research, 65(7), 2586–2593. Thompson, A., & Kim, Y.-J. (2019). Relation of second formant trajectories to tongue kinematics. The Journal of the Acoustical Society of America, 145(4), EL323–EL328. Tjaden, K., & Wilding, G. E. (2004). Rate and loudness manipulations in dysarthria. Journal of Speech, Language, and Hearing Research, 47(4), 766–783. Turner, G. S., & Tjaden, K. (2000). Acoustic differences between content and function words in amyotrophic lateral sclerosis. Journal of Speech, Language, and Hearing Research, 43(3), 769–781. van Brenk, F., Terband, H., Van Lieshout, P., Lowit, A., & Maassen, B. (2013). Rate-related kinematic changes in younger and older adults. Folia Phoniatrica et Logopaedica, 65(5), 239–247. van Lieshout, P. H. H. M., Bose, A., Square, P. A., & Steele, C. M. (2007). Speech motor control in fluent and dysfluent speech production of an individual with apraxia of speech and Broca’s aphasia. Clinical Linguistics & Phonetics, 21, 159–188. Wang, J., Green, J. R., Samal, A., & Yunusova, Y. (2013). Articulatory distinctiveness of vowels and consonants: A data-driven approach. Journal of Speech, Language, and Hearing Research, 56(5), 1539–1551. Weismer, G. (2006). Speech disorders. In M. Gernsbacher & M. Traxler (Eds.), Handbook of psycholinguistics, 93–224. Blackwell. Wenke, R. J., Theodoros, D., & Cornwell, P. (2010). Effectiveness of Lee Silverman Voice Treatment (LSVT)® on hypernasality in non-progressive dysarthria: The need for further research. International Journal of Language & Communication Disorders, 45(1), 31–46. Westbury, J. R., Turner, G., & Dembowski, J. (1994). X-ray microbeam speech production database user’s handbook. University of Wisconsin. Whiteside, S. P., Robson, H., Windsor, F., & Varley, R. (2012). Stability in voice onset time patterns in a case of acquired apraxia of speech. Journal of Medical Speech-Language Pathology, 20(1), 17–29. Whitfield, J. A., Dromey, C., & Palmer, P. (2018). Examining acoustic and kinematic measures of articulatory working space: Effects of speech intensity. Journal of Speech, Language, and Hearing Research, 61(5), 1104–1117.
522 Yunjung Kim, Raymond D. Kent, and Austin Thompson Whitfield, J. A., & Goberman, A. M. (2014). Articulatory–acoustic vowel space: Application to clear speech in individuals with Parkinson’s disease. Journal of Communication Disorders, 51, 19–28. Wohlert, A. B., & Smith, A. (1998). Spatiotemporal stability of lip movements in older adult speakers. Journal of Speech, Language, and Hearing Research, 41(1), 41–50. Wrench, A. A. (2000). A multi-channel/multispeaker articulatory database for continuous speech recognition research. Phonus, 5, 1–13. https://eresearch.qmu.ac.uk/handle/20.500. 12289/2489
Xue, P., Bai, J., Wang, Q., Zhang, X., & Feng, P. (2018). Analysis and classification of the nasal finals in hearing-impaired patients using tongue movement features. Speech Communication, 104, 57–65. Yunusova, Y., Green, J. R., Greenwood, L., Wang, J., Pattee, G. L., & Zinman, L. (2012). Tongue movements and their acoustic consequences in amyotrophic lateral sclerosis. Folia Phoniatrica et Logopaedica, 64(2), 94–102. Yunusova, Y., Weismer, G., Kent, R. D., & Rusche, N. M. (2005). Breath-group intelligibility in dysarthria. Journal of Speech, Language, and Hearing Research, 62(7), 2082–2098.
36 Instrumental Analysis of Voice MEIKE BROCKMANN-BAUSER
36.1 Introduction 36.1.1 Human Voice Assessment Instrumental voice analysis techniques aim to objectively describe characteristics of vocal production and output. For both clinical and research questions, a variety of instrumental methods to describe features of the three main subsystems of voice production, phonation, respiration, and resonance, are available. Assessment techniques are mainly classified as direct methods to visualize the physiology and activity of laryngeal structures such as by laryngostroboscopy or videokymography (VKG), and indirect methods to objectively assess acoustic properties of the human voice sound (i.e. vocal output). Moreover, semi-direct methods provide information regarding vocal fold and respiration physiology without visualizing, and include electromyography, subglottal pressure estimates, and electroglottography (Awan, 2008). In a diagnostic setting, clinicians aim to determine the cause and consequences of disordered voice production to tailor a specific treatment for the individual patient. A disordered voice production may result from changes in the structure (such as vocal fold nodules or edema), innervation (such as in vocal fold paralysis), or muscular function of the phonatory system. Therefore, as recommended by European and American clinician associations, a comprehensive voice examination usually includes subjective, visual, perceptual, aerodynamic, and instrumental acoustic assessment techniques (Dejonckere et al., 2001; Patel et al., 2018). Scope of the present chapter is to review advantages, disadvantages and applications of the most widely used indirect instrumental acoustic and direct visual assessment techniques applied in voice diagnostics and research.
36.2 I ndirect Assessment Techniques: Instrumental Acoustic Analysis Each vocal fold vibration produces a natural tone, containing the fundamental frequency (fo), and a set of overtones, the natural partials, which are multiples of fo. According to the source-filter theory, the unformed voice sound emitted from the vocal folds is passively amplified and filtered by the vocal tract by its own resonance properties. Articulatory movements of the speech organs in the vocal tract (such as the tongue or lips) actively form the sound to create speech (Fant, 1980). Thus, the vocal output carries indirect information about
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
524 Meike Brockmann-Bauser characteristics of sound production and the speaker, including (general) health status, gender, stress level, personality characteristics, age, and training status (Berg et al., 2017; Brockmann-Bauser et al., 2018; Mahrholz et al., 2018; Reetz et al., 2019). Due to huge technical developments in the last 20 years, researchers and clinicians have the choice between a variety of analysis types and techniques possible with both licensed or free software. Instrumental acoustic voice analysis refers to a family of computer-based techniques to objectively measure defined acoustic signal properties of spoken or sung materials. Usually for this, a participant or patient is recorded while phonating a (prolonged) vowel, speaking, reading of a standard text or singing. Then, defined acoustic characteristics of the voice signal are automatically calculated with a computer program. In clinical applications, mostly linear measures basing on the source-filter theory are derived, including objective estimates of voice pitch (fundamental frequency, fo) and loudness (amplitude, voice SPL), perturbation or irregularity of fo (jitter) and voice SPL (shimmer), the proportion of aperiodicity such as Harmonics-to-Noise-Ratio (HNR) and spectral characteristics such as (Smoothed) Cepstral Peak Prominence (CPP(S)) (Dejonckere et al., 2001; Patel et al., 2018). Further, methods based on nonlinear dynamics such as correlation dimension (D2) have been described as complementary alternative for voice assessment in more irregular voice signals.
36.2.1 Why Acoustic Voice Properties Are Measured Acoustic analysis in clinical applications relies on the hypothesis that voice pathology results in measurable acoustic characteristics. While small aberrations in the acoustic voice signal have been described as normal variation related to physiologic body functions including heartbeat, increased levels are considered as an indicator of vocal pathology (Orlikoff & Baken, 1989; Patel et al., 2018). Thus, instrumental acoustic measurements are applied to obtain indirect information about the biomechanical status and vibratory properties of the vocal folds, and pathophysiology of phonation (Jayakumar & Benoy, 2022; Lee et al., 2018; Patel et al., 2018). Instrumental acoustic analysis is considered as a non-invasive voice assessment technique sensitive to the severity of voice pathology, and was recommended for initial diagnosis and evaluation of changes within a patient. However, acoustic assessment methods vary in their ability to detect or document voice pathology and functional changes (Brockmann-Bauser & Drinnan, 2011; Patel et al., 2018; Ternström et al., 2016). Therefore, the application of the most widely used methods will be reviewed in the following sections.
36.2.2 Voice Range Profiles (VRP) Voice range profiles (VRP), also known as phonetograms, are a graphical profile indicating the voice intensity (on the y-axis) and pitch range (x-axis) during a set of defined voice tasks. Depending on the tasks, so-called speaking, calling, and singing voice range profiles are derived, providing information about restrictions to voice function in the frequency and sound pressure level domains (Figure 36.1). Tasks during voice recording typically include spontaneous speech in comfortable voice intensity levels up to shouting, counting tasks, text reading and singing of scales (Patel et al., 2018; Reetz et al., 2019; Ternström et al., 2016). This allows deriving a treatment plan targeting at the patient’s actual functional restrictions and needs. Moreover, an objective and standardized evaluation of treatment effects is possible. In voice disordered adults with so-called muscle tension dysphonia, restricted high frequency and minimal intensity areas, which have been shown to improve after voice therapy, have been reported. Also, in patients with laryngeal palsy, an improvement of fundamental frequency ranges and voice intensity levels during talking and singing can be observed. However, to date, voice range profiles are considered as not sufficient for the characterization of functional restrictions, because the natural variation between individuals is large and
Instrumental Analysis of Voice 525
Figure 36.1 Voice Range Profile (VRP) example of a male adult with a unilateral polyp before and after treatment. This example shows a speaking, shouting and singing voice range profile (VRP) from a male adult with a unilateral vocal fold polyp before (a) and after treatment (b). The y-axis displays voice intensity levels from low (bottom) to high, and the x-axis fundamental frequencies in Hz and musical notation from low (left) to high. Crosses indicate mean speaking fo and SPL for a variety of speaking tasks from “counting from 20 to 30” (1) “as soft as possible”, (2) “with habitual intensity”, (3) “as if talking for four people at a table”, (4) “so that you can be heard in the next room” and (5) while shouting a standardized sentence “as loudly as possible”. The connected line shows the corresponding results from singing a syllable (“la”) with the softest and loudest possible voice. (Source: M. Brockmann-Bauser, own image).
the normative data base includes only a limited characterization of age, training, and gender effects (Reetz et al., 2019; Ternström et al., 2016).
36.2.3 Measuring Vocal Quality Most clinical acoustic analysis types base on the source-filter model and on an understanding of the voice production system from a linear perspective. These approaches focus on the extraction and analysis of frequency (including overtone spectrum) and intensity patterns, and are so-called time-based analysis methods due to their dependency on identification of cycle boundaries or shape on the time axis.
526 Meike Brockmann-Bauser
36.2.3.1 Perturbation Parameters Jitter and Shimmer Commonly used perturbation parameters are jitter and shimmer, measuring the short-term variation or irregularity of fundamental frequency (jitter) and voice intensity (shimmer). Jitter and shimmer are typically computed in the time domain, and indicate variations in the cycle-to-cycle period duration and amplitude, respectively, across acoustic cycles. Measurement results are strongly influenced by software based characteristics such as the fo recognition strategy or parameter calculation method, and therefore cannot be simply compared between different softwares, centers or clinics (Baken & Orlikoff, 2000; Titze, 1995). Both measures depend on the correct recognition of fo (for jitter) and voice SPL (shimmer) respectively. Thus, measurement of these parameters has been recommended only in mildly to moderately dysphonic sounding voices, because technically there is a limitation in the correct recognition of glottic cycles (fo) and amplitude (SPL) in signals with strong deviations. In other words, acoustic signals can be too irregular for a reliable estimate of irregularity (Titze, 1995). Therefore, a classification of voice signals according to regularity with recommendations for suitable acoustic analysis types was developed (Titze, 1995). Nearly periodic Type 1 signals were classified as highly suitable for perturbation (such as jitter and shimmer) analysis. Type 2 signals with strong modulations or subharmonics approaching fo in energy, and irregular or aperiodic Type 3 signals were recommended for spectrographic and perceptual analysis methods. As an addition to this model, Type 4 signals have been described as Type 3 signals that are stochastic in nature and therefore suitable for nonlinear analysis (Sprecher et al., 2010).
36.2.3.2 Harmonics-to-Noise Ratio and Cepstral Analysis Harmonics-to-Noise Ratio (HNR) indicates a ratio of harmonic energy to noise energy in a speech signal and can be computed in the time, spectral, and cepstral domains. Thus, HNR provides an indication of the proportion of voiced components and noise of an acoustic signal. HNR measurements and similar indices such as signal-to-noise ratio (SNR) or normalized noise energy (NNE) have been suggested for more dysphonic or irregular voices (de Sousa, 2009; Ikuma et al., 2022). In a variety of studies, HNR and NNE have been described to quantify perceived dysphonia, and especially the perceptual characteristic breathiness (Barsties v. Latoszek et al., 2018). Both HNR and (Smoothed) Cepstral Peak Prominence (CPP(S)) rise with increased stability and clarity of phonation (Heman-Ackah et al., 2003). However, current guidelines recommend the cepstral parameter CPPS(S) basing on spectral analysis for voice assessment (Patel et al., 2018). Cepstral measures estimate the proportion of periodic energy based on a power spectrum, and thereby indicate the harmonic organization of an acoustic signal. The computation of Cepstral Peak Prominence (CPP) is derived from an inverse Fourier transformation of the log power spectrum of an acoustic voice signal, called cepstrum (anagram to “spectrum”), and visualizes the extent of harmonic sound energy. CPP indicates the difference (in dB) between the first rahmonic peak and the point of a regression line fitted across the cepstrum crossing the quefrency of the first rahmonic. The more harmonic a voice signal is, the more distinct is the first cepstral peak, leading to higher CPP. Smoothed Cepstral Peak Prominence (CPPS) is obtained with an additional processing step, to smooth the cepstral curve before calculating CPP (Sampaio et al., 2021). This analysis method makes it possible to estimate fo and aperiodicity without identifying individual cycle limits, as is necessary for traditional linear measures of perturbation and noise. A further advantage of CPP(S) is, that a comparatively reliable voice analysis can be performed with connected speech samples. Thus, cepstral analysis has been described as highly useful in the study and diagnosis of disordered voices, including the evaluation of highly irregular acoustic signals (Ferrer Riesgo & Nöth, 2020; Patel et al., 2018). For CPPS, an
Instrumental Analysis of Voice 527 accuracy of up to 82% for predicting the presence of a voice disorder has been described (Sauder et al., 2017). However, similar to other acoustic parameters, results should be communicated with detailed measurement characteristics.
36.2.3.3 Combined Parameters DSI, CSID and AVQI Combined indices such as the Dysphonia Severity Index (DSI), Cepstral Spectral Index of Dysphonia (CSID) and Acoustic Voice Quality Index (AVQI) consider the multidimensional nature of voice production and incorporate several measurements into a single metric. Generally, these indices are applied to objectively quantify dysphonia (Dejonckere et al., 2001; Patel et al., 2018). The DSI combines the time based measures maximum voice fo and SPL and jitter, plus the aerodynamic measure maximum phonation time (MPT). Values ≤ −1.2 have been associated with severely disordered voice, whereas values > 4.2 were described as normal voice function. In voice disordered adults, the DSI was related to dysphonia and changes in vocal function after treatment (Hakkesteegt et al., 2008, 2010). In contrast to this, both the CSID and AVQI base on perturbation, noise, and cepstral measures. While the CSID is derived from CPP, low-high spectral ratio (L/H ratio), and gender information through weighted multiple regression, AVQI includes the six parameters HNR, shimmer percentage, shimmer dB, the overall slope of the long-term averaged spectrum (LTAS), the slope of the regression line through the LTAS, and CPPS. The sensitivity of the CSID and AVQI to detect dysphonia has been described as adequate to good, and the specificity as moderate to excellent, with stronger overall performance for CSID. In turn, AVQI has been validated in several languages, taking into account the influence of language and culture on instrumental acoustic measures (Barsties v. Latoszek et al., 2021; Jayakumar & Benoy, 2022; Lee et al., 2018).
36.2.3.4 Nonlinear Measures Methods based on nonlinear models have been described as complementary and better alternative for voice assessment especially in irregular voice signals to understand the multiple biomechanical nonlinear interactions in voice production (de Oliveira Florencio et al., 2021). So-called traditional nonlinear measures such as correlation dimension (D2), largest Lyapunov exponent (Lyap), correlation entropy (H2) or relative entropy (ENTR-R) are considered effective for quantifying chaos (randomness) in voices with low dimensionality, that is, mostly Type 1 and Type 2 signals. Dimensionality is defined as the number of characteristics necessary to represent the signal in the phase space. However, in some Type 3 and all Type 4 signals with severe deviations, glottal noise, and turbulent airflow, recurrence quantification measures (RQMs) haven been suggested (de Oliveira Florencio et al., 2021; Liu et al., 2019). Both so-called traditional nonlinear measures and RQMs have been described as predictors of vocal pathology, correlating with the overall severity of dysphonia (de Oliveira Florencio et al., 2021).
36.2.4 Influencing Factors and Limitations Dysphonic voice signals are characterized by an increase in irregularity related to fundamental frequency and voice intensity, and reduced organization or energy in the overtone spectrum. Both single acoustic measures including jitter, shimmer, HNR, and CPP(S) and combined indices incorporating variants of these such as CSID and AVQI have been related to breathiness, roughness, and overall dysphonia in a comparatively large body of literature (Barsties v. Latoszek et al., 2021; Jayakumar & Benoy, 2022; Maryn, Roy et al., 2009; Patel et al., 2018). However, in a clinical population there are several pragmatic issues relating to measurement technique and speech characteristics to consider.
528 Meike Brockmann-Bauser
36.2.4.1 Recording and Measurement Technique Acoustic voice measures are objective in nature and provide a measurement of any input signal. In a pragmatic perspective, this means that also background noise or the quality of the recording equipment can influence analysis results. From the parameters discussed above, fundamental frequency (fo) has been described as the most robust measure even under suboptimal recording conditions, allowing data comparability between different hardware and software, and audio file formats (Fuchs, 2016; Maryn et al., 2017; Oliveira et al., 2017; Zhang et al., 2021). However, voice intensity and quality measurements have been shown to be significantly influenced by recording characteristics outside of a patient or participant. These include the microphone, signal-to-noise ratio (SNR) of the recording system, data processing steps, analysis algorithms, room acoustics, and background noise (Boersma, 2009; Maryn, Corthals et al., 2009; Maryn et al., 2017; Patel et al., 2018; Titze, 1995). Consequently, without reporting of salient technical characteristics, most acoustic parameters are not comparable to measures from other research studies or clinics.
36.2.4.2 Dependency on Voice Task Type and Speech Characteristics In clinical measurements, the analysis of vowel phonations has been favored over speech tasks, mostly due to its steady properties and higher relative reliability in the calculation of perturbation and spectral-based indices. More recently, speech samples or read text have been described as better representative for normal voice production behavior. However, for jitter, shimmer, CPP, and CPPS, the evidence regarding the influence of speech token type is inconclusive and contradictory so far. While several studies have shown differences for jitter, shimmer, HNR, and CPPS in sustained vowel phonation as compared to speech, reading or counting tasks, others found no significant disagreement. Also, the relation of token type with perceptual dysphonia has been described in contradictory terms (Aghajanzadeh & Saeedi, 2022; Phadke et al., 2020; Sampaio et al., 2020, 2021). Amongst others, differences in sustained vowels versus speech have been attributed to natural prosodic variations of speech, related to shifting in voice fo and SPL, transitions from glides or consonants, and syllable stress and duration (Sampaio et al., 2020). However, natural differences between individuals in speaking fo and SPL have been shown to strongly affect jitter, shimmer, HNR and CPPS in healthy and dysphonic adults, in both sustained vowel phonation and speech in a number of studies. Also, all parameters improve within both euphonic and disordered speakers in vowel phonation and speech, when the samples are spoken at a higher mean SPL and fo. Moreover, sample duration (i.e. length of spoken vowel) and vowel context have a significant effect on CPPS in vowels extracted from read sentences (Awan et al., 2012; Brockmann-Bauser et al., 2018, 2021; de Oliveira Florencio et al., 2021; Phadke et al., 2020; Sampaio et al., 2020, 2021). When taking speaking SPL into account as a confounding variable, there were no significant differences for jitter, shimmer, HNR, and CPPS between vocally healthy and matched voice disordered women of the same age and similar profession (BrockmannBauser et al., 2018, 2021). Under currently recommended clinical instrumental acoustic voice assessment protocols and in research studies, voices are usually recorded at “comfortable” loudness and pitch levels (Patel et al., 2018). To date there are no normative values available with correction for SPL and fo for single voice quality measures or combined indices. As a consequence, the above described speech-related effects are largely unaccounted for, especially since voice disordered and vocally healthy individuals exhibit a natural variability in vocal output across voice assessment sessions even within the same day (Brown et al., 1996; Park & Stepp,
Instrumental Analysis of Voice 529 2019; Pierce et al., 2021). Moreover, voice disordered individuals may perform well in one voice task, and sound more dysphonic in another. Thus, a global representativity of a normative value or token type cannot be simply assumed, since there is a lack of baseline data to understand which of the observed changes in clinical measurements are physiologic or pathologic. This reduces the accuracy and usefulness of jitter, shimmer, HNR and CPPS and combined indices incorporating these such as CSID and AVQI to reliably detect dysphonia and voice pathology.
36.2.4.3 Do We Know Enough about Normal Voice Function? Speaker characteristics such as gender, age, voice training, and voice use level have been shown to influence vocal output characteristics including voice range profiles (VRP), jitter, shimmer, HNR, and CPPS (Brockmann-Bauser et al., 2018; Brown et al., 2000; Pierce et al., 2021; Schaeffer et al., 2015; Stathopoulos et al., 2011; Walzak et al., 2008; Watts et al., 2015). However, age effects may be attenuated by voice training and appear less distinct in elderly with a good general physical condition. Moreover, gender differences in measurements of jitter, simmer, HNR, and CPPS are partially related to systematic differences in speaking SPL between women and men (Brockmann-Bauser et al., 2018; Lortie et al., 2017). To date it is unclear for most of these characteristics, how these should be considered when interpreting measurement results. This substantially minimizes the usefulness of acoustic voice measures to determine the extent of functional restriction, and calls for a far larger evidence base describing natural voice function variation in vocally healthy individuals.
36.2.5 Basic Data Comparability in Acoustic Voice Measurements In view of the above discussed influencing factors in objective acoustic measurements, several measures to ensure basic data comparability have been recommended with reference to international standards. These include reporting of (a) equipment (including Signal-toNoise-Ratio), acoustic waveform recognition strategies, and analysis software, (b) the acoustic environment (normal room acoustics, quiet room, soundproof booth), (c) voice tasks including number of repetitions and indication of analyzed parts, (d) mean (with SD) of the speaking SPL and fo for measurements of voice quality such as jitter, HNR, CPPS, or combined indices (Brockmann-Bauser et al., 2021; Patel et al., 2018).
36.3 S emi-direct Assessment Techniques: Electroglottography 36.3.1 Measurement Technique Electroglottography (EGG) has been increasingly applied in research and clinical questions and is a low-cost examination technique of vocal fold contacting and opening behavior under natural phonatory conditions. For electroglottography, two electrodes are placed anteriorly on both sides of the external neck at glottal level. Between the electrodes passes a small high-frequency low-amperage alternating current. The voltage increases proportionally to the rise in impedance influenced by glottal opening and closing: it drops when the glottis closes, and increases when the glottis opens (Figure 36.2). Thus, the derived EGG waves indirectly reflect vocal fold interaction in a real-time and non-invasive objective manner (Herbst, 2020).
2
6
8
10
12
Closing peak
peak
Closing peak
Opening peak
4
Opening
Open phase
Closed phase Closing peak
Bottom: smoothed DEGG; mid: DEGG; top: EGG
530 Meike Brockmann-Bauser
14 ms
Example of EGG and DEGG signals, with indication of glottis closure and opening
Figure 36.2 Schema of EGG and dEGG waves. In the upper part, an EGG signal with open and closed phases is illustrated. At the beginning of the cycle and the closed phase (marked with a grey vertical line), there is initial contact of the lower vocal fold margins, maximum vocal fold contact is reached on top of the wave. Thereafter, the decontacting phase is initiated by separation of the lower vocal fold margins, with an open glottis and minimal contact area at the lowest point. On the middle level is a dEGG, and on the lowest level a smoothed dEGG signal. (Source: Tools for Electroglottographic Analysis: Software, Documentation and Databases Reproduced with permission from (Henrich & Michaud, 2017) / Voice research.)
36.3.2 EGG Analysis Types and Parameters Usually, several parameters are derived from the EGG signal obtained during vowel phonation or speaking tasks. The most widely applied electroglottographic parameters are the contact quotient (CQ) and open quotient respectively (OQ), and to a lesser extent the dEGG peak (Herbst, 2020; Herbst et al., 2017). As shown in Figure 36.2, the time-differentiated EGG signal (dEGG) reflects the velocity of change in the vocal fold contacting area, and relies upon the first mathematical derivative of the EGG signal (Herbst, 2020; Herbst et al., 2017; Herbst & Ternström, 2006). The dEGG signal exhibits two major peaks, indicating when the vocal fold contacting area increases (positive peak) or decreases (negative peak) with the greatest magnitude. They correspond to glottal closing or opening events, and divide the EGG cycle into two parts: a closed and an open phase (Figure 36.2). In vocally healthy individuals, the change between two phases happens rapidly, whereas in dysphonic individuals this is delayed, leading to a lower peak amplitude in the dEGG signal (Henrich et al., 2017). In contrast to this, the EGG contact quotient (CQ) indicates the relative vocal fold adduction during phonation of a vibratory cycle (Figure 36.2). Complementary to this is the open quotient (OQ), defined as the time that the vocal folds are open as a percentage. With CQ= 1– OQ, the two parameters can easily be converted into each other. Since the EGG signal does not permit the determination of exact instants of vocal fold (de)contacting, the EGG CQ only represents an approximation of the closed phase (Baken & Orlikoff, 2000; Herbst, 2020). Various thresholds have been proposed for indicating the moment of vocal fold (de)contacting, influencing CQ results. Mostly, three main different computation methods have been
Instrumental Analysis of Voice 531 employed. The criterion method (or threshold level) first establishes the vibratory cycle duration by locally normalizing the EGG wave graph (typically between 0 and 1). Then (de)contacting instants from EGG signal are inferred, when amplitudes exceed above or fall below a defined threshold such as at 20%, 25%, 30%, 35%, or 50%. In contrast, the dEGG based method operates on the first derivative of the EGG signal, similar to the DEGG peak. The zenith (for contacting) and the nadir (for decontacting) of the dEGG wave, corresponding to the moments of fastest in-/decrease of vocal fold contacting area, are used to determine (de) contacting. The OQ derived from this method often is referred to as dEGG OQ. Lastly, the Hybrid or Howard’s method to determine hybrid OQ or CQ, is a combination of the criterionand the dEGG-based methods. Here, the contacting event is the positive dEGG peak, while decontacting is defined as an EGG amplitude drop below a threshold typically set at about 0.43 (3/7) of the cycle amplitude. Open Quotient (OQ) ranges in healthy individuals may be as large as between 0.4 and 0.75 in phonations with increasing voice SPL, and vary with the above described calculation methods (Herbst, 2020; Herbst et al., 2017).
36.2.3 Influencing Factors and Outview Electroglottographic measurements have been applied in diverse contexts, including basic voice physiology during speaking and singing, swallowing, phonetics, hearing and psychology research as well as for biofeedback (Herbst, 2020). With regard to voice diagnostics, EGG was described as a potential complement to objective measures of vocal function in the multidisciplinary management of voice disorders, and as a non-invasive alternative to endoscopic imaging in irregular voice signals. Studies in patients with muscle tension dysphonia and structural vocal fold pathology show an improvement in CQ and standard deviation (Hosokawa et al., 2012) and OQ after treatment (da Cunha Pereira et al., 2018; Herbst, 2020). However, the EGG wave is also influenced by external factors such as the equipment, electrode placement, parameter calculation methods and speech task type, as well as individual factors like head posture, anatomical variations, age, gender, and lung volume (Baken & Orlikoff, 2000; Herbst, 2020). Moreover, recent literature suggests that EGG measures also vary with speech-related features such as speaking SPL and fo with highly individual patterns over a voice range profile (Patel & Ternström, 2019, 2021; Selamtzis & Ternstrom, 2017). This calls for further development and standardization of the quantitative measurement technique and clinical application.
36.4 Direct Assessment Techniques: Visual Examination Direct laryngeal assessment techniques refer to visual examination techniques suitable for inspection of laryngeal physiology and function. To date, the main techniques in voice research and clinics are laryngoscopy and videolaryngostrobsocopy, High Speed Video (HSV) techniques and (video)kymography.
36.4.1 Laryngoscopy and Laryngostroboscopy 36.4.1.1 Laryngoscopy and Videolaryngoscopy Rigid laryngoscopy with a constant light source, performed through the mouth with special 90° or 70° optics, is usually applied to detect structural changes and to evaluate the opening and closing movements of the vocal folds. For this purpose, the tongue of the examined person is held by the examiner, and the patient performs different voice tasks such as a
532 Meike Brockmann-Bauser phonating a sustained /i/ and doing pitch glides (Patel et al., 2018). However, under these conditions, natural speech or singing is not possible, leading to functional differences as compared to uninfluenced speech situations. Flexible transnasal endoscopy of the larynx is recommended, when a largely undisturbed functional assessment of the larynx and its adjacent structures is to be performed during speech and singing. However, often poorer image quality is obtained with this method. For videoendoscopy, video-documenting technology is coupled to an endoscope, usually recording with a rate at 30 frames per second (fps), which is far too slow to resolve the oscillatory behavior of the vocal folds during phonation with a minimum of around 95 Hz for males. Generally, laryngoscopy and videoendoscopy are applied to assess gross laryngeal structures and function during respiration and phonation. Usually inspected are laryngeal physiology, vocal fold medial edges, vocal fold mobility (e.g., abduction/adduction), supraglottic muscle activity during phonation, and laryngeal maneuvers during a set of voice tasks.
36.4.1.2 Videolaryngostroboscopy (VLS) Currently, videolaryngostroboscopy (VLS) is considered the gold standard for the assessment of laryngeal structures and function in a clinical setting (Patel et al., 2018; Powell et al., 2020). Laryngostroboscopy allows an assessment of vocal fold vibration during phonation, which is considered critical for determining the nature or causes of dysphonia (Mehta & Hillman, 2012). For laryngostroboscopy, minimally time-shifted images of successive vibration cycles are recorded with the aid of a light exposure triggered by the fundamental frequency. Usually this is done with a stroboscopic light source or a shutter technique. The images are then recombined into a moving film. Thus, videostroboscopy is not a real-time film of true vocal fold oscillations during phonation, but a synthesized movie of pictures of different successive oscillation cycles. For stroboscopic assessment, it has been recommended to rate the regularity, vibratory amplitude, mucosal wave, phase symmetry, and vertical level of the vocal folds during phonation. Also vocal fold adduction and abduction patters are observed (Patel et al., 2018). However, in the case of unstable phonation, this method reaches technical limitations. The light source and hence image acquisition cannot be triggered correctly in the case of a moderately to highly irregular fundamental frequency, as often occurs in moderately and highly dysphonic or aphonic voices (Powell et al., 2020). Moreover, the evaluation is highly dependent on the experience and training of the examiner, leading to a large interindividual variability of several assessment characteristics. In a comparison between nine voice experts, limited interrater reliability was described especially for the characteristics vertical level, glottal closure pattern, phase closure, phase symmetry, and regularity. Based on this, it has been suggested that these variables should not be solely assessed via laryngostroboscopy (Nawka & Konerding, 2012).
36.4.1.3 High-speed Videoendoscopy (HSV) Since laryngostroboscopy is confined to periodic voice signals, increasingly high-speed imaging techniques have been applied in dysphonic voices. High-speed videoendoscopy (HSV) bases on high-speed endoscopic imaging in a real-time procedure with a typical temporal resolution of 4000 frames per second (fps) for clinical applications, and resolution rates of up to 15000 for research questions (Schützenberger et al., 2016). By this, entire natural vibration cycles can be recorded, facilitating an accurate examination of both vibratory and kinematic vocal fold properties even in patients with severe dysphonia including an assessment of phonation onset, irregular and aperiodic vibration patterns or tremor (Phadke et al., 2017). Even though the technology for capturing, processing, and analyzing HSV
Instrumental Analysis of Voice 533 images is constantly improving, the clinical application of these systems is still limited mainly due to comparatively expensive hardware, and a lack of guidelines for the analysis and normative values for potential clinical parameters, gender and age groups (Kist et al., 2021; Schützenberger et al., 2016).
36.4.1.4 Videokymography (VKG) In recent years, videokymography (VKG) has been described as more economic and simpler alternative to HSV suitable for immediate feedback due to the real-time assessment of kymographic images. Videokymography (VKG) bases on a real-time single-line scanning of the vocal folds by a highspeed imaging technique with a high spatial resolution (over 700 pixels/ line) and image rate (up to 8000 line images per second) (Phadke et al., 2017; Woo, 2020). Thus, VKG delivers images from a single line selected from a laryngoscopic image, independent of the vocal sound characteristics. The selected portion (line) of the vocal fold is registered for all medial–lateral movements that are displayed on a monitor (with time on the vertical axis). Characteristics that have been assessed with VKG include presence, irregularity or absence of vocal fold vibration; duration of opening and closing, left-right asymmetry, shape of lateral and medial peaks, mucosal waves and interference of surrounding structures. In a diagnostic study comparing outcomes to videostroboscopy, VKG led to an adjustment of the treatment in 20% of subjects, showing the diagnostic potential (Phadke et al., 2017). However, similar to HSV, the clinical application is still limited by a lack of uniform guidelines for image acquisition and analysis (Woo, 2020).
36.4.2 Comparison of Visualization Techniques Videolaryngostroboscopy is to date considered as first choice visual examination technique in the clinical assessment and diagnosis of several laryngeal pathologies. Laryngotroboscopy is applied to detect pathological vibratory regularity and amplitude, mucosal wave, phase symmetry and vertical level of the vocal folds during phonation, as well as vocal fold adduction and abduction patterns. However, laryngostroboscopy does not show the entire epithelial surface, so lesions extending more medially and caudally may be missed (Patel et al., 2018; Welham et al., 2007). Also, in moderately to severely sounding (i.e. irregular) or aphonic voices, the stroboscopic triggering mechanism becomes ineffective, thus a description of vocal cycle phases and vibratory characteristics is not possible. Especially in those patients, high-speed imaging and associated analysis systems can complement the already established laryngoscopic and stroboscopic techniques. However, these are not widely used in clinical applications due to high costs of equipment and to date a limited description of guidelines and evidence (Patel et al., 2018; Schützenberger et al., 2016).
36.5 Conclusions Instrumental voice analysis techniques are applied to describe the cause and extent of vocal pathology. While the direct examination technique laryngostroboscopy is considered the gold standard, the potential of high speed video (HSV) and videokymographic (VKM) techniques has not been fully explored yet and holds promise for a better objective quantification of laryngeal and vocal fold pathology and dynamics (Kist et al., 2021). Moreover, indirect acoustic voice analysis techniques are considered as important tools in voice clinics to offer an objective estimate of pathological alterations in voice function and output to supplement a diagnosis, tailor a patient specific treatment plan and to document treatment effects (Patel et al., 2018).
534 Meike Brockmann-Bauser Currently, there are still several challenges for a reliable application of instrumental acoustic (indirect), electroglottographic (semi-direct) and more novel direct visual assessment techniques including VKM and HSV such as the need for (a) better standardization of recording, analysis strategies and reporting of results; (b) a far larger evidence base to understand normal variability within and between individuals without and with dysphonia related to age, training and physiologic day-to-day variation; (c) accounting for covariables such as differences in speaking voice SPL and fo, vowels and speech token in the interpretation of measurements. This calls for further technical developments including open source platforms, standardized data analysis pathways or smartphone based techniques (Kist et al., 2021). Moreover, nonlinear parameters may be more suitable for the inherently multiple biomechanical nonlinear interactions in voice production. Future applications of implementing voice quality and electroglottographic measures into voice range profiles may improve our understanding of voice function changes with natural prosodic variations of speech, and how these patterns are altered in voice disordered individuals (Cai & Ternström, 2022; Patel & Ternström, 2019; Selamtzis & Ternstrom, 2017). Novel developments should be tested with dysphonic individuals early onwards for transferring knowledge into practice as soon as possible.
REFERENCES Aghajanzadeh, M., & Saeedi, S. (2022). Efficacy of cepstral measures in voice disorder diagnosis: A literature review. Journal of Modern Rehabilitation, 16(2), 120–129. Awan, S. (2008). Instrumental analysis of phonation. In M. Ball, M. Perkins, N. Müller, & S. Howard (Eds.), The handbook of clinical linguistics (pp. 344– 360). Blackwell Publishing Ltd. Awan, S. N., Giovinco, A., & Owens, J. (2012). Effects of vocal intensity and vowel type on cepstral analysis of voice. Journal of Voice, 26(5), 670.e615–670.e620. https://doi.org/10.1016/ j.jvoice.2011.12.001 Baken, R. J., & Orlikoff, R. F. (2000). Clinical measurements of speech and voice (2nd ed.). Thomson Delmar Learning. Barsties v. Latoszek, B., Kim, G. H., Delgado Hernández, J., Hosokawa, K., Englert, M., Neumann, K., & Hetjens, S. (2021). The validity of the Acoustic Breathiness Index in the evaluation of breathy voice quality: A meta-analysis. Clinical Otolaryngology, 46(1), 31–40. https://doi.org/10.1111/coa.13629 Barsties v. Latoszek, B., Maryn, Y., Gerrits, E., & De Bodt, M. (2018). A meta-analysis: Acoustic measurement of roughness and breathiness. Journal of Speech, Language, and Hearing Research, 61(2), 298–323. https://doi. org/10.1044/2017_jslhr-s-16-0188 Berg, M., Fuchs, M., Wirkner, K., Loeffler, M., Engel, C., & Berger, T. (2017). The speaking voice in the general population: Normative
data and associations to sociodemographic and lifestyle factors. Journal of Voice, 31(2), 257. e213–257.e224. https://doi.org/10.1016/ j.jvoice.2016.06.001 Boersma, P. (2009). Should jitter be measured by peak picking or by waveform matching? Folia Phoniatrica et Logopaedica, 61(5), 305–308. https://doi.org/10.1159/000245159. Epub 2009 Oct 10. PMID: 19828997. Brockmann-Bauser, M., Bohlender, J. E., & Mehta, D. D. (2018). Acoustic perturbation measures improve with increasing vocal intensity in individuals with and without voice disorders. Journal of Voice, 32(2), 162–168. https://doi. org/10.1016/j.jvoice.2017.04.008 Brockmann-Bauser, M., & Drinnan, M. J. (2011). Routine acoustic voice analysis: Time to think again? Current Opinion in Otolaryngology, Head and Neck Surgery, 19(3), 165–170. Brockmann-Bauser, M., Van Stan, J. H., Carvalho Sampaio, M., Bohlender, J. E., Hillman, R. E., & Mehta, D. D. (2021). Effects of vocal intensity and fundamental frequency on cepstral peak prominence in patients with voice disorders and vocally healthy controls. Journal of Voice, 35(3), 411–417. https://doi. org/10.1016/j.jvoice.2019.11.015 Brown, W., Rothman, H., & Sapienza, C. (2000). Perceptual and acoustic study of professionally trained versus untrained voices. Journal of Voice, 14(3), 301–309.
Instrumental Analysis of Voice 535 Brown, W. S. J., Morris, R. J., & Murry, T. (1996). Comfortable effort level revisited. Journal of Voice, 10(3), 299–305. https://doi.org/10.1016/ s0892-1997(96)80011-7 Cai, H., & Ternström, S. (2022). Mapping phonation types by clustering of multiple metrics. Applied Sciences, 12(23), 12092. https://www.mdpi.com/2076-3417/ 12/23/12092 da Cunha Pereira, G., de Oliveira Lemos, I., Dalbosco Gadenz, C., & Cassol, M. (2018). Effects of voice therapy on muscle tension dysphonia: A systematic literature review. Journal of Voice, 32(5), 546–552. https://doi. org/10.1016/j.jvoice.2017.06.015 de Oliveira Florencio, V., Almeida, A. A., Balata, P., Nascimento, S., Brockmann-Bauser, M., & Lopes, L. W. (2021). Differences and reliability of linear and nonlinear acoustic measures as a function of vocal intensity in individuals with voice disorders. Journal of Voice. https://doi. org/10.1016/j.jvoice.2021.04.011 de Sousa, R. J. T. (2009). A new accurate method of Harmonics-to-Noise Ratio extraction. Proceedings of the International Conference on Bio-inspired Systems and Signal Processing Dejonckere, P. H., Bradley, P., Clemente, P., Cornut, G., Crevier-Buchman, L., Friedrich, G., Van De Heyning, P., Remacle, M., & Woisard, V. (2001). A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques. Guideline elaborated by the Committee on Phoniatrics of the European Laryngological Society (ELS). European Archives of Oto-Rhino-Laryngology, 258(2), 77–82. https:// doi.org/10.1007/s004050000299 Fant, G. (1980). The relations between area functions and the acoustic signal. Phonetica, 37(1-2), 55–86. Ferrer Riesgo, C. A., & Nöth, E. (2020). What makes the cepstral peak prominence different to other acoustic correlates of vocal quality? Journal of Voice, 34(5), 806.e801–806.e806. https://doi.org/10.1016/j.jvoice.2019.01.004 Fuchs, R. (2016). The effects of mp3 compression on acoustic measurements of fundamental frequency and pitch range. In O. Maxwell (Ed.), Proceedings of the International Conference on Speech Prosody (Vol. 2016- January, pp. 523–527). Hakkesteegt, M. M., Brocaar, M. P., & Wieringa, M. H. (2010). The applicability of the
dysphonia severity index and the voice handicap index in evaluating effects of voice therapy and phonosurgery. Journal of Voice, 24(2), 199–205. https://doi.org/10.1016/j. jvoice.2008.06.007 Hakkesteegt, M. M., Brocaar, M. P., Wieringa, M. H., & Feenstra, L. (2008). The relationship between perceptual evaluation and objective multiparametric evaluation of dysphonia severity. Journal of Voice, 22(2), 138–145. https://doi.org/10.1016/j.jvoice.2006.09.010 Heman-Ackah, Y. D., Heuer, R. J., Michael, D. D., Ostrowski, R., Horman, M., Baroody, M. M., Hillenbrand, J., & Sataloff, R. T. (2003). Cepstral peak prominence: A more reliable measure of dysphonia. Annals of Otology, Rhinology & Laryngology, 112(4), 324–333. https://doi.org/10.1177/000348940311200406 Henrich, N., Gendrot, C., & Michaud, A. (2017). Electroglottography: Open-source software for analysing the electroglottographic signal. Retrieved January 30, 2021, from http:// voiceresearch.free.fr/egg Henrich, N., & Michaud, A. (2017). Tools for electroglottographic analysis: Software, documentation and databases [Figure]. http:// voiceresearch.free.fr/egg/#downloads Herbst, C., & Ternström, S. (2006). A comparison of different methods to measure the EGG contact quotient. Logopedics Phoniatrics Vocology, 31(3), 126–138. https://doi. org/10.1080/14015430500376580 Herbst, C. T. (2020). Electroglottography – An update. Journal of Voice, 34(4), 503–526. https://doi.org/10.1016/j.jvoice.2018.12.014 Herbst, C. T., Schutte, H. K., Bowling, D. L., & Svec, J. G. (2017). Comparing chalk with cheese-the EGG contact quotient is only a limited surrogate of the closed quotient. Journal of Voice, 31(4), 401–409. https://doi. org/10.1016/j.jvoice.2016.11.007 Hosokawa, K., Yoshida, M., Yoshii, T., Takenaka, Y., Hashimoto, M., Ogawa, M., & Inohara, H. (2012). Effectiveness of the computed analysis of electroglottographic signals in muscle tension dysphonia. Folia Phoniatrica et Logopaedica, 64(3), 145–150. https://doi.org/10.1159/000342146 Ikuma, T., Brad Story, B., McWhorter, A. J., Lacey Adkins, L., & Kunduk, M. (2022). Harmonicsto-noise ratio estimation with deterministically time-varying harmonic model for pathological voice signals. The Journal of the Acoustical Society of America, 152(3), 1783–1794. https:// doi.org/10.1121/10.0014177
536 Meike Brockmann-Bauser Jayakumar, T., & Benoy, J. J. (2022). Acoustic voice quality index (AVQI) in the measurement of voice quality: A systematic review and metaanalysis. Journal of Voice. https://doi.org/ 10.1016/j.jvoice.2022.03.018 Kist, A. M., Dürr, S., Schützenberger, A., & Döllinger, M. (2021). OpenHSV: An open platform for laryngeal high-speed videoendoscopy. Scientific Reports, 11(1), 13760. https://doi.org/10.1038/s41598-021-93149-0 Lee, J. M., Roy, N., Peterson, E., & Merrill, R. M. (2018). Comparison of two multiparameter acoustic indices of dysphonia severity: The acoustic voice quality index and cepstral spectral index of dysphonia. Journal of Voice, 32(4), 515.e511–515.e513. https://doi.org/ 10.1016/j.jvoice.2017.06.012 Liu, B., Polce, E., Raj, H., & Jiang, J. (2019). Quantification of voice type components present in human phonation using a modified diffusive chaos technique. Annals of Otology, Rhinology & Laryngology, 128(10), 921–931. https://doi.org/10.1177/0003489419848451 Lortie, C. L., Rivard, J., Thibeault, M., & Tremblay, P. (2017). The moderating effect of frequent singing on voice aging. Journal of Voice, 31(1), 112.e111–112.e112. https://doi. org/10.1016/j.jvoice.2016.02.015 Mahrholz, G., Belin, P., & McAleer, P. (2018). Judgements of a speaker’s personality are correlated across differing content and stimulus type. PLoS One, 13(10), e0204991. https://doi.org/10.1371/journal.pone.0204991 Maryn, Y., Corthals, P., Bodt De, M., Cauwenberge van, P., & Delyiski, D. D. (2009). Perturbation measures of voice: A comparative study between Multi-Dimensional Voice Program and Praat. Folia Phoniatrica et Logopaedica, 61(4), 217–226. Maryn, Y., Roy, N., De Bodt, M., Van Cauwenberge, P., & Corthals, P. (2009). Acoustic measurement of overall voice quality: A meta-analysis. The Journal of the Acoustical Society of America, 126(5), 2619–2634. https:// doi.org/10.1121/1.3224706 Maryn, Y., Ysenbaert, F., Zarowski, A., & Vanspauwen, R. (2017). Mobile communication devices, ambient noise, and acoustic voice measures. Journal of Voice, 31(2), 248.e211–248. e223. https://doi.org/10.1016/j.jvoice. 2016.07.023 Mehta, D. D., & Hillman, R. E. (2012). Current role of stroboscopy in laryngeal imaging. Current Opinion in Otolaryngology & Head and Neck Surgery, 20(6), 429–436. https://doi.org/ 10.1097/MOO.0b013e3283585f04
Nawka, T., & Konerding, U. (2012). The interrater reliability of stroboscopy evaluations. Journal of Voice, 26(6), 812.e811–810. https://doi.org/ 10.1016/j.jvoice.2011.09.009 Oliveira, G., Fava, G., Baglione, M., & Pimpinella, M. (2017). Mobile digital recording: Adequacy of the iRig and iOS device for acoustic and perceptual analysis of normal voice. Journal of Voice, 31(2), 236–242. https://doi.org/10.1016/ j.jvoice.2016.05.023 Orlikoff, R. F., & Baken, R. J. (1989). The effect of the heartbeat on vocal fundamental frequency perturbation. Journal of Speech and Hearing Research, 32(3), 576–582. Park, Y., & Stepp, C. E. (2019). Test-retest reliability of relative fundamental frequency and conventional acoustic, aerodynamic, and perceptual measures in individuals with healthy voices. Journal of Speech, Language, and Hearing Research, 62(6), 1707–1718. https://doi. org/10.1044/2019_jslhr-s-18-0507 Patel, R. R., Awan, S. N., Barkmeier-Kraemer, J., Courey, M., Deliyski, D., Eadie, T., Paul, D., Svec, J. G., & Hillman, R. (2018). Recommended protocols for instrumental assessment of voice: American SpeechLanguage-Hearing Association expert panel to develop a protocol for instrumental assessment of vocal function. American Journal of Speech-Language Pathology, 27(3), 887–905. https://doi.org/10.1044/2018_ajslp-17-0009 Patel, R. R., & Ternström, S. (2019). Electroglottographic voice maps of untrained vocally healthy adults with gender differences and gradients. 107–110 https://www.fupress.com/ archivio/pdf/3984_20260.pdf. Patel, R. R., & Ternström, S. (2021). Quantitative and qualitative electroglottographic wave shape differences in children and adults using voice map-based analysis. Journal of Speech, Language, and Hearing Research, 64(8), 2977–2995. https://doi.org/10.1044/2021_jslhr-20-00717 Phadke, K. V., Laukkanen, A.-M., Ilomäki, I., Kankare, E., Geneid, A., & Švec, J. G. (2020). Cepstral and perceptual investigations in female teachers with functionally healthy voice. Journal of Voice, 34(3), 485.e433–485. e443. https://doi.org/10.1016/j.jvoice. 2018.09.010 Phadke, K. V., Vydrová, J., Domagalská, R., & Švec, J. G. (2017). Evaluation of clinical value of videokymography for diagnosis and treatment of voice disorders. European Archives of Oto-Rhino-Laryngology, 274(11), 3941–3949. https://doi.org/10.1007/s00405-017-4726-1
Instrumental Analysis of Voice 537 Pierce, J. L., Tanner, K., Merrill, R. M., Shnowske, L., & Roy, N. (2021). Acoustic variability in the healthy female voice within and across days: How much and why? Journal of Speech, Language, and Hearing Research, 64(8), 3015–3031. https://doi.org/10.1044/2021_jslhr-21-00018 Powell, M. E., Deliyski, D. D., Zeitels, S. M., Burns, J. A., Hillman, R. E., Gerlach, T. T., & Mehta, D. D. (2020). Efficacy of videostroboscopy and high-speed videoendoscopy to obtain functional outcomes from perioperative ratings in patients with vocal fold mass lesions. Journal of Voice, 34(5), 769–782. https://doi.org/10.1016/j.jvoice. 2019.03.012 Reetz, S., Bohlender, J. E., & Brockmann-Bauser, M. (2019). Do standard instrumental acoustic, perceptual, and subjective voice outcomes indicate therapy success in patients with functional dysphonia? Journal of Voice, 33(3), 317–324. https://doi.org/10.1016/j.jvoice. 2017.11.014 Sampaio, M., Vaz Masson, M. L., de Paula Soares, M. F., Bohlender, J. E., & BrockmannBauser, M. (2020). Effects of fundamental frequency, vocal intensity, sample duration, and vowel context in cepstral and spectral measures of dysphonic voices. Journal of Speech, Language, and Hearing Research, 63(5), 1326–1339. https://doi.org/10.1044/ 2020_jslhr-19-00049 Sampaio, M. C., Bohlender, J. E., & BrockmannBauser, M. (2021). Fundamental frequency and intensity effects on cepstral measures in vowels from connected speech of speakers with voice disorders. Journal of Voice, 35(3), 422–431. https://doi.org/10.1016/j.jvoice. 2019.11.014 Sauder, C., Bretl, M., & Eadie, T. (2017). Predicting voice disorder status from smoothed measures of cepstral peak prominence using praat and analysis of dysphonia in speech and voice (ADSV). Journal of Voice, 31(5), 557–566. https://doi. org/10.1016/j.jvoice.2017.01.006 Schaeffer, N., Knudsen, M., & Small, A. (2015). Multidimensional voice data on participants with perceptually normal voices from ages 60 to 80: A preliminary acoustic reference for the elderly population. Journal of Voice, 29(5), 631–637. https://doi.org/10.1016/j.jvoice.2014.10.003 Schützenberger, A., Kunduk, M., Döllinger, M., Alexiou, C., Dubrovskiy, D., Semmler, M., Seger, A., & Bohr, C. (2016). Laryngeal highspeed videoendoscopy: Sensitivity of objective
parameters towards recording frame rate. BioMed Research International, 2016, 4575437. https://doi.org/10.1155/2016/4575437 Selamtzis, A., & Ternstrom, S. (2017). Investigation of the relationship between electroglottogram waveform, fundamental frequency, and sound pressure level using clustering. Journal of Voice, 31(4), 393–400. https://doi.org/10.1016/j.jvoice.2016.11.003 Sprecher, A., Olszewski, A., Jiang, J. J., & Zhang, Y. (2010). Updating signal typing in voice: Addition of type 4 signals. The Journal of the Acoustical Society of America, 127(6), 3710–3716. https://doi.org/10.1121/1.3397477 Stathopoulos, E., Huber, J., & Sussmann, J. (2011). Changes in acoustic characteristics of the voice across the life-span: Measures from 4-93 year olds. Journal of Speech, Language and Hearing Research, 54(4), 1011–1021. Ternström, S., Pabon, P., & Södersten, M. (2016). The Voice Range Profile: Its Function, Applications, Pitfalls and Potential. Acta Acustica united with Acustica, 102(2), 268–283. https://doi.org/10.3813/AAA.918943 Titze, I. R. (1995). Workshop on acoustic analysis: Summary statement. National Center for Voice and Speech, Iowa City, USA. Walzak, P., McCabe, P., Madill, C., & Sheard, C. (2008). Acoustic changes in student actors’ voices after 12 months of training. Journal of Voice, 22(3), 300–313. https://doi.org/10.1016/ j.jvoice.2006.10.006 Watts, C. R., Ronshaugen, R., & Saenz, D. (2015). The effect of age and vocal task on cepstral/ spectral measures of vocal function in adult males. Clinical Linguistics & Phonetics, 29(6), 415–423. https://doi.org/10.3109/02699206.20 15.1005673 Welham, N. V., Dailey, S. H., Ford, C. N., & Bless, D. M. (2007). Voice handicap evaluation of patients with pathologic sulcus vocalis. Annals of Otology, Rhinology & Laryngology, 116(6), 411–417. https://doi.org/10.1177/ 000348940711600604 Woo, P. (2020). Objective measures of stroboscopy and high-speed video. Advances in Oto-Rhino-Laryngology, 85, 25–44. https://doi. org/10.1159/000456681. Epub 2020 Nov 9. PMID: 33166979. Zhang, C., Jepson, K. M., Lohfink, G., & Arvaniti, A. (2021). Comparing acoustic analyses of speech data collected remotely. The Journal of the Acoustical Society of America, 149(6), 3910–3916. https://doi.org/10.1121/ 10.0005132
37 Measures of Speech Perception JAN WOUTERS, ROBIN GRANSIER, AND ASTRID VAN WIERINGEN 37.1 Introduction Speech perception involves detecting, discriminating, and recognizing a continuous stream of speech sounds. Research in speech perception seeks to understand how human listeners recognize speech sounds and how listening conditions and specific disorders, such as hearing impairment, affect speech perception. The current chapter describes different paradigms to “measure” speech perception. Speech perception measures are used in communication technology, hearing instrument development and fitting, room acoustics, speech coding, hearing screening, diagnostics, auditory rehabilitation, and basic research on speech perception in noise. The process of speech perception starts with sensory processing of the acoustical speech signal to neural activation in the auditory pathway and the brain. The continuous stream of sounds with partly overlapping or clearly segmented components obtains meaning in language. Language consists of consonants and vowels, which together form words and sentences. From the waveform and spectrogram, as shown in Figure 37.1, important acoustical features of speech provide the basic framework for the perception and intelligibility of the speech signal. The most prominent parts of the speech signal are most likely vocals that spread their energy across several octaves. Consonants generally have less energy than vowels and can be distinguished based on distinctive spectral and temporal properties. Some categories of phonemes can be clearly differentiated, such as high-frequency energy for unvoiced fricatives (/s/) versus low-frequency energy (< 500 Hz) for nasals (/m/, /n/), and periodicity or voicing for relatively low frequencies. Speech perception depends on several elements, on the signal, the listener, the environment, or a combination of all. Firstly, the speech signal can be defined by its acoustical features, that is, the spectral envelope (the relative frequency content) and temporal envelope (see panel Speech in Figure 37.1). Secondly, speech perception (panel Maskers) depends on the listening environment external to the listener (acoustical properties of background noise, stationary or fluctuating, reverberation, interfering speech etc.). However, the listener uses a multitude of neural and cognitive processes to perceive speech since speech perception relies on audiovisual processing, cognitive processes, and often involves multitasking and attention switching when listening in complex sound environments. Moreover, speech perception varies with age, speaking rate, and many other contextual factors. In this chapter, we will summarize some of the key techniques used to develop and quantify measures of auditory perception of speech signals and their linguistic significance to the
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
540 Jan Wouters, Robin Gransier, and Astrid van Wieringen
Figure 37.1 This is an illustration of three elements of the schema of this chapter on speech perception measures: Measures of speech perception of the listener with the three families of speech perception measures, Speech to be perceived by the listener, Noise maskers and listening environment (impact of). The impact of Neural and Cognitive Processing specific to the listener (in person or model, real or virtual) is listed in Table 37.1.
human listener. Most of these approaches derive measures of speech perception from suprasegmental features such as the spectral or temporal envelope of speech. There are three temporal components (Rosen, 1992): the temporal speech envelope, the periodicity and the temporal fine structure (TFS), with fluctuations (or modulation frequencies) below 30 Hz, at the fundamental frequency 100–300 Hz, and from approximately 300 Hz up, respectively. Speech intelligibility relies mainly on information transmission of (features of) the temporal speech envelope and spectral info. TFS can be defined as a variation of spectrum in time, for example, the first derivative of the spectral components, such as formant transitions. In tonal languages the identity (meaning) of a word changes by changing the tone (the periodicity). In Figure 37.1 panel Speech, the most important features of the speech temporal envelope can be discerned in the waveform as the fluctuating amplitude of the speech. This amplitude typically fluctuates at modulation rates below 30 Hz with 3–5 Hz as the most prominent modulations (Ding et al., 2017; Plomp, 1983; Varnet et al., 2017). Whereas the periodicity, or fundamental frequency F0, is distinguished as the (most) intense low frequency content (the yellow bars), the TFS is apparent in all segments of the spectrogram reflecting the varying spectral components in time, such as for example formant transitions (also in yellow). In this contribution we aim to give an overview of methods and procedures to quantify speech perception using behavioral, instrumental, or physiological measures. Although each
Measures of Speech Perception 541 of these measures can be used in isolation, they provide insight into different aspects of speech perception and can therefore be considered complementary. The behavioral measures involve subjective responses of human participants. In contrast, instrumental measures are computational methods applied to the speech and other (masking) input signals. Auditory models based on the auditory path’s physiology are used. The physiological measures are based on the responses to speech stimulations as acquired with EEG/MEG in participants and multiple model parameters are obtained from animal research. The instrumental and physiological methodologies refer to objective metrics. These measures are based on real speech, speech-like stimuli, or speech-derived metrics that have been proven to correlate with speech perception. In many of these measures the spectral, temporal, dynamical characteristics as well as the impact of context-induced variation (as an example different languages), of different listening conditions (as an example in noise background) and of the neural and cognitive processing (as an example attentional effects) will be indicated. A variety of examples of speech perception measures from recent literature will be reviewed, with examples of applications in clinical populations, such as children with hearing impairment, and speech perception across the lifespan from infants to aged population. In Table 37.1 a list of behavioral, Instrumental, and Physiological measures is shown. This list is not supposed to be an exhaustive list of different speech intelligibility measures, but an overview of measures prevalent in the research literature and that will be addressed in this chapter. In the matrix of Table 37.1 the speech perception measures are divided into two main rows. One row addresses the measures based on the spectral envelope, temporal envelope or both and the listening environment, the acoustical specification external to the listener (e.g., context, audibility, energetic masking). The second row describes or refers to measures that quantify the impact on speech perception of neural and cognitive processing characteristic to the individual listener: hearing loss, informational masking (from speech in speech spectrum weighted steady noise, to speech in interfering speaker(s), binaural unmasking exploiting differences in across-ear correlation between target and masker), cognition, attention and speaker segregation, test procedures (multimodal, multitasking). Hearing loss and the difficulty of the test procedure can influence cognition (availability of resources). Many important references are added to this overview to allow the reader to have easy access to the details of different approaches of measuring speech perception.
37.2 Behavioral Measures One aim of behavioral speech perception testing is to estimate the level of difficulty persons encounter in everyday listening situations. Speech perception in daily communication entails processing audiovisual speech, different talkers, dialects or languages, often under adverse conditions. Moreover, speech processing goes beyond auditory processing. It involves cognitive factors such as working memory, inhibitory control, and cognitive flexibility to recall information from lexical memory. Ultimately, behavioral speech perception measures should be a proxy of human perception, take a number of these factors into account, and be ecologically realistic while simultaneously being sensitive enough to indicate a specific performance. However, the more ecological the task, the less control one has over influencing variables. Over the past decades, researchers, and clinicians have invested in developing various measures for different purposes. This part of the chapter will present an overview of different behavioral speech perception measures. All measures capture available spectral components while some, such as sentences, also capture temporal components of speech perception, such as speech envelope. Precise measures are essential for several purposes (e.g., assessing hearing impairment). These will be discussed in the different
Spectral & temporal
Physiological auditory modeling CSE Cochlea-Scaled Entropy Stilp & Kluender, 2010 sEPSM speech-based Envelope Power Spectrum Model Jørgensen & Dau, 2011 NSIM Neurogram Moncada-Torres et al., 2017 STMI Spectro-Temporal Modulation Index Elhilali et al., 2003
STI Speech Transmission Index Steeneken & Houtgast, 1980 NCM Normalized Covariance Measure Goldsworthy & Greenberg, 2004
Sentences, real speech, discourse Shannon et al., 1995 Drullman et al., 1994
Temporal Envelope
FFR Frequency Following Responses Coffey et al., 2017 ACC Acoustic Change Complex Vonck et al., 2022 Cheek & Cone, 2020 ASSR Auditory Steady State Responses Picton et al., 2003 CI Modulation Transmission Index Gransier et al., 2020 Neural tracking Vanthornhout et al., 2018 Muncke et al., 2022
SII Speech Intelligibility Index ANSI S3.5-1997 CSII Coherence based SII Kates & Arehart, 2005
DTT, Phonemes, CVC van den Borre et al., 2021
Physiological measures
Instrumental measures
Spectral Envelope Stationary noise Audibility Energetic masking
Neural & cognitive Behavioral processing measures Characteristics listener & listening conditions
Table 37.1 An overview of different behavioral, instrumental and physiological speech intelligibility measures prevalent in the research literature.
Impact on Speech Perception Time varying maskers, Gaps in noise Binaural Hearing Loss Informational masking Complex listening Attention & speaker segregation Multimodal (AV) and multi-tasking Cognition Effect of age, speaking rate, …
Fluctuating & Informational maskers Francart et al., 2011 Complex Listening & effort AVATAR, simulation real-world Devesse et al., 2020 Multimodal (AV) and multi-tasking Devesse et al., 2020 Age Goossens et al., 2017
Table 37.1 (Continued)
Fluctuating maskers ESII Extended SII Rhebergen & Versfeld, 2005 ECSII BSIM Binaural SII Beutelmann & Brand, 2006; Beutelmann et al., 2010 BSTI Binaural STI Van Wijngaarden & Drullman, 2008 Hearing loss Beutelmann et al., 2010 Speaker segregation Ding & Simon, 2012 Alfa and listening effort Dimitrijevic et al., 2017 Wöstmann et al., 2017 Comprehension & tracking Ding et al., 2016
544 Jan Wouters, Robin Gransier, and Astrid van Wieringen sections. Overarching aspects are the presentation mode and the response format. These are discussed first.
37.2.1 Presentation Mode Speech testing yields a percentage correct score at a certain level or signal-to-noise ratio (SNR) or a speech reception threshold (SRT). The SRT corresponds to the speech level or SNR at which a certain performance level (typically 50%) is achieved. The SRT can be determined in an adaptive or fixed procedure. In an adaptive procedure (e.g., Plomp & Mimpen, 1979), the level or SNR is determined based on the response of a previous trial. The level of SNR does not change with the fixed method during a trial. The choice of presentation mode depends on the clinical/research question and the type of speech materials. Speech materials with steeper slopes at the 50% point (like sentence materials) are more suitable for adaptive procedures than those with shallower slopes (e.g., words). The higher the slope of the speech material (see further), the smaller the measurable difference in performance can be. This higher slope is important as performance changes are often a few dBs. An SRT measurement’s maximal precision is typically ~1 dB when following some recommendations for constructing speech intelligibility tests with appropriate speech and noise materials. Note that this precision is far better than what is commonly obtained in clinical tone audiometry.
37.2.2 Response Format Open and closed set response formats fundamentally differ in their cognitive processing demands to access the target sound (e.g., Miller et al., 1951). In the open-set response format, the listener compares the target to all possible candidate phonemes and words in lexical memory. Lexical activation and competition are less involved with the closed-set response format, where comparisons are limited to the response alternatives. The listener must hear enough to differentiate the target from the other response alternatives. The difficulty of closed-set response formats can be manipulated by varying the number of alternatives (Miller et al., 1951) and the phonetic similarity to the target word (Buss et al., 2016). In the following paragraphs, we present different behavioral speech perception measures, ranging from those assessing the perception of temporal and spectral characteristics of speech sounds to those capturing real-life communication situations.
37.2.3 Analytical Measures Phoneme identification, or the nonsense syllable test is assessed with an n-alternative closedset response format. The summary score, percentage correct, reflects how well a listener perceives the spectral and temporal properties of vowels and consonants (e.g., Rødvik et al., 2018). Nonsense syllable tests have limited learning effects and allow for detailed error analysis with information transmission analyses (Miller & Nicely, 1955; for an example, see van Wieringen & Wouters, 2022).
37.2.4 Lexical Tasks, More Redundant Numbers are often used to capture performance, as they are meaningful yet require limited knowledge of language and can be used with a closed-set response format. The digits in noise test (DiN) was initially developed for hearing screening (see van den Borre et al., 2021 for a review) but is increasingly used as an alternative for the sentence in noise (SiN) task (Kaandorp et al., 2016; van Wieringen & Wouters, 2022). The speech reception threshold is determined for digits presented in speech-weighted noise using an adaptive procedure. The
Measures of Speech Perception 545 DiN paradigm can be used repeatedly since learning the content is less likely to occur with digits. Open-set word and sentence recognition are standard behavioral measures. Monosyllabic word tests allow for word scoring and phoneme scoring, while bi-syllabic words are scored as words or syllables. Sentence materials can be divided into two broad categories: semantically predictable sentences or the HINT, AzBio, LIST type sentences (Soli & Wong, 2008; Spahr et al., 2012; van Wieringen & Wouters, 2008) and the less semantically predictable sentences, the matrix sentences (formerly Hagerman test) (Hagerman, 1982). Matrix sentences have a fixed structure (name, verb, numeral, adjective, noun), thereby allowing for words of different sentences to be interchanged. The resulting sentences’ semantic content is unpredictable, limiting the cues to memorize a sentence. Both types of sentence materials have been developed for several languages. Due to the sentence’s similar structure, the matrix test is suitable for comparison across languages (see Kollmeier et al., 2015 for a review). Moreover, the closed set response format can be administered by a non-native person as the participant responds by choosing from the options of the matrix. Also, meaningful words and sentences cannot be used repeatedly unless an almost infinite number of alternatives can be generated, such as with the matrix sentences.
37.2.5 Spatial Speech Perception Measures The abovementioned measures are usually presented monaurally and do not capture aspects of binaural hearing. The LISN-Sentences test (Cameron & Dillon, 2007) was developed to assess spatial and talker conditions involved in speech perception. Using three-dimensional virtual auditory environment under headphones performance is measured as two SRT measures and three advantage measures. These represent the benefit in decibels gained when either talker, spatial, or both talker and spatial cues are incorporated in the maskers. A childfriendly version of the LISN can be used to capture listening difficulties in children aged six and older (Dillon & Cameron, 2021). For children and adults who do not master the language well, spatial listening in noise can also be assessed with spatially separated numbers in noise (van Deun et al., 2010). With this paradigm, head shadow, summation, and binaural unmasking effects can be estimated by determining SRTs for different speech and noise configurations with an adaptive paradigm.
37.2.6 Masking Noise Measuring speech in noise is essential, although the outcome depends on the choice of the masker, which, in turn, depends on the measurement purpose. Often a stationary speechweighted noise is used as a reference or standard because its psychometric function yields a steep slope at 50%. Speech-weighted noise has the same long-term spectrum as the target speech, causing spectral overlap between the noise and the target. However, noise in daily life is not stationary, it varies over time. Normal hearing listeners can use spectral and temporal gaps in a fluctuating (time-varying) masker to perceive speech. Festen and Plomp (1990) were among the first to report a 6–8 dB improvement in SRT for a competing talker compared to a stationary speech-weighted noise with persons with normal hearing. However, listening in the gaps is notably difficult for persons with hearing impairment. Therefore, maskers with temporal gaps or with temporal fine structure cues should be considered depending on the clinical or research question. Francart and colleagues (2011) report reference values for a range of maskers with similar speech sentences and demonstrate the need to determine norm values for each combination of speech material and masker, given their differences in masking. To assess speech perception in a variety of environments that
546 Jan Wouters, Robin Gransier, and Astrid van Wieringen more realistically represent real-world listening situations, Brungart and colleagues (2014) modified speech materials from the QuickSIN to determine the effects of audiovisual cues, binaural cues, room reverberation, and time compression on the intelligibility of speech.
37.2.7 Across the Life Span For children, perceptual measures must be adapted to their vocabulary, and the response format should be appropriate for the intended purpose. A closed-set response format with predefined response alternatives is often used with children younger than six years of age (Jerger & Jerger, 1982; van Wieringen & Wouters, 2005, 2008). From four years onwards, children are also able to do open-set tasks. Evaluation of the same speech materials (monosyllabic words) shows performance differences for speech in quiet and speech-weighted noise between four and five-year-olds and adults (van Wieringen & Wouters, 2022). Children perform worse than adults because of poorer working memory (McCreery et al., 2019), poorer attention skills in noise, and more limited cognitive resources to compare the target with the response alternatives. Recognition of masked speech follows a prolonged time course of development, until about 10–15 years of age (see Erickson & Newman, 2017 for a review), especially when the masker is speech (e.g., Leibold et al., 2016; Leibold & Buss, 2013, 2019).
37.2.8 Across Languages Nowadays, speech intelligibility measures are used broadly in clinical applications. However, they are usually tied to a certain language. Given the multitude of languages in Europe and the need to compare performance across language the European research project Hearcom (Vlaming et al., 2011) initiated the development of procedures and materials to achieve comparable speech intelligibility results across languages. Many tests for hearing screening, diagnostics and rehabilitation have been developed in the meantime following identical protocols (Akeroyd et al., 2015; van den Borre et al., 2021). It has been demonstrated that a precision of ~1 dB on SRT can typically be achieved across languages as well as across speakers.
37.2.9 Dual-task Measures and Listening Effort Most speech perception measures entail a single task (with varying stimulus complexity). However, behavioral dual-task paradigms are useful for assessing listening effort. In a dualtask paradigm, individuals are asked to perform the primary task alone and concurrently with a secondary task (see Gagné et al., 2017 for a review). The main task is usually a speech recognition task, while the second one can be memory, visual reaction, etc. The amount of listening effort can be inferred from dual-task performance on the secondary task.
37.2.10 Towards More Real-life Ecological Measures Most speech perception measures are an oversimplification of listening in everyday situations (speech materials are controlled, testing conditions are isolated events, conditions are held constant, and they lack auditory spatial complexity and visual speech cues). Over the past years, different paradigms have been developed, often using new technologies, to capture an individual’s communicative competence in real-life situations. An example of a behavioral multitasking paradigm mimicking auditory spatial complexity and visual cues is “Audiovisual True-to-life Assessment of Auditory Rehabilitation” (AVATAR, Devesse et al., 2018, 2020). While speech understanding in noise is the primary task, other measures (visual, spatial) can be assessed simultaneously to research the role of intrinsic and extrinsic cognitive
Measures of Speech Perception 547 demands on listening. Other research groups have brought the real world to the lab through the development of standardized measures using visual talking heads (Schreitmüller et al., 2018), the recording of realistic listening scenarios with a spherical microphone array inside an anechoic enclosure (Mansour et al., 2021). The recording was reproduced as a virtual sound environment using Ambisonics over a 64-channel, fully spherical loudspeaker array inside an anechoic enclosure and using spontaneously produced speech excised from conversations taking place in realistic background noise (ECO-SiN sentences Miles et al., 2022). In summary, different behavioral speech measures capture different aspects of hearing and listening. Performance measures, even the most ecological ones, are not completely predictive of the difficulties faced in real-world environments. Speech understanding takes place in a certain context (a certain speaker, the linguistic context, metalinguistic skills) and new measures are focusing on the use of contextual effects and meta-linguistics skills.
37.3 Instrumental Measures A multitude of existing instrumental measures of speech intelligibility are used for the evaluation of speech transmission channels, of signal processing algorithms or components in digital hearing aids and cochlear implant systems, as well as for predictions of speech intelligibility, for example, as an objective means in occupational hearing applications (Soli et al., 2018). The advantage of instrumental measures is that they are objective, fast and automatic, and parameters can be quickly tuned. However, models have their limitations too. Researchers, audiologists, engineers, acoustics consultants, and other professionals often rely on objective procedures to predict speech intelligibility. Examples of such procedures are the articulation index AI (Kryter, 1962), the speech intelligibility index SII (ANSI, 1997) and the speech transmission index (STI) (Steeneken & Houtgast, 1980). The SII and the STI are two main families of approaches, the one based on the intensities and the other based on the transmission of modulations (the modulation transfer function), or the spectral envelope and temporal envelope of the signal, respectively. The development of instrumental measures follows the acoustics or/and the physiological processing of the involved signals, speech, and noise. First, widely used phenomenological models based on spectral and temporal envelope are described together with some extended models. Second, measures based on physiological auditory modeling are mentioned.
37.3.1 The SII Model A detailed description of the SII (Speech Intelligibility Index) model is given in ANSI S3.51997 (ANSI, 1997). The SII model calculates the average amount of speech information available to a listener. It uses the long-term averaged speech spectrum and the long-term averaged noise spectrum as input. Both speech and noise spectrum are defined as the spectrum level (in dB/Hz) at the eardrum of the listener. Within the model, an option exists to partition the speech and noise spectrum into octave bands, one-third-octave bands, or critical bands. Within each band, the spectrum level is separately determined for both speech and noise. Next, correction factors are implemented to account for upward spread of masking (the low frequencies better mask high frequencies than the reverse), inaudibility due to the auditory threshold of pure tones, and distortion due to exessive high speech or noise levels. Then, within each frequency band, the difference between the speech and noise level (signal-tonoise ratio, or SNR) is calculated and this value is multiplied with the so-called band-importance function, which results in the proportion of information in that band that is available to the listener.
548 Jan Wouters, Robin Gransier, and Astrid van Wieringen The band-importance function may depend on the type of speech materials (e.g., sentences or words) or level. Finally, these values are added, yielding the Speech Intelligibility Index (SII), or the amount of speech information available to the listener. For normal-hearing listeners, the SII has proven to be closely related to the average intelligibility in a given condition where speech is masked by a stationary noise masker (Pavlovic, 1987).
37.3.2 The STI Model Traditionally the STI is determined from the modulation transfer function (MTF), a function relating a listener’s threshold for detecting sinusoidal amplitude modulation to modulation rate (Viemeister, 1979). For each frequency band, the probe signal consists of speech-shaped noise that has been bandpass filtered and then 100% intensity modulated at a particular modulation frequency. The probe signal is passed through the system to be evaluated, be it the acoustics of a room or the processing in an assistive hearing instrument. The fractional change in modulation depth between probe and response intensity envelopes is quantified for that value of frequency modulation and the process is repeated for other modulation frequencies to determine the complete MTF. In the original STI the MTF is typically characterized using the most important modulation frequencies ranging from about 0.63 Hz to 12.7 Hz in one-third octave intervals (Steeneken & Houtgast, 1980). The latter is used as a model for quantifying the impact on the temporal speech envelope. The correlation of the STI or/and SII model values with the behavioral scores obtained with well-developed reference lists of speech tokens (e.g., existing monosyllabic word lists for speech intelligibility) provide instrumental methods useful for practical applications.
37.3.3 Extensions of STI and SII Variants of the STI have been developed that use speech, rather than the artificial probe signals in the classical STI. One example is the NCM-index (Normalized Covariance Measure) or CSTI (Normalized Covariance based STI procedure) (Chen & Loizou, 2012; Goldsworthy & Greenberg, 2004), based on the covariance between probe and response speech envelope computed in each band, unlike the classical STI, which quantifies the change in modulation depth using the modulation transfer function. For spectral envelope-based measures, the Coherence-based Speech Intelligibility Index (CSII) is a modification of the SII wherein, instead of the SNR, the Signal-to-Distortion Ratio computed from the coherence in frequency bands is used (Kates & Arehart, 2014). One of the most challenging listening conditions for persons with hearing impairment is speech reception in noisy situations. The SII and STI are often used to predict how well a person with normal hearing or hearing impairment can understand speech in a noisy background. However, because both metrics are based on long-term statistics, these models are only valid for continuous jammer sounds, whereas most everyday sounds are non-stationary and fluctuate. The SII and STI methods calculate how much the speech information rises above the noise (or reverb sound), based on the average frequency or modulation spectrum, respectively. Both methods use the average computed spectrum of speech and noise over time periods of 15 to 30 seconds, and as such do not take intensity fluctuations of the jammer sources into account and preclude a good prediction of speech intelligibility in fluctuating background sounds. In the past decade, several extensions from monaural to binaural models, for stationary noise to fluctuating noise and to application with hearing loss, have been developed.
Measures of Speech Perception 549
37.3.4 Extensions for Time-varying Maskers The SII model (ANSI, 1997) can accurately describe intelligibility for speech in stationary noise but fails to do so for nonstationary noise maskers. In (Rhebergen & Versfeld, 2005), an extension to the SII model is proposed with the aim to predict the speech intelligibility in both stationary and fluctuating noise. The basic principle of the present approach is that both speech and noise signal are partitioned into small time frames, typically 20–30 ms. Within each time frame the conventional SII is determined, yielding the speech information available to the listener at that time frame. Next, the SII values of these time frames are averaged, resulting in the SII for that particular condition. Using speech reception threshold (SRT) data from the literature, the extension to the present SII model can give a good account for SRTs in stationary noise, fluctuating speech noise, interrupted noise, and multiple-talker noise. The predictions for sinusoidally intensity modulated (SIM) noise and real speech or speechlike maskers are better than with the original SII model but are still not accurate. For the latter type of maskers, informational masking may play a role. This extension of the SII allows to predict speech understanding in a realistic background. Although the SII and STI models are generally successful in predicting intelligibility across a wide range of conditions, there are a lot of conditions for which inaccurate results are obtained, or where the used model is not adequate, such as for spatial separation of speech and noise sources. The measure STOI (Short-Term Objective Intelligibility) is based on the CSTI correlation between the temporal envelopes of clean and degraded speech in short-time segments of 386 ms and has shown to provide good speech perception predictions (Taal et al., 2011).
37.3.5 Extensions for Binaural Hearing An important source of prediction errors is that standardized versions of the STI and SII are monaural models; they are based on single-channel (or single ear) estimates. By extending the prediction models to cover aspects of binaural hearing, their scope is extended to binaural applications (Beutelmann & Brand, 2006; Beutelmann et al., 2010; van Wijngaarden & Drullman, 2008). Binaural speech intelligibility of individual listeners under realistic conditions was predicted using a model consisting of a gammatone filter bank, an independent equalization-cancellation (EC) process in each frequency band, a gammatone resynthesis, and the speech intelligibility index (SII) (Beutelmann & Brand, 2006). Hearing loss was simulated by adding uncorrelated masking noises (according to the pure-tone audiogram) to the ear channels. Speech intelligibility measurements were carried out with normal-hearing and hearing-impaired listeners, collecting speech reception threshold (SRT) data for three different room acoustic conditions (anechoic, office room, cafeteria hall) and eight directions of a single noise source (speech in front). The overall correlation coefficient between predicted and observed SRTs was 0.95. The effect of individual hearing impairment was predicted with a median correlation coefficient of 0.95. However, for mild hearing losses, the release from masking was overestimated. Although the speech transmission index (STI) is a well-accepted and standardized method for objective prediction of speech intelligibility in a wide range of environments and applications, it is essentially a monaural model. In specific conditions, this leads to considerable mismatches between subjective intelligibility and the STI. A binaural version of the STI was developed based on interaural cross correlograms, which shows a considerably improved correspondence with subjective intelligibility in dichotic listening conditions (van Wijngaarden & Drullman, 2008). The binaural STI is designed to be a relatively simple model, which adds only few parameters to the original standardized STI and changes none of the existing model parameters. The model was validated on a set of 39 dichotic listening
550 Jan Wouters, Robin Gransier, and Astrid van Wieringen conditions, featuring anechoic, classroom, listening room, and strongly echoic environments. For these 39 conditions, the speech intelligibility (CVC word score) and binaural STI highly correlate and give rise to a standardized relation between STI and CVC word score. A revised and extended binaural speech intelligibility model has been presented in Beutelmann et al. (2010) and evaluated with normal-hearing and hearing-impaired subjects. It yields accurate predictions of speech reception thresholds (SRTs) in the presence of nonstationary interfering noise sources at arbitrary azimuths and in different rooms by applying the model to short time frames of the input signals and averaging over the predicted SRT. Binaural SRTs from normal-hearing and hearing-impaired subjects, incorporating acoustical combinations of different rooms, sound source setups, and noise types were measured and correlated with the model’s predictions. 70% of the variance of the SRTs of hearing-impaired subjects could be explained by the model, which is based only on the audiogram. A remaining question is what components of speech are most important for intelligibility. Recently it has been argued that most information is transmitted during dynamic portions of the stimulus and so postulated that the speech segments with the most change are also the most important for intelligibility. Stilp and Kluender (2010) tested this hypothesis by developing a metric Cochlea-Scaled Entropy (CSE). CSE partitions the magnitude spectrum of non-overlapping 16 ms time frames into frequency subbands or channels. They defined entropy as the running average of successive differences of adjacent spectral slices. Stilp and Kluender (2010) concluded that CSE, not vowels or consonants, best predicts speech intelligibility. CSE is a measure of the extent to which successive spectral contents differ (or cannot be predicted) from preceding spectral contents. These results challenge the traditional distinctions between consonants and vowels because this research suggests that speech intelligibility is better predicted by nonlinguistic sensory measures of uncertainty (potential information) than by orthodox physical acoustic measures or linguistic constructs. This has recently been further investigated. It is shown that CSE may capture dynamical properties of the speech signal crucial for intelligibility (Aubanel et al., 2018). This approach is attractive because it is consistent with properties of sensory systems primarily responding to change and practically because the CSE can be computed automatically, without the need to manually segment speech into categories such as vowels or consonants. To extend the classical concept of the STI to non-linear processing the Speech based envelope power spectrum model (sEPSM) has been developed (Jørgensen & Dau, 2011) and further extended (Relano-Iborra & Dau, 2022). This phenomenological model involves spectral as well as temporal envelope components. A first stage of the model contains a bandpass filter bank and envelope extraction as a model of the auditory periphery followed by a modulation filter bank which simulates higher-order neural processing. The model estimates the speech-to-noise envelope power ratio at the output of the modulation filter bank for speech plus noise and noise alone and relates this metric to intelligibility of speech. The idea of considering the SNR in the modulation domain is originally put forward by Dubbelboer and Houtgast (2008). A drawback of the SII and STI-based approaches is that any nonlinear aspect of processing by the human auditory system and degradations due to hearing loss can only be implemented in an ad hoc manner. The instrumental measures described before mainly deal with phenomenological models of speech intelligibility. A more biologically relevant model would include physiological knowledge. An auditory periphery model is used to transform the input from sound to neural response. The model components correspond to known physiological entities (hair cells, neurons) and the model parameters can be deduced from physiological measurements (e.g., animal studies). Neural speech intelligibility measures create an ideal reference response to a specific speech stimulus, that is, an unprocessed signal presented to a model of the normal auditory periphery at a conversational speech level in a quiet background. This forms a template of
Measures of Speech Perception 551 what the central auditory systems of the brain expect the auditory nerve (AN) activity to be for that particular stimulus. Such neural time-frequency representations are referred to as neurograms. A comparison can then be made with the test case of a degraded AN neurogram that differs from the ideal case because of modification of the acoustic stimulus and/ or impairment of the auditory periphery. A widely used auditory-periphery model for speech intelligibility prediction is that of Zilany et al. (2014) This model provides a high level of physiological detail of the transformation of sound/speech to a neurogram representation in the auditory neural system (e.g., Moncada-Torres et al., 2017). As a comparison, alternative auditory-periphery models employ different degrees of physiological detail and accuracy to simplify the processing and increase computational efficiency (e.g., Elhilali et al., 2003; Jørgensen & Dau, 2011; Kates & Arehart, 2014, 2022; Zaar & Carney, 2022). An interesting evolution is quantifying speech intelligibility using automatic speech recognition. Schädler et al. (2018) have shown that speech recognition of Matrix sentences in fluctuating noise conditions can be predicted using the FADE automatic speech recognition system with a universal set of parameters. The predictions with FADE are reference-free, which means no empirical data is required to perform predictions. The predictions match well with empirical data (Kollmeier et al., 2016). It also works with different languages and was successfully used to predict the speech recognition performance of listeners with impaired hearing and the benefit iof speech recognition performance due to binaural noise reduction algorithms. In conclusion, it is difficult to select 1 metric for all speech intelligibility applications. It is also difficult to identify preferences of the physiological predicters over the traditional acoustics-based metrics. This mainly depends on the application. For reasons of complexity, this field of research has developed by incorporating different dimensions step by step. However, although a range of different metrics has been developed, as reviewed in this article, only limited in-depth comparisons of predicters have been performed. An important future area of research is to conduct more rigorous comparisons of the different predictors for multiple sets of speech intelligibility data (e.g., as in Chen & Loizou, 2012; van Kuyk et al., 2018). Furthermore, studies using these predictors have given more insights into the neural coding of speech features. Instrumental measures are important tools for the improvement of communication devices (or components of), especially hearing aids and cochlear implants for persons with hearing impairment.
37.4 Physiological Measures How the auditory pathway processes speech and how these speech features are encoded in the neural signal along the auditory pathway has gained increasing interest over the past decade. Not only does this information provide fundamental insights into the different neural mechanisms involved in speech perception, but it also enables the diagnostics of poor neural encoding that can be at the basis of speech perception deficits. Different imaging techniques can be used to quantify the neural processes involved in speech perception. Here we will focus on the non-invasive techniques that have a high temporal resolution and thereby enable the characterization of neural activity to speech in detail, such as electro- and magnetoencephalography. Characterizing the neural response to speech or its features is often based on specific measures that originate from generators located at different regions in the auditory pathway. Physiological responses that are often used to investigate how the different acoustical features of speech are encoded in the auditory pathway can classically be divided into two categories, namely transient responses and sustained responses (Näätänen & Picton, 1987;
552 Jan Wouters, Robin Gransier, and Astrid van Wieringen Picton et al., 2003). In addition, measures of higher-order aspects of speech perception, such as comprehension and listening effort, often include measures of phase-locking to particular speech structures (Ding et al., 2016; Teng et al., 2020) or enhanced activity in specific oscillatory bands (Dimitrijevic et al., 2017; Wöstmann et al., 2017). In the following we will discuss how these measures of neural encoding can be used as a proxy of speech perception at an individual level.
37.4.1 Neural Encoding of Spectral Features Encoding of the fundamental frequency (F0) of the voice results in a sustained phaselocked response to the periodicity of F0, and can be recorded with the frequency following response (FFR) (Coffey et al., 2019). Although the FFR has a predominant brainstem generator (Bidelman, 2018) it also has generators along the whole ascending auditory pathway, including the thalamus and the auditory cortex (Coffey et al., 2016). Given that F0 is an essential part for speech perception, especially in challenging listening conditions, has made that it has been used as neural proxy for speech perception. (Coffey et al., 2017) found that the FFR strengths across all regions of the auditory pathway were significantly correlated with the ability of individuals to perceive speech in noise, indicating that the robust encoding of F0 throughout the auditory pathway is important for speech perception. Another physiological response that is often used to quantify the neural encoding of spectral features, is the auditory change complex (ACC). The ACC is a cortical transient response to a change in stimulus characteristics (Martin et al., 2020) and consists of a negative and positive peak that occur ~100 and 200 ms after the change, respectively. The stimulus change can be based on a spectral, temporal or level variation. One of the assumptions is that if an ACC is evoked, then the speech features that differ between stimulus conditions are also properly encoded in the auditory pathway. This directly implies that stimuli conditions that evoke the ACC need to be carefully chosen so that the ACC is only evoked by the stimulus characteristic of interest (Gransier, Carlyon et al., 2020). Cheek and Cone (2020) found that the ACC could be elicited by changes in vowel contrasts (vowel changes from /a/ to /i/, /o/, or /u/). Although all vowel contrasts resulted in robust ACC responses, the vowel contrast /a-i/ resulted in significantly larger responses. One can hypothesize that due to the lower spectral resolution associated with hearing loss or cochlear-implant stimulation, the spectral differences are more poorly encoded in the auditory pathway and therefore resulting in lower magnitude, longer latencies, or absent ACCs. The results of Vonck et al. (2022), who used the ACC to characterize the ability of the auditory pathway of normalhearing and hearing-impaired listeners to encode frequency changes at four base frequencies and with frequency changes of 12%, are consistent with this hypothesis. They found that the ACC latency could account for 70% of the variance of speech perception in noise. These studies show that when using specific proxies of neural encoding, insight can be gained in the effect of the encoding of periodicity and spectral features on speech perception, and particular speech in noise.
37.4.2 Neural Encoding of Temporal Envelope Features How well the temporal envelope of speech is encoded is important for speech perception, especially since speech can be perceived purely based on the temporal envelope, with only a limited number of spectral channels (Drullman et al., 1994; Shannon et al., 1995). This especially applies for cochlear-implant users who predominately rely on the temporal envelope to
Measures of Speech Perception 553 process speech (Wouters et al., 2015). Neural encoding of the temporal envelope can be assessed with the envelope following response (EFR), often referred to as the auditory steadystate response (ASSR). ASSRs are evoked with single- or multi-frequency amplitude modulated sound, and can easily be detected in the frequency domain, namely at the response frequency (Picton et al., 2003). Of interest is that with increasing frequency the predominant generators of the ASSR shift from the auditory cortex to the brainstem (Gransier et al., 2021; Herdman et al., 2002). Clinically, this metric has great potential to assess temporal envelope processing in persons with supra-threshold hearing deficits, or in cochlear-implant users. ASSRs have, for example, been used to assess the neural encoding ability of the neural ensembles that are stimulated by each stimulation electrode of cochlear implant users. The ASSR patterns across the electrode array vary greatly across subjects, indicating that envelope processing can be channel and subject specific, as is also evident from behavioral studies (see Pfingst et al., 2015 for an overview). Interestingly, how well the temporal modulations are encoded in the auditory pathway can be quantified based on a STI which is highly correlated with the ability of the CI users to perceive speech in noise (Gransier, Luke et al., 2020). Another method that has gained a lot of interest in recent years is envelope tracking, which assesses how well the temporal envelope of real speech is encoded in the auditory cortex. Vanthornhout et al. (2018) and Muncke et al. (2022) found that the ability of normal-hearing listeners to track the envelope is correlated with the ability to perceive speech in noise.
37.4.3 Neural Encoding of Higher-order Processes The abovementioned physiological measures are used as a proxy of how acoustical features of speech are encoded in the auditory pathway. However, being able to perceive speech, especially in adverse listening conditions, depends on also higher-order factors, for example, the amount of effort needed to understand the speech (Pichora-Fuller et al., 2016) or language proficiency (Bsharat-Maalouf & Karawani, 2022). Furthermore, one has to take into account that when speech is comprehended that then the linguistic features in the speech utterance will also result into phase-locked responses (Ding et al., 2016; Teng et al., 2020). Different metrics of the neural mechanism involved in higher-order processing have been used to predict speech perception. Brain oscillations, particularly those in the alpha band, have for example been postulated to play an important role in sensory processing. Dimitrijevic et al. (2017), for example showed that the amount of alpha power present on a task was negatively correlated with speech perception in noise of normal-hearing listeners. In addition, the representation of speech in the brain is also related to attention. O’Sullivan et al. (2015), showed, for example, that attention to a specific speech stream, when listening to competing talkers, enhances the temporal envelope of the attended stream.
37.5 Concluding Remarks In this chapter we focused on behavioral, instrumental, and physiological measures of speech intelligibility. Although we aimed to give a comprehensive overview of the different speech metrics, we also want to show the complementarity of many of these measures. However, the reader should be aware of the following. In practice, different input signals are used, be it real speech or a model of speech (see Figure 37.1). Whereas the effects of audibility and energetic masking by stationary noise are well understood, we are aware that when listening conditions become more complex, the ability to perceive
554 Jan Wouters, Robin Gransier, and Astrid van Wieringen speech can be significantly affected by aspects of hearing loss, binaural listening, and neural and cognitive processing of the individual, such as informational masking, complex listening environments, attention, cognition, and age that can have a significant impact on speech perception. Nevertheless, other measures are often used in research which probe higher-order aspect of speech perception, for example the detection of lexical differences, or measures are used that indirectly assess different processes that are associated with speech perception. For example, Mismatch Negativity (MMN), a psychological measure derived from different responses to for example, different speech segments. Furthermore, we did not review speech quality measures. Ample metrics exist that can be used to assess speech quality, for example the Perceptual Evaluation of Speech Quality (PESQ) or Mean Opinion Scores (MOS, standardized for assessment of speech quality in telephone systems) (Kates & Arehart, 2022; Rix et al., 2001). However, we did not consider these metrics here, as these measures do not assess speech intelligibility. One also should consider that the development of different measures follows different approaches. However, some of these measures across the domains are related. As an example, the physiological ASSR, which can be used to probe the neural transmission of AM signals from the periphery to the brain is associated with the instrumental STI. The research field of speech processing in the brain is rapidly developing and – considering the complexity of listening – it is expected that many more bridges will be constructed between behavioral, Objective and physiological measures. Different approaches can be considered as complementary. Furthermore, it should be stressed that all measures of speech perception that have been presented in this review are only proxies of speech perception. Therefore, one specific measure cannot be used for all applications and different measures have to be considered as complementary. A golden standard does not exist for the most general listener. However, reference values of speech intelligibility as obtained for normal hearing, sentence speech materials and stationary speech weighted noise, can be quantified with a precision of about 1 dB using appropriate test procedures. As a final note, one must realize that all measures are different, and all have their limitations. Specific measures are highly relevant for specific applications, whereas they might have no added value in other aplications. For example: (i) Some perceptual measures mimic the situation in real life and are ecologically relevant, and provide a good measuring tool for speech perception; (ii) instrumental measures allow to study different parameter settings or conditions in a time-efficient way. This is particularly useful for the evaluation and optimization of speech transmission and processing strategies in hearing instruments, public address audio systems and room acoustics applications; (iii) the physiological measures provide basic insight into the underlying neural processes that contribute to speech perception. And deficits in one of these processes can have a significant impact on speech perception. Individual listener characteristics and listening conditions are listed in Table 37.1. Those listed can almost all be measured behaviorally; many instrumental measures have been developed to account for specific aspects of soundscapes (the target speech and jammer/ masking listening environment) and prediction purposes, but only few physiological measures that are associated with speech perception measures are available and have yet been studied in depth.
Measures of Speech Perception 555
REFERENCES Akeroyd, M. A., Arlinger, S., Bentler, R. A., Boothroyd, A., Dillier, N., Dreschler, W. A., Gagné, J. P., Lutman, M., Wouters, J., Wong, L., & Kollmeier, B. (2015). International Collegium of Rehabilitative Audiology (ICRA) recommendations for the construction of multilingual speech tests. International Journal of Audiology, 54, 17–22. https://doi.org/10.3109/14 992027.2015.1030513 American National Standards Institute. (1997). Methods for calculation of the speech intelligibility index (ANSI S3.5-1997). Aubanel, V., Cooke, M., Davis, C., & Kim, J. (2018). Temporal factors in cochlea-scaled entropy and intensity-based intelligibility predictions. The Journal of the Acoustical Society of America, 143(6), EL443–EL448. https://doi. org/10.1121/1.5041468 Relano-Iborra, H., & Dau, T. (2022). Speech intelligibility prediction based on modulation frequency-selective processing. Hearing Research, 426, 108610. Beutelmann, R., & Brand, T. (2006). Prediction of speech intelligibility in spatial noise and reverberation for normal-hearing and hearing-impaired listeners. The Journal of the Acoustical Society of America, 120(1), 331–342. https://doi.org/10.1121/1.2202888 Beutelmann, R., Brand, T., & Kollmeier, B. (2010). Revision, extension, and evaluation of a binaural speech intelligibility model. The Journal of the Acoustical Society of America, 127(4), 2479–2497. https://doi.org/10.1121/1.3295575 Bidelman, G. M. (2018). Subcortical sources dominate the neuroelectric auditory frequency-following response to speech. NeuroImage, 175(February), 56–69. Brungart, D. S., Sheffield, B. M., & Kubli, L. R. (2014). Development of a test battery for evaluating speech perception in complex listening environments. Journal of the Acoustical Society of America, 136(2), 777–790. Bsharat-Maalouf, D., & Karawani, H. (2022). Learning and bilingualism in challenging listening conditions: How challenging can it be? Cognition, 222, 105018. https://doi. org/10.1016/j.cognition.2022.105018 Buss, E., Leibold, L. J., & Hall, J. W. (2016). Effect of response context and masker type on word recognition in school-age children and adults. The Journal of the Acoustical Society of America, 140(2), 968–977. https://doi.org/10.1121/1.4960587
Cameron, S., & Dillon, H. (2007). Development of the listening in spatialized noise-sentences test (LISN-S). Ear & Hearing, 28(2), 196–211. Cheek, D., & Cone, B. (2020). Evidence of vowel discrimination provided by the acoustic change complex. Ear and Hearing, 41(4), 855–867. https://doi.org/10.1097/AUD. 0000000000000809 Chen, F., & Loizou, P. C. (2012). Contributions of cochlea-scaled entropy and consonant-vowel boundaries to prediction of speech intelligibility in noise. The Journal of the Acoustical Society of America, 131(5), 4104–4113. https://doi.org/10.1121/1.3695401 Coffey, E. B. J., Herholz, S. C., Chepesiuk, A. M. P., Baillet, S., & Zatorre, R. J. (2016). Cortical contributions to the auditory frequencyfollowing response revealed by MEG. Nature Communications, 7, 1–11. https://doi. org/10.1038/ncomms11070 Coffey, E. B. J., Musacchia, G., & Zatorre, R. J. (2017). Cortical correlates of the auditory frequency-following and onset responses : EEG and fMRI evidence. The Journal of Neuroscience, 37(4), 830–838. https://doi. org/10.1523/JNEUROSCI.1265-16.2017 Coffey, E. B. J., Nicol, T., White-Schwoch, T., Chandrasekaran, B., Krizman, J., Skoe, E., Zatorre, R. J., & Kraus, N. (2019). Evolving perspectives on the sources of the frequencyfollowing response. Nature Communications, 10(1), 1–10. https://doi.org/10.1038/ s41467-019-13003-w Devesse, A., Dudek, A., van Wieringen, A., & Wouters, J. (2018). Speech intelligibility of virtual humans. International Journal of Audiology, 57(12), 908–916. https://doi.org/10. 1080/14992027.2018.1511922 Devesse, A., van Wieringen, A., & Wouters, J. (2020). AVATAR assesses speech understanding and multitask costs in ecologically relevant listening situations. Ear and Hearing, 41(3), 521–531. https://doi. org/10.1097/AUD.0000000000000778 Dillon, H., & Cameron, S. (2021). Separating the causes of listening difficulties in children. Ear & Hearing, 42(5), 1097–1108. Ding, N., & Simon, J.Z. (2012). Emergence of neural encoding of auditory objects while listening to competing speakers. Proceedings of the National Academy of Sciences, 109, 11854–11859.
556 Jan Wouters, Robin Gransier, and Astrid van Wieringen Dimitrijevic, A., Smith, M. L., Kadis, D. S., & Moore, D. R. (2017). Cortical alpha oscillations predict speech intelligibility. Frontiers in Human Neuroscience, 11, article 88. 1–10. https://doi.org/10.3389/fnhum.2017.00088 Ding, N., Melloni, L., Zhang, H., Tian, X., & Poeppel, D. (2016). Cortical tracking of hierarchical linguistic structures in connected speech. Nature Neuroscience, 19(1), 158–164. https://doi.org/10.1038/nn.4186 Ding, N., Patel, A. D., Chen, L., Butler, H., Luo, C., & Poeppel, D. (2017). Temporal modulations in speech and music. Neuroscience and Biobehavioral Reviews, 81, 181–187. https:// doi.org/10.1016/j.neubiorev.2017.02.011 Drullman, R., Festen, J. M., & Plomp, R. (1994). Effect of reducing slow temporal modulations on speech reception. Journal of the Acoustical Society of America, 95(5), 2670–2680. Dubbelboer, F., & Houtgast, T. (2008). The concept of signal-to-noise ratio in the modulation domain and speech intelligibility. The Journal of the Acoustical Society of America, 124(6), 3937–3946. https://doi.org/10.1121/1.3001713 Elhilali, M., Chi, T., & Shamma, S. A. (2003). A spectro-temporal modulation index (STMI) for assessment of speech intelligibility. Speech Communication, 41(2–3), 331–348. https://doi. org/10.1016/S0167-6393(02)00134-6 Erickson, L. C., & Newman, R. S. (2017). Influences of background noise on infants and children. Current Directions in Psychological Science, 26(5), 451–457. SAGE Publications Inc. https://doi.org/10.1177/0963721417709087. Festen, J. M., & Plomp, R. (1990). Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing. Journal of the Acoustical Society of America, 88(4), 1725–1736. https://doi. org/10.1121/1.400247 Francart, T., van Wieringen, A., & Wouters, J. (2011). Comparison of fluctuating maskers for speech recognition tests. International Journal of Audiology, 50(1), 2–13. https://doi.org/10.3109 /14992027.2010.505582 Gagné, J. P., Besser, J., & Lemke, U. (2017). Behavioral assessment of listening effort using a dual-task paradigm: A review. Trends in Hearing, 21, 1–25. https://doi. org/10.1177/2331216516687287 Goldsworthy, R. L., & Greenberg, J. E. (2004). Analysis of speech-based speech transmission index methods with implications for nonlinear operations. The Journal of the Acoustical Society
of America, 116(6), 3679–3689. https://doi. org/10.1121/1.1804628 Goossens, T., Vercammen, C., Wouters, J., & van Wieringen, A. (2017). Masked speech perception across the adult lifespan : Impact of age and hearing impairment. Hearing Research, 344, 109–124. https://doi.org/10.1016/j. heares.2016.11.004 Gransier, R., Carlyon, R. P., & Wouters, J. (2020). Electrophysiological assessment of temporal envelope processing in cochlear implant users. Scientific Reports, 10(1), 15406. https://doi. org/10.1038/s41598-020-72235-9 Gransier, R., Hofmann, M., van Wieringen, A., & Wouters, J. (2021). Stimulus-evoked phaselocked activity along the human auditory pathway strongly varies across individuals. Scientific Reports, 11(1), 143. https://doi. org/10.1038/s41598-020-80229-w Gransier, R., Luke, R., van Wieringen, A., & Wouters, J. (2020). Neural modulation transmission is a marker for speech perception in noise in cochlear implant users. Ear & Hearing, 41(3), 591–602. Hagerman, B. (1982). Sentences for testing speech intelligibility in noise. Scandinavian Audiology, 11(2), 79–87. https://doi. org/10.3109/01050398209076203 Herdman, A. T., Lins, O., van Roon, P., Stapells, D. R., Scherg, M., & Picton, T. W. (2002). Intracerebral sources of human auditory steady-state responses. Brain Topography, 15(2), 69–86. Jerger, S., & Jerger, J. (1982). Pediatric speech intelligibility test: Performance-intensity characteristics. Ear & Hearing, 3(6), 325–334. Jørgensen, S., & Dau, T. (2011). Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulationfrequency selective processing. The Journal of the Acoustical Society of America, 130(3), 1475–1487. https://doi.org/10.1121/1.3621502 Kaandorp, M. W., de Groot, A. M. B., Festen, J. M., Smits, C., & Goverts, S. T. (2016). The influence of lexical-access ability and vocabulary knowledge on measures of speech recognition in noise. International Journal of Audiology, 55(3), 157–167. https://doi.org/10.3 109/14992027.2015.1104735 Kates, J. M., & Arehart, K. H. (2005). Coherence and the speech intelligibility index. The Journal of the Acoustical Society of America, 117(4), 2224–2237. Kates, J. M., & Arehart, K. H. (2014). The hearing-aid speech perception index (HASPI).
Measures of Speech Perception 557 Speech Communication, 65, 75–93. https://doi. org/10.1016/j.specom.2014.06.002 Kates, J. M., & Arehart, K. H. (2022). An overview of the HASPI and HASQI metrics for predicting speech intelligibility and speech quality for normal hearing, hearing loss, and hearing aids. Hearing Research, 40(6), 108608. https://doi.org/10.1016/j.heares.2022.108608 Kollmeier, B., Schädler, M. R., Warzybok, A., Meyer, B. T., & Brand, T. (2016). Sentence recognition prediction for hearing-impaired listeners in stationary and fluctuation noise with FADE. Trends in Hearing, 20. https://doi. org/10.1177/2331216516655795 Kollmeier, B., Warzybok, A., Hochmuth, S., Zokoll, M. A., Uslar, V., Brand, T., & Wagener, K. C. (2015). The multilingual matrix test: Principles, applications, and comparison across languages: A review. International Journal of Audiology, 54(Suppl 2), 3–16. Taylor and Francis Ltd. https://doi.org/10.3109/1499 2027.2015.1020971 Kryter, K. D. (1962). Methods for the calculation and use of the articulation index. The Journal of the Acoustical Society of America, 34(11), 1689–1697. https://doi.org/10.1121/1.1909094 Leibold, L. J., Bonino, A. Y., & Buss, E. (2016). Masked speech perception thresholds in infants, children, and adults. Ear and Hearing, 37(3), 345–353. https://doi.org/10.1097/ AUD.0000000000000270 Leibold, L. J., & Buss, E. (2013). Children’s identification of consonants in a speech-shaped noise or a two-talker masker. Journal of Speech, Language, and Hearing Research, 56(4), 1144–1155. https://doi.org/10.1044/1092-4388(2012/12-0011) Leibold, L. J., & Buss, E. (2019). Masked speech recognition in school-age children. Frontiers in Psychology, 10, 1981. https://doi.org/10.3389/ fpsyg.2019.01981 Mansour, N., Marschall, M., May, T., Westermann, A., & Dau, T. (2021). Speech intelligibility in a realistic virtual sound environment. The Journal of the Acoustical Society of America, 149(4), 2791–2801. https:// doi.org/10.1121/10.0004779 Martin, B. A., Hall, D., York, N., & Boothroyd, A. (2020). Cortical, auditory, evoked potentials in response to changes of spectrum and amplitude. Journal of the Acoustical Society of America, 107(4), 2155–2161. https://doi. org/10.1121/1.428556 McCreery, R. W., Walker, E. A., Spratford, M., Lewis, D., & Brennan, M. (2019). Auditory, cognitive, and linguistic factors predict speech
recognition in adverse listening conditions for children with hearing loss. Frontiers in Neuroscience, 13. https://doi.org/10.3389/ fnins.2019.01093 Miles, K., Beechey, T., Best, V., & Buchholz, J. (2022). Measuring speech intelligibility and hearing-aid benefit using everyday conversational sentences in real-world environments. Frontiers in Neuroscience, 16. https://doi.org/10.3389/fnins.2022.789565 Miller, G. A., Heise, G. A., & Lighten, W. (1951). The intelligibility of speech as a function of the context of the test materials. Journal of Experimental Psychology, 41(5), 329–335. Miller, G. A., & Nicely, P. E. (1955). An analysis of perceptual confusions among some English consonants. The Journal of the Acoustical Society of America, 27(2), 338–352. Moncada-Torres, A., van Wieringen, A., Bruce, I. C., Wouters, J., & Francart, T. (2017). Predicting phoneme and word recognition in noise using a computational model of the auditory periphery. The Journal of the Acoustical Society of America, 141(1), 300–312. https://doi.org/ 10.1121/1.4973569 Muncke, J., Kuruvila, I., & Hoppe, U. (2022). Prediction of speech intelligibility by means of EEG responses to sentences in noise. Frontiers in Neuroscience, 16. https://doi.org/10.3389/ fnins.2022.876421 Näätänen, R., & Picton, T. (1987). The N1 wave of the human electric and magnetic response to sound: a review and an analysis of the component structure. Psychophysiology, 24(4), 375–425. https://doi.org/10.1111/j.1469-8986.1987. tb00311.x O’Sullivan, J. A., Power, A. J., Mesgarani, N., Rajaram, S., Foxe, J. J., Shinn-Cunningham, B. G., Slaney, M., Shamma, S. A., & Lalor, E. C. (2015). Attentional selection in a cocktail party environment can be decoded from single-trial EEG. Cerebral Cortex, 25(7), 1697–1706. https:// doi.org/10.1093/cercor/bht355 Pavlovic, C. V. (1987). Derivation of primary parameters and procedures for use in speech intelligibility predictions. Journal of the Acoustical Society of America, 82(2), 413–422. https://doi.org/10.1121/1.395442 Pfingst, B. E., Zhou, N., Colesa, D. J., Watts, M. M., Strahl, S. B., Garadat, S. N., Schvartzleyzac, K. C., Budenz, C. L., Raphael, Y., & Zwolan, T. A. (2015). Importance of cochlear health for implant function. Hearing Research, 322, 77–88. https://doi.org/10.1016/j.heares. 2014.09.009
558 Jan Wouters, Robin Gransier, and Astrid van Wieringen Pichora-Fuller, M. K., Kramer, S. E., Eckert, M. A., Edwards, B., Hornsby, B. W. Y., Humes, L. E., Lemke, U., Lunner, T., Matthen, M., Mackersie, C. L., Naylor, G., Phillips, N. A., Richter, M., Rudner, M., Sommers, M. S., Tremblay, K. L., & Wingfield, A. (2016). Hearing impairment and cognitive energy: The framework for understanding effortful listening (FUEL). Ear and Hearing, 37, 5S–27S. https://doi. org/10.1097/AUD.0000000000000312 Picton, T. W., John, M. S., Dimitrijevic, A., & Purcell, D. (2003). Human auditory steadystate responses. International Journal of Audiology, 42(4), 177–219. https://doi. org/10.3109/14992020309101316 Plomp, R. (1983). The role of modulation in hearing. In R. Klinke & R. Hartmann (Eds.), HEARING — Physiological bases and psychophysics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-69257-4_39 Plomp, R., & Mimpen, A. M. (1979). Improving the reliability of testing the speech reception threshold for sentences. International Journal of Audiology, 18(1), 43–52. https://doi. org/10.3109/00206097909072618 Rhebergen, K. S., & Versfeld, N. J. (2005). A Speech Intelligibility Index-based approach to predict the speech reception threshold for sentences in fluctuating noise for normalhearing listeners. The Journal of the Acoustical Society of America, 117(4), 2181–2192. https:// doi.org/10.1121/1.1861713 Rix, A. W., Beerends, J. G., Hollier, M. P., & Hekstra, A. P. (2001). Perceptual evaluation of speech quality (PESQ) – A new method for speech quality assessment of telephone networks and codecs. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2, 749–752. https://doi.org/10.1109/icassp.2001.941023 Rødvik, A. K., von Koss Torkildsen, J., Wie, O. B., Storaker, M. A., & Silvola, J. T. (2018). Consonant and vowel identification in cochlear implant users measured by nonsense words: A systematic review and meta-analysis. Journal of Speech, Language, and Hearing Research, 61(4), 1023–1050. https://doi. org/10.1044/2018_JSLHR-H-16-0463 Rosen, S. (1992). Temporal information in speech: Acoustic, auditory and linguistic aspects. Philosophical Transactions of the Royal Society of London, 336, 367–373. Schädler, M. R., Warzybok, A., & Kollmeier, B. (2018). Objective prediction of hearing aid benefit across listener groups using machine
learning: Speech recognition performance with binaural noise-reduction algorithms. Trends in Hearing, 22. https://doi.org/10.1177/ 2331216518768954 Schreitmüller, S., Frenken, M., Bentz, L., Ortmann, M., Walger, M., & Meister, H. (2018). Validating a method to assess lipreading, audiovisual gain, and integration during speech reception with cochlear-implanted and normal-hearing subjects using a talking head. Ear and Hearing, 39(3), 503–516. https://doi. org/10.1097/AUD.0000000000000502 Shannon, R. V., Zeng, F., Kamath, V., Wygonski, J., & Ekelid, M. (1995). Speech recognition with primarily temporal cues. Science, 270(5234), 303–304. Soli, S. D., Amano-Kusumoto, A., Clavier, O., Wilbur, J., Casto, K., Freed, D., Laroche, C., Vaillancourt, V., Giguère, C., Dreschler, W. A., & Rhebergen, K. S. (2018). Evidence-based occupational hearing screening II: Validation of a screening methodology using measures of functional hearing ability. International Journal of Audiology, 57(5), 323–334. https://doi.org/1 0.1080/14992027.2017.1411623 Soli, S. D., & Wong, L. L. N. (2008). Assessment of speech intelligibility in noise with the hearing in noise test. International Journal of Audiology, 47(6), 356–361. https://doi. org/10.1080/14992020801895136 Spahr, A. J., Dorman, M. F., Litvak, L. M., van Wie, S., Gifford, R. H., Loizou, P. C., Loiselle, L. M., Oakes, T., & Cook, S. (2012). Development and validation of the azbio sentence lists. Ear and Hearing, 33(1), 112–117. https://doi. org/10.1097/AUD.0b013e31822c2549 Steeneken, H. J. M., & Houtgast, T. (1980). A physical method for measuring speechtransmission quality. Journal of the Acoustical Society of America, 67(1), 318–326. https://doi. org/10.1121/1.384464 Stilp, C. E., & Kluender, K. R. (2010). Cochleascaled entropy, not consonants, vowels, or time, best predicts speech intelligibility. Proceedings of the National Academy of Sciences of the United States of America, 107(27), 12387–12392. https:// doi.org/10.1073/pnas.0913625107 Taal, C. H., Hendriks, R. C., Heusdens, R., & Jensen, J. (2011). An algorithm for intelligibility prediction of time-frequency weighted noisy speech. IEEE Transactions on Audio, Speech and Language Processing, 19(7), 2125–2136. https:// doi.org/10.1109/TASL.2011.2114881 Teng, X., Ma, M., Yang, J., Blohm, S., Cai, Q., & Tian, X. (2020). Constrained structure of
Measures of Speech Perception 559 ancient Chinese poetry facilitates speech content grouping. Current Biology, 30(7), 1299–1305.e7. https://doi.org/10.1016/j.cub. 2020.01.059 van den Borre, E., Denys, S., van Wieringen, A., & Wouters, J. (2021). The digit triplet test: A scoping review. International Journal of Audiology, 60(12), 946–963. https://doi.org/10. 1080/14992027.2021.1902579 van Deun, L., van Wieringen, A., & Wouters, J. (2010). Spatial speech perception benefits in young children with normal hearing and cochlear implants. Ear & Hearing, 31(5), 702–713. van Kuyk, S., Bastiaan Kleijn, W., & Hendriks, R. C. (2018). An evaluation of intrusive instrumental intelligibility metrics. IEEE/ACM Transactions on Audio Speech and Language Processing, 26(11), 2153–2166. https://doi. org/10.1109/TASLP.2018.2856374 van Wieringen, A., & Wouters, J. (2005). Normalization and feasibility of speech understanding tests for Dutch speaking toddlers. Speech Communication, 47(1–2), 169–181. https://doi.org/10.1016/j. specom.2005.03.013 van Wieringen, A., & Wouters, J. (2008). LIST and LINT: Sentences and numbers for quantifying speech understanding in severely impaired listeners for Flanders and the Netherlands. International Journal of Audiology, 47(6), 348–355. https://doi.org/10.1080/ 14992020801895144 van Wieringen, A., & Wouters, J. (2022). Lilliput: Speech perception in speech-weighted noise and in quiet in young children. International Journal of Audiology, 62(8). https://doi.org/10. 1080/14992027.2022.2086491 van Wijngaarden, S. J., & Drullman, R. (2008). Binaural intelligibility prediction based on the speech transmission index. The Journal of the Acoustical Society of America, 123(6), 4514–4523. https://doi.org/10.1121/1.2905245 Vanthornhout, J., Decruy, L., Wouters, J., Simon, J. Z., & Francart, T. (2018). Speech intelligibility predicted from neural entrainment of the speech envelope. Journal of the Association for Research in Otolaryngology, 19(2), 181–191. https://doi.org/10.1007/s10162-018-0654-z
Varnet, L., Ortiz-Barajas, M. C., Erra, R. G., Gervain, J., & Lorenzi, C. (2017). A crosslinguistic study of speech modulation spectra. The Journal of the Acoustical Society of America, 142(4), 1976–1989. https://doi.org/10.1121/ 1.5006179 Viemeister, N. F. (1979). Temporal modulation transfer functions based upon modulation thresholds. Journal of the Acoustical Society of America, 66(5), 1364–1380. https://doi. org/10.1121/1.383531 Vlaming, M. S. M. G., Kollmeier, B., Dreschler, W. A., Martin, R., Wouters, J., Grover, B., Mohammadh, Y., & Mohammadh, T. (2011). Hearcom: Hearing in the communication society. Acta Acustica united with Acustica, 97(2), 175–192. https://doi.org/10.3813/ AAA.918397 Vonck, B. M. D., van Heteren, J. A. A., Lammers, M. J. W., de Jel, D. V. C., Schaake, W. A. A., van Zanten, G. A., Stokroos, R. J., & Versnel, H. (2022). Cortical potentials evoked by tone frequency changes can predict speech perception in noise. Hearing Research, 420, 108508. https://doi.org/10.1016/j. heares.2022.108508 Wöstmann, M., Lim, S., & Obleser, J. (2017). The human neural alpha response to speech is a proxy of attentional control. Cerebral Cortex, 27(6), 3307–3317. https://doi.org/10.1093/ cercor/bhx074 Wouters, J., McDermott, H. J., & Francart, T. (2015). Sound coding in cochlear implants: From electric pulses to hearing. IEEE Signal Processing Magazine, 32(2), 67–80. https://doi. org/10.1109/MSP.2014.2371671 Zaar, J., & Carney, L. H. (2022). Predicting speech intelligibility in hearing-impaired listeners using a physiologically inspired auditory model. Hearing Research, 426(1), 108553. https://doi.org/10.1016/j.heares. 2022.108553 Zilany, M. S. A., Bruce, I. C., & Carney, L. H. (2014). Updated parameters and expanded simulation options for a model of the auditory periphery. The Journal of the Acoustical Society of America, 135(1), 283–286. https://doi. org/10.1121/1.4837815
38 Neurophonetics WOLFRAM ZIEGLER, INGRID AICHERT, THERESA SCHÖLDERLE, AND ANJA STAIGER 38.1 Preamble Neurophonetics is a research field that applies knowledge and methodological tools developed in phonetics to study the neurological aspects of speaking and speech perception. It covers investigations of the impact of brain dysfunctions on speech production and perception in persons with neurologic conditions. The work reviewed here deals exclusively with impaired speech production in dysarthria and apraxia of speech in children and adults. The major share of neurophonetic research addresses applications of phonetic knowledge and methodologies in the service of clinical neurology or neurologic rehabilitation. The speech patterns of individuals with neurologic disorders, such as stroke, Parkinson’s d isease, or neurologic conditions acquired during childhood are analyzed using auditory-perceptual, acoustic, or kinematic speech parameters, with the aim of understanding the impact of neural dysfunctions on the afflicted persons’ speech characteristics. Applications of this research seek to establish diagnostic parameters that are sensitive and specific to neurologic speech impairments, with the ultimate goal of developing and improving phonetically based assessment tools and treatment approaches. Furthermore, neurophonetic research can also provide deeper insight into the neural mechanisms of speech motor control by investigating its failure due to dysfunction of relevant brain networks. In this respect, it also contributes to the development and refinement of speech production models.
38.2 Motor Speech Disorders in Adults Motor speech disorders is an umbrella term for speech production impairments resulting from neurological disease. The term includes the broad class of the dysarthrias, apraxia of speech, and special forms (e.g., neurogenic stuttering, akinetic mutism). This chapter discusses dysarthria and apraxia of speech in more detail. Common to the disorders is that the impairment of speech motor functions can severely affect the intelligibility and naturalness of speech, thereby negatively impacting a person’s ability to communicate and participate in a social environment (Klopfenstein et al., 2020; Yorkston et al., 1992). Careful examination of the impaired speech motor skills and their communicative consequences, as well as their appropriate treatment are therefore crucial in the rehabilitation process of the affected individuals.
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
562 Wolfram Ziegler, Ingrid Aichert, Theresa Schölderle, and Anja Staiger
38.2.1 Dysarthria 38.2.1.1 Definition and Prevalence The term dysarthria refers to a class of neurogenic speech disorders that are caused by damage to the motor execution network for speech movements. Dysarthria is to be distinguished from disorders of neurogenic origin that affect the linguistic processes of speech production (aphasia), the planning stage of speech movements (apraxia of speech), or the execution of non-speech movements of the speech apparatus (e.g. sticking out the tongue or pursing the lips; see also 38.4). With an estimated prevalence of > 400/100.000, dysarthria represents the most common form of neurologically caused communication disorders (Ziegler & Staiger, 2019).
38.2.1.2 Dysarthria Types: Etiology, Neuroanatomy, and Clinical Characteristics Dysarthria can be caused by a variety of neurodegenerative or non-degenerative diseases. They comprise various syndromes whose taxonomical classification is derived from neurological disorders of the limb motor system, such as flaccid or spastic paresis, ataxia, akinesia, rigidity, dyskinesia and dystonia, or tremor (Duffy, 2019). It should be noted, however, that the application of this neuromotor taxonomy is based more on analogies than on neurophysiological data specific to the speech motor system. The syndromes described in the following are characterized by relatively typical clusters of symptoms, the severity of which can vary widely from one individual to another. Not in all cases, however, dysarthria can be clearly assigned to one of the respective types, and mixed forms may occur as well. • Flaccid dysarthrias are caused by lesions of the lower motor neuron system, whose fibers project from cranial (and also several spinal) motor nuclei to the effector organs involved in speaking (e.g., the facial nerve VII innervating the muscles of the lips). Lesions to these structures prevent the afflicted muscles from receiving sufficient innervation. This results in decreased muscular tension with significant muscular weakness and a flaccid appearance of the respective muscles. Depending on which motor subsystem is affected by neural damage, deficits of respiration, phonation, resonance, and/or articulation can arise. In individuals with diseases afflicting the region of the motor nuclei in the brainstem (e.g., following traumatic brain injury or stroke affecting the bulbar region, or degenerative diseases involving the lower motor neuron system), a generalized flaccid syndrome is observed. This syndrome typically manifests in slowed articulation rate, hypernasality, imprecise articulation, weak and breathy voice, and increased inspiration rate. Since the lesion affects the peripheral nervous system, muscular contraction is impeded for all movement conditions, that is, reflexive, involuntary, or volitional. • Spastic dysarthria results from lesions of the upper motor neuron system, that is, the inferior part of the primary motor cortex, and the fiber tracts descending from there to the brainstem/spinal motor nuclei. Since most of the speech muscles receive input from both brain hemispheres, the consequences of unilateral lesions can usually be compensated for within several days or weeks, and severe persisting impairments are confined to individuals with bilateral lesions (Urban et al., 2001). The spastic dysarthria syndrome differs from its flaccid counterpart by the appearance of an increased muscular tension and a spastic appearance of the respective muscles. However, since functional weakness may result from spastic co-contractions of agonist and antagonist muscles, many features of spastic dysarthria (e.g., slow rate, hypernasality, imprecise consonants) can resemble those of the flaccid type, with the exception of a typically strained-strangled voice quality resulting from hyper-adduction of the vocal folds in spastic dysarthria. In lesions of the
Neurophonetics 563 primary motor cortex and its descending fibers, the brainstem motor nuclei may still receive input from other motor cortical areas, such as from mesial regions of the frontal lobes. These regions are particularly related to emotional and motivational aspects of human motor control. Hence, emotional vocal or facial expression (e.g., voiced laughing or crying) can be preserved even in individuals with severely impaired speech (Ackermann & Ziegler, 2010). • Ataxic dysarthria is a syndrome resulting from cerebellar disease or from lesions to the afferent or efferent pathways of the cerebellum. The pathomechanism of ataxia interferes with movement coordination and with the temporal and spatial precision of motor execution. Dyscoordination may be seen, for instance, in the speech breathing pattern or in the interplay between laryngeal and articulatory movements. Impaired timing and an inability to precisely measure the distance, speed, and range of motion for voluntary movements (dysmetria) results in articulatory breakdowns or in intermittent disturbances of the nasal-oral distinction. Cerebellar tremor of 2–3 Hz may be present in the laryngeal or the supralaryngeal system. As in limb ataxia, persons with ataxic speech may tend to compensate for their deficits by increasing their muscular tension, for example, in order to suppress tremor or dysmetric aberrations. Attempts to control ataxic symptoms at the laryngeal level may, for instance, result in strained-strangled voice quality. • Rigid-hypokinetic dysarthria: Akinesia and rigidity are two mutually independent pathomechanisms, which typically co-occur in individuals with basal ganglia disease such as in Parkinson’s disease. Akinesia denotes a condition characterized by impoverished motor activity, with impaired movement initiation, reduced movement amplitudes (hypokinesia) and slowed movements (bradykinesia). Rigidity describes an increased stiffness of the musculature, that is, increased resistance to passive movement, which is caused by an increase of agonist and antagonist muscular tone. In Parkinson’s disease, speech movements are considered to be impaired by hypokinesia and rigidity, while the presence of bradykinesia is controversial. Voice is typically soft and breathy, the intonation is flat, and articulation is undershooting. Unlike those with most other syndromes, persons with a rigid-hypokinetic dysarthria may often speak at a normal rate or even sound hasty. • Hyperkinetic dysarthria and focal dystonias: Dyskinesia and dystonia are collective terms denoting conditions of involuntary muscle contractions leading to uncontrolled movements (hyperkinesias, tics) or abnormal postures (dystonia). Choreatic hyperkinesias, for instance, are present in Huntington’s disease, where they can interfere with the control of speech movements (e.g., disruptions of respiratory activity, uncontrolled vocalizations, excessive pitch and loudness variations, over- and undershooting of articulatory movements). As in Parkinson’s disease, the condition is associated with basal ganglia dysfunction. However, their underlying pathomechanisms differ considerably from each other. Focal dystonias represent a highly heterogeneous group of diseases (often idiopathic in origin), of which two forms also affect speech. In spasmodic dysphonia, dystonia affects the laryngeal motor control system during speech (Ludlow, 2011). In the adductor type, the vocal fold closing muscles are subjected to involuntary spasms, leading to strainedstrangled voice and even complete cessation of voice. In the rarer abductor type, spasm involve the glottis opening muscle, resulting in breathy voice and breathy bursts. Mixed forms can also occur (adductor-abductor type). Spasmodic dysphonia is often associated with vocal tremor. In oromandibular lingual dystonia, spasms involve the supralaryngeal muscles. Spasms may, for example, cause excessive opening or closing of the jaw and involuntary tongue protrusion. In extreme cases, the condition can lead to a complete loss of intelligibility. The neuroanatomical basis of dystonia is only partly understood.
564 Wolfram Ziegler, Ingrid Aichert, Theresa Schölderle, and Anja Staiger Evidence suggests that the disorder is associated with pathophysiological changes in multiple brain regions, including also the basal ganglia. For comprehensive clinical descriptions of the dyskinetic syndromes, readers are referred to (Duffy, 2019).
38.2.2 Apraxia of Speech 38.2.2.1 Definition, Etiologies, and Neuroanatomy A common characterization of apraxia of speech (AOS) is that the affected individuals apparently know what they want to say and how it should sound. In AOS, errors are assumed to arise at the phonetic planning stage of spoken language production, where the more abstract phonological codes are transformed into speech motor programs specifying the instructions (i.e. spatio-temporal parameters) for the movements of the speech organs (e.g., Basilakos, 2018). In psycholinguistic terms, the concept of “knowing how it should sound” refers to the assumption that phonological representations of words are preserved in AOS, which distinguish the impairment from a language disorder (i.e. aphasic-phonological impairment). Since the error pattern of AOS cannot be explained by elementary motor pathomechanisms such as paresis, ataxia, or akinesia, the disorder is also distinguished from the dysarthrias. Pure forms of AOS are very rare, as most individuals show aphasic impairments in addition to AOS (Duffy, 2019). In most cases, AOS is caused by a left-hemisphere stroke. Furthermore, brain damage due to traumatic brain injury, brain tumor, or infectious disease can also cause AOS. In recent years, it has also been recognized that AOS is associated with neurodegenerative diseases where persons experience a gradual progression of their speech apraxic symptoms (e.g., Duffy et al., 2014). Like AOS following stroke, progressive AOS (PAOS) most often occurs in combination with aphasic impairment. AOS is among the core symptoms of the nonfluent variant of primary progressive aphasia (nfvPPA). In rare cases, AOS can also be the only or clearly predominant sign of a neurodegenerative disease. This is referred to as primary progressive AOS (PPAOS; e.g., Duffy et al., 2021). Estimates of the prevalence of apraxia of speech across etiologies are not available. A recent study which estimated the prevalence of AOS based on a representative sample of 156 people with chronic aphasia after stroke revealed that 44% of the affected individuals also had speech apraxic symptoms (Ziegler et al., 2022). There is general agreement that AOS is a syndrome of the left cerebral hemisphere, occurring almost exclusively after lesions to the anterior language zone. However, there is a lasting debate about exactly which regions of left frontal cortex are involved. According to a long tradition, the posterior part of the left inferior frontal gyrus is engaged in speech motor planning (Broca’s area; e.g., Hillis et al., 2004). However, this area may be more responsible for the accompanying aphasic impairment than for the apraxic impairment itself. More recent studies including people with pure AOS indicate that the neighboring left ventral motor and premotor cortices are implicated in the origin of apraxic speech (e.g., Itabashi et al., 2016). Subjects with PPAOS are assumed to suffer from a degeneration of a network of regions that include the superior lateral premotor cortex and supplementary motor area (Josephs et al., 2021).
38.2.2.2 Clinical Characteristics The impairment underlying apraxia of speech primarily leads to articulatory and prosodic deficits, while voice and speech breathing are largely unaffected. In particular, the speech production of individuals with AOS is characterized by dysfluent, groping, and effortful speech with phonetic distortions and perceived phonemic errors, and a frequent occurrence of false
Neurophonetics 565 starts and restarts. Perceived errors are mostly described as inconsistent and varying in nature in repeated occurrences of the same word (Staiger et al., 2012). Effortful and phonetically distorted speech are suggestive of the motor nature of the disorder, the presence of groping movements and of self-initiated corrections indicates that the person struggles for the realization of some internalized, more or less stable phonological target. Errors perceived as phonemic, that is, as changes in phonemic categories, are not necessarily indicative of an abstract-phonological error mechanism, but may also result from the phonetic encoding deficit underlying AOS (Ziegler et al., 2012). For example, perceived changes of phonemic category may be caused by even small deviations from the proper timing of individual movement components (e.g., in the case of voice onset time specification). The interruptions of the flow of speech due to frequent speech repairs are a prominent prosodic feature of apraxic speech production. Disfluencies can also be caused by prosodically inadequate pauses (e.g., between-syllable pauses). Furthermore, speech sounds or the transitions between them can be affected by prolongations. These prosodic abnormalities lead to local changes in the temporal and rhythmic structure of the affected words and phrases. When interruptions and prolongations accumulate, these local phenomena also have a global effect on the rhythm and tempo of spoken utterances, for example, a slow overall speech rate or a completely disintegrated rhythmic structure. Speech characteristics associated with progressive AOS are described as largely consistent with the core features of post-stroke AOS (Duffy et al., 2021). People with progressive AOS have been classified into two different subtypes of impairment based on the relative predominance of prosodic impairments (“phonetic” vs. “prosodic” subtype, see Utianski et al., 2018). Recently, this distinction has also been suggested to be useful for the description of nondegenerative AOS (Mailend & Maas, 2020). In most cases, however, the primary impairments in segmental articulation will also affect prosody, as evidenced by dysfluent and/or syllabic speech. Therefore, a predominantly prosodic pattern of AOS is more likely to reflect conditions in which a milder speech apraxia is successfully compensated for by slowing and fragmenting the flow of articulation.
38.2.2.3 AOS in Models of Speech Motor Planning The symptoms observed in individuals with AOS are ascribed to a disruption of the speech planning system that translates the abstract phonological representation into speech. However, what exactly this pathomechanism implies is little more than a guess, as the question of which processes are actually involved in “speech motor planning” has not been resolved (Laganaro, 2019). Moreover, there remains uncertainty about the exact nature of the motor representations involved in these processes and how these units are integrated into increasingly complex speech patterns. Within the framework of Levelt’s influential model of spoken language production (Levelt et al., 1999), the pathomechanism of AOS can be linked to the phonetic encoding component. At the core of this model is a store of syllable-sized gestural scores, the syllabary. The entries of the syllabary are considered as holistic motor programs prescribing the articulations for a syllable. In the production of multisyllabic words and words in phrases, the retrieved syllables are assembled in a linear fashion. More recently, the neurocomputational model “Directions into Velocities of Articulators”(DIVA) and its extension “Gradient Order DIVA” have been applied to AOS (Miller & Guenther, 2020). Within this framework, AOS is associated with damage to the speech sound map, which contains the motor programs for the production of single syllables as well as information about the sensory consequences of a speech motor program. Consistent with the assumption of phonetic plans for syllables, studies on error production in AOS show a robust syllable frequency effect in this population,
566 Wolfram Ziegler, Ingrid Aichert, Theresa Schölderle, and Anja Staiger leading to the consideration of AOS as an impairment affecting syllables as motor planning units (e.g., Aichert & Ziegler, 2004). The perspective that AOS corrupts learned speech motor abilities is not necessarily linked to the assumption of acquired holistic motor plans as the units of apraxic failure. A more dynamic account is the nonlinear gestural (NLG) model of speech motor planning along lines suggested by articulatory phonology (Goldstein & Fowler, 2003). The NLG model conceives speech motor plans as hierarchically organized structures, ranging from the level of articulatory gestures to the level of metrical feet (e.g., Ziegler et al., 2021). Gestural arrangement is nonlinear in the sense that gestures (and the segments they compose) are not planned in a strictly left-to-right manner; rather, the probability of correct production of each gesture is a function of its position within the gestural score of a syllable and the overarching metrical frame of the phonological word. This assumption is in line with recent studies of German and English speakers with AOS, which have shown that the occurrence of consonant and vowel errors is indeed influenced by the stress patterns of phonological words. In particular, the regular stressed-unstressed (trochaic) pattern apparently protects from apraxic failure (e.g., Aichert et al., 2016).
38.3 Pediatric Motor Speech Disorders The acquisition of speech motor functions is a prolonged process lasting at least through the first decade of life. Disorders occurring throughout this developmental process are classified as Pediatric Motor Speech Disorders (PMSD). Similar to motor speech disorders in adults, the two major clinical entities are childhood dysarthria (CD) and childhood apraxia of speech (CAS). Both conditions are often associated with limitations in everyday communication and social participation (e.g., Mei et al., 2014), making early diagnosis and intervention mandatory.
38.3.1 Childhood Dysarthria Neural networks controlling the execution of speech movements can be damaged (or restricted in their normal maturation) already in childhood, leading to CD. While there is some overlap in etiologies (e.g., traumatic brain injury, tumors, and cerebrovascular accidents may occur throughout the entire life span), the most relevant neurologic conditions of CD are specifically characterized by an onset in early infancy. The most common cause is cerebral palsy, a condition defined by motor dysfunction due to damage to the infant brain. Up to 90% of persons with cerebral palsy show symptoms of dysarthria (Mei et al., 2014). Other etiologies are genetic syndromes affecting the development of speech-relevant neural structures (e.g., Down Syndrome, Worster-Drought Syndrome) and congenital malformations of the brain (e.g., Wilson et al., 2019). It is assumed that the symptoms of CD, which may involve all speech subsystems (Allison & Hustad, 2018), result from similar motor pathomechanisms as in adult dysarthria (e.g., paresis, hyperkinesia, ataxia; cf. Section 38.2.1.2). However, while children present differentiable patterns of dysarthric symptoms, standard dysarthria syndromes are less clear cut in children than in adults (Schölderle et al., 2021). This finding has been discussed in the context of developmental processes, which are effective throughout childhood and have a major impact on the clinical manifestation of CD. Studies of typically developing children demonstrated that there is a large perceptual overlap between developmental speech features and symptoms of dysarthria. For instance, typically developing children show features such as a breathy voice, slow articulation rate, and pauses as correlates of their anatomically and physiologically immature speech apparatus (Schölderle et al., 2020). Accounting for these developmental influences through age-normalization, a recent investigation showed that
Neurophonetics 567 respiration, articulation, and prosodic modulation were most severely disordered in a group of children with CD due to cerebral palsy, while voice parameters were unimpaired in the majority of children compared to their age norm. Four symptoms were identified as specific markers of CD: conspicuous rhythm, hypernasality, strained-strangled voice, and imprecise articulation (Schölderle et al., 2022).
38.3.2 Childhood Apraxia of Speech CAS is defined as a pediatric speech sound disorder which is not attributable to motor execution deficits such as paresis, ataxia, etc. It affects the planning and/or programming of spatiotemporal parameters of speech motor patterns. However, the hierarchical architecture of phonological encoding, phonetic planning, and motor execution processes, as assumed for the adult model, must be considered immature and developing in children. Hence, a conceptual separation between impairments of phonological or phonetic encoding, or of motor execution, is even more intricate in children than it is in adults. The American Speech-Language-Hearing Association (ASHA) identified three speech features as core symptoms of the condition: inconsistent errors on consonants and vowels in repeated productions of syllables or words, lengthened and disrupted coarticulatory transitions between sounds and syllables, and inappropriate prosody, especially in the realization of lexical or phrasal stress (ASHA, 2007). The clinical spectrum of CAS, however, is still discussed, especially in terms of a differentiation from other pediatric speech disorders, such as CD (Iuzzini-Seigel et al., 2022; Murray et al., 2021). CAS often occurs as an idiopathic condition (and thus represents the exception among the motor speech disorders that are otherwise due to a distinct neurological condition), but it can also manifest in the context of complex neurobehavioral disorders (e.g., autism, fragile X syndrome) or result from known neurological events. A genetic origin is hypothesized since the discovery of a mutation of the FOXP2-gene in the members of the “KE-family,” who over four generations showed symptoms of CAS. Recent genetic research suggests a more complex interplay between many different genes and the development of planning and execution of speech motor sequences as well as higher linguistic abilities (Kaspi et al., 2022; Mountford et al., 2022).
38.4 Neurophonetic Methods The most widely used methods in neurophonetic research and in clinical applications are acoustic and auditory-perceptual speech analyses. In contrast, kinematic analyses of vocal tract movements during speaking, such as Electromagnetic Articulography (EMA), are technically demanding and can be burdensome for individuals with neurologic diseases, which is why they are limited to single case or small sample studies of dysarthria and AOS (e.g., Hagedorn et al., 2017). In clinical assessment, the gold standard practice is auditory-perceptual analysis of speech characteristics, usually by rating scales. The major advantages of this approach are that it directly targets the behaviorally relevant parameter, that is, the audible speech output of the affected person, and that it allows to provide a complete picture of speech characteristics through a unified scaling method (e.g., Enderby & Palmer, 2012; Ziegler et al., 2017). A notorious problem with this method is its often limited reliability, especially when a fine-grained profile of highly specific diagnostic variables is assessed, as in the influential Mayo Clinic dysarthria rating system (Bunton et al., 2007; Duffy, 2019). In contrast, acoustic parameters are considered more reliable, at least under the condition that comparable standards are met with regard to the applied algorithms, parameter settings and diagnostic materials.
568 Wolfram Ziegler, Ingrid Aichert, Theresa Schölderle, and Anja Staiger However, acoustic measures cannot cover the whole range of clinically relevant speech characteristics, and most of them require reference norms specified for age, gender, context, and dialect. Moreover, common phonetic parameters such as voice-onset time (VOT) or vowel formant frequencies are not applicable across a broader severity range or in all types of motor speech disorders, for example, in the absence of complete stop closures (VOT) or when vowel spectra are contaminated by strong nasal resonance in persons with severe paretic dysarthria (vowel formants). In clinical applications it is important to distinguish between diagnostic variables describing the immediate dysfunctions of the speech subsystems (i.e., respiration, voice production, articulation, resonance), on the one hand, and others describing the impact of these dysfunctions on a person’s communicative limitations, such as reduced intelligibility or unnaturally sounding speech (Klopfenstein et al., 2020). Assessment of the parameters related to sub-system impairments requires the expertise and experience of trained clinicians, for example, to discern different types of voice quality impairment, recognize hypernasality or articulatory undershoot as sources of blurred speech, or distinguish between phonetic and phonemic errors. In this way, a speech profile can be created that allows differential diagnostic conclusions and the development of tailored treatments. In contrast, communication-related parameters reflect the limitations individuals with motor speech disorders may experience in their daily environment, making them ecologically valid and particularly suitable for measuring therapeutic effects or the progression of the speech disorder. One problem with these parameters is that because of adaptation effects, they should not be assessed by speech-language therapists, especially not those responsible for the diagnosed individuals. Therefore, ways must be found to make lay judgments available for clinical assessment of communication parameters (Lehner et al., 2022). There is a widespread practice to include nonspeech (or paraspeech) tasks like maximally fast repetitions of syllables (e.g., /papapa … / or /pataka … /) or maximally long phonation on a vowel (e.g., /aaaa:/) in clinical assessments of motor speech disorders (e.g., Kent, 2015). They are considered valuable because they provide an opportunity to test each vocal tract organ selectively (“effector-specificity”) and to impose motor demands that target basic motor functions such as speed, strength, or coordination, thus helping to understand how these supposedly domain-unspecific functions are disrupted in speech disorders (“functionspecificity”; cf. Folkins et al., 1995). In contrast to that, critics of the diagnostic use of nonspeech tasks in speech assessment emphasize that although they involve movements of the same effector organs, they differ from speech in a variety of dimensions, such as the absence of an auditory reference space in tasks requiring silent mouth movements, or the absence of feedback mechanisms based on aerodynamic information in tasks without respiratory activity. Arguments based on learning-dependent neural plasticity mechanisms and taskdynamics theory support the view of speech as a highly integrated and specialized motor skill that is particularly adapted for language communication. Empirical data directly comparing speech with nonspeech diagnostic parameters are scarce (e.g., Staiger et al., 2017). For a recent overview of the controversy and new data arguing against the use of nonspeech measures as clinical markers of speech impairment, see Ziegler et al. (2023).
38.5 Conclusions and Outlook Neurophonetic research is grounded in phonetic knowledge and methodology and interfaces with a number of research areas such as clinical neurology, neuropsychology, and neuro-/psycholinguistics. It contributes, from a theoretical perspective, to the refinement of existing models of speech production and perception, and, from an applied perspective, to the development and improvement of diagnostic and ultimately therapeutic methods for
Neurophonetics 569 motor speech disorders in children and adults. Among the topical theoretical issues are, for example, the question of a domain-general or domain-specific organization of speech motor control and how specificity may emerge in speech development, or the interplay of production with perception in communicative interactions of patients with motor speech impairments. Current clinical issues are the development of reliable, sensitive, and valid outcome measures for clinical trials, with a stronger focus on communication relevant speech parameters such as intelligibility or speech naturalness in the future. Technological advancements such as the availability of machine learning methods are increasingly offering new perspectives for neurophonetic research. The challenge for the future is to make such methods usable for valid and effective diagnostics of motor speech disorders.
REFERENCES Ackermann, H., & Ziegler, W. (2010). Brain mechanisms underlying speech motor control. In J. Hardcastle, J. Laver, & F. E. Gibbon (Eds.), The handbook of phonetic sciences (2nd ed., pp. 202–250). Wiley-Blackwell. Aichert, I., Späth, M., & Ziegler, W. (2016). The role of metrical information in apraxia of speech. Perceptual and acoustic analysis of word stress. Neuropsychologia, 82, 171–178. Aichert, I., & Ziegler, W. (2004). Syllable frequency and syllable structure in apraxia of speech. Brain and Language, 88(1), 148–159. Allison, K. M., & Hustad, K. C. (2018). Acoustic predictors of pediatric dysarthria in cerebral palsy. Journal of Speech, Language, and Hearing Research, 61(3), 462–478. American Speech-Language-Hearing Association. (2007). Childhood apraxia of speech. Technical Report. Basilakos, A. (2018). Contemporary approaches to the management of post-stroke apraxia of speech. Seminars in Speech and Language, 39(1), 25. Bunton, K., Kent, R. D., Duffy, J. R., Rosenbek, J. C., & Kent, J. F. (2007). Listener agreement for auditory-perceptual ratings of dysarthria. Journal of Speech, Language, and Hearing Research, 50(6), 1481–1495. Duffy, J. R. (2019). Motor speech disorders: Substrates, differential diagnosis, and management (4th ed.). Elsevier. Duffy, J. R., Strand, E. A., & Josephs, K. A. (2014). Motor speech disorders associated with primary progressive aphasia. Aphasiology, 28(8–9), 1004–1017. Duffy, J. R., Utianski, R. L., & Josephs, K. A. (2021). Primary progressive apraxia of speech: From recognition to diagnosis and care. Aphasiology, 35(4), 560–591.
Enderby, P., & Palmer, R. R. (2012). FDA-2: Frenchay dysarthria assessment. Pro-ed. Folkins, J. W., Moon, J. B., Luschei, E. S., Robin, D. A., Tye-Murray, N., & Moll, K. L. (1995). What can nonspeech tasks tell us about speech motor disabilities? Journal of Phonetics, 23(1-2), 139–147. Goldstein, L., & Fowler, C. (2003). Articulatory phonology: A phonology for public language use. In A. Meyer & N. O. Schiller (Eds.), Phonetics and phonology in language comprehension and production: Differences and similarities (pp. 159–207). Mouton de Gruyter. Hagedorn, C., Proctor, M., Goldstein, L., Wilson, S. M., Miller, B., Gorno-Tempini, M. L., & Narayanan, S. S. (2017). Characterizing articulation in apraxic speech using real-time magnetic resonance imaging. Journal of Speech, Language, and Hearing Research, 60(4), 877–891. Hillis, A. E., Work, M., Barker, P. B., Jacobs, M. A., Breese, E. L., & Maurer, K. (2004). Re-examining the brain regions crucial for orchestrating speech articulation. Brain, 127(7), 1479–1487. Itabashi, R., Nishio, Y., Kataoka, Y., Yazawa, Y., Furui, E., Matsuda, M., & Mori, E. (2016). Damage to the left precentral gyrus is associated with apraxia of speech in acute stroke. Stroke, 47(1), 31–36. Iuzzini-Seigel, J., Allison, K. M., & Stoeckel, R. (2022). A tool for differential diagnosis of childhood apraxia of speech and dysarthria in children: A tutorial. Language, Speech, and Hearing Services in Schools, 53(4), 926–946. Josephs, K. A., Duffy, J. R., Clark, H. M., Utianski, R. L., Strand, E. A., Machulda, M. M., Botha, H., Martin, P. R., Pham, N. T. T., Stierwalt, J., Ali, F., Buciuc, M., Baker, M., Fernandez De Castro, C.
570 Wolfram Ziegler, Ingrid Aichert, Theresa Schölderle, and Anja Staiger H., Spychalla, A. J., Schwarz, C. G., Reid, R. I., Senjem, M. L., Jack, C. R., Jr., … Whitwell, J. L. (2021). A molecular pathology, neurobiology, biochemical, genetic and neuroimaging study of progressive apraxia of speech. Nature Communications, 12(1), 3452. Kaspi, A., Hildebrand, M. S., Jackson, V. E., Braden, R., Van Reyk, O., Howell, T., Debono, S., Lauretta, M., Morison, L., Coleman, M., Webster, R., Coman, D., Goel, H., Wallis, M., Dabscheck, G., Downie, L., Baker, E. K., Parry-Fielder, B., Ballard, K., … Morgan, A.T. (2022). Genetic aetiologies for childhood speech disorder: Novel pathways co-expressed during brain development. Molecular Psychiatry, 28(4), 1647–1663. Kent, R. D. (2015). Nonspeech oral movements and oral motor disorders: A narrative review. American Journal of Speech-Language Pathology, 24(4), 763–789. Klopfenstein, M., Bernard, K., & Heyman, C. (2020). The study of speech naturalness in communication disorders: A systematic review of the literature. Clinical Linguistics & Phonetics, 34(4), 327–338. Laganaro, M. (2019). Phonetic encoding in utterance production: A review of open issues from 1989 to 2018. Language, Cognition and Neuroscience, 34(9), 1193–1201. Lehner, K., & Ziegler, W., & KommPaS Study Group. (2022). Indicators of communication limitation in dysarthria and their relation to auditory-perceptual speech symptoms: Construct validity of the KommPaS web app. Journal of Speech, Language, and Hearing Research, 65(1), 22–42. Levelt, W. J. M., Roelofs, A., & Meyer, A. S. (1999). A theory of lexical access in speech production. Behavioral and Brain Sciences, 22(1), 1–38. Ludlow, C. L. (2011). Spasmodic dysphonia: A laryngeal control disorder specific to speech. Journal of Neuroscience, 31(3), 793–797. Mailend, M.-L., & Maas, E. (2020). To lump or to split? Possible subtypes of apraxia of speech. Aphasiology, 35(4), 592–613. https://doi.org/10 .1080/02687038.2020.1836319 Mei, C., Reilly, S., Reddihough, D., Mensah, F., & Morgan, A. (2014). Motor speech impairment, activity, and participation in children with cerebral palsy. International Journal of SpeechLanguage Pathology, 16(4), 427–435. https://doi. org/10.3109/17549507.2014.917439 Miller, H. E., & Guenther, F. H. (2020). Modelling speech motor programming and apraxia of
speech in the DIVA/GODIVA neurocomputational framework. Aphasiology, 35(4), 424–441. Mountford, H. S., Braden, R., Newbury, D. F., & Morgan, A. T. (2022). The genetic and molecular basis of developmental language disorder: A review. Children, 9(5), 586. Murray, E., Iuzzini-Seigel, J., Maas, E., Terband, H., & Ballard, K. J. (2021). Differential diagnosis of childhood apraxia of speech compared to other speech sound disorders: A systematic review. American Journal of SpeechLanguage Pathology, 30(1), 279–300. Schölderle, T., Haas, E., & Ziegler, W. (2020). Age norms for auditory-perceptual neurophonetic parameters: A prerequisite for the assessment of childhood dysarthria. Journal of Speech, Language, and Hearing Research, 63(4), 1071–1082. Schölderle, T., Haas, E., & Ziegler, W. (2021). Dysarthria syndromes in children with cerebral palsy. Developmental Medicine & Child Neurology, 63(4), 444–449. Schölderle, T., Haas, E., & Ziegler, W. (2022). Childhood dysarthria: Auditory-perceptual profiles against the background of typical speech motor development. Journal of Speech, Language, and Hearing Research, 65(6), 2114–2127. Staiger, A., Finger-Berg, W., Aichert, I., & Ziegler, W. (2012). Error variability in apraxia of speech: A matter of controversy. Journal of Speech, Language, and Hearing Research, 55(5), 1544–1561. Staiger, A., Schölderle, T., Brendel, B., Bötzel, K., & Ziegler, W. (2017). Oral motor abilities are task-dependent: A factor analytic approach to performance rate. Journal of Motor Behavior, 49(5), 482–493. Urban, P. P., Wicht, S., Vukurevic, G., Fitzek, C., Fitzek, S., Stoeter, P., Massinger, C., & Hopf, H. C. (2001). Dysarthria in acute ischemic stroke. Lesion topography, clinicoradiologic correlation, and etiology. Neurology, 56(8), 1021–1027. Utianski, R. L., Duffy, J. R., Clark, H. M., Strand, E. A., Botha, H., Schwarz, C. G., Machulda, M. M., Senjem, M. L., Spychalla, A. J., Jack, C. R., Jr, Petersen, R. C., Lowe, V. J., Whitwell, J. L., & Josephs, K. A. (2018). Prosodic and phonetic subtypes of primary progressive apraxia of speech. Brain and Language, 184, 54–65. Wilson, E. M., Abbeduto, L., Camarata, S. M., & Shriberg, L. (2019). Estimates of the prevalence of speech and motor speech disorders in adolescents with Down
Neurophonetics 571 syndrome. Clinical Linguistics & Phonetics, 33(8), 772–789. Yorkston, K. M., Dowden, P. A., & Beukelman, D. R. (1992). Intelligibility measurement as a tool in the clinical management of dysarthric speakers. In R. D. Kent (Ed.), Intelligibility in speech disorders. Theory, measurement and management (1 ed., Vol. 1, pp. 265–285). John Benjamins Publishing Company. Ziegler, W., Aichert, I., & Staiger, A. (2012). Apraxia of Speech: Concepts and Controversies. Journal of Speech, Language, and Hearing Research, 55(5), 1485–1501. Ziegler, W., Aichert, I., Staiger, A., Willmes, K., Baumgaertner, A., Grewe, T., Flöel, A., Huber, W., Rocker, R., Korsukewitz, C., & Breitenstein, C. (2022). The prevalence of apraxia of speech in chronic aphasia after stroke: A bayesian hierarchical analysis. Cortex, 151, 15–29. Ziegler, W., Lehner, K., Pfab, J., & Aichert, I. (2021). The nonlinear gestural model of speech
apraxia: Clinical implications and applications. Aphasiology, 35(4), 462–484. Ziegler, W., Schölderle, T., Brendel, B., Risch, V., Felber, S., Ott, K., Goldenberg, G., Vogel, M., Bötzel, K., Zettl, L., Lorenzl, S., Lampe, R., Strecker, K., Synofzik, M., Lindig, T., Ackermann, H., & Staiger, A. (2023). Speech and nonspeech parameters in the clinical assessment of dysarthria: A dimensional analysis. Brain Sciences, 13(1), 113. https://doi. org/10.3390/brainsci13010113 Ziegler, W., & Staiger, A. (2019). Dysarthrie und Sprechapraxie. In H. C. Diener, H. Steinmetz, & O. Kastrup (Eds.), REFERENZ Neurologie (pp. 112–119). Thieme. Ziegler, W., Staiger, A., Schölderle, T., & Vogel, M. (2017). Gauging the auditory dimensions of dysarthric impairment: Reliability and construct validity of the Bogenhausen Dysarthria Scales (BoDyS). Journal of Speech, Language, and Hearing Research, 60(6), 1516–1534.
39 Coarticulation and Speech Impairment IVANA DIDIRKOVÁ 39.1 Introduction The act of speaking is a complex activity that requires fine-grained management of cognitive processes and muscle activation. Respiration, laryngeal activity, and supraglottic articulatory activity need to be performed with a particular timing allowing for effortless production of what will be perceived by the hearer as a sequence of speech sounds. Speech sounds are indeed scarcely produced in isolation (e.g., hesitation marks such as “er” in English). Instead, most of the time, they are surrounded by other speech sounds which are part of the same (or adjacent) syllable, the same (or adjacent) word, or the same (or adjacent) utterance. Hence, it is considered that speech sounds are co-produced or co-articulated. This phenomenon is well known in speech production and perception, resulting in some modifications of the speech sound itself. In other words, a speech sound can be slightly altered to become more like its neighboring sounds, especially in fast, casual speech. When sounds next to each other become more similar, articulatory movements can be reduced, making speech easier. For instance, in English, a word sequence such as “as you know” can be characterized by a modification of the /z/ sound in “as,” which can be adapted to the palatal /j/ from “you.” The alveolar fricative /z/ would then become a postalveolar /ʒ/ to get closer to the tongue position required for the palatal approximant /j/: /əʒjənəʊ/ instead of /əzjənəʊ/. In French, the voicing feature of a sound can be influenced by a neighboring sound, as in “anecdote” (= anecdote), where the usually voiceless /k/ can become voiced due to voicing in the immediately following /d/, resulting hence in /anɛgdɔt/ rather than /anɛkdɔt/ pronunciation. Such observations are based on the assumption that there would be an expected sound (/z/ in the first example, /k/ in the second), called invariant. Each word consists of one or several phonemes that need to be recognized for the listener to reconstitute the word. In other words, a particular phoneme needs to be identified, which is possible thanks to some invariant characteristics (e.g., an /i/ will always need to be produced with the tongue higher in the oral cavity than /a/, and this contrast will allow the listener to classify them correctly). However, as speech consists of neighboring sounds, the target can be slightly modified in actual speech due to articulatory gestures overlapping in the speech chain production. As illustrated, such processes can be observed within a word, within a syllable, or between two adjacent syllables or words. They demonstrate the coarticulation phenomenon, “the articulatory modification of a given speech sound arising from coproduction or overlap with neighboring sounds in the speech chain” (Recasens, 2018). The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
574 Ivana Didirková
39.2 Coarticulation in Speech Production Many theories aim at explaining coarticulation in speech. They focus on several parameters, such as the extent of coarticulation (e.g., between two vowels separated by one or several consonants or between two adjacent sounds), coarticulation direction (i.e., are coarticulation effects due to a sound spreading its features to upcoming sounds, or to a sound whose features are observed in preceding sounds), its articulatory characteristics (i.e., describing overlaps between two different tongue gestures, movement direction, amplitude, stiffness), phonological features concerned by coarticulation (e.g., voicing, place of articulation, nasalization), or language differences.
39.2.1 Coarticulation Extent The extent of coarticulation refers to the temporal domain or, in other words, to the distance to which coarticulatory effects are observed. Two categories are generally acknowledged: long-distance coarticulation, which refers to the type of coarticulation where the effect of one speech sound on another is observed through several sounds, and adjacent coarticulation, where two neighboring sounds are involved. Long-distance coarticulation is often observed in vowels separated by one or several consonants. For example, in an /abi/ sequence, one of the vowels is a low vowel (/a/), which means the tongue is relatively low in the mouth, while the second vowel (/i/) is high. Within such a sequence, one of the two vowels can transmit its tongue height to the second one. More specifically, /a/ can influence /i/, m aking /i/ lower than usual (or vice versa). Many studies have illustrated such coarticulation. In English, it was shown that unstressed vowels undergo a strong coarticulatory influence from stressed vowels (Cho, 2004), and such effects can also persist throughout a medial schwa in trisyllabic sequences (Magen, 1997). Coarticulation is also observed in two adjacent sounds, and consonants are also concerned. For instance, in French, it is expected that in a consonant-vowel (CV) sequence, the consonant frontness will be influenced by the vowel’s place of articulation (PoA). Hence, in /ku/ (“cou,” = neck) and in /ki/ (“qui,” = who), the acoustic counterpart of frontness, the second formant commonly referred to as F2, of the /k/ consonant is expected to be higher (= the tongue is further front in the oral cavity) when /k/ is followed by /i/ which is a front vowel, compared to /ku/ where /k/, a back consonant, is followed by a back vowel. Another phonological feature, lip rounding, was observed to spread over several consonants preceding a rounded vowel (such as /u/) in several languages, including English (Daniloff & Moll, 1968), French (Benguerel & Cowan, 1974), or Italian (Caldognetto et al., 1992).
39.2.2 Direction of Coarticulation In the previous two examples, coarticulation was due to the last sound imposing one of its features (frontness, rounding) on preceding sounds. Such coarticulation is called anticipatory since it is characterized by anticipation of an upcoming sound’s features in the preceding sound(s). Nevertheless, the opposite is possible: in other configurations, one sound’s characteristics can spread over the forthcoming sounds. Such coarticulation is often referred to as carryover or perseverative coarticulation, and in English, it is observed, for example, in sequences such as /pliːz/ (“please”) where the voiceless /p/ sound carries over the voicelessness to the /l/.
Coarticulation and Speech Impairment 575
39.2.3 Articulatory Features Concerned with Coarticulation and Language Variation It is important to note that various articulatory or phonological features can be affected by coarticulatory effects, that is, several features can be extended to upcoming or preceding sounds. For example, another sound’s characteristics can influence tongue height, tongue frontness, voicing, nasality, tones, or lip rounding.
39.2.4 Variation in Coarticulation The features mentioned above, directions, and extents of coarticulation are not universal since they vary across languages and speakers. Generally, languages with smaller phonemic inventories tend to allow for more coarticulation since the risk of perceptual confusion is believed to be smaller compared to languages with rich inventories (Manuel, 1990). Studies also show interlanguage differences in their preference for carryover or anticipatory coarticulation. Considering vowels, languages such as Catalan (Recasens, 1984) or English (Beddor et al., 2002) prefer carryover coarticulation, while in Shona, anticipatory effects are more prominent than carryover influences (Beddor et al., 2002). Similarly, while in English or Greek, stressed vowels exert an impact on unstressed vowels (Cho, 2004; Nicolaidis, 1999), this effect seems weaker in Shona (Beddor et al., 2002). Finally, comparably to any other aspect of speech, coarticulation is also subject to interindividual variation (Butcher & Weiher, 1976; Yu, 2016).
39.2.5 Coarticulation Models Coarticulation teaches us a lot about how speech is produced. Indeed, the fact that speech sounds can influence other sounds, sometimes four or more segments away (see, for example, Benguerel & Cowan, 1974 for lip protrusion anticipation in French sequence /sinistʁstʁyktyʁ/ (“sinistre structure,” = sinister structure) where the rounded /y/ spreads the lip rounding behavior as far as six segments to the left (/stʁstʁy/)), can reasonably support several ideas: (1) Speech sounds are not produced in isolation, one by one, (2) Speech sounds can be influenced by upcoming segments, which implies that speech programming is not limited to a segment, (3) Despite coarticulatory modifications, humans can identify underlying invariants (and do not identify such modifications as “new” phonemes). It is, however, essential to note that there is a limit to which coarticulation can influence other speech sounds. This phenomenon is referred to as coarticulatory resistance, which designs the ability of a segment to be less affected by adjacent segments (Bladon & Al-Bamerni, 1976). Implications that coarticulation exerts on speech production are thus fascinating, and studies on coarticulation contribute to the knowledge on the units of speech planning and speech motor programming, as well as on physiological and perceptual constraints on the latter. The importance of coarticulation and its explanation by different speech production models has been discussed for several years (see Farnetani & Recasens, 1999; Kent & Minifie, 1977 for detailed discussions).
39.3 Principal Measurements of Coarticulation Coarticulation concerns the articulation of the speech sounds and the acoustic output resulting from the articulatory movements. Consequently, studies interested in the phenomenon use acoustic and articulatory cues to investigate the consequences of coarticulatory changes.
576 Ivana Didirková
39.3.1 Acoustic Measurements Acoustic measurements are convenient since they do not require heavy equipment. Also, different tools used nowadays for acoustic analyses allow for efficient and fast processing of substantial amounts of data. Even though such analyses are not considered direct observations, they also allow for inferencing underlying articulatory behaviors from acoustic data based on acoustic theories of speech production. However, articulatory-acoustic relationships are not systematically linear.
39.3.1.1 Formant Frequencies Coarticulation effects are often studied through formant frequencies. F1 and F2 values provide information on vowel identity (respectively, tongue height and tongue frontness) and thus indicate coarticulation of the tongue body in vowel-to-vowel (V-to-V) or consonant-tovowel (C-to-V) sequences. Cole et al. (2010) used F1 and F2 values at the midpoint of the target vowels in their study on variability in V-to-V anticipatory coarticulation, studying the impact of /i/, /æ/, or /ɑ/ on the target vowels /ʌ/ and /ɛ/. In this case, the target vowels are central, allowing the tongue to move on both horizontal and vertical axes. Their results indicated that coarticulation effects are seen both on the first and the second formant. They are, however, higher on F2 than on F1. Also, the authors confirmed coarticulatory resistance when the medial consonant is velar. In a C-to-V coarticulation study on differences between native and non-native speakers of English and French, Oh (2008) compared back vowel fronting in the coronal context (i.e., when preceded by a coronal consonant) by investigating F2 values at the first glottal pulse after the burst and at the vowel midpoint as compared to target F2 values of the isolated vowel of interest (/u/). C-to-V coarticulation index for /u/ is then calculated as the difference between the estimated target F2 of the vowel and the F2 of the vowel when preceded by the consonant of interest. The author also calculated the same index for the coronal consonant by subtracting the F2 of the consonant followed by a vowel from the target F2 value of the consonant. According to the study, coarticulatory effects of the coronal consonant on the adjacent back vowel /u/ are higher in English than in French, probably to preserve the phonological contrast between /u/ and /y/, a front rounded vowel that doesn’t exist in English. Such F2 transitions are known as locus equations (Sussman et al., 1991), corresponding to the difference between the F2 onset at the first glottal pulse after the release burst and the F2 target at the vowel midpoint. These two points are then plotted in a two-dimensional plot, with F2 onset on the y-axis and the F2 target on the x-axis. A standard linear equation, y = mx + c, where m is the slope and c is a constant, is then applied. The slope indicates the amount of coarticulatory effect, with zero indicating no effect and 1.0 indicating maximum coarticulation. Importantly, locus equations allow for studying both anticipatory and carryover coarticulation. Locus equations are widely used in studies on coarticulation in speech impairments. In their paper on stuttering, Verdurand et al. (2020) investigated coarticulation in French and Italian persons who stutter and control subjects in CV sequences where C = /b, d, g/ and V = /a, i, u/. Participants were subject to two conditions, normal and altered auditory feedback, and F2 measurements correspond to the vowel onset, the first 10% of the vowel duration, and 50% of the vowel duration. Their results show less coarticulation in persons who stutter than in fluent controls.
39.3.1.2 Nasality Measurements For nasality, Chen’s (1997) acoustic correlates of nasalized vowels are used in several studies on coarticulation effects. Chen (1997) proposes two values, namely A1-P1, which represents
Coarticulation and Speech Impairment 577 the difference between the amplitude of the first formant (A1) and the nasal prominence (P1), which is the amplitude of the extra peak above the first formant, and A1-P0, or the difference between the amplitude of the first formant (A1) and the amplitude of the lowfrequency nasal peak (P0). While A1 decreases with nasalization, the P0 increases as nasalization decreases. Hence, the smaller the A1-P0 difference, the greater the nasality. This latter measurement is used, among others, in a study on individual differences in nasal coarticulation (Zellou, 2017), which measures acoustic nasality at vowel midpoint.
39.3.1.3 Other Spectral Measurements Studies interested in coarticulation in consonants frequently use other acoustic cues based on spectral analyses. Centroid frequency, which represents a weighted spectral average of frequencies and is believed to represent the resonances of the oral cavity, can be used for measurements of anticipatory C-to-V coarticulation in stops (Ryalls et al., 1993) and fricatives (Baum, 1998). Other spectral moments, such as M2 (standard deviation), were also used in coarticulation studies, with reserved results (Körkkö, 2015; Shadle & Mair, 1996). Centroid frequency was used by Waldstein and Baum (1991) to investigate anticipatory coarticulation in children with hearing impairment. The authors used CV syllables /ʃi, ʃu, ti, tu, ki, ku/ to measure centroid frequency in both fricatives and stop consonants at early and late locations. Their study suggests limited anticipatory coarticulation in children with hearing impairment compared to their normally hearing pairs.
39.3.1.4 Tone Measurements Coarticulation effects are also studied on suprasegmental features, specifically on tones in contour-tone languages. Measures are comparable to those used for studies on segments. For example, in a study on Mandarin (Shen, 1990), the fundamental frequency (F0) of the tonal onset (first F0 value coinciding with reasonable formant frequencies after voiceless sounds, and first F0 value after the initial F0 increase after voiced stops), F0 of the turning point (the point where the direction of the tone changes from rising to falling or vice versa), and F0 of the tonal offset (last F0 value) were used. The results indicate assimilatory tonal coarticulatory effects, with low/high offsets lowering/raising the following onset, and low/high onsets lowering/raising the preceding onsets, with carryover coarticulation concerning offsets, and anticipatory coarticulation extending to onsets. Other recent Tianjin and Standard Mandarin studies used a different method by comparing overall F0 contours rather than single points (Li & Chen, 2016; Sun & Shih, 2021). They also confirmed existing coarticulatory tonal effects. As such, tonal coarticulation is an exciting subject since it allows for differentiating between segmental and suprasegmental levels of speech production, especially in brain damage-related impairments such as aphasia. For instance, while several studies show no or little difference between aphasic patients and healthy controls in terms of segmental coarticulation (Katz, 1988, 1990), in their study on suprasegmental coarticulation in Thai, Gandour et al. (1996) observe coarticulation impairment in both fluent and nonfluent aphasic patients, suggesting possible differences in brain processing of the two levels.
39.3.2 Articulatory Measurements While acoustic measurements are fast and easier to obtain, articulatory data allows for more direct observation of glottal and supraglottic articulatory activity. Thus, it has long interested researchers in studies dealing with coarticulation.
578 Ivana Didirková
39.3.2.1 Ultrasound Imaging Ultrasounds have the undeniable advantage of being portative, allowing one to record the speakers’ tongue movements while speaking by placing a probe under the chin. Such data need to be processed to obtain exploitable values, allowing coarticulation measurements. Usually, articulatory data are also segmented based on acoustic parameters. In a recent study, Mousikou et al. (2021) used ultrasound imaging to investigate coarticulation across morpheme boundaries. The authors used two types of analysis: whole image and tongue contours. While the former allows establishing a mean difference in coarticulation, the latter was used to complete the results with the articulatory dimension. Tongue contour analysis was also used by Zharkova and Hewlett (2009), who calculated the across-environment distance between curves of the phoneme studied in one phonemic environment and curves of the same phoneme in other phonemic environments. In their study on coarticulation in children who stutter, Lenoci and Ricci (2018) applied the locus equation measurement principle to tongue contour data. Their results show that comparing children who do and do not stutter depends on the PoA of the consonant in CV sequences, with children who stutter adopting the same degree of coarticulation as children who do not stutter or a higher degree of coarticulation than the control group.
39.3.2.2 Electropalatography (EPG) Electropalatography allows obtaining information on tongue contact with the hard palate in speech. It has the inconvenience of not representing the tongue root nor the movements where the tongue does not touch the palate, but it has long been used in coarticulation studies for other articulators. For example, in their research on V-to-V coarticulation in German, Butcher and Weiher (1976) observed palatal contact in VCV sequences with the consonant being /t/ or /k/. More recently, EPG investigations of the contrast between alveolar and retroflex apical consonants (Tabain, 2009) and of assimilation in nasal-to-velar clusters (Celata et al., 2013) used the center of gravity (CoG) index since it allows for observations within the entire palate: the higher the CoG, the more front the electrodes contacted, and vice versa. In clinical settings, EPG is not only used in studies on coarticulation but also in speech therapy for articulation and coarticulation assessment and providing direct feedback to patients (Lee et al., 2022).
39.3.2.3 Electromagnetic Articulography (EMA) Electromagnetic articulography allows three-dimensional observations of the tongue, lip, and jaw movements using coils fixed on supraglottic articulators. It is considered a relatively p recise technique, with the important inconvenience of being invasive. In coarticulation studies, EMA can provide valuable information on vertical and horizontal movements, their onset and offsets, or first and second derivates (velocity, acceleration) (Recasens & Espinosa, 2010). This technique was recently used in a study on coarticulation in adults who stutter, aiming to verify coarticulation disruption during disfluencies (Didirková & Hirsch, 2020).
39.3.2.4 Magnetic Resonance Imaging (MRI) Recently, MRI adaptations allowed studying speech production by depicting articulatory movements in the midsagittal plane. Like EMA, it is an invasive but valuable technique of articulatory data acquisition and is currently used, among others, in coarticulation studies. MRI provides information on supraglottic speech organs, the larynx, and the velum. A study on coarticulation in one multilingual speaker (Badin et al., 2019) used semi-automatic contour segmentation of speech articulators before superposing contours in a given set of phonemes of interest.
Coarticulation and Speech Impairment 579
39.3.3 Speech Perception and Coarticulation Coarticulation studies are not limited to speech production. Indeed, it has long been proved that listeners can anticipate an upcoming sound based on acoustic and visual cues. This ability is not limited to coarticulation per se since it is known that a typical adult listener can also reconstruct a missing or mispronounced sound (Samuel, 1981). Perception studies often use the gating paradigm (Grosjean, 1980), which consists of presenting a stimulus repeatedly and increasing its presentation time (duration from onset). In other words, for a /bog/ sequence, the participant will be presented with, for example, the first 30 milliseconds from the beginning of the first consonant /b/, then with the first 45 milliseconds, and so on. The participant is tasked to identify the syllable they are about to hear, usually from a set of syllables they are presented with. This paradigm allows the identification of relevant acoustic or visual cues used by the listener to anticipate the upcoming sound and the temporal extent to which a hearer can predict the subsequent sound or syllable. For example, in a study on lip rounding anticipatory coarticulation in /izi/ vs. /izy/ sequences, Ménard et al. (2015) used the gating paradigm to assess the ability of congenitally blind adults to identify an upcoming rounded vowel as compared to listeners with no impairment. Participants were randomly presented with gated /izi/ and /izy/ stimuli, with 16 ms long steps, so each stimulus began with the onset of the first vowel /i/ and ended at different time points within the /zi/ or /zy/ sequence. Participants were asked to guess the second vowel from the sequence. The study shows that both groups of speakers were able to identify the upcoming rounded /y/ before its acoustic onset. Since speech is a multimodal activity and is not limited to acoustic cues, other studies couple acoustic gating with audiovisual or visual gating, using visual cues in addition to or separately from acoustic stimuli. An example of such a paradigm can be found in Howson et al. (2021), who used the audiovisual gating paradigm to measure coarticulation perception in both anticipatory and carryover coarticulation in children and adults. To sum up, coarticulation is a widely studied phenomenon with numerous possibilities of approaches based on acoustic and articulatory observations. The phenomenon is of interest to researchers in phonetics and phonology since it contributes to the knowledge of the influence that speech motor control and mechanical constraints segments can exert on their adjacent counterparts. Coarticulation also provides important information on invariants and their variable counterparts in speech production and perception and accounts for speech planning characteristics, particularly in anticipatory coarticulation.
39.4 Coarticulation in Impaired Speech Since coarticulation provides valuable information on speech production, speech planning, and speech motor control, it is naturally of great interest in speech impairment, not only because it can teach us more about a given disorder but also because information on coarticulation disorder provides information on the expected functioning of speech. Indeed, if we consider speech impairment as a variation of speech with no impairment (rather than a completely different speech production category), coarticulation studies in speech impairments inform us of different stages of (non-impaired) speech production. Information on coarticulation preserved in some syllables and disrupted in others observed in aphasia (see the following section), for example, clearly indicates the necessity of accounting for the corresponding articulatory features in speech production models.
580 Ivana Didirková Speech impairments constitute a group of various disorders presenting variable underlying conditions, including purely motor disorders and speech planning disorders. This variety of underlying conditions is of great interest for coarticulation studies since it allows for separating motor mechanisms dynamics (see, for example, Ostry et al., 1996) and processes relative to speech planning per se. On the other hand, studying coarticulation in speech-impaired populations requires essential adaptations of the study design due to smaller groups and sometimes increased fatigability. Studying coarticulation in such a variety of speech behaviors is thus challenging but also very informative, especially regarding different stages of speech production. This section proposes an overview of coarticulation studies in significant speech impairments.
39.4.1 Aphasia Aphasia is a complex disorder that is traditionally split into several groups roughly depending on the origins of the condition: Wernicke’s, or fluent, aphasia patients would present speech errors due to phonemic stages of speech production (roughly corresponding to selection of the speech invariants), while Broca’s, or disfluent, aphasic patients’ errors would be due to errors in speech motor control (at the lower, variant level of speech production). Anomic aphasia leads to word-finding difficulties, and conduction aphasia patients have difficulty repeating words. Although this view is no longer considered clear-cut, and authors tend to suggest a continuum rather than distinct types of aphasia, it remains generally accepted that disfluent aphasic patients undergo a motor speech disorder. As such, it represents interesting support for coarticulation theories. Patients with Broca’s aphasia are considered to experience difficulties with speech motor timing with several complications linked to consonantal production, namely voice onset time (VOT) disruptions (e.g., Gandour & Dardarananda, 1984) or voicing (Nespoulous et al., 2013), even though vowel systems seem well preserved (Katz, 2000). Katz et al. (1990) show, in their EMA study, increased variability of coarticulatory patterns in aphasic patients as compared to control speakers both for velar (nasality) and labial (rounding) coarticulation but without temporal particularities. In other words, they found that aphasic patients’ articulatory displacements were disrupted but well timed. This is an important point since it underlines the importance of kinematic studies in the field: indeed, as hypothesized by Tuller and Story (1988), it is possible to see a smaller extent of coarticulation in acoustic data, with this smaller extent being due to slower movements rather than delayed or lesser anticipation. However, other studies on velar coarticulation in aphasia did show early velar lowering when compared to control subjects (e.g., Ziegler & Cramon, 1986). Regarding V-to-V coarticulation, the results indicated disrupted patterns in aphasic patients (Katz et al., 1990). As for lip rounding, anticipatory coarticulation does not seem disrupted in aphasia (Katz, 1988, 1990). Some perception studies confirm the results above (at least from an acoustic-perceptive viewpoint), indicating normal coarticulation in aphasic patients. Indeed, their results show that naïve listeners can correctly identify the upcoming sound both from control and aphasic subjects’ speech production (Katz, 1988). Katz (2000) proposes an interesting paper on the implications of coarticulation studies in aphasic patients on phonetic theories. The report underlines, among others, the importance of considering several articulators since studies on aphasia cited above show inter-articulator differences in coarticulation management in aphasic patients.
Coarticulation and Speech Impairment 581
39.4.2 Apraxia of Speech (AoS) In AoS patients, phonological representations are correct, but the patients cannot program corresponding articulatory movements (Ziegler et al., 2012). This description corresponds to human apraxia’s general behavior, which concerns other motor action control, such as oral or limb apraxia (Code, 1998). As a result, patients with AoS do not have muscular problems per se but are unable to translate phonological invariants into efficient motor execution (Buckingham, 1979), differentiating AoS patients from aphasic patients. Consequently, if we consider that phonological planning remains correct in these patients, but translation into motor patterns of speech segments is disrupted, studying coarticulation should provide interesting insights into how patients with AoS plan and execute their speech movements in connected speech. An interesting question in this disorder is to investigate speech in pseudo-words since such stimuli represent a typical case of using less- or non-stored motor patterns and require the speaker to use online speech processing even in speakers with no speech-related impairment (e.g., Sasisekaran et al., 2010). The speech of patients with AoS should then be comparable to pseudo-word production in control speakers. Whiteside and Varley (1998) investigated this question using locus equations for anticipatory coarticulation in CV sequences. Their results show a higher amount of coarticulation in the control speaker compared to the AoS speaker, which would, according to the authors, corroborate the idea of an overreliance on indirect motor encoding mechanisms. Moreover, such lingual coarticulation disruption grows with utterance complexity (Whiteside et al., 2010). An acoustic, perceptual, and EPG study by Southwood et al. (2009) also indicated disruption in CV anticipatory lingual and labial coarticulation in AoS, with delayed lingual coarticulation at faster speech rates, suggesting inappropriate timing and/or direction of the expected lingual placement. A delay in labial coarticulation was also confirmed in the AoS patient, which was previously observed by other studies (Ziegler & von Cramon, 1985).
39.4.3 Dysarthria Dysarthria covers speech disorder-related behaviors due to neurological injuries, resulting in poor articulation due to muscular issues. Seven types of dysarthria are commonly mentioned in the literature, namely flaccid (cranial and spinal nerves), spastic (motor control areas), unilateral upper motor neuron (motor control areas), hypokinetic (e.g., Parkinson’s disease, slow movements and rigidity), hyperkinetic (e.g., Huntington’s disease or Tourette syndrome, unpredictable speech production), ataxic (connections between the cerebellum and other brain areas), and mixed dysarthria (e.g., Perrotta, 2020). This group of disorders is different from aphasias and apraxia in that it’s characterized by issues in motor execution. Coarticulation studies in dysarthria are not numerous, and their results vary according to the dysarthria subtype. In hypokinetic dysarthria, primarily studied through patients with Parkinson’s disease, the rationale for studying coarticulation relies on the supposition that reduced amplitude of articulatory movements and changes in segmental duration are essential factors in coarticulation, including in control subjects (Martel Sauvageau et al., 2015). In this dysarthria subtype, however, studies seldom find a significant difference between coarticulation in control subjects and the one in patients (D’Alessandro et al., 2019; Kim, 2017; Tjaden, 2003; Tjaden & Wilding, 2005), even though Tjaden (2000) suggested increased coarticulation in Parkinson patients compared to controls. Moreover, in a perceptual study, Tjaden and Sussman (2006) did not find any significant difference between naïve participants’ perception of speech produced by patients with multiple sclerosis or Parkinson’s disease and control subjects.
582 Ivana Didirková
39.4.4 Stuttering Stuttering is a neurodevelopmental speech disorder mainly characterized by an increased number of so-called disfluencies, that is, blocks, prolongations, and repetitions (Ward, 2018). It is sometimes considered a motor speech disorder resulting in disturbance of speech production processing (Ludlow & Loucks, 2003). As such, stuttering has been claimed to be a coarticulation problem since a disfluency can occur within a syllable (Wingate, 1977), which has motivated an important number of studies interested in coarticulation in this disorder. Here, it is important to note that, unlike other speech disorders, stuttering is not consistent in speech and disfluent moments represent only a part of a person’s speech production. Hence, studies on coarticulation in stuttering can be divided into those who are interested in perceptually fluent (i.e., with no perceptible disfluency) speech, and on disfluent speech (i.e., during a disfluent moment) in persons who stutter. In perceptually fluent speech of people who stutter, there is little consensus on coarticulation with some of the studies concluding to a weaker coarticulation (Robb & Blomgren, 1997), while others fail to find a coarticulation disruption (Dehqan et al., 2016). Regarding disfluent speech, in his seminal acoustic-electropalatographic study of coarticulation in stuttering, Harrington (1987) confirmed disrupted coarticulation patterns in the disfluent speech of persons who stutter, including unsatisfactory tongue movement amplitude or repetition of the tongue movement with little or no anticipation of the upcoming sound. Reduced F2 transitions were also confirmed by other studies (Howell et al., 1987). Recently, Didirková and Hirsch (2020) used EMA to investigate coarticulatory patterns in disfluent speech produced by persons who stutter. Their conclusions show that, most of the time, the disfluent sound itself is correctly anticipated, regardless of the articulator recruited for the sound’s production. However, they observed disruption of coarticulation between the disfluent (i.e., repeated, blocked, or prolonged sound) sound and the subsequent sound. According to the authors, these observations exclude the hypothesis according to which stuttering would be exclusively a coarticulation problem, since there were disfluencies where both the disfluent and subsequent sound were anticipated. Implications for stuttering theories are discussed in the paper (Didirková & Hirsch, 2020).
39.5 Coarticulation in Other Impairments Although speech analyses mainly concern disorders which are directly related to speech and language, coarticulation in hearing and vision impairments is also widely studied. In hearing impairments, the rationale for studying coarticulation is based on modifications observed both at segmental and suprasegmental levels in speech produced by hearing-impaired subjects as compared to their pairs with no impairment. More specifically, children and adults with hearing impairment would tend to restrict their vowel space which would result in lesser contrast between different vowels due to their centralization (Núñez-Batalla et al., 2019; Sfakianaki & Nicolaidis, 2016). Another study on different groups of hearing-impaired participants observed reduced spectral frequencies in people presenting lesser intelligibility than more intelligible, mostly post-linguistically impaired subjects (Mendel et al., 2017). From the suprasegmental viewpoint, restricted transitions in CV sequences have been reported (Rothman, 1976). Coarticulation studies in hearing impairments have shown reduced anticipatory coarticulation in CV sequences in children with cochlear implants as compared to children with no impairment (Grandon & Vilain, 2020). The same is true for perseveratory coarticulation in children (Baum & Waldstein, 1991). Results were more nuanced in adolescents and adults with hearing impairment, where a context effect is observed (McCaffrey Morrison, 2008; Sfakianaki & Nicolaidis, 2016).
Coarticulation and Speech Impairment 583 Vision impairments, on the other hand, pose the question of speech multimodality and its influence on coarticulation perception. Studies lead on anticipatory coarticulation perception in blind or visually impaired subjects show that blind participants perform better than sighted subjects in discrimination tasks, that is, when asked to identify two gated sequences as identical or different, but not in identification tasks, that is, when asked to choose one of the proposed sequences (Delvaux et al., 2018; Ménard et al., 2015).
39.6 Conclusion In clinical phonetics, coarticulation is widely studied, especially within speech impairments involving speech production deficits, including phonological (invariant), motor planning, and motor execution disorders. Due to their lack of sight, coarticulation in blind speakers is also a center of interest, especially in determining the relative importance of visual and acoustic cues. While the lack of consensual conclusions requires further studies, perhaps combining acoustic and kinematic information, in several speech disorders, their results are interesting for non-pathological and pathological speech. They confirm the inter-language differences in coarticulation and the variation in coarticulatory functioning depending on the articulator. In the upcoming years, the improvement of acoustic measurements and tools allowing for direct observation of supraglottic articulatory behavior promises new and exciting insights into coarticulation in clinical populations.
REFERENCES Badin, P., Tabain, M., & Lamalle, L. (2019). Comparative study of coarticulation in a multilingual speaker : Preliminary results from MRI data. Proceedings of the 19th International Congress of Phonetic Sciences, 3453–3457. Baum, S. R. (1998). Anticipatory coarticulation in aphasia : Effects of utterance complexity. Brain and Language, 63(3), 357–380. https://doi. org/10.1006/brln.1997.1938 Baum, S. R., & Waldstein, R. S. (1991). Perseveratory coarticulation in the speech of profoundly hearing-impaired and normally hearing children. Journal of Speech, Language, and Hearing Research, 34(6), 1286–1292. https:// doi.org/10.1044/jshr.3406.1286 Beddor, P. S., Harnsberger, J. D., & Lindemann, S. (2002). Language-specific patterns of vowel-tovowel coarticulation : Acoustic structures and their perceptual correlates. Journal of Phonetics, 30(4), 591–627. https://doi.org/10.1006/ jpho.2002.0177 Benguerel, A. P., & Cowan, H. A. (1974). Coarticulation of upper lip protrusion in French. Phonetica, 30(1), 41–55. https://doi. org/10.1159/000259479
Bladon, R. A. W., & Al-Bamerni, A. (1976). Coarticulation resistance in English /l/. Journal of Phonetics, 4(2), 137–150. https://doi. org/10.1016/S0095-4470(19)31234-3 Buckingham, H.W. (1979). Explanation in apraxia with consequences for the concept of apraxia of speech. Brain and Language, 8, 202-226. Butcher, A., & Weiher, E. (1976). An electropalatographic investigation of coarticulation in VCV sequences. Journal of Phonetics, 4(1), 59–74. https://doi. org/10.1016/S0095-4470(19)31222-7 Caldognetto, E. M., Vagges, K., Ferrigno, G., & Busà, M. G. (1992). Lip rounding coarticulation in Italian. Proceedings of the 2nd International Conference on Spoken Language Processing, 61–64. https://www.isca-speech. org/archive/pdfs/icslp_1992/caldognetto92_ icslp.pdf Celata, C., Calamai, S., Ricci, I., & Bertini, C. (2013). Nasal place assimilation between phonetics and phonology : An EPG study of Italian nasal-to-velar clusters. Journal of Phonetics, 41(2), 88–100. https://doi. org/10.1016/j.wocn.2012.10.002
584 Ivana Didirková Chen, M. Y. (1997). Acoustic correlates of English and French nasalized vowels. The Journal of the Acoustical Society of America, 102(4), 2360–2370. https://doi.org/10.1121/1.419620 Cho, T. (2004). Prosodically conditioned strengthening and vowel-to-vowel coarticulation in English. Journal of Phonetics, 32(2), 141-176. https://doi.org/10.1016/ S0095-4470(03)00043-3. Code, C. (1998). Models, theories and heuristics in apraxia of speech. Clinical Linguistics & Phonetics, 12(1), 47–65. https://doi. org/10.3109/02699209808985212 Cole, J., Linebaugh, G., Munson, C., & McMurray, B. (2010). Unmasking the acoustic effects of vowel-to-vowel coarticulation : A statistical modeling approach. Journal of Phonetics, 38(2), 167–184. https://doi. org/10.1016/j.wocn.2009.08.004 D’Alessandro, D., Pernon, M., Fougeron, C., & Laganaro, M. (2019). Anticipatory V-to-V coarticulation in French in several Motor Speech Disorders. Proceedings of the 3rd Phonetics and Phonology in Europe conference, 82–84. http://conference.unisalento.it/ocs/ public/conferences/27/fmgr_upload/ Abstracts/oral_4_D_Alessandro-etal.pdf Daniloff, R., & Moll, K. (1968). Coarticulation of lip rounding. Journal of Speech and Hearing Research, 11(4), 707–721. https://doi. org/10.1044/jshr.1104.707 Dehqan, A., Yadegari, F., Blomgren, M., & Scherer, R. C. (2016). Formant transitions in the fluent speech of Farsi-speaking people who stutter. Journal of Fluency Disorders, 48, 1–15. https://doi.org/10.1016/j.jfludis.2016.01.005 Delvaux, V., Huet, K., Piccaluga, M., & Harmegnies, B. (2018). The perception of anticipatory labial coarticulation by blind listeners in noise : A comparison with sighted listeners in audio-only, visual-only and audiovisual conditions. Journal of Phonetics, 67, 65–77. https://doi.org/10.1016/j. wocn.2018.01.001 Didirková, I., & Hirsch, F. (2020). A two-case study of coarticulation in stuttered speech. An articulatory approach. Clinical Linguistics & Phonetics, 34(20), 517–535. https://doi.org/10.1 080/02699206.2019.1660913 Farnetani, E., & Recasens, D. (1999). Coarticulation models in recent speech production theories. In N. Hewlett & W. J. Hardcastle (Eds.), Coarticulation : Theory, data and techniques (pp. 31–66). Cambridge University Press. https://doi.org/10.1017/ CBO9780511486395.003
Gandour, J., & Dardarananda, R. (1984). Voice onset time in aphasia: Thai. II. Production. Brain and Language, 23(2), 177–205. https://doi. org/10.1016/0093-934x(84)90063-4 Gandour, J., Potisuk, S., Ponglorpisit, S., Dechongkit, S., Khunadorn, F., & Boongird, P. (1996). Tonal coarticulation in Thai after unilateral brain damage. Brain and Language, 52(3), 505–535. https://doi.org/10.1006/ brln.1996.0027 Grandon, B., & Vilain, A. (2020). Development of fricative production in French-speaking school-aged children using cochlear implants and children with normal hearing. Journal of Communication Disorders, 86, 105996. https:// doi.org/10.1016/j.jcomdis.2020.105996 Grosjean, F. (1980). Spoken word recognition processes and the gating paradigm. Perception & Psychophysics, 28(4), 267–283. https://doi. org/10.3758/BF03204386 Harrington, J. (1987). Coarticulation and stuttering : An acoustic and electropalatographic study. In H. F. M. Peters & W. Hulstijn (Eds.), Speech motor dynamics in stuttering (pp. 381–392). Springer. https://link.springer.com/chap ter/10.1007/978-3-7091-6969-8_30 Howell, P., Williams, M., & Vause, I. (1987). Acoustic analysis of repetitions in stutterers’ speech. In H. F. M. Peters & W. Hulstijn (Eds.), Speech motor dynamics in stuttering (pp. 371–380). Springer. Howson, P. J., Kallay, J. E., & Redford, M. A. (2021). A psycholinguistic method for measuring coarticulation in child and adult speech. Behavior Research Methods, 53(2), 846–863. https://doi. org/10.3758/s13428-020-01464-7 Katz, W. (1990). A kinematic analysis of anticipatory coarticulation in the speech of anterior aphasic subjects using electromagnetic articulography*1. Brain and Language, 38(4), 555–575. https://doi. org/10.1016/0093-934X(90)90137-6 Katz, W., Machetanz, J., Orth, U., & Schönle, P. (1990). A kinematic analysis of anticipatory coarticulation in the speech of anterior aphasic subjects using electromagnetic articulography. Brain and Language, 38(4), 555–575. https://doi. org/10.1016/0093-934x(90)90137-6 Katz, W. F. (1988). Anticipatory coarticulation in aphasia : Acoustic and perceptual data. Brain and Language, 35(2), 340–368. https://doi. org/10.1016/0093-934x(88)90116-2 Katz, W. F. (2000). Anticipatory coarticulation and aphasia : Implications for phonetic theories. Journal of Phonetics, 28(3), 313–334. https://doi.org/10.1006/jpho.2000.0118
Coarticulation and Speech Impairment 585 Kent, R. D., & Minifie, F. D. (1977). Coarticulation in recent speech production models. Journal of Phonetics, 5(2), 115–133. https://doi. org/10.1016/S0095-4470(19)31123-4 Kim, Y. (2017). Acoustic characteristics of fricatives /s/ and /∫/ produced by speakers with Parkinson’s disease. Clinical Archives of Communication Disorders, 2(1), 7–14. https:// doi.org/10.21849/cacd.2016.00080 Körkkö, P. (2015). Spectral moments analysis of /s/ coarticulation development in Finnishspeaking children. Proceedings of the 18th International Congress of Phonetic Sciences. 18th International Congress of Phonetic Sciences (ICPhS), Glasgow, UK. Lee, A., Fujiwara, Y., Liker, M., Yamamoto, I., Takei, Y., & Gibbon, F. (2022). Electropalatography (EPG) activities in Japan and the impact of the COVID-19 pandemic on EPG research and therapy : A report of presentations at the 7th EPG Symposium. International Journal of Language & Communication Disorders, 57(4), 906–917. https://doi.org/10.1111/1460-6984.12720 Lenoci, G., & Ricci, I. (2018). An ultrasound investigation of the speech motor skills of stuttering Italian children. Clinical Linguistics & Phonetics, 32(12), 1126–1144. https://doi.org/1 0.1080/02699206.2018.1510983 Li, Q., & Chen, Y. (2016). An acoustic study of contextual tonal variation in Tianjin Mandarin. Journal of Phonetics, 54, 123–150. https://doi. org/10.1016/j.wocn.2015.10.002 Ludlow, C. L., & Loucks, T. (2003). Stuttering : A dynamic motor control disorder. Journal of Fluency Disorders, 28(4), 273–295. https://doi. org/10.1016/j.jfludis.2003.07.001 Magen, H. S. (1997). The extent of vowel-tovowel coarticulation in English. Journal of Phonetics, 25(2), 187–205. https://doi. org/10.1006/jpho.1996.0041 Manuel, S. Y. (1990). The role of contrast in limiting vowel‐to‐vowel coarticulation in different languages. The Journal of the Acoustical Society of America, 88(3), 1286–1298. https:// doi.org/10.1121/1.399705 Martel Sauvageau, V., Roy, J.-P., Langlois, M., & Macoir, J. (2015). Impact of the LSVT on vowel articulation and coarticulation in Parkinson’s disease. Clinical Linguistics & Phonetics, 29(6), 424–440. https://doi.org/10.3109/02699206.20 15.1012301 McCaffrey Morrison, H. (2008). The locus equation as an index of coarticulation in syllables produced by speakers with
profound hearing loss. Clinical Linguistics & Phonetics, 22(9), 726–740. https://doi.org/ 10.1080/02699200802176402 Ménard, L., Cathiard, M.-A., Troille, E., & Giroux, M. (2015). Effects of congenital visual deprivation on the auditory perception of anticipatory labial coarticulation. Folia Phoniatrica et Logopaedica, 67(2), 83–89. https:// doi.org/10.1159/000434719 Mendel, L. L., Lee, S., Pousson, M., Patro, C., McSorley, S., Banerjee, B., Najnin, S., & Kapourchali, M. H. (2017). Corpus of deaf speech for acoustic and speech production research. The Journal of the Acoustical Society of America, 142(1), EL102–EL107. https://doi. org/10.1121/1.4994288 Mousikou, P., Strycharczuk, P., Turk, A., & Scobbie, J. M. (2021). Coarticulation across morpheme boundaries : An ultrasound study of past-tense inflection in Scottish English. Journal of Phonetics, 88, 101101. https://doi. org/10.1016/j.wocn.2021.101101 Nespoulous, J.-L., Baqué, L., Rosas, A., Marczyk, A., & Estrada, M. (2013). Aphasia, phonological and phonetic voicing within the consonantal system : Preservation of phonological oppositions and compensatory strategies. Language Sciences, 39, 117–125. https://doi.org/10.1016/j.langsci. 2013.02.015 Nicolaidis, K. (1999). The influence of stress on V-to-V coarticulation : An electropalatographic study. Proceedings of the XIV International Congress of Phonetic Sciences, 1087–1090. Núñez-Batalla, F., Vasile, G., Cartón-Corona, N., Pedregal-Mallo, D., Menéndez de Castro, M., Guntín García, M., Gómez-Martínez, J., Carro Fernández, P., & Llorente-Pendás, J. L. (2019). Vowel production in hearing impaired children : A comparison between normal-hearing, hearing-aided and cochlear-implanted children. Acta Otorrinolaringologica (English Edition), 70(5), 251–257. https://doi.org/10.1016/j.otoeng. 2018.05.004 Oh, E. (2008). Coarticulation in non-native speakers of English and French : An acoustic study. Journal of Phonetics, 36(2), 361–384. https://doi.org/10.1016/j.wocn.2007.12.001 Ostry, D., Gribble, P., & Gracco, V. (1996). Coarticulation of jaw movements in speech production: Is context sensitivity in speech kinematics centrally planned? The Journal of Neuroscience, 16(4), 1570–1579. https://doi. org/10.1523/JNEUROSCI.16-04-01570.1996
586 Ivana Didirková Perrotta, G. (2020). Dysarthria : Definition, clinical contexts, neurobiological profiles and clinical treatments. Archives of Community Medicine and Public Health, 6(2), 142–145. Recasens, D. (1984). Vowel‐to‐vowel coarticulation in Catalan VCV sequences. The Journal of the Acoustical Society of America, 76(6), 1624–1635. https://doi.org/10.1121/1.391609 Recasens, D. (2018). Coarticulation. In Oxford research encyclopedia of linguistics. https://doi. org/10.1093/acrefore/9780199384655.013.416 Recasens, D., & Espinosa, A. (2010). Lingual kinematics and coarticulation for alveolopalatal and velar consonants in Catalan. The Journal of the Acoustical Society of America, 127(5), 3154–3165. https://doi. org/10.1121/1.3372631 Robb, M., & Blomgren, M. (1997). Analysis of F2 transitions in the speech of stutterers and nonstutterers. Journal of Fluency Disorders, 22(1), 1–16. https://doi.org/10.1016/ S0094-730X(96)00016-2 Rothman, H. B. (1976). A spectrographic investigation of consonant-vowel transitions in the speech of deaf adults. Journal of Phonetics, 4(2), 129–136. https://doi.org/10.1016/ S0095-4470(19)31233-1 Ryalls, J., Baum, S., Samuel, R., Larouche, A., Lacoursière, N., & Garceau, J. (1993). Anticipatory co-articulation in the speech of young normal and hearing-impaired French Canadians. European Journal of Disorders of Communication: The Journal of the College of Speech and Language Therapists, London, 28(1), 87–101. https://doi.org/10.3109/13682829309033144 Samuel, A. G. (1981). Phonemic restoration : Insights from a new methodology. Journal of Experimental Psychology: General, 110(4), 474–494. Sasisekaran, J., Smith, A., Sadagopan, N., & Weber-Fox, C. (2010). Nonword repetition in children and adults : Effects on movement coordination. Developmental Science, 13(3), 521–532. https://doi.org/10.1111/ j.1467-7687.2009.00911.x Sfakianaki, A., & Nicolaidis, K. (2016). Acoustic aspects of segmental and suprasegmental productions of Greek hearing-impaired speech : A qualitative analysis. Selected Papers on Theoretical and Applied Linguistics, 21(0), 401–425. https://doi.org/10.26262/istal.v21i0.5239 Shadle, C. H., & Mair, S. J. (1996). Quantifying spectral characteristics of fricatives. Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP ’96, 3, 1521–1524. https://doi.org/10.1109/ICSLP.1996.607906
Shen, X. S. (1990). Tonal coarticulation in Mandarin. Journal of Phonetics, 18(2), 281–295. https://doi.org/10.1016/S0095-4470(19)30394-8 Southwood, H., Dagenais, P., Sutphin, S., & Garcia, J. (2009). Coarticulation in apraxia of speech : A perceptual, acoustic, and electropalatographic study. Clinical Linguistics & Phonetics, 11(3), 179–203. https://doi. org/10.3109/02699209708985190 Sun, Y., & Shih, C. (2021). Boundary-conditioned anticipatory tonal coarticulation in Standard Mandarin. Journal of Phonetics, 84, 101018. https://doi.org/10.1016/j.wocn.2020.101018 Sussman, H. M., McCaffrey, H. A., & Matthews, S. A. (1991). An investigation of locus equations as a source of relational invariance for stop place categorization. The Journal of the Acoustical Society of America, 90(3), 1309–1325. https://doi.org/10.1121/1.401923 Tabain, M. (2009). An EPG study of the alveolar vs. Retroflex apical contrast in Central Arrernte. Journal of Phonetics, 37(4), 486–501. https://doi.org/10.1016/j.wocn.2009.08.002 Tjaden, K. (2000). An acoustic study of coarticulation in dysarthric speakers with Parkinson disease. Journal of Speech, Language, and Hearing Research, 43(6), 1466–1480. https:// doi.org/10.1044/jslhr.4306.1466 Tjaden, K. (2003). Anticipatory coarticulation in multiple sclerosis and Parkinson’s disease. Journal of Speech, Language, and Hearing Research: JSLHR, 46(4), 990–1008. https://doi. org/10.1044/1092-4388(2003/077) Tjaden, K., & Sussman, J. (2006). Perception of coarticulatory information in normal speech and dysarthria. Journal of Speech, Language, and Hearing Research: JSLHR, 49(4), 888–902. https://doi.org/10.1044/1092-4388(2006/064) Tjaden, K., & Wilding, G. E. (2005). Effect of rate reduction and increased loudness on acoustic measures of anticipatory coarticulation in multiple sclerosis and Parkinson’s disease. Journal of Speech, Language, and Hearing Research, 48(2), 261–277. https://doi. org/10.1044/1092-4388(2005/018) Tuller, B., & Story, R. S. (1988). Anticipatory and carryover coarticulation in aphasia : An acoustic study. Cognitive Neuropsychology, 5(6), 747–771. https://doi.org/10.1080/02643298808253281 Verdurand, M., Rossato, S., & Zmarich, C. (2020). Coarticulatory aspects of the fluent speech of French and Italian people who stutter under altered auditory feedback. Frontiers in Psychology, 11. https://www.frontiersin.org/ articles/10.3389/fpsyg.2020.01745
Coarticulation and Speech Impairment 587 Waldstein, R. S., & Baum, S. R. (1991). Anticipatory coarticulation in the speech of profoundly hearing-impaired and normally hearing children. Journal of Speech and Hearing Research, 34(6), 1276–1285. https://doi. org/10.1044/jshr.3406.1276 Ward, D. (2018). Stuttering and cluttering. Frameworks for understanding and treatment (2nd ed.). Routledge. Whiteside, S., & Varley, R. A. (1998). Coarticulation in apraxia of speech : An acoustic study of non-words. Logopedics Phoniatrics Vocology, 23(4), 155–163. https:// doi.org/10.1080/140154398434059 Whiteside, S. P., Grobler, S., Windsor, F., & Varley, R. (2010). An acoustic study of vowels and coarticulation as a function of utterance type : A case of acquired apraxia of speech. Journal of Neurolinguistics, 23(2), 145–161. https://doi. org/10.1016/j.jneuroling.2009.12.002 Wingate, M. E. (1977). The iImmediate source of stuttering: An integration of evidence. Journal of Communication Disorders, 10(1-2), 45–51. https://doi.org/10.1016/0021-9924(77)90012-0 Yu, A. C. L. (2016). Vowel-dependent variation in Cantonese /s/ from an individual-difference
perspective. The Journal of the Acoustical Society of America, 139(4), 1672–1690. https://doi. org/10.1121/1.4944992 Zellou, G. (2017). Individual differences in the production of nasal coarticulation and perceptual compensation. Journal of Phonetics, 61, 13–29. https://doi.org/10.1016/j.wocn.2016.12.002 Zharkova, N., & Hewlett, N. (2009). Measuring lingual coarticulation from midsagittal tongue contours : Description and example calculations using English /t/ and /ɑ/. Journal of Phonetics, 37(2), 248–256. https://doi. org/10.1016/j.wocn.2008.10.005 Ziegler, W., Aichert, I., & Staiger, A. (2012). Apraxia of speech : Concepts and controversies. Journal of Speech, Language, and Hearing Research: JSLHR, 55(5), S1485–S1501. https://doi. org/10.1044/1092-4388(2012/12-0128) Ziegler, W., & Cramon, D. (1986). Timing deficits in apraxia of speech. European Archives of Psychiatry and Neurological Sciences, 236(1), 44–49. https://doi.org/10.1007/BF00641058 Ziegler, W., & von Cramon, D. (1985). Anticipatory coarticulation in a patient with apraxia of speech. Brain and Language, 26(1), 117–130. https://doi.org/10.1016/0093-934X(85)90032-X
40 Prosodic Impairments BILL WELLS AND TRACI WALKER 40.1 What is a Prosodic Impairment? Every time we speak, we have to do something with the pitch, loudness and duration of the utterance. Linguists sometimes refer to these features as “suprasegmental,” suggesting that they are somehow above a string of consonants and vowels. This connotation is misleading: rather, in speech the string of consonants and vowels is overlaid onto a base of phonation (voicing), generated by an airstream from the lungs passing through the larynx, which results in fluctuations in pitch (height and movement), and loudness distributed over phonatory chunks of varying durations. The term “prosody” and the related adjective prosodic, are commonly used to refer to features of pitch, loudness, and duration in speech, in a broad sense – encompassing their use on individual words (e.g. in lexical stress; duration of the syllable or part of syllable; lexical tones), as well as the use of these features over longer stretches of speech (phrases, complete utterances, conversational turns), the latter being the focus of the present chapter. There are at least two good reasons why clinical linguists and speech and language pathology professionals should study prosody. First, there are some clients who present with unusual prosodic patterns, and it is important to investigate why this might be. Second, if prosody is a relative strength for many people with speech and language difficulties, how might it be used to support or compensate for other aspects of language? In the case of prosody, the basis for postulating an impairment is likely to be the auditory impression of listeners that the speaker’s use of prosodic features is in some way atypical for that speech community, yet its atypicality cannot be attributed to other causes, for example, being a nonnative speaker whose prosody in the second language is influenced by the mother tongue. Beyond that, identification, description, and explanation of the impairment are theorydependent. The investigator can adopt one or more relatively distinct though complementary approaches.
40.2 Phonetic Approach Atypical prosody in both developmental disorders (e.g. childhood apraxia of speech (CAS), Williams Syndrome, autism and Asperger’s syndrome), and acquired disorders of speech (e.g. acquired apraxia of speech (AOS) and the acquired dysarthrias (e.g. hypokinetic and hyperkinetic)) can be explored using a range of methodologies which are based on both auditory-perceptual and acoustic phonetic and experimental methodologies (e.g. Odell &
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
590 Bill Wells and Traci Walker Shriberg, 2001). Although this discussion will focus on the latter set of approaches, the combination of both acoustic and perceptual methods is both strongly advocated (Kent & Kim, 2003), and employed in the profiling of prosody (Bunton et al., 2000). Prosody is related to a number of perceptual dimensions, and includes, though not exhaustively, features of pitch, loudness and duration. All of these features can be explored instrumentally using a range of techniques which include acoustic-based methods (Kent & Kim, 2003), and other techniques like laryngography (see Fourcin, 1986, for a description of the methodology). Quantitative information on F0, intensity and duration can be employed independently to profile prosody. However, a multidimensional approach which combines aspects of F0, intensity and duration allows a more comprehensive assessment and characterization of atypical prosody. The perception of pitch is partly determined by fundamental frequency (F0), which therefore forms an important part in the investigation and profiling of atypical prosody. Various F0-related parameters can be explored instrumentally and quantified for any isolated word or any type of connected speech utterance with this aim in mind. These parameters could include mean F0, the standard deviation of F0, F0 range, and the shape of fundamental frequency contours. Information gleaned from parameters such as a limited F0 range can provide quantifiable acoustic information on an adult with dysarthria, or a child with autism who may, for example, present with monotonous sounding speech. The perception of loudness is partly determined by intensity, which can be quantified in a number of ways, and could include information on mean intensity, the standard deviation of intensity, intensity range, the shape of the intensity envelope, and intensity decay. For example, patterns of diminishing intensity (or intensity decay) have been observed in the speech of individuals with Parkinsonian dysarthria (Ho et al., 2001). However, there is also evidence to suggest that intensity decay is not consistent across different individuals with Parkinson’s disease, and that patterns of intensity decay may also vary across different speech tasks (Rosen et al., 2005). This highlights the role of individual variability, and the effect of different speech tasks in the assessment of prosody (Lowit-Leuschel & Docherty, 2001). In contrast to Parkinsonian dysarthria, speakers with AOS who are perceived as having abnormal stress patterns may display a limited variation in the intensity of syllables across an utterance. The aspect of prosody which relates to the durational dimension of speech includes parameters such as utterance duration, articulation rate, mean pause duration, the incidence of pauses, the ratio between articulation time and pause time and mean stressed vowel duration. As an illustration, speakers with moderate to severe AOS may show evidence of a high number of pauses relative to articulation time, which when combined with the excessively long syllables, goes some way in explaining their atypical prosody patterns (Whiteside & Varley, 1998). Hayley & Jacks (2019) show that syllable duration measures are best at capturing the differences in word-level prosody to differentially diagnose people with AOS from people with aphasia and healthy controls. Most studies of atypical prosody focus on the acoustic parameters related to the primary features of pitch (via fundamental frequency), loudness (via amplitude) and duration. However, other acoustic cues related to various dimensions of voice quality at both the laryngeal and supralaryngeal levels have been identified in the perception and production of emotion and affect in healthy individuals (Banse & Scherer, 1996). This therefore deserves further investigation in studies of individuals who may display impairment in the production and perception of affective prosody. A more comprehensive set of acoustic parameters may also prove useful in furthering our understanding of the perception and production of atypical linguistic prosody. For example, multivariate analyses of prosody seem to be a promising avenue for uncovering acoustic markers of ASD, but more systematicity is required in terms of the parameters being measured and compared, (Fusaroli et al., 2016).
Prosodic Impairments 591
40.3 Linguistic Approach Prosodic features serve to realize linguistic systems such as tone (in tone languages) stress and intonation. From this perspective a prosodic impairment impacts on the linguistic system in question, with the result that the meaning (in its broadest sense) of the speaker’s utterance may be obscured. Identification of a linguistic impairment of intonation, for example, is therefore dependent on the analyst having a description of the intonation system of the language. Such descriptions are available for a growing number of languages; in this chapter examples are taken from English to illustrate general principles. It is likely that all languages have an equivalent of the English systems of Tonality and Tone, that is, a way of chunking the utterance prosodically in different ways, and a choice of pitch direction (Halliday, 1967). English intonation makes use of a third major system that is more restricted in its distribution across languages. This is Tonicity, the location of the tonic syllable, which serves to highlight important information in the utterance and to background less important information. Each of these systems may be impaired, and one analytic strategy is to transcribe data from prosodically impaired speakers using the conventions of English intonation. This can serve to highlight the impact on meaning caused by atypical prosodic patterns. This linguistic approach to prosodic assessment is associated particularly with David Crystal, who formalized it into a profile (Crystal, 1992). An example of a fully worked through case study using this approach, of a client with dysarthria, is presented in Vance (1994), and the use of a linguistically based test battery can be found in Samuelsson and Nettelbladt (2004). A particular value of the linguistic approach is to tell us what is going wrong linguistically with a speaker’s output; for example, how the client succeeds or fails in realizing intonation systems. While a linguistic description does not itself indicate directly what the underlying causes of the client’s prosodic problems might be, it provides a systematic basis for generating hypotheses about causation. However, the linguistic approach has for the most part been superseded by the interactional approach, described below.
40.4 Interactional Approach The interactional approach resembles the linguistic approach in that it involves the analysis of spontaneous speech data, on the basis of careful transcription, including phonetic notation. Studies of prosody in conversational interaction have revealed the complex and subtle ways in which speakers deploy prosodic features in order to negotiate everyday talk (CouperKuhlen & Selting, 1996). This approach tends to take as its starting point basic interactional phenomena, frequently turn organisation, though it could also be repair organization, or topic management, for example. In clinical analysis, the focus is on how the client manages (or doesn’t manage) the business of maintaining the social interaction, in the face of prosodic or other limitations. Beeke et al. (2009) exemplify this approach in their analysis of conversations between three speakers with agrammatism and family members or friends. They show how intact prosodic skills compensate for impaired grammatical ability, and allowed the speakers with agrammatism to reliably indicate turn-transition points, as well as turn-continuation. Their use of prosodic patterns is observably oriented to by their interlocutors, even when the flow of speech is severely disrupted by pausing and word searches. In the more extensive interactional literature on prosody in atypical development, children on the autistic spectrum are well represented, with a focus on immediate or delayed echolalia (Local & Wootton, 1995; Tarplee & Barrow, 1999; Sterponi & Shankey, 2014) and turn-taking (Kelly & Beeke, 2011). There has also been interactional prosodic research
592 Bill Wells and Traci Walker involving children with hearing impairments (Anstey and Wells, 2013) and developmental speech and language difficulties, as reviewed in Chapter 10 of Wells and Stackhouse (2015). A case study by Wells and Local (1993) of a boy with speech and language difficulties illustrates the approach. At the age of 5;4, David invariably located the main pitch movement on the final syllable of his turn at talk, and it was invariably a rising pitch. Words preceding this final syllable were produced with level pitch around the middle of the pitch range. The direction of final pitch movement was more or less appropriate for declaratives in the variety of English which David was exposed to (West Midlands of England), but, from a linguistic perspective, its invariable location on the final syllable was not. David was receiving speech and language therapy, but this was not targeted at prosodic features. Recordings made one year later showed that this pattern had been superseded by the more usual one for his variety of British English, whereby position of nuclear tone is determined by considerations of information focus as well as turn completion. Partly as a consequence, David displayed much greater variety in pitch height and movement. This was accompanied by a marked improvement in his overall intelligibility. Wells and Local argued that at 5;4, David’s idiosyncratic prosodic pattern served to mark the end of his turns at talk in a clear, consistent and unambiguous way, which was useful for him and his co-participants given the unintelligibility of his speech. By clearly signaling the end of his turn at talk, David managed to maintain interactions with others without an unusual amount of overlap or interruption by co-participants. An important feature of the interactional prosodic approach, evident in the studies just cited, is that it can reveal not only any negative communicative consequences of “impairments” of prosody but also, and perhaps more importantly, the ways in which prosody can act as a communicative resource in the face of other cognitive and/or linguistic difficulties. Where the child or adult is able to manipulate pitch, loudness, duration etc., these may be used to manage conversational tasks such as the management of turn-exchange (as in the case of David), initiating a new conversational action such as repair, highlighting of topic and alignment with co-participants in the conversation (e.g. Tempest & Wells, 2012). With this in mind, Wells and Stackhouse (2015: Appendix 3) have devised a prosodic profile where the primary analytical categories are interactional rather than intonational.
40.5 Psycholinguistic Approach With prosody as with other levels of linguistic analysis, the structures posited by linguists can be taken as a testable hypothesis as to how linguistic knowledge is represented in adult speakers’ minds. It can thus be hypothesized that the English speaker has to learn to draw on representational distinctions, of the type encoded in the three systems of intonation (Tonality, Tonicity, Tone) described above, both in comprehension and in production (cf. Levelt, 1989, Ch 10). For example, if in comprehension, a client interprets /chocolate and honey/ as having its main focus on chocolate rather than honey, one possibility is that s/he has not learnt the systemic significance of Tonicity. There may be consequences of such an immature/inaccurate prosodic representation for the speaker’s own production. If the distinction between non-final and final tonic placement (Tonicity) is lacking at the representational level, we might anticipate that the speaker will mix up the form in his/ her own production, that is, on occasion use a final tonic in a context which requires non-final focus, and vice versa. Thus, inaccurate uses of intonation, in terms of both comprehension and production, may be attributed to imprecise representations, that is, to “high level” factors. However, low-level influences may also be involved. On the input side, the client may have deficits in hearing or in auditory processing that block access to prosodic details of the incoming signal (Barry et al., 2002). Such deficits are likely to give rise to imperfect processing
Prosodic Impairments 593 and comprehension of the heard utterance, and in the longer term the construction of inaccurate prosodic representations. On the output side, a speaker with a prosodic impairment may have accurate representations but be unable to realize them accurately, due to limitations on their ability to execute complex prosodic patterns/contours, for example arising from laryngeal anomalies or respiratory difficulties (Heselwood et al., 1995). Within this psycholinguistic framework, in the case of a person with impaired prosodic output we can ask the question: what is the level of breakdown? Is it in input, representations, or output, or a mix of these? In order to address such questions systematically, it is usual to use a battery of tasks that tap different levels of processing. PEPS-C represents an attempt to devise a systematic and comprehensive prosodic test battery (Wells & Peppé, 2003). It incorporates the following dimensions: Input (perception/ comprehension) vs. Output (generation/production); and Form (referring to lower level phonetic processing, where meaning is not involved) vs. Function (involving higher level processing, drawing on stored knowledge, relating phonetic form to meaning). PEPS-C covers four communicative areas where intonation is generally agreed to have an important role: a) Chunking: prosodic delimitation of the utterance into units for grammatical, semantic or pragmatic purposes, for example, /coffee-cake /and honey/ vs. / coffee / cake /and honey/. b) Affect: expressing strong liking as opposed to reservation with the syllable “m,” by using rise-fall vs. fall-rise pitch movement respectively. c) Interaction: PEPS-C used the prosodic opposition between a low fall meaning “yes I understand”; as opposed to a high rise meaning “no I didn’t understand, please repeat.” d) Focus: the speaker’s use of phonetic prominence (tonicity) to indicate which item is most important in an utterance, e.g. /chocolate and honey/ vs. /chocolate and honey/. Each of the four communicative areas is tested for both Input and Output, with different assessments for Form and Function, giving a total of sixteen tasks. The battery has been employed in order to characterize clinical groups prosodically, compared to typical populations e.g. developmental speech and language impairment (Wells & Peppé, 2003). It can also be used to profile the prosodic abilities of individual clients, thus providing a basis for individually tailored intervention. Wells and Peppé (2001) profile two contrasting eight-year old boys, diagnosed as having a specific language impairment and compare their pattern of performance on various prosodic tasks against data from a normative study of prosodic development (Wells et al., 2004). Jonathan and Robin had each been identified as having language difficulties serious enough to warrant special educational provision. The following observations were made of Jonathan’s prosody in spontaneous speech, compared to normally developing children of a similar age: (i) many syllables are unusually loud; (ii) his speech is slow overall; (iii) at the end of utterances, Jonathan often has level pitch or moves rapidly from one level to another; (iv) he lengthens vowels very noticeably in the final syllables of his utterances. Jonathan’s speech has a “sing-song” character, deriving from his pervasive use of level pitch, as well as sustained vowels in some positions (as opposed to dynamic falls and rises). This intonation was regarded as unusual by his parents, as well as by professionals and others outside the family. They noted that this feature started sometime after his seventh birthday, and had become increasingly evident. Pervasively poor performance on the PEPS-C Output tasks suggests that Jonathan may have problems with output representation for some items in the intonation lexicon. He may also have low level, motor execution deficits. This ties in with the observations of his conversational speech. However, on the Affect Output Function task his performance was flawless,
594 Bill Wells and Traci Walker suggesting that Jonathan is not incapable of using prosody deliberately to express his meaning; moreover, it was for Affect that he made his highest score on an Input Function task (15/16). On other Input Function tasks, he scored less well, suggesting that his representation of some intonational meanings may be imprecise. This may in turn contribute to inaccurate output. In Robin’s spontaneous speech, by contrast, there are few strikingly unusual prosodic features. It is therefore quite surprising to discover that on the PEPS-C he had difficulties with both Input and Output. On Input Function, he scored below normal limits on all communicative areas except Chunking; in fact, he performed worse than Jonathan. Robin’s difficulties with Input Function suggest that he has problems interpreting pragmatic aspects of prosody. This is likely to be one of the factors responsible for his difficulties with social interaction, and may therefore be a suitable area for intervention. Robin was successful on three of the four Output Function tasks. This is somewhat paradoxical, given his failure on two of these (Affect, Focus) on Input Function. It suggests that a child may sound quite typical in terms of his own prosody, yet still have problems making sense of intonation. This can be described as a covert prosodic deficit. The contrasting profiles of Jonathan and Robin indicate the value of psycholinguistic profiling in the area of prosody, potentially as a basis for targeted intervention. The method enables the identification of areas of prosodic strength and prosodic weakness, neither of which may be evident from the study of spontaneous output alone. That said, psycholinguistic testing of intonation presents considerable challenges. For example, production on output tasks is subject to contextual effects – the test situation is a social interaction of a kind, and intonation is very susceptible to interactional factors. The test demands of a battery like PEPS-C are such as to preclude its use with preschool children, and with older clients who are not at a sufficient cognitive level. Furthermore, while the interpretation of scores depends on comparison with matched typical children or adults, there is a lot of variability in the adult population (Peppé et al., 2000) and among children of different age ranges (Dankovičová et al., 2004; Wells et al., 2004), which means that some caution must be exercised when diagnosing a prosodic impairment on the basis of such test results. Nonetheless, PEPS-C remains one of the most widely used tools for assessing prosodic ability Loveall et al. (2021), and has been widely translated (e.g., Peppé et al. (2010) for Spanish, Flemish, French and Norwegian; Foley et al. (2011) for Irish English; Filipe et al. (2017) for Portuguese).
40.6 Prosodic Impairments in Developmental Disorders Prosody in people with autistic spectrum disorders (ASD) is the subject of an early review by McCann and Peppé (2003), updated by Fusaroli et al. (2022). Several group studies of prosody in children with varying levels of ASD show evidence of impairment in both receptive and expressive prosodic ability (McCann et al., 2007; Paul et al., 2005; Peppé et al., 2007, 2011). However, in Dahlgren et al.’s (2018) acoustic and perceptual study of a cohort of Swedishspeaking children with high functioning autism, listeners failed to identify speakers with autism, and the only acoustic measure of difference was words per utterance (with children with autism using more). Single case studies of children with severe autism (Local & Wootton, 1995; Tarplee & Barrow, 1999), focusing on immediate and delayed echoes, have illustrated the value of the interactional approach in cases where testing is impossible and linguistic comparison with the adult system would be unrevealing – the child’s echoes are prosodically well-formed; it is their frequency and precise distribution, in relation to the interaction in progress, that can appear anomalous but when studied in situ, can show evidence of interactional competence. For example, Sterponi and Shankey (2014) show how “Aaron” used both the placement of
Prosodic Impairments 595 and changes to the prosodic pattern of his echoes to display acquiescence or divergence from directives (and other actions) initiated by his interlocutors. Loveall et al. (2021) present a meta-analysis of prosody in autism, Williams syndrome, and Down syndrome. To ensure comparability of results, use of PEPS-C was one of the criteria for inclusion in the review. All the groups struggled with the tests of prosodic form, but particular strengths (and weaknesses) relating to prosodic function differentiated the groups from one another. The search terms included a range of specific developmental and intellectual disabilities as well as generic terms; however, autism, Williams and Down syndrome were the only developmental disabilities returned, thus highlighting the need for more research on prosody in developmental disabilities. Wells and Stackhouse (2015) present an overview of work by Stojanovic and colleagues on the development of intonation in children with the genetic disorders of Williams syndrome and Down syndrome. Children with Williams syndrome have a rate of development of intonation in line with their mental age, but slower than that of chronologically age-matched controls (Stojanovic 2010), and children with Down syndrome show a similarly delayed intonational development, in line with their overall delay in language development (Stojanovic 2011). Stojanovic and Setter (2011) compared the performance of nine children each with Williams syndrome and Down syndrome, matched for mental age: children with Williams syndrome did significantly better on all expressive aspects of prosody, despite the Down syndrome children having comparable language skills. This points to a difference in prosodic ability linked to a genetic disorder. Wells and Stackhouse (2015) note, however, that Down syndrome children (and adults) suffer from other physical impairments that are a part of the syndrome, such as poor muscle tone and respiratory difficulties, that can impact on the control needed for intonation.
40.6.1 Developmental Language Disorder (DLD) [Previously Specific Language Impairments (SLI)] As a clinical entity SLI is notoriously difficult to define (Bishop, 1997), and the studies of prosody in SLI use a wide range of inclusion/exclusion criteria, with a wide age range. Complicating matters for an overview of the area is the fact that from around 2017, Developmental Language Disorder (DLD) has become the preferred term (e.g., Bishop et al., 2017). This term captures the language disorder aspect, which is important for validating the experiences of children whose difficulties with language exclude them from full participation in social and educational life, as well as making access to medical interventions available. It is of course not without drawbacks, as some dislike the medicalizing nature of the term disorder. Additionally, the change from SLI to DLD complicates not only such work as compiling or revising a handbook, but also conducting systematic reviews and meta- analyses. Even before the adoption of new terminology, the findings on prosodic impairment in SLI/DLD are rather mixed, and hard to interpret. Here, for studies published using the now-defunct SLI terminology, we have added the admittedly inelegant “/DLD,” but have not added SLI to studies published using the DLD label. Snow (1998) examined two specific prosodic features associated with sentence final position: final pitch movement and final lengthening. Ten children with SLI and ten children with normally developing language between the ages of 4;0 and 4;11 were age-matched within 3 months of each other. Children were recorded in play sessions and specific spontaneous utterances were then measured for mean length of utterance (MLU), duration and F0 contour. Analysis of variance was used to evaluate the mean final-nonfinal differences between language groups, alongside a minimum perceptual criterion known as “just noticeable difference” (JND). Snow found that both groups used final syllable lengthening to some degree, and all children had control of the final pitch fall.
596 Bill Wells and Traci Walker Snow had hypothesized that the phrase final features of greater tone contour and syllable lengthening might not be found to the same degree in the SLI/DLD group, since their grammatical abilities were less than those of the normally developing children. In the event, both groups showed similar use of these parameters. This suggests that the prosodic features studied by Snow are not associated directly with syntax, that is, they are not serving as exponents of syntactic boundaries in his data. Snow’s results support the view that these children are not impaired in this area of prosodic output, and his subsequent investigation comparing prosodic production in children with SLI/DLD with chronological age matched controls revealed no differences between the two groups (Snow, 2001). Weinert (1996) posits a specific deficit in prosodic processing in children with SLI/DLD based on a comparison of children with SLI/DLD matched for memory span to younger normal-language children. Weinert and Mueller (1996) tested 11 children with SLI/DLD in a similar way, but this time using three prosodic conditions rather than two. In this case, some of the older SLI/DLD children, who had better language abilities, did improve their sentence reproductions under the exaggerated condition. This suggests that for these children prosodic processing ability may be relatively preserved, and a similar finding for Dutch is shown by Van der Meulen et al. (1997). In their study, children with language impairment performed consistently more poorly on an imitation task than typically developing children matched for chronological age, though performance of both groups improved significantly with age. The authors point out that the discrepancy between the groups does not necessarily mean that the language-impaired children have a primary deficit in prosodic production. Their poor performance may be the consequence of other speech and/or language difficulties. Wells and Peppé (2003) studied a group of eighteen eight-year-old children with speech and/or language impairments (LI) using the PEPS-C battery, described above. Scores were compared to those from a chronological age (CA) matched group and a group matched individually for grammatical comprehension (LA). The LI group performed below the LA group on just 2/16 tasks. On 7/16 tasks, the LI group did not differ significantly from the CA group either. There were few significant correlations between the prosodic measures and measures of grammatical comprehension, expressive language and articulation. The results support the view that in general intonation is relatively discrete from other levels of speech and language. A basic dichotomy in the PEPS-C procedure is between Form and Function. The generally good scores on Function tasks suggest that for children with speech and language impairment, intonation may be an island of relative strength in their communicative repertoire, enabling them to convey linguistically important areas of meaning without having recourse solely to grammatical and lexical means. Nevertheless, some specific problems were indicated. For the group as a whole, the pattern of results on the Input Form tasks suggested that the children with language impairments may find it difficult to store and process long prosodic strings. This points to an auditory memory deficit that may be responsible for their difficulties with language development. The other area of difficulty for the group was in using prosody for pragmatic/ interactional purposes. However, there was a lot of variation across individuals, in the profile of scores on the PEPS-C battery. This points to the importance of individual profiling as a basis for clinical intervention: Robin and Jonathan, described earlier in the chapter, were participants in this study. Marshall, Harcourt-Brown, Ramus and van der Lely (2009) used the PEPS-C battery on a group of older children (10–14) with SLI/DLD and/or dyslexia, and report that few children with SLI/DLD and dyslexia have difficulty with the form-based tasks (e.g., imitation of a prosodic pattern), but that evidence of impairment surfaces in their performance on more functional tasks (e.g., interpreting sentences where prosody interacts with syntax or
Prosodic Impairments 597 pragmatics, as in chunking and focus placement). Note that these findings for DLD are parallel to those for ASD, as reported in the meta-analysis conducted by Loveall et al. (2021). It could be predicted that expressive prosodic difficulties may give rise to pragmatic difficulties in conversation and other forms of spoken interaction, given that the functions of prosody, particularly intonation, include the conveying of interactional and affective meaning. Having found that LI children with morphosyntactic difficulties have normal prosody, Snow (1998) leaves it open that other groups of LI children may have prosodic output deficits, for example, pragmatic language impaired children. This idea has been around among clinicians for a long time. Nevertheless, there is a lack of studies investigating prosodic processing in children diagnosed as having pragmatic language impairments. Wells and Peppé (2003) included a subgroup of SLI/DLD children with pragmatic language difficulties. Interestingly, their performance on the more pragmatically oriented of the PEPS-C tasks was at least as good as that of children with SLI/DLD who were not thought to have pragmatic difficulties. Leonard and Kueser (2019) conducted cross-linguistic comparisons (of English to various Romance and Germanic languages) of the strengths and weaknesses of children with DLD, and posit that prosodic structure (e.g., strong-weak syllabic stress patterns) must be considered when evaluating or designing therapy for a child with DLD. This is because the prosodic structure of a language interacts with morphosyntactic and morphophonemic structure. For instance, English-speaking children with DLD tend to omit articles, whereas Swedishspeaking children do not. This may however be explained by the English weak-strong syllable construction of article-noun, compared to the Swedish strong-weak noun-article structure. Their results remind us that therapy for children with DLD should consider whether particular targets present prosodic as well as grammatical challenges. In sum, research suggests that while many children diagnosed with SLI/DLD do not appear to have overt prosodic difficulties in their speech output (at least by the time they are diagnosed), they may still have subtle hidden problems with processing prosody, which might affect either their language production, or their comprehension, or both.
40.6.2 Speech Sound Disorder The terminology used to describe speech sound disorders, of which childhood apraxia of speech is considered one subgroup, has undergone major change since the first version of this handbook. As described by Malmenholt et al. (2022, p. 157), speech sound disorder (SSD) is now the preferred umbrella term for childhood speech disorders, and there exist three subgroups of SSD related to problems with articulation, phonology or motor control. Childhood apraxia of speech (CAS) sits within this last subgroup, and is characterized by inconsistent vowel and consonant errors, errors in articulatory transitions, and inappropriate prosody. In the 1990s, Shriberg et al. (1997) investigated the possibility that there is a particular subgroup of speech-impaired children for whom the diagnostic marker is prosodic: a deficit in lexical and phrasal stress production. To determine whether there was a single diagnostic criterion that would distinguish children with what would now be called suspected CAS (known then as developmental apraxia of speech) from others with speech delay. Although Shriberg and colleagues argued that inappropriate stress is likely to result from a deficit in the linguistic representation of stress, rather than in motor planning or execution, and even argued that the stress deficit they discovered is independent of segmental phonological difficulties, thinking about the presentation and therapy for CAS has moved on. It is currently not thought to be possible to identify subgroups of CAS; rather it is acknowledged that strengths and weaknesses in a child’s presentation can depend on many factors, including the language spoken (e.g., Malmenholt et al., 2022). The differential diagnosis of CAS from
598 Bill Wells and Traci Walker other subgroups of SSD is important, however, to ensure that children receive the right kind of interventions (see e.g., Murray et al., 2015). Inappropriate prosody, particularly to do with stress, remains a defining characteristic of CAS, and therapies focused on motor control have shown promise in improving both prosodic and segmental output, for example, Ballard, Robin, McCabe and McDonald (2010), Miller et al. (2021). As Wells and Peppé (2003) showed, children who had speech difficulties at the segmental level were more likely to have a low score on the PEPS-C Output Form tasks, which involved imitation of a short phrase, including its prosodic pattern. This result suggests a relationship between the ability to pronounce segments accurately and prosodic contours. Therefore, it still seems plausible that difficulty with segmental articulation might disrupt the planning and execution of prosodic structures and systems, but the opposite relation cannot be discounted: problems with prosodic organisation may affect the production of segments (see also Howard & Wells, this volume).
40.7 Prosodic Impairments in Acquired Disorders The first part of this section provides a brief review of research into prosody in acquired apraxia of speech (AOS). The second part focuses on prosody in individuals with left-hemisphere (LH) and right-hemisphere (RH) brain damage impairments.
40.7.1 Prosody in Acquired Apraxia of Speech (AOS) Acquired apraxia of speech (AOS) is a motor deficit typically associated with left hemisphere (LH) damage, and interferes with the programming and sequencing of movements in the volitional articulation of speech (Varley & Whiteside, 2001). Studies on AOS have reported a wide range of speech characteristics that typify this motor speech disorder. In addition to groping behaviors commonly observed in speakers with AOS, at the segmental level their speech characteristics include: inconsistent and variable articulatory movements; increased word and vowel duration patterns; voicing errors; segmental errors; and reduced coarticulation patterns (see Whiteside & Varley, 1998 for a review). In addition, effortful speech which is produced in a word-by-word, phrase-by-phrase fashion, and a general slowed rate of speaking together with prolongations of transitions, segments and intersyllable pauses have also been observed. These features together with a limited variation in relative peak intensity across syllables, results in the perception of abnormal stress and rhythm patterns, a general impression of atypical prosody in AOS (Kent & Rosenbek, 1983). Because speech production in AOS is impaired phonetically at both the segmental and suprasegmental levels, it is likely that suprasegmental characteristics in emotional or affective prosody may also be affected. On this basis, Van Putten and Walker (2003) investigated the ability of one healthy speaker, one speaker with moderate AOS, and another with mild AOS to produce emotional prosody using sentences. The emotions investigated were happy, sad and neutral. Both repetition and reading tasks were employed in the study. Ten sets of phonetically balanced and semantically neutral sentences were produced with a happy, sad or neutral voice. The sentences produced by all subjects were analyzed using a range of acoustic parameters. Results indicated that the speakers with AOS were not able to produce significant differences in F0, duration and amplitude to signal the three different emotions, as a consequence of their groping, inter-syllabic pauses and word initiation difficulties. The severity of AOS did not appear to be a factor – both subjects had an impaired capacity to signal emotional prosody. In addition, although naïve listeners were able to identify the emotional intent of the control subject’s productions, this was not the case for the
Prosodic Impairments 599 AOS samples. These results suggest that in addition to linguistic prosody, the production of affective prosody is impaired in speakers with AOS. Several studies show the effect of different metrical structures on segmental accuracy in speakers with AOS. Bailey et al. (2019) replicate, with English-speaking participants, a study of German speakers with AOS (Aichert et al., 2016) and report similar findings: words with strong-weak stress patterns (as opposed to those with weak-strong patterns) are more resistant to segmental errors. This shows that in actual speech production, separation between the segmental and so-called suprasegmental level is not so clear.
40.7.2 Processing Prosody in Individuals with Left-hemisphere and Right-hemisphere Lesions Because prosody signals both linguistic and emotional information, it provides a useful basis for exploring the traditional view for left-hemisphere dominance for linguistic processing versus right-hemisphere dominance for the processing of emotion and affect. One view is that individual acoustic cues to prosody are lateralized to different hemispheres so that frequency-determined acoustic cues such as fundamental frequency (F0) are processed by the right hemisphere (RH), whereas those that have a temporal component are processed by the left hemisphere (LH) (e.g. Van Lancker & Sidtis, 1992). Neuroscientific studies which have investigated the processing of acoustic cues provide some evidence for hemispheric specialization for the perception of prosody (eg., Zatorre & Belin, 2001) as well as behavioral (Branucci et al., 2005), and imaging evidence (e.g. Liebenthal et al., 2005). However Weed and Fusaroli’s (2020) systematic review found evidence for reduced F0 variation in both RHD and LHD (see also Perkins Walker et al., 2002), and no meta-analytic evidence for an effect of what they call prosody type, that is, linguistic vs affective prosody. Weed and Fusaroli’s review also points to the need for larger sample sizes and better detail on lesion location in RHD, a point echoed by Durfee et al. (2021) who demonstrate the importance of taking the lesion site into account when planning therapy for aprosodia after RHD. Additionally, Sheppard, Meier, Durfee, Walker, Shea, and Hillis (2021) present a study showing how various subcortical structure play different roles in emotional prosody comprehension, and receptive aprosodia can result from impairments at different processing stages. Additionally, although there are a wide range of acoustic cues which have been identified in the processing of emotion and affect in healthy individuals (Banse & Scherer, 1996), many studies of prosodic processing have focused only on the primary features of pitch (via fundamental frequency) and duration (e.g. Baum, 1998; Pell, 1998). This deserves further consideration in experimental studies of individuals who display impairments in the processing of affective and linguistic prosody.
40.8 Future Directions Developmental and acquired disorders of spoken communication make up the bulk of the case load of speech and language pathologists and are an important focus for research. A reliable picture of typical prosodic systems and their development is therefore very important. However, it is not at all easy to attain. Prosody is resistant to testing, and the measures used throw up large individual differences (cf. Peppé et al., 2000). Hawthorne and Fischer (2020) surveyed 245 speech and language pathologists who reported that assessment and/ or treatment for prosodic disorders lags far behind that of other impairments, due to a perceived lack of understanding of prosody and lack of knowledge assessments/treatments. The good news is, however, that Hargrove et al. (2009)’s critical review of interventions
600 Bill Wells and Traci Walker targeting prosody showed that despite diversity among types of prosodic impairment and communication disorders, overall, treatments were effective. While the phonetic, linguistic, interactional, and psycholinguistic approaches to prosodic impairment have been differentiated, it will be clear that they are interlocking. Phonetic observation and description of prosodic patterns remain the indispensable foundations from which a linguistic account, relating prosodic form to meaning, can be derived. Psycholinguistic investigations are themselves dependent on linguistic and phonetic expertise, notably in the construction of test stimuli, if they are to produce valid and reliable results. Interactional analysis is an extension of the linguistic approach that takes account systematically of the place of utterances, and their prosodic components, within sequences of talk. Group studies are important in establishing the prosodic components in particular clinical entities. However, case studies have a key role in the research endeavor, in generating hypotheses to be tested experimentally. Clinically, case studies can provide concrete suggestions as to how to identify and describe the prosodic difficulties of individual clients – an essential prerequisite for intervention.
ACKNOWLEDGMENT Our thanks to Suzanne Churcher for thoughtful discussion and invaluable advice on some of the topics contained in this chapter.
REFERENCES Aichert, I., Spaeth, M., & Ziegler, W. (2016). The role of metrical information in apraxia of speech. Perceptual and acoustic analyses of word stress. Neuropsychologia, 82, 171–178. https://apastyle.apa.org/blog/issuenumbers#:~:text=If%20a%20journal%20 does%20not,need%20to%20search%20for%20it. Anstey, J., & Wells, B. (2013). The uses of overlap: Carer-child interaction involving a nine-yearold boy with auditory neuropathy. Clinical Linguistics & Phonetics, 27(10–11), 746–769. https://doi.org/10.3109/02699206.2013.803602 Bailey, D. J., Bunker, L., Mauszycki, S., & Wambaugh, J. L. (2019). Reliability and stability of the metrical stress effect on segmental production accuracy in persons with apraxia of speech. International Journal of Language & Communication Disorders, 54(6), 902–913. Ballard, K.J., Robin, D.A., McCabe, P., & McDonald, J. (2010). A treatment for dysprosody in childhood apraxia of speech. Journal of Speech, Language and Hearing Research, 53(5), 1227–1245. doi.org/10.1044/1092-4388(2010/09-0130) Banse, R., & Scherer, K. R. (1996). Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology, 70(3), 614–636.
Barry, J. G., Blamey, P., Martin, L., Lee, K., Tang, T., Ming, Y. Y., & van Hasselt, C. (2002). Tone discrimination in Cantonese-speaking children using a cochlear implant. Clinical Linguistics and Phonetics, 16(2), 79–99. Baum, S. R. (1998). The role of fundamental frequency and duration in the perception of linguistic stress by individuals with brain damage. Journal of Speech Language and Hearing Research, 41, 31–40. Beeke, S., Wilkinson, R., & Maxim, J. (2009). Prosody as a compensatory strategy in the conversations of people with agrammatism. Clinical Linguistics and Phonetics, 23(2), 133–155. https://doi.org/10.1080/02699200802602985 Bishop, D. (1997). Uncommon understanding: Development and disorders of language comprehension in children. Psychology Press. Bishop, D., Snowling, M., Thompson, P., Greenhalgh, T., & the CATALISE-2 consortium (2017). Phase 2 of CATALISE: A multinational and multidisciplinary Delphi consensus study of problems with language development: Terminology. Journal of Child Psychology and Psychiatry, 58, 1068–1080. https://doi. org/10.1111/jcpp.12721
Prosodic Impairments 601 Branucci, A., Babiloni, C., Rossini, P. M., & Romani, G. L. (2005). Right hemisphere specialization for intensity discrimination of musical and speech sounds. Neuropsychologia, 43, 1916–1923. Bunton, K., Kent, R. D., Kent, J. F., & Rosenbek, J. C. (2000). Perceptuo-acoustic assessment of prosodic impairment in dysarthria. Clinical Linguistics and Phonetics, 14, 13–24. Couper-Kuhlen, E., & Selting, M. (1996). Prosody in conversation: Interactional studies. Cambridge University Press. 11–56. Crystal, D. (1992). Profiling linguistic disability (2nd ed.). Whurr. Dahlgren, S., Sandberg, A. D., Strömbergsson, S., Wenhov, L., Råstam, M., & Nettelbladt, U. (2018). Prosodic traits in speech produced by children with autism spectrum disorders – Perceptual and acoustic measurements. Autism and Developmental Language Impairments, 3. https://doi.org/10.1177/2396941518764527 Dankovičová, J., Pigott, K., Peppé, S., & Wells, B. (2004). Temporal markers of prosodic boundaries in children’s speech production. Journal of International Phonetic Association, 34, 17–36. Durfee, Z. A., Sheppard, S. M., Meier, E. L., Bunker, L., Cui, E., Crainiceanu, C., & Hillis, A. E. (2021). Explicit training to improve affective prosody recognition in adults with acute right hemisphere stroke. Brain Sciences, 11(5). https://doi.org/10.3390/brainsci11050667 Filipe, M. G., Peppé, S., Frota, S., & Vicente, S. G. (2017). Prosodic development in European Portuguese from childhood to adulthood. Applied Psycholinguistics, 38(5), 1045–1070. https://doi.org/10.1017/S0142716417000030 Foley, M., Gibbon, F. E., & Peppé, S. (2011). Benchmarking typically developing children’s prosodic performance on the Irish-English version of the Profiling Elements of Prosody in Speech-Communication (PEPS-C). Journal of Clinical Speech and Language Studies, 20, 19–40. Fourcin, A. (1986). Electrolaryngographic assessment of vocal folds function. Journal of Phonetics, 14, 435–442. Fusaroli, R., Grossman, R., Bilenberg, N., Cantio, C., Jepsen, J. R. M., & Weed, E. (2022). Toward a cumulative science of vocal markers of autism: A cross‐linguistic meta‐analysis‐based investigation of acoustic markers in American and Danish autistic children. Autism Research, 15(4), 653–664. Fusaroli, R., Lambrechts, A., Bang, D., Bowler, D. M., & Gaigg, S. B. (2016). Is voice a marker for Autism spectrum disorder? A systematic
review and meta-analysis. Autism Research. https://doi.org/10.1002/aur.1678 Halliday, M. A. K. (1967). Intonation and grammar in British English. Mouton. Hargrove, P., Anderson, A., & Jones, J. (2009). A critical review of interventions targeting prosody. International Journal of SpeechLanguage Pathology, 11(4), 298–304. https://doi. org/10.1080/17549500902969477 Hawthorne, K., & Fischer, S. (2020). Speechlanguage pathologists and prosody: Clinical practices and barriers. Journal of Communication Disorders, 87. https://doi.org/ 10.1016/j.jcomdis.2020.106024 Hayley, K. L., & Jacks, A. (2019). Word-level prosodic measures and the differential diagnosis of apraxia of speech. Clinical Linguistics and Phonetics, 33(5), 479–495. https:// doi.org/10.1080/02699206.2018.1550813 Heselwood, B., Bray, M., & Crookston, I. (1995). Juncture, rhythm and planning in the speech of an adult with Down’s syndrome. Clinical Linguistics & Phonetics, 9(2), 121–137. Ho, A. K., Iansek, R., & Bradshaw, J. L. (2001). Motor instability in Parkinsonian speech intensity. Neuropsychiatry, Neuropsychology, and Behavioral Neurology, 14, 109–116. Kelly, D., & Beeke, S. (2011). The management of turn taking by a child with highfunctioning autism: Re-examining atypical prosody. In V. S. J. Stojanovik (Ed.), Speech prosody in atypical populations: Assessment and remediation (pp. 71–98). J & R Press. Kent, R. D., & Kim, Y. (2003). Toward an acoustic typology of motor speech disorders. Clinical Linguistics & Phonetics, 17(6), 427–445. Kent, R. D., & Rosenbek, J. C. (1983). Acoustic patterns of apraxia of speech. Journal of Speech and Hearing Research, 26, 231–249. Leonard, L. B., & Kueser, J. B. (2019). Five overarching factors central to grammatical learning and treatment in children with developmental language disorder. International Journal of Language and Communication Disorders, 54(3), 347–361. https://doi.org/10.1111/1460-6984.12456 Levelt, W. (1989). Speaking: From intention to articulation. MIT Press. Liebenthal, E., Binder, J. R., Spitzer, S. M., Possing, E. T., & Medler, D. A. (2005). Neural substrates of phoneme perception. Cerebral Cortex, 15, 1621–1631. Local, J., & Wootton, T. (1995). Interactional and phonetic aspects of immediate echolalia in autism: A case study. Clinical Linguistics & Phonetics, 9, 155–184.
602 Bill Wells and Traci Walker Loveall, S. J., Hawthorne, K., & Gaines, M. (2021). A meta-analysis of prosody in autism, Williams syndrome, and Down syndrome. Journal of Communication Disorders, 89, 106055. https://doi.org/10.1016/j.jcomdis.2020. 106055 Lowit-Leuschel, A., & Docherty, G. J. (2001). Prosodic variation across sampling tasks in normal and dysarthric speakers. Logopaedics, Phoniatrics and Vocology, 26, 151–164. Malmenholt, A., McAllister, A., Lohmander, A., & Östberg, P. (2022). Speech feature profiles in Swedish 5-year-olds with speech sound disorder related to suspected childhood apraxia of speech or cleft palate. International Journal of Speech-Language Pathology, 24(2), 156–167. https://doi.org/10.1080/17549507.20 21.1968951 Marshall, C. R., Harcourt-Brown, S., Ramus, F., & van der Lely, H. K. J. (2009). The link between prosody and language skills in children with specific language impairment (SLI) and/or dyslexia. International Journal of Language & Communication Disorders, 44(4), 466–488. https://doi.org/10.1080/13682820802591643 McCann, J., & Peppé, S. (2003). Prosody in autism spectrum disorders: A critical review. International Journal of Language and Communication Disorders, 38, 325–350. McCann, J., Peppé, S., Gibbon, F. E., O’Hare, A., & Rutherford, M. (2007). Prosody and its relationship to language in school-aged children with high-functioning autism. International Journal of Language & Communication Disorders, 42, 682–702. https:// doi.org/10.1080/13682820601170102 Miller, H., Ballard, K., Campbell, J., Smith, M., Plante, A., Aytur, S., & Robin, D. (2021). Improvements in speech of children with apraxia: The efficacy of treatment for establishing motor program organization (TEMPOSM). Developmental Neurorehabilitation, 24(7), 494–509. https://doi.org/10.1080/17518423.2021.1916113 Murray, E., McCabe, P., Heard, R., & Ballard, K. J. (2015). Differential diagnosis of children with suspected childhood apraxia of speech. Journal of Speech, Language, and Hearing Research, 58, 43–60. https://doi.org/10.1044/2014_JSLHR-S-12-0358 Odell, K. H., & Shriberg, L. D. (2001). Prosodyvoice characteristics of children and adults with apraxia of speech. Clinical Linguistics and Phonetics, 15, 275–307. Paul, R., Augustyn, A., Klin, A., & Volkmar, F. R. (2005). Perception and production of prosody by speakers with autism spectrum disorders.
Journal of Autism and Developmental Disorders, 35, 205–220. Pell, M. D. (1998). Recognition of prosody following unilateral brain lesion: Influence of functional and structural attributes of prosodic contours. Neuropsychologia, 36, 701–715. Peppé, S., Cleland, J., Gibbon, F., O’Hare, A., & Castilla, P. M. (2011). Expressive prosody in children with autism spectrum conditions. Journal of Neurolinguistics, 24(1), 41–53. https:// doi.org/10.1016/j.jneuroling.2010.07.005 Peppé, S., Martínez-Castilla, P., Coene, M., Hesling, I., Moen, I., & Gibbon, F. (2010). Assessing prosodic skills in five European languages: Cross-linguistic differences in typical and atypical populations. International Journal of Speech-Language Pathology, 12(1), 1–7. https://doi.org/10.3109/17549500903093731 Peppé, S., Maxim, J., & Wells, B. (2000). Prosodic variation in Southern British English. Language and Speech, 43(3), 309–334. Peppé, S., McCann, J., Gibbon, F., O’Hare, A., & Rutherford, M. (2007). Receptive and expressive prosody in children with highfunctioning autism. Journal of Speech Language and Hearing Research, 50(4), 1015–1028. Perkins Walker, J., Daigle, T., & Buzzard, M. (2002). Hemispheric specialisation in processing prosodic structures: Revisited. Aphasiology, 16, 1155–1172. Rosen, K. M., Kent, R. D., & Duffy, J. R. (2005). Task-based profile of vocal intensity decline in Parkinson’s disease. Folia Phoniatrica et Logopaedica, 57, 28–37. Samuelsson, C., & Nettelbladt, U. (2004). Prosodic problems in Swedish children with language impairment: Towards a classification of subgroups. International Journal of Language and Communication Disorders, 39(3), 325–344. Sheppard, S. M., Meier, E. L., Zezinka Durfee, A., Walker, A., Shea, J., & Hillis, A. E. (2021). Characterizing subtypes and neural correlates of receptive aprosodia in acute right hemisphere stroke. Cortex; A Journal Devoted to the Study of the Nervous System and Behavior, 141, 36–54. https://doi.org/10.1016/j.cortex. 2021.04.003 Shriberg, L., Aram, D., & Kwiatkowski, J. (1997). Developmental Apraxia of Speech: III. A subtype marked by inappropriate stress. Journal of Speech Language and Hearing Research, 40, 313–337. Snow, D. (1998). Prosodic markers of syntactic boundaries in the speech of four-year-old children with normal and disordered language
Prosodic Impairments 603 development. Journal of Speech Language and Hearing Research, 41, 1158–1170. Snow, D. (2001). Imitations of intonation contours by children with normal and disordered language development. Clinical Linguistics and Phonetics, 15(7), 567–584. Sterponi, L., & Shankey, J. (2014). Rethinking echolalia: Repetition as interactional resource in the communication of a child with autism. Journal of Child Language, 41(2), 275–304. https://doi.org/10.1017/S0305000912000682 Stojanovik, V. (2010). Understanding and production of prosody in children with Williams syndrome: A developmental trajectory approach. Journal of Neurolinguistics, 23(2), 112–126. doi:10.1016/j.jneuroling.2009.11.001 Stojanovik, V. (2011). Prosodic deficits in children with Down syndrome. Journal of Neurolinguistics, 24(2), 145–155. doi:10.1016/j. jneuroling.2010.01.004 Stojanovik, V., & Setter, J. (2011). Prosody in two genetic disorders: Williams and Down’s syndrome. In V. Stojanovik & J. Setter (Eds.), Speech prosody in atypical populations: assessment and remediation (pp. 25–43). J & R Press. Tarplee, C., & Barrow, E. (1999). Delayed echoing as an interactional resource: A case study of a 3-year-old child on the autistic spectrum. Clinical Linguistics & Phonetics, 13(6), 449–482. Tempest, A., & Wells, B. (2012). Alliances and arguments: A case study of a child with persisting speech difficulties in peer play. Child Language Teaching and Therapy, 28(1), 57–72. https://doi.org/10.1177/0265659011419233 Van der Meulen, S., Janssen, P., & Den Os, E. (1997). Prosodic abilities in children with specific language impairment. Journal of Communication Disorders, 30, 155–170. Van Lancker, D., & Sidtis, J. (1992). The identification of affective-prosodic stimuli by left- and right-hemisphere- damaged subjects: All errors are not created equal. Journal of Speech and Hearing Research, 35, 963–970. Van Putten, S. M., & Walker, J. P. (2003). The production of emotional prosody in varying degrees of severity of apraxia of speech. Journal of Communication Disorders, 36, 77–95.
Vance, J. (1994). Prosodic deviation in dysarthria; a case study. European Journal of Disorders of Communication, 29, 61–76. Varley, R. A., & Whiteside, S. P. (2001). What is the underlying impairment in acquired apraxia of speech? Aphasiology, 15, 39–49. Weed, E., & Fusaroli, R. (2020). Acoustic measures of prosody in right-hemisphere damage: A systematic review and metaanalysis. Journal of Speech, Language, and Hearing Research, 63(6), 1762–1775. https://doi. org/10.1044/2020_JSLHR-19-00241 Weinert, S. (1996). Prosodie–Gedaechtnis– Geschwindigkeit: Eine vergleichende Studie zu Sprachverarbeitungsdefiziten dysphasischsprachgestoerter Kinder. Sprache & Kognition, 15(1–2), 46–69. Weinert, S., & Mueller, C. (1996). Erleichtert eine akzentuierte Sprachmelodie die Sprachverarbeitung? Eine Untersuchung zur Verarbeitung rhythmisch-prosodischer Informationen bei dysphasischsprachgestoerten Kindern. Zeitschrift fuer Entwicklungspsychologie und Paedagogische Psychologie, 28(3), 228–256. Wells, B., & Local, J. (1993). The sense of an ending: A case of prosodic delay. Clinical Linguistics & Phonetics, 7(1), 59–73. Wells, B., & Peppé, S. (2001). Intonation within a psycholinguistic framework. In J. Stackhouse & B. Wells (Eds.), Children’s speech and literacy difficulties 2: Identification and intervention. Whurr Publishers. Wells, B., & Peppé, S. (2003). Intonation abilities of children with speech and language impairments. Journal of Speech Language and Hearing Research, 46, 5–20. Wells, B., Peppé, S., & Goulandris, A. (2004). Intonation development from five to thirteen. Journal of Child Language, 31, 749–778. Wells, B., & Stackhouse, J. (2015). Children’s intonation: A framework for practice and research. Wiley Blackwell. Whiteside, S. P., & Varley, R. A. (1998). A reconceptualisation of apraxia of speech. Cortex, 34, 221–231. Zatorre, R. J., & Belin, P. (2001). Spectral and temporal processing in human auditory cortex. Cerebral Cortex, 11, 946–953.
41 Speech Intelligibility JULIE LISS 41.1 Introduction: What Is Speech Intelligibility? The construct of speech intelligibility (SI) is central to human communication. Estimates of SI are routinely collected in assessments of speech and hearing as a proxy for degree of communicative impairment. Yet, the very definition of SI has been elusive because it is not a unitary construct, and both clinicians and researchers continue to face challenges with its implementation and application. At the heart of the issue is the collective tension we experience regarding the locus of intelligibility deficit. On the one hand, speech signal degradation is the obvious cause of intelligibility deficits – whether because of an impaired speech production or reception mechanism (e.g., dysarthria, hearing impairment), the interference of noise or distortion (e.g., cocktail party, reverberation), or alterations of the speech signal through transmission (e.g., cell phones, hearing aids, cochlear implants). Thus, characterizing the nature of the speech d egradation and its extent is widely regarded as of primary value for understanding and predicting speech intelligibility. On the other hand, speech intelligibility is ultimately a function of a human brain’s processing of that degraded speech signal. This is where it gets complicated because the nature of the degraded acoustic signal is the only one of many v ariables that influence that speech signal’s intelligibility, and probably not even the most important one in the context of real-world communication. This is because, when confronted with a degraded speech signal, the brain engages in active problem-solving to match the available acoustic information to lexical items, using all the vast predictive and contextual information at its disposal to crack the code (Davis & Johnsrude, 2007). Indeed, the neural systems involved in perceiving even highly intelligible speech are tuned to the amount of acoustic richness in the speech signal, presumably to facilitate the most effective and efficient processing (Lee et al., 2016). Moreover, recent functional neuroimaging and EEG research suggests that the brain actively shapes what is heard. For example, Kösem et al. (2018) showed magneto-encephalographic (MEG) evidence that entrainment of the brain’s endogenous rhythms with speech dynamics serves as a timing predictor that impacts how upcoming words are perceived. In a review of the literature on the time-course of speech perception, Getz and Toscano (2020) concluded that speech perception is organized around two central principles: (a) sensitivity to gradient acoustic differences in the signal, and (b) influence from higher-level linguistic representations on early speech perception. These principles point toward a model of speech perception in which listeners track fine-grained information available in the bottom-up signal, but do so in a contextually-sensitive manner in which the earliest stages of perceptual processing are influenced by top-down information as well. (p.7, italics added)
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
606 Julie Liss Based on this growing body of work, pure bottom-up processing of speech is likely only a theoretical construct that cannot exist when a real human brain is involved. Nonetheless, it is impractical (at least at this time in history) for clinical or industry applications to view SI as an emergent property of a unique brain processing a degraded speech signal. Instead, reliable and valid assessments and indices are needed to document disease progression or improvements with interventions; for establishing industry standards and conducting performance comparisons for hearing aids and cochlear implants; and for clinical trials evaluating treatment efficacy for speech disorders. In this chapter, we will explore the ways in which scientists have and are attempting to make the challenge of SI assessment more tractable by restricting the influence of the brain on speech intelligibility measures. [For precision, we will keep our focus on SI assessments in adults rather than children, for whom developmental considerations are necessary (interested readers are encouraged to see Hustad et al., 2020).] We will consider adults who either suffer from speech disorders that degrade SI (such as dysarthria) or those adults with hearing loss whose speech reception/ perception challenges cause reduced SI (hearing aid and cochlear implant wearers).
41.2 Signal-Complementary “Confounds” Weismer (2008) provided a tentative definition of SI in the previous edition of this book: “Speech intelligibility is a relative measure of the degree to which a speaker’s speech signal is understood, the relativity depending at a minimum on the identities of speaker and listener, what is spoken and where it is spoken” (p. 568). This definition captures the fact that measures of SI are influenced by a wide array of variables, wherein the quality of the acoustic signal (i.e., signal-dependent information, Lindblom, 1987) is but one. Indeed, the l iterature is rife with publications that demonstrate the impact of signal-independent or signal-complementary information on speech perception, and on measures of speech intelligibility more specifically. Such signal-independent information includes, but is in not limited to: the availability of information (e.g., visualization of face/lip, gestures) concurrent with the acoustic speech signal (e.g., Keintz et al., 2007); the quality of the listening environment (e.g., Ahrens et al., 2019); the listener’s familiarity with the speaker (e.g., Liss et al., 2002; Liss, 2007); the listener’s knowledge of the topic being discussed (e.g., Utianski et al., 2011); the availability of semantic, syntactic, and pragmatic context (e.g., Davis et al., 2005); the ability of the listener to perceptually learn or adapt to a degraded signal (e.g., Lansford et al., 2019); and in the case of hearing loss, the listener’s audiometric profile and cognitive abilities (e.g., Kates et al., 2013). Thus, any SI measure collected under one set of circumstances (for example, a face-to-face estimate of conversational SI by a clinician who is providing speech therapy to a patient with dysarthria) likely will be different than that collected under another set of conditions (for example, that same patient performing a single-word intelligibility test as scored by a clinician who is unfamiliar with that patient). This will hold true for all but the extremely high or low SI, where ceiling or floor effects will drive common scores across diverse circumstances. This relative nature of SI measures creates a challenge for clinicians and researchers, wherein the signal-complementary information is regarded as a confound to be controlled and reduced when trying to obtain valid and reliable SI measures.
41.3 Making the Speech Intelligibility Challenge Tractable To bring clarity and uniformity to research and clinical practice in communication disorders, a consensus study was conducted in Europe to land on a comprehensive definition of SI and to distinguish it from its neighbor construct, speech comprehensibility (Pommée et al., 2022). The study recruited forty international speech experts, including speech clinicians, researchers,
Speech Intelligibility 607 and academics, to participate in a three-round modified Delphi consensus study. The experts were provided with five contemporary definitions from the literature of each term, intelligibility and comprehensibility. In the first round, experts provided open-ended definitions of the terms. In round 2, the synthesized responses from round 1, along with 22 new statements extracted from round 1 responses, were given to the experts for a binary evaluation (agree/ disagree) and comments. Round 3 resulted in consensus, which was defined as at least 75% of agreement among the members of the panel. The experts reached agreement that “Intelligibility refers to the reconstruction of an utterance at the acoustic-phonetic level, intelligibility-related information is thus carried by the acoustic signal (i.e., intelligibility focuses on signal-dependent information)” (p. 31). They also agreed that Comprehensibility refers to the reconstruction of a message at the semantic-discursive level, subsequent to the acoustic-phonetic reconstruction. Therefore, intelligibility is a component of comprehensibility. In addition to the acoustic-phonetic decoding, it also includes signalindependent, contextual elements such as the linguistic or the non-verbal context. However, one can be comprehensible without all low-level units necessarily being accurately decoded; therefore, while intelligibility affects comprehensibility, the latter is, however, not fully dependent on it (p. 31)
Thus, comprehensibility is the more functionally relevant of the two constructs as it involves the successful reception of a message. As such, the study found it should be measured by meaning-related metrics that tap the success of top-down cognitive processes in the face of a degraded speech signal. The SI recommendation was that it should be assessed using materials that do not convey context or allow for other signal-independent factors to influence the acoustic-phonetic decoding. Estimating SI by way of listener performance on identifying phonemes, syllables, pseudo-words, unpredictable sentences, and minimal pairs was regarded as optimal. Further, clinicians who administer and score the tests should be unfamiliar with the patient. The report goes on to suggest that assessment of intelligibility may further reduce s ubjectivity by “cast(ing) off the listener dimension” (italics added, p. 34) and relying instead on objective instrumental assessment of the acoustic signal.
41.3.1 Objective Instrumental Assessment of the Acoustic Signal The use of objective instrumental measures of SI dates to the early days of telephone t echnology (Amlani et al., 2002; see also Weismer, 2008). French and Steinberg (1947) introduced the articulation index, which was developed as an objective measure of SI that focused on the quality of the acoustic signal in the frequency bands that matter most for speech perception. This index was informed by articulation theory, which specified the frequency bands that carry the most information about speech sounds (Fletcher & Galt, 1950). This use of articulation theory obviated the need to include an actual human listener in the development of the index, thereby eliminating signal-independent influence. This makes good sense for the telecommunication use-case because the impact of the signal processing (e.g., compression, noise reduction) on healthy speech is quantifiable and m anipulable. An articulation index score of “0” is totally unintelligible and a score of “1” is completely intelligible, and this is suitable for characterizing the intelligibility of transmitted speech signals for the general population of hearers. Subsequent iterations and enhancements of the articulation index have evolved the field, but the basic concept remains in use. Indices developed for SI in telecommunication were adopted by the field of audiology and hearing science as objective benchmarks for specifications of industry standards, to predict performance in hearing loss and adverse listening environments, to measure improvement
608 Julie Liss enabled by hearing aids and cochlear implants, and to compare apples to apples across listening settings and devices. As with telecommunication applications, such indices are relatively straightforward as they are generated using high fidelity speech signals of single words or standard utterances produced by a small number of healthy speakers that are then degraded in some systematic way by noise or by (simulated) transmission through a hearing aid or cochlear implant. Comparison of the clean signal to the transmitted signal, weighted preferentially on important speech frequencies, is a reasonable way to estimate SI for the general population. Ellis and Souza (2022) recently described a method by which an existing index could be tailored to a specific individual device user, while still avoiding the influence of signal-independent variables. This involved development of an enhancement to the Spectral Correlation Index (SCI) for improving intelligibility of hearing aided speech. The original SCI has been a useful index for comparing the amplitude modulation of a baseline speech signal (collected from one or two male and female healthy speakers in a sound booth and unprocessed) to that of those speech signals that have been processed through hearing aid compression simulators (Gallun & Souza, 2008). As with the earlier articulation index, SCI values near “1” indicate that the processed signals are highly similar to the unprocessed versions and are therefore highly intelligible. Conversely, processed signals that are entirely dissimilar to the unprocessed speech yield SCI values near “0” and are unintelligible. The enhancement offered by Ellis and Souza was to use a hearing-impaired listener’s audiogram to differentially d e-weight modulations in carrier bands that are less audible to that listener, thereby improving intelligibility of the processed signal for that listener. This approach effectively maintains the signal-dependent focus without introducing listener confounds. While intelligibility in real world situations will certainly involve signal-complementary factors, the enhanced SCI still provides clinical value for optimizing signal processing based on audibility. While advances in signal processing stand to improve hearing outcomes further, caution must be taken as signal processing itself can introduce its own unintended listener-dependent response. Souza et al. (2019) examined how signal processing techniques (e.g., dynamic range compression, noise suppression) in hearing aids differentially impacted listener performance based on age, hearing loss, and working memory. They compared two distinct sets of settings, one with only mild signal processing that would modify the incoming signal minimally, and one with strong signal processing and large modifications to the incoming signal. Participants were tested for speech recognition in noise after wearing hearing aids with one of the two settings for five weeks, allowing time for acclimation to the settings. They found that participants with lower working memory scores performed worse in both mild and strong setting conditions in the speech in noise task. Those with higher working memory performed better only in the mild setting condition. They hypothesized that the strong signal processing condition may have introduced distortions that outweighed any audibility benefits provided by said signal processing. This study highlights the need for consideration of human factors and potentially complex interactions among those factors and speech signal processing strategies. While the field of audiology requires SI indices that are largely signal-dependent for the reasons cited earlier (e.g., comparing signal processing settings and devices), the preference of measures may have come at a price. Outcomes for individual hearing aid and cochlear implant users are highly variable, a fact that is widely recognized but less understood. Wang and colleagues (2021) reported virtually no relationship between audibility and word recognition gains in the hearing-aided compared to unaided condition and the patient reported outcomes. That is, the amount of hearing benefit as measured in the sound booth did not predict the hearing aid wearer’s experience or satisfaction. This likely contributes to poor persistence in hearing aid use. Understanding what the hearing-impaired listener’s brain is
Speech Intelligibility 609 bringing to the table is vital to address the significant variability in outcomes with hearing aids and cochlear implants. Future work may seek to incorporate additional listener-centric variables (e.g., working memory, auditory attention) into existing SI indices for hearing impairment.
41.3.2 Speech Intelligibility Measures for Speech Production Disorders In the case of speech degradations caused by neurological conditions that result in dysarthria, objectively quantifying speech intelligibility is not as straightforward as for telecommunication and audiological applications. Dysarthria degrades the speech signal in both p redictable and unpredictable ways, with substantial variability across timescales and frequency bands. While many spectral and temporal features of dysarthric speech can be objectively measured, such data have yet to yield robust clinical tools for SI estimations (see Pommée et al., 2021). It is unlikely that an articulation index approach for speech production could be successful because of the need for it to generalize across dysarthria types and severities. Theoretically, advances in speech signal processing, artificial intelligence and machinelearning, and automatic speech recognition (ASR) should enable the development of objective intelligibility measures of dysarthric speech. However, there are significant b arriers that have prohibited success to date. The most consequential is the lack of large, labeled speech datasets from clinical populations (i.e., speech samples that have associated transcripts or clinical ratings). A large dataset that is representative of the population of interest is necessary to develop robust machine-learning models. The problem is made worse by advanced speech signal processing techniques that allow us to extract a large number of features from the speech signal that may or may not relate to the speaker’s clinical condition. When machinelearning algorithms are trained with many features extracted from small clinical datasets, the result is overfitting of the model. Overfitting means that it will not generalize to new datasets, rendering them clinically worthless (interested readers see Berisha et al., 2021 for a discussion of this problem). Similarly, ASR has long been considered a candidate for objective SI assessment in dysarthria. The rationale is that the ASR will succeed in correctly recognizing intelligible words, but it will make errors when the acoustic signal is degraded by dysarthria. To date, no meaningfully successful studies using ASR to estimate SI in dysarthria have been reported (for example, see Gutz et al., 2022). An alternative to objective assessment is to develop clinical tools that primarily capture a listener’s acoustic-phonetic decoding for SI assessment with the least amount of signal- independent influence possible. The use of nonwords and pseudo-words as the stimuli for these assessments offers several advantages over real words, provided they are carefully created. An example of a carefully curated corpus is offered by Lalain and colleagues (2020) who developed an intelligibility test specifically for individuals with head and neck cancer. They designed and developed a French pseudo-word directory with 90,000 entries that conform to the phonotactic constraints of the French language, wherein nearly all phonemes are represented in all possible positions and contexts of the nonwords. The directory enables the generation of balanced lists of pseudowords based on the constraint criteria of interest and allows for continuous development and refinement by those who use it. From this directory, Lalain and colleagues constructed balanced lists of 52 pseudowords by using a down-selection algorithm that imposed several phonological constraints. Orthographic transcriptions collected from naïve, untrained listeners can be scored at a gross level of analysis, such as the percent of pseudowords correctly transcribed. However, the structure allows for a more granular level of analysis by comparing the phonemes in the target with the corresponding transcribed phonemes. Using phonological theory, between-forms distances in terms of number of distinctive features can be calculated to generate cost-matrices, wherein the more deviations in
610 Julie Liss distinctive features between the target phoneme and transcribed phoneme (Perceived Phonological Deviation, PDD) the poorer the acoustic-phonetic decoding. Because the targets are nonwords, the signal-independent influence of lexical effects is minimized and all that remains to bias perception are the phonotactic constraints and statistical regularities of the language. To move the PDD towards a clinical tool, this group conducted three experiments to determine how many of the 52 pseudowords must be sampled to achieve similar intelligibility results as the full set (Marczyk et al., 2021). This was motivated by the need for clinicians to have access to quick and easy intelligibility assessments. By enforcing criteria related to phonotactic complexity of the pseudoword lists, they were able to reduce the original sample size to 32 pseudowords, which elicited performance that had high correlation with the original dataset and with a separate speech impairment index. An SI test of this size is expected to promote its use in clinical practice. While pseudowords are particularly germane to head and neck cancer, where a rticulation is likely to be the major contributor to reduced intelligibility, the dysarthrias require consideration of contributions from disturbances in articulation, resonance, voice, and prosody. This necessitates evaluation of SI in connected speech, however strings of p seudowords likely would be read as a list rather than with phrase- or sentence-level prosody. An alternative is to use real words but control maximally for the impact of signal-complementary variables on SI measures. Lehner and Ziegler (2021) reported on such an approach in a large study designed to evaluate lexical and articulatory factors in the automatic generation of sentences for SI in dysarthria. Using their KommPaS clinical tool, they quasi-randomly selected 2700 German words from a large corpus of well characterized words produced by 100 individuals with various forms and severities of dysarthria. These words were embedded in context-free carrier sentences and transcribed via crowdsource by more than 200 listener participants. Importantly, target accuracy scores were calculated as the average of multiple transcribers to mitigate effects of inter-rater variability and individual transcriber bias. The primary motivation of this study design was to maximally control for signal-complementary v ariables introduced by the listener (i.e., familiarity and performance variability of individual listeners) and by the stimulus materials (i.e., lexical frequency, phonological neighborhood structure, articulatory complexity, lexical familiarity, word class, stimulus length, and embedding position of the target word in the sentence). To evaluate the impact of articulatory and lexical factors on SI they conducted classification and regression analyses and built generalized linear mixed models. Their rigorous analyses resulted in expected findings based on prior research, namely that higher intelligibility scores were obtained for target words with higher word frequency, sparser phonological neighborhoods, and higher lexical familiarity, across all severity levels. Interestingly, higher SI was also found for target words with higher articulatory complexity. This finding is somewhat unexpected because producing such words will be more challenging for individuals with dysarthria. From a perceptual standpoint, however, this makes sense: Whatever evidence of articulatory complexity that was gleaned from the acoustic signal benefitted the listeners’ word recognition. In addition, target words embedded in sentence-initial positions were of lower SI than those in medial or final position, as would be expected if the carrier sentence helps listeners tune their listening to that speaker’s specific speech characteristics (Choi & Perrachione, 2019). Neither the number of syllables in the target word nor the word class had an effect on SI. These findings inform the parameters for automatically generating balanced stimuli sets for an SI assessment. Finally, our work has attempted to move SI assessments of dysarthria forward by accounting for some signal-complementary influence, namely by modeling how an average listener would transcribe a given phrase based on its speech acoustics. Like Lehner and Ziegler (2021), we start with the assumption that the listener’s brain is the final arbiter of speech intelligibility, and that “the wisdom of the crowd” can be leveraged to obtain stable and valid estimates of SI that do not reflect any single transcriber’s biases. Further, we
Speech Intelligibility 611 assume that the perceptual errors made when transcribing a dysarthric speech signal are a function of the nature of the speech degradation and the perceptual processes the listener brings to bear on that signal. Finally, we assume that these relationships between degradation patterns and perceptual errors are nonrandom and learnable by artificial intelligence. The goal of this work is to create an automated clinical tool to return measures of SI at multiple levels of analysis (phoneme, word, lexical segmentation) that will provide actionable insights for treatment targets. Our stimulus set consists of a carefully curated bank of six-syllable phrases that alternate in syllable strength (strong-weak or weak-strong) that allow us to both sample across the phonetic inventory of English and balance for trochaic and iambic meters. This enables the evaluation of not only the words and phonemes of the transcription but also the location and type of lexical segmentation errors. This is an important consideration as acoustic cues to lexical segmentation are frequently degraded in dysarthric speech and this provides a unique contribution to the intelligibility deficit (Liss, 2007). To reduce the impact of signal-independent semantic cues, the phrases are of low inter-word predictability. Crowdsource platforms are used to collect transcripts, wherein listeners transcribe unique phrases produced by different speakers with dysarthria to avoid familiarization with the speaker or the stimulus phrases. We have developed a coding strategy that allows us to automatically score listener transcripts to document the number and types of errors at the levels of the phoneme, word, and lexical segmentation (for details, see Jiao et al., 2019). We refer to this output as the Multidimensional Intelligibility Profile (MIP) as it represents global and granular characterization of the listener’s errors and accuracy. We use the error data from the transcripts to pair with the acoustic signal of the associated phrase to train machine-learning models to learn the relationship between the acoustics and the perceptual error patterns. The model, once fully developed and validated, will be able to produce a set of MIP data from speech of people it has never seen before. The model will essentially predict how an individual’s speech will be perceived by an average listener. The MIP data will identify targets for intervention based on error profiles, and the model can be used to objectively document SI improvements when errors are reduced on post-treatment speech samples.
41.4 Conclusion There is a tension between the utility of signal-dependent measures of speech intelligibility and the reality that signal-complementary variables exert an outsized influence on these measures in the real world. Technological advances in signal processing, artificial intelligence and ASR will likely be used to cast off the listener dimension more efficiently and effectively in the future. However, a more valuable goal is to use such advances to calculate SI metrics that include ever more signal-complementary variables. This must be guided by our increasing understanding of how the actively brain shapes perception and the human factors that drive an individual listener’s percepts. Ultimately, modeling SI as an emergent property of a unique brain processing a degraded speech signal will allow for the development of precision solutions for those with communication impairments secondary to reduced SI.
REFERENCES Ahrens, A., Marschall, M., & Dau, T. (2019). Measuring and modeling speech intelligibility in real and loudspeaker-based virtual sound
environments. Hearing Research, 377(3), 307–317. https://doi.org/10.1016/j.heares. 2019.02.003
612 Julie Liss Amlani, A. M., Punch, J. L., & Ching, T. Y. (2002). Methods and applications of the audibility index in hearing aid selection and fitting. Trends Amplification, 6(3), 81–129. https://doi. org/10.1177/108471380200600302. PMID: 25425917; PMCID: PMC4168961. Berisha, V., Krantsevich, C., Hahn, P. R., Hahn, S., Dasarathy, G., Turaga, P., & Liss, J. (2021). Digital medicine and the curse of dimensionality. NPJ Digital Medicine, 4(1), 153–153. https://doi. org/10.1038/s41746-021-00521-5 Choi, J. Y., & Perrachione, T. K. (2019). Time and information in perceptual adaptation to speech. Cognition, 192, 103982. https://doi. org/10.1016/j.cognition.2019.05.019 Davis, M. H., & Johnsrude, I. S. (2007). Hearing speech sounds: Top-down influences on the interface between audition and speech perception. Journal of Hearing Research, 229(1-2), 132–147. Davis, M. H., Johnsrude, I. S., Hervais-Adelman, A., Taylor, K., & McGettigan, C. (2005). Lexical information drives perceptual learning of distorted speech: Evidence from the comprehension of noise-vocoded sentences. Journal of Experimental Psychology: General, 134(2), 222–241. Ellis, G. M., & Souza, P. (2022). Updating the spectral correlation index: Integrating audibility and band importance using speech intelligibility index weights. Journal of Speech, Language, and Hearing Research, 65(7), 2720–2727. https://doi.org/10.1044/2022_JSLHR-21-00448 Fletcher, H., & Galt, R. H. (1950). The perception of speech and its relation to telephony. Journal of the Acoustical Society of America, 22, 89–151. French, N. R., & Steinberg, J. C. (1947). Factors governing the intelligibility of speech sounds. Journal of the Acoustical Society of America, 19, 90–119. Gallun, F., & Souza, P. (2008). Exploring the role of the modulation spectrum in phoneme recognition. Ear & Hearing, 29(5), 800–813. Getz, L. M., & Toscano, J. C. (2020). The time‐course of speech perception revealed by temporally‐ sensitive neural measures. Wiley Interdisciplinary Reviews. Cognitive Science, 12(2), 1-18. e1541–n/a. https://doi.org/10.1002/wcs.1541 Gutz, S., Stipancic, K. L., Yunusova, Y., Berry, J. D., & Green, J. R. (2022). Validity of off-theshelf automatic speech recognition for assessing speech intelligibility and speech severity in speakers with amyotrophic lateral sclerosis. Journal of Speech, Language, and Hearing Research, 65(6), 2128–2143. https://doi. org/10.1044/2022_JSLHR-21-00589
Hustad, K., Mahr, T. J., Broman, A. T., & Rathouz, P. J. (2020). Longitudinal growth in single-word intelligibility among children with cerebral palsy from 24 to 96 months of age: Effects of speech-language profile group membership on outcomes. Journal of Speech, Language, and Hearing Research, 63(1), 32–48. https://doi. org/10.1044/2019_JSLHR-19-00033 Jiao, Y., LaCross, A., Berisha, V., & Liss, J. (2019). Objective intelligibility assessment by automated segmental and suprasegmental listening error analysis. Journal of Speech, Language, and Hearing Research, 62(9), 3359–3366. https://doi. org/10.1044/2019_JSLHR-S-19-0119 Kates, J., Arehart, K. H., & Souza, P. E. (2013). Integrating cognitive and peripheral factors in predicting hearing-aid processing effectiveness. The Journal of the Acoustical Society of America, 134(6), 4458–4469. https:// doi.org/10.1121/1.4824700 Keintz, C. K., Bunton, K., & Hoit, J. D. (2007). Influence of visual information on the intelligibility of dysarthric speech. American Journal of Speech-Language Pathology, 16(3), 222–234. https://doi.org/10.1044/ 1058-0360(2007/027) Kösem, A., Bosker, H. R., Takashima, A., Meyer, A., Jensen, O., & Hagoort, P. (2018). Neural entrainment determines the words we hear. Current Biology, 28(18), 2867–2875.e3. https:// doi.org/10.1016/j.cub.2018.07.023 Lalain, M., Ghio, A., Giusti, L., Robert, D., Fredouille, C., & Woisard, V. (2020). Design and development of a speech intelligibility test based on pseudowords in French: Why and how? Journal of Speech, Language, and Hearing Research, 63(7), 2070–2083. https://doi. org/10.1044/2020_JSLHR-19-00088 Lansford, K., Borrie, S. A., & Barrett, T. S. (2019). Regularity matters: Unpredictable speech degradation inhibits adaptation to dysarthric speech. Journal of Speech, Language, and Hearing Research, 62(12), 4282–4290. https://doi. org/10.1044/2019_JSLHR-19-00055 Lee, Y.-S., Min, N. E., Wingfield, A., Grossman, M., & Peelle, J. E. (2016). Acoustic richness modulates the neural networks supporting intelligible speech processing. Hearing Research, 333, 108–117. https://doi.org/10.1016/j.heares.2015.12.008 Lehner, K., & Ziegler, W. (2021). The impact of lexical and articulatory factors in the automatic selection of test materials for a web-based assessment of intelligibility in dysarthria. Journal of Speech, Language, and Hearing Research, 64(6), 2196–2212. https://doi. org/10.1044/2020_JSLHR-20-00267
Speech Intelligibility 613 Lindblom, B. (1987). Explaining phonetic variation: A sketch of the H&H theory. In Speech production and speech modelling (pp. 403–439). Springer. https://doi. org/10.1007/978-94-009-2037-8_16 Liss, J. M. (2007). Perception of dysarthric speech. In G. Weismer (Ed.), Motor speech disorders: Essays for Ray Kent (pp. 187–219). Plural Publishing. Liss, J. M., Spitzer, S. M., Caviness, J. N., & Adler, C. (2002). The effects of familiarization on intelligibility and lexical segmentation in hypokinetic and ataxic dysarthria. The Journal of the Acoustical Society of America, 112(6), 3022–3030. Marczyk, A., Ghio, A., Lalain, M., Rebourg, M., Fredouille, C., & Woisard, V. (2021). Optimizing linguistic materials for feature-based intelligibility assessment in speech impairments. Behavior Research Methods, 54(1), 42–53. https:// doi.org/10.3758/s13428-021-01610-9 Pommée, T., Balaguer, M., Mauclair, J., Pinquier, J., & Woisard, V. (2022). Intelligibility and comprehensibility: A Delphi consensus study. International Journal of Language & Communication Disorders, 57(1), 21–41. https:// doi.org/10.1111/1460-6984.12672 Pommée, T., Balaguer, M., Pinquier, J., Mauclair, J., Woisard, V., & Speyer, R. (2021). Relationship
between phoneme-level spectral acoustics and speech intelligibility in healthy speech: A systematic review. Journal of Speech, Language, and Hearing Research, 24(2), 105–132. https:// doi.org/10.1080/2050571x.2021.1913300 Souza, P., Arehart, K., Schoof, T., Anderson, M., Strori, D., & Balmert, L. (2019). Understanding variability in individual response to hearing aid signal processing in wearable hearing aids. Ear and Hearing, 40(6), 1280–1292. https://doi. org/10.1097/AUD.0000000000000717 Utianski, R., Lansford, K. L., Liss, J. M., & Azuma, T. (2011). The effects of topic knowledge on intelligibility and lexical segmentation in hypokinetic and ataxic dysarthria. Journal of Medical Speech-Language Pathology, 19(4), 25–36. Wang, X., Zheng, Y., Li, G., Lu, J., & Yin, Y. (2021). Objective and subjective outcomes in patients with hearing aids: A cross-sectional, comparative, associational study. Audiology & Neurotology, 27(2), 166–174. https://doi. org/10.1159/000516623 Weismer, G. (2008). Speech intelligibility. In M. J. Ball, M. Perkins, N. Müller, & S. Howard (Eds.), The handbook of clinical linguistics (pp. 568–582). Blackwell Publishers.
42 Sociophonetics and Clinical Linguistics GERARD DOCHERTY AND GHADA KHATTAB 42.1 Introduction The term sociophonetics refers to the study of those aspects of phonetic realization that vary as a function of a range of social factors, such as age, gender, ethnicity, class, style, individual identity, etc. In recent decades there has been a sharply growing awareness that developing our understanding of how speakers are influenced by extra-linguistic factors applying in particular communicative situations is fundamental in building models of speech production, and this interface between the perspectives and paradigms conventionally adopted by sociolinguistic and phonetic research has come to be seen as the domain of sociophonetics (Damico & Ball, this volume; Docherty, 2022; Drager & Kettig, 2021; Foulkes & Docherty, 2006; Foulkes et al., 2010; Kendall & Fridland, 2021; Thomas, 2011). While the bulk of sociophonetic investigation focuses on variation in phonetic realization, it is also insightful to consider how the social-indexical information conveyed within the speech signal is accessed and interpreted by listeners, and the scope of sociophonetics now extends uncontroversially to include issues relating to speech processing and perception (e.g. Bent & Pisoni, this volume; Clopper, 2004; Drager, 2010a; Foulkes, 2005; Hay et al., 2006). Likewise, since early perception shapes the child’s phonological acquisition and representations, an understanding of sociophonetic variation is also fundamental to understanding how children acquire the ability to interpret and generate the social-indexical properties of speech through the various stages of phonological development. Sociophonetic variation is also increasingly investigated in the context of speakers operating within a multilingual environment. While much work on bilingualism focuses on the interactions between the two or more languages deployed by an individual, much less attention has been paid to how phonetic variability across both languages is harnessed as a means of signaling individual identity in different contexts. One of the aims of this chapter is to highlight this area as one that needs to be factored into clinical speech assessment. The study of sociophonetic variation is closely associated with theories of phonological change and, as part of this, with studies of geographically determined accent variation and of the phenomena which are observed when accents come into contact or when a shift in social structures leads to greater or less differentiation in the social–indexical properties of language performance.1 However, for the purposes of this chapter, we do not focus on
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
616 Gerard Docherty and Ghada Khattab comparing different “accents” or on how they have changed over time; rather, our objective is to highlight factors that can give rise to variation within the same community and to consider the implications of these for clinical phonological assessment.
42.2 The Nature of Sociophonetic Variability The study of sociophonetic variability has been heavily influenced by methodologies arising from sociolinguistic research. A key aspect of this is the adoption of the linguistic variable as the fundamental object of study (Milroy & Gordon, 2003). Linguistic variables can be identified at different levels of analysis (phonological, discourse, lexical, syntactic, morphological), but in each case they are defined by representing a locus of socially correlated variation in speaker performance. Note that an aspect of speech performance that could constitute a linguistic v ariable in one variety of a language may not apply in another variety, and, likewise, different variables may be relevant across languages. Once a linguistic variable has been identified as a focus for analysis, the analysis proceeds by scoring the relative frequencies of the range of variants which are found for all of the occurrences of a particular variable within a corpus of speech. This is often combined with acoustic analysis, particularly in the case of vowel variables where dynamic analysis can reveal differences along the trajectory of the various realizations. So, for example, in a study of sociophonetic variation in Newcastle upon Tyne, Warburton (2020) investigated recent developments in the realization of the vowels in the goat (e.g. go, load, slow) and thought (e.g. bought, fought) lexical sets,2 which had been predicted to be undergoing a merger in the community in an earlier study by Watt (2002). Warburton tracked the static and dynamic acoustic patterns in these two vowels across the performance of 28 speakers equally distributed across groups defined by social class, sex, and age. Results from the dynamic analyses of these two vowels are presented in Figure 42.1. Analyses using Generalized Additive Mixed Modelling (Sóskuthy, 2021) were used to predict the F1 and F2 trajectories for goat and thought across age groups and provided evidence for a change in progress. However, these differed by sex and age: young females produced more similar goat and thought trajectories than older females, in both F1 and F2; on the other hand, young males produced more differentiated F2 trajectories for these vowels than older males, but more similar F1 trajectories. Auditory analyses (not presented here) revealed that continued use of multiple goat variants among younger Tyneside males may be preventing a goat – thought merger in their speech. These results reflect social-indexicality conveyed by variation in vowel production in Newcastle English. This variation does not typically come about by virtue of one group of speakers adopting a particular variant 100% of the time while another group adopts a different variant in 100% of tokens. More typically, investigators find non-categorical distributions in which the speakers are differentiated by the relative frequency with which particular variants occur; dynamic analyses further reveal that the differences across groups may be located in particular portions of the vowel trajectories, which may show formant movement even when the majority of productions sound monophthongal. Sociophonetic variation of this sort has been most thoroughly investigated in relation to vowel production, but in recent years there has been an increasing focus on consonants revealing very similar types of patterns of variation. For example, Figure 42.2 shows the distribution of (ð) variants for children and adolescent speakers from a Bedouin community of Palestinian refugees living in a village near Damascus in Syria (Shetewi, 2018). [ð] is the local Bedouin variant whereas [d] is the urban Damascene variant. As can be seen, while tokens of both variants are found across all speaker groups, there is a clear differentiation in the relative frequencies of the fricative and stop variant as a function of speaker sex and age, with the local fricative form being much more strongly associated
Sociophonetics and Clinical Linguistics 617 Young Females
Middle Females
Young males
Middle males
Older Females
Older males
Figure 42.1 Predicted formant trajectories (F1 and F2) for goat and thought by Tyneside females (top) and males (bottom) across three age groups (Warburton, 2020).
with male speakers from age 9 while female speakers maintain the urban variants until age 15, when the two groups converge. Many other consonantal variables have been investigated within recent sociophonetic studies (Thomas, 2016) with ample evidence now available regarding the diverse ways in which variation in consonantal realization can be associated with speakers’ social characteristics and orientation. Perhaps not surprisingly, social-indexical marking in speech performance is not the exclusive domain of segmental units and there is an increasing body of evidence demonstrating the role of prosody and voice quality as carriers of social-indexicality. Stuart-Smith (1999) found variation in voice quality and vocal timbre in speakers from Glasgow as a function of age, sex, and (most of all) social class. Other investigators have suggested that the social-indexical role of creaky voice quality is different across UK and USA varieties of English, being predominantly a marker of young female speakers in the latter while being more of a marker of maleness in the former (Henton & Bladon, 1988), although the recent systematic review by Dallaston & Docherty (2020) suggests that the nature of variability in the use of vocal creak is still poorly understood.
618 Gerard Docherty and Ghada Khattab 100.0% 90.0% 80.0% 70.0% 60.0% 50.0% 40.0% 30.0% 20.0% 10.0% 0.0% male
female
3-5 years
male
female
6-8 years
male
female
9-11 years [d]
male
female
12-14 years
male
female
15-17 years
[z]
Figure 42.2 Frequency of occurrence of fricative and stop variants of (ð) in Shetewi’s (2018) data from Khan Eshieh Camp in Syria.
42.3 Focusing on Individuals Studies of social-indexical phonetic variation have increasingly highlighted that the social factors which have typically been considered in studies of this sort (e.g., age, gender, class) offer only a partial understanding of social-indexicality within speech. There are two key issues here. One is that these social categories are themselves quite complex and subject to different interpretations (for example, age can be mapped to a simple chronological scale, but equally, as suggested by Eckert (2017) there is a case for handling it in terms of a more nuanced concept of “life stage” since the latter may be a more direct influence on what appears to agerelated language patterns). A second issue is that the social categories are perhaps best thought of as “analysts’ categories” (Milroy & Gordon, 2003, p. 116) since there is nothing intrinsic to those social variables that leads speakers to perform in a particular way (i.e. being a young, working class, male from Newcastle doesn’t in itself determine that such an individual would have a certain percentage of [oʊ] tokens in the goat lexical set). Rather it appears that what investigators are tapping into with their observations is a much more complex process whereby individuals deploy their linguistic (including phonological) resources to align themselves with the communities with whom they interact (much as they do along many other dimensions of human behavior). We can see evidence of this in the style-shifting that speakers undertake in different contexts, driven in some cases by an implicit accommodation to or divergence from an interlocutor (Giles et al., 1991), but driven in other cases by an individual’s beliefs about the impact of particular linguistic behavior within a particular situation (see for example, Sangster’s (2002) work on phonological variation amongst students at Oxford who had originated from Liverpool which explores the explicit motivation that some individuals had for either retaining or diminishing aspects of their Liverpool variety). It is clear then that fundamentally, sociophonetic variation appears to be best understood as a key element in the formation and projection of individual identity (or identities) across the range of interactions maintained by a speaker. This appears to be a dimension of speech production which develops only gradually during early childhood (Foulkes &
Sociophonetics and Clinical Linguistics 619 Docherty, 2006), but which is probably heavily influenced by growing awareness of gender identity, encountering peer groups at the onset of schooling, and, perhaps most of all by the process of adolescence (Eckert, 2017; Kerswill, 1996).
42.4 Interpreting Sociophonetic Variability As might be expected, the ability to execute what is often very fine-grained tuning of speech performance in the interest of identity projection is mirrored by the fact that individuals are extremely adept at interpreting this dimension of the speech signals that they are exposed to. While listeners often find it difficult to articulate what it is in the physical manifestation of speech that drives their evaluations, anecdotal evidence and personal experience suggests that listeners readily make a wide range of judgments about interlocutors based on properties of an interlocutor’s speech. Research into the perception and interpretation of social-indexical properties of speech is far less well-advanced than the work on speaker performance. One line of research (commonly referred to as “perceptual dialectology” – see, for example, Preston (2017), Cramer (2014), and Clopper (2004) as good sources of background on this) has focused on the extent to which listeners are able to identify the regional provenance of speakers. Another, exemplified in the work of Strand (1999), Niedzielski (1999), and Drager (2010b) has begun to explore how listeners’ implicit knowledge (built up over time and experience) of how a particular category of speaker typically performs (e.g. males vs females in Strand’s case) impacts on a range of speech perception tasks thus demonstrating the likely integration of the social-indexical channel into speech processing. In reviewing the impact of sociophonetic variation on speech communication, it is also important to highlight that over time in most communities ideologies evolve arising from conventional beliefs about the social meaning of particular phonetic forms (Lippi-Green, 2011; Milroy, 2006). For example, within the UK, while use of a glottal stop variant of word medial /t/ (in words like “water” or “bottle”) is known to be a rapidly spreading feature of the speech performance of younger speakers across geographically diverse urban centers, a full account of this pattern of variation should reflect the fact that these variants are at the same time somewhat stigmatized for many members of the same community (most prominently for older speakers) and considered to be quite acceptable by many other members of the same community (typically the younger speakers). Socially constructed beliefs center around factors such as perceived prestige, or the alleged aesthetic qualities of a particular variant (or accent as a whole) – see the case of the Birmingham variety of English in the UK which is much-maligned in popular discourse and about which many individuals from outside [and even within] that region will readily express negative opinions (Bishop et al., 2005) – or around other collective stereotypical judgments, e.g. what characteristics are typically thought to correlate with an individual’s sexual orientation (Campbell-Kibler, 2011) or with particular ethnic groups (Purnell et al., 2016). These beliefs further shape and reinforce individuals’ behavior and lead to the situation where some forms are highly salient (carrying either a positive or negative overtone) whereas others are abundantly present but with far lower overt awareness on the part of speakers and listeners.
42.5 Sociophonetic Variability in Multilingual Contexts Few bilingual studies have considered the sociophonetic dimensions of phonological acquisition (cf. Khattab, 2013; Sim, 2021), or their impact on the cognitive representation of two languages. This may be due to the fact that, until recently, linguists interested in early bilingualism have mainly focused on whether bilingual children start with one phonological
620 Gerard Docherty and Ghada Khattab system for both of their languages or whether they differentiate their systems from the start (e.g. De Houwer, 2017; Mennen, 2011; Paradis, 2001). Since the emphasis has often been on the issue of separation rather than the phonetic detail and the potential variation within each system, researchers have mainly been interested in the child’s ability to acquire aspects of the phonology that are important for lexical contrast; where variation is considered, this is often in relation to the delay it incurs on the consolidation of the bilingual child’s phonological contrasts (e.g. Bosch & Ramon-Casas, 2011). The targets for each language are often based on the standard dialect, and little attempt is made to look at social-indexical sound features that identify the child as belonging to a particular community, age, gender, social class, etc. The targets are also generally assumed to be invariable (i.e. only one realization is often expected for each target sound under investigation). While more and more studies on monolingual acquisition are pointing to the importance of looking at variation in the input that the child receives (e.g. Docherty et al., 2006; Foulkes et al., 2005; Nycz, 2015; Smith & Durham, 2019; Smith et al., 2007), variable targets are even more pertinent to any discussion of bilingual input, as the child’s linguistic input may consist of standard, non-standard, and non-native varieties for two languages (Levy & Hanulíková, 2019). In many minority communities, first generation immigrants often learn the host language as adults and speak it with a foreign accent, while it is assumed that their offspring will acquire a native-like accent due to the more naturalistic context and the increasing peer-influence, and that they will eventually “catch up” with their monolingual peers. Chambers (2002, p. 121) refers to this phenomenon as the “Ethan experience,” named after the son of eastern European immigrants to Toronto. Ethan’s parents were advanced speakers of English with a pronounced foreign accent, but Ethan learned English with a native-like accent by “filtering out” the foreign accent features that were present in his parents’ input. While it is true that many children of immigrants end up sounding more like their monolingual peers than their parents, the possibility that they may possess multiple representations for the same lexical, phonological, and/or phonetic phenomena which they can call upon according to context cannot be discounted. Evidence for this position comes from comparing the English spoken by bilingual children in the presence of their monolingual peers with that addressed to their parents or other bilinguals/second language learners. For instance, Khattab (2013) found that English-Arabic bilingual children growing up in Yorkshire acquire native accent features that are more typical of their immediate rather than wider community, and in certain contexts they may also produce L2 features that are typical of their parents’ speech. For instance, while the bilingual children’s production paralleled their monolingual friends’ use of northern realizations for bath and the fronted [əː] realization for the goat vowel which was undergoing change (e.g. Watt & Tillotson, 2001), their realization of start, face, and strut was more typical of the standard-like [ɑː], [eɪ], and [ʌ] realizations that were found in their circle of monolingual friends and families, despite evidence for the use of [aː], [ɛː], and [ʊ] respectively in the wider community (Grabe & Nolan, 2001). The bilinguals’ parents produced foreign-accented variants that were typical of L1 interference, e.g. [ɛ] for bath, [eː] for face, [oː] for goat, syllable-final clear [l]s, taps or trills for /r/s and a rhotic accent. However, the bilingual children also produced these features when communicating with their parents and code-switching to English from an Arabic base. A detailed analysis of the use of these features showed a strong influence of the base language (Arabic) but also a tendency for the children to accommodate to their parents’ English accent. The features found in the code-switched data reflect the bilingual children’s wider linguistic repertoire and suggests that the foreign-accent features that are present in their parents’ speech are not ignored or filtered out. Instead, these are learned and stored as knowledge that is only activated in particular social contexts and that has particular socialindexical value. Bilingual speakers may also choose to use features of their L1 when producing their L2 as a way to preserve their ethnic identity through the L2 accent (Alam, 2015).
Sociophonetics and Clinical Linguistics 621 Khattab’s study underlines the importance of awareness of the variable native targets in a particular community in order to establish what is acceptable in a bilingual speaker’s production. The study also underlines the importance of collecting data from controls who have close links with the bilingual participants, since these are essential for the identification of the bilingual speaker’s targets, especially in cases where English is being learned mainly outside the home, and peer influence becomes even more pervasive. The notion of various degrees of activation depending on the context is not new to the discussion of bilingual competence but has often been limited to language choice. For instance, it has been shown that a bilingual speaker’s choice of language in a particular conversational setting is influenced by factors like topic, interlocutor(s), and social context (Gafaranga, 2017). Knowledge of these choices constitutes part of the sociolinguistic repertoire which children acquire. Sociophonetic research suggests that the bilingual individual’s choices may not only be limited to which language to use with whom, but may also extend to which particular phonetic variants to use depending on the linguistic context and the accent of the interlocutor. The ability shown by children to accommodate their speech to their interlocutor is part of the development of sociolinguistic competence that has been reported in monolingual situations (e.g. Johnson & White, 2020), and in cases of contact between different varieties of English (e.g. Mooney, 2020).
42.6 Sociophonetic Variability and Clinical Assessment We now consider some of the implications of the presence of abundant social-indexical variation in speech for clinical assessment of speech production, focusing on three issues: (a) establishing a baseline against which to score a speaker’s performance in the presence of abundant variability in typical speech production; (b) the approach to sociophonetic variation in the context of assessing a multilingual speaker; (c) the extent to which ideologies relating to sociophonetic variation could influence clinical assessment.
42.6.1 Establishing a Baseline A key tension arising from the findings of sociophonetic research is how they impact on the application of assessment tools which require a fixed frame of reference against which to score an individual’s speech performance (either with or without standardization against a population sample). Phonological assessment tools based around a “standard” phonological inventory of some sort are clearly problematic because even regional “standards” are unlikely to be representative of the speech performance of many speakers from that region. For example, within the UK context, it is possible to conceive of assessments being adapted to a Scottish standard, or a north of England standard in order to deal with some of the differentiation from a southern English standard, but what we know of sociophonetic variation suggests that such regional adaptations would struggle to capture the range of variation encountered within a single population; that is, while regional adaptations would go some way to tackling some of the more obvious aspects of regional variation they would not address the fact that within a single region individual speech patterns are heavily shaped by social factors such as those reviewed above. In light of this, it is striking that the vast majority of off-the-shelf tests do not make any attempt to deal even with coarse-grained regional variation, and indeed some of them obviate the need to tackle this problem by not explicitly considering vowel production at all (for example, for the UK, the South Tyneside Assessment of Phonology or STAP (Armstrong & Ainley, 1988) and the Diagnostic Evaluation of Articulation and Phonology or DEAP (Dodd et al., 2002)) despite their contribution to understanding phonological development and disorders (e.g. Ball & Gibbon, 2013).
622 Gerard Docherty and Ghada Khattab Low et al. (2019) demonstrate how the frequency of recorded error patterns on the DEAP can be reduced when dialectal variation is taken into consideration. Similar findings have been reported when using non-standard assessments, with researchers pointing out the increased risk of mis-diagnosis of language impairment if variation is not taken into account (e.g. Laffey et al., 2014). A second problematic issue raised by sociophonetic work with respect to assessment against some type of “standard” is the need for assessment tools to be able to deal with the token-to-token variation in phonetic realization, which sociophonetic work suggests will be readily encountered within the performance of speakers, whether they are children or adolescents following typical patterns of development or in the performance of adults. We can exemplify this by referring to the findings of a study (Docherty et al., 2006; Foulkes et al., 2005) that set out to investigate the emergence of sociophonetic variation in children alongside other aspects of their acquisition of native language sound-patterning. Focusing on a cross-sectional sample of 39 children from a “working-class” speech community in Newcastle upon Tyne aged 2;0, 2;6, 3;0, 3;6, and 4;0, this study analyzed the realization of /t/ in a number of different contexts in utterances produced by children recorded interacting with their mothers in a play situation. The analysis focused on the extent to which the children were reproducing the patterns of phonetic realization which previous research on interadult speech in the same community had shown to be social-indexically structured. One such pattern was the use of a pre-aspirated variant in the realization of /t/ in word-final pre-pausal position (e.g. “bet” realized as [bɛʰt]), which in inter-adult speech was shown to be predominantly a feature of female speech performance. Figure 42.3 shows for the children in the Newcastle study the percentage usage of preaspirated variants in the 1,396 tokens of word-final pre-pausal /t/ that were analyzed. It can readily be seen that the pre-aspirated variant is amply present across the sample of speakers as a whole (on average 38% of all /t/ tokens were produced with this variant). While the
100 90
Girls
Boys
% of tokens with pre-aspiration
80 70 60 50 40 30 20 10
Sabrina Eleanora Josie Louise Amber Lara Sammy Alison Helen Roberta Roxannne Lisa Octavia Emily Learune Eva Alice Hayley Tina Ethan Darury Josh Arran Tristan Guy Orlando Warren Oscar Kyle Rowan Leonardo Laurence Lowell Andrew Smon Zebedee Gareth Justin Carl
0
Child
Figure 42.3 Frequency of occurrence of pre-aspirated variants of (t) in Docherty et al.’s (2006) data from Newcastle children.
Sociophonetics and Clinical Linguistics 623 children’s performance did not in general reflect the social-indexical structuring of the usage of pre-aspirated /t/ in the adult community (with the exception of the 3;6 speakers), the results nevertheless suggest that any attempt to state what is age-typical in respect of the realization of /t/ in this environment for this variety of English cannot neglect the pre-aspirated forms (i.e. in considering what is “normal” for children who are learning this variety an expectation which referred to only a canonical /t/ realization would not be appropriate). A further striking aspect of this data is the varying degrees of inconsistency both within and between speakers. While three speakers attain levels of ≥80% in usage of the pre-aspirated variant, others don’t evince any usage of it all, while for the majority of speakers pre- aspirated /t/ is one of a number of variants which they produce for /t/ in this context (the principal others being a canonical released /t/ and a glottal/ized variant). This suggests that there are some who are quite consistent in respect of the degree of their use of this particular variant, and others who are much less so. While the Newcastle study reports on variants which are a reflection of the sociophonetic variability within the child’s immediate speech community, there are of course many reports of variability in the speech of typically developing children arising from other factors such as motor control and articulatory coordination (see for example Yu et al.’s (2015) study of VOT variability as a function of age and sex in children ranging from age 5 to 18, Gerosa et al.’s (2007) study of temporal variability in children’s productions, or Levy and Hanulíková’s (2019) study of vowel variability in children’s production). The presence of these sources of variability in the performance of typically developing children points to the need for phonological assessment tools to avoid two particular pitfalls: forcing the clinician to base a judgment of typical/atypical performance on a single token of a particular target sound in a particular context, and requiring a decision with respect to typicality by reference to a “standard” which provides only a single “correct” target for each particular context. The risk is that where this cannot be achieved, children may be scored as performing atypically when in fact they are within the range of normal variability for a particular target or vice versa. It is positive that the presence of variability such as that discussed above is acknowledged in the design of phonological assessment tools. For example, the DEAP (Dodd et al., 2002) specifically sets out to rate the degree of consistency evinced by a speaker over three repetitions of single word responses to 25 picture stimuli. While the primary motivation for this is Dodd’s work (1995) suggesting that inconsistency above a certain threshold is a strong diagnostic indicator, this approach clearly embodies a view that typical speech production is not free of token-to-token variability. However, by adopting this approach, the DEAP places demands on the clinician in respect of understanding the nature of typical variability within the client’s speech community. When scoring the inconsistency test, if the within-word inconsistency rate crosses the threshold of 40% which Dodd et al. (2002) identify as the diagnostic threshold, the clinician is advised to re-assess the inconsistencies which have been identified and to exclude from the percentage calculation those that are “developmentally age-appropriate.” But how would a clinician handle the Newcastle children’s pre-aspirated variants discussed above? They do not fall into the category of a typical developmental process, and yet since they are simply reflecting variability in the speech community which the child is part of, they clearly shouldn’t be classified as atypical. This suggests that the laudable move to incorporate into clinical assessment what we know about typical patterns of variability in speech needs to be mediated by an awareness of the parameters of variation within the client’s immediate speech community. Interestingly, while the DEAP does not bring these together in respect of the inconsistency analysis, it does to some extent in respect of its approach to error analysis where clinicians are urged to not class as an “error” tokens which are present within the regional accent concerned. As indicated above, it is not simply regional accent features that will be relevant in this case, but also the social-indexical variation present within the client’s speech community.
624 Gerard Docherty and Ghada Khattab
42.6.2 Working with Multilingual Speakers Bilingual phonological assessments are relatively scarce (cf. Peña et al., 2018; Stow & Pert, 1998b) and a long way from being able to account for sociophonetic variability in one or more of the bilingual/multilingual speaker’s languages. In some cases there are more urgent issues to deal with, like finding out what other language(s) and/or dialects the client speaks (Stow et al., 2012) due to the scarcity of data on languages other than English (cf. McLeod, 2007). For instance, in the case of the Pakistani heritage languages spoken in the UK, the prestige that is attached to Urdu may lead clients to claim it as their mother tongue when they might actually be Mirpuri speakers. The lack of standardized tests on languages other than English may also mean that the assessment in these languages is more informal and, if code-switching is the norm in the bilingual’s community, then the data elicited by the bilingual Speech and Language Therapy Assistant (SLT) assistants are bound to contain many instances of code-switching. While in many studies of typical bilingual development it has become standard to account for language mode by using different interlocutors for each language session, this is not always possible in a clinical context where the clinician might only speak one of the languages of the bilingual client. While bilingual assistants and interpreters may be trained to elicit data from the bilingual client’s L1, the SLT is often present and the context is rarely conducive to a “monolingual” state. Any bilingual assessment will need to take into consideration the fact that speakers might be exposed to several varieties of a particular language or to closely related languages and that their production might contain phonological (amongst other) features from more than one of these varieties (see Canta et al., 2023, for an example of how the DEAP can be scored against possible transfer from Jamaican Creole phonological targets), especially if code-switching is common in their community. There are hardly any standardized tests that can deal with code-switching in terms of scoring (cf. Stow & Pert, 1998a, 1998b), and until recently many SLTs still viewed codeswitching as a sign of lack of competence in one of the bilingual speaker’s languages (cf. Kapantzoglou et al., 2021). This may be true in the very early stages of bilingual development. While typically developing bilingual children may reach a state of competence in both of their languages that would allow them to function in a near-monolingual mode, many who are referred for SLT may be in the early stages of acquiring one of their languages and/or may be experiencing delay in one of these languages and may therefore still be relying on the other to communicate. However, in communities where bilingualism is the norm, the amount of code-switching may actually increase as the children grow older and become more competent in both of their languages (Pert & Letts, 2006; Yow et al., 2018). This is due to the fact that fluent code-switching preserves the grammatical, syntactic, and morphological rules of both languages. Code-switching in this context should therefore be expected as the norm for the bilingual child’s daily interactions, but most clinical assessments are too rigid to accommodate bilingual discourse in terms of their scoring system and what counts as an “acceptable” or “correct” answer. In cases of balanced bilingualism, bilingual behavior seems to suggest that bilingual individuals can control language activation: competent bilingual speakers have been shown to behave in a “monolingual” manner when speaking to other monolingual speakers by producing separate phonetic realizations of particular phonological variables that are similar in their languages (Bullock et al., 2004). When code-switching, however, speakers may “carry over” phonetic properties from the base language onto the “guest” language. This could either be due to internal factors like base language influence (termed “linguistic convergence” by Bullock et al., 2004) or to external factors like the bilingual speaker’s accommodation to their interlocutor. These signs of interaction between the bilingual’s languages should not be interpreted as interference, since they tend to occur mainly in bilingual contexts. This is
Sociophonetics and Clinical Linguistics 625 not to suggest that competent bilingual speakers are immune from interference between their languages. For instance, English-Arabic bilingual speakers can develop different realizations for /r/ for each of their languages but occasionally produce taps in English and approximants in Arabic, or even combine patterns from both languages by producing retroflex taps (Khattab, 2002). SLTs need to be aware of these “atypical” realizations which may not be indicative of a disorder, but which may be a necessary step in the bilingual child’s development as they formulate hypotheses about their languages. Similar observations have been made by Holm and Dodd (1999) for their Cantonese-English bilingual participants who showed “error” patterns which the authors argue should be considered normal developmental stages in the development of Cantonese-English bilingual speakers even if they were atypical for monolingual children.
42.6.3 Ideology and Assessment As a further illustration of the importance of building a sociophonetic dimension into clinical assessment, we now consider the status of labial and/or labio-dental variants of /r/ in speakers of British English. Historically the realization of /r/ as a labiodental approximant [ʋ] was widely interpreted as a developmental misarticulation which for some speakers would persist beyond the age at which it could be classified (e.g. in Dodd et al.’s terms) as developmental age-appropriate.3 While the most recent edition of Gimson’s Pronunciation of English (Cruttenden, 2014) acknowledges reports of [ʋ] as a possible regional variant, it still maintains that the labio-dental /r/ is “regarded as a speech defect in adults” (p. 225). For most of the last century, UK-based speech and language therapists would routinely assess and treat clients presenting with a labio-dental /r/. However, for contemporary young speakers of UK varieties of English, the situation is very different with labio-dental /r/ now significantly more prevalent, but crucially now not being interpreted as a “speech defect” or indeed being a feature which speakers and listeners show much awareness of at all (the nature of this change is documented in Foulkes & Docherty, 2000). This gives rise to an interesting follow-on question: what is the role of the social evaluation prevalent within a particular speech community with respect to what is considered to be atypical or “disordered” speech? On the one hand, there are clearly many cases where social evaluation has presumably no part to play – fronting of /k/ to [t], for example, which results in the collapse of a significant consonantal contrast in English and potentially severe problems of intelligibility, whatever the individual’s variety of English. On the other hand, there is a range of “misarticulations,” the impact of which is differentially interpreted by both speakers and listeners. For example, for many speakers, a degree of lisping of /s/ is a feature that they would wish to change in their own performance (or that parents may wish to change in their children’s speech), whereas other speakers may feel less motivated to change because it has simply become part of their own phonetic “signature.” We can see a similar situation in the case of a rather different phonetic parameter, voice quality. Creak, often referred to as vocal fry (e.g. Gallena & Pinto, 2021), is widespread in the voice quality of many US English speakers, but attracts particular criticism in public discourse when produced by women. These judgments are shaped not solely by how individuals perceive normality in respect of their habitual voice quality, but also by constructed societal beliefs about what constitutes a “normal” and/or “appealing” voice quality. Clinicians have a duty not to be influenced by these biased and often discriminatory judgements in their practice (see Winn, Tripp and Munson’s (2022) critique of Gallena and Pinto’s recommendations); rather, a clinician’s interpretation of what is normal/acceptable or not for an individual speaker in respect of voice quality needs to be mediated by an understanding that voice quality (like other phonetic parameters referred to above) is one of the aspects of speech production which are closely tied to an individual’s identity. Similar awareness is needed of how
626 Gerard Docherty and Ghada Khattab non-mainstream and/or so-called non-native varieties, which are typically spoken by speakers from minority backgrounds, are evaluated by listeners in both clinical and nonclinical settings, and the impact this may have for both the clinician and the client. For the clinician this may influence the evaluation of their own suitability to provide services in phon-related cases; this seems to be problematized when a clinician has an L2 accent in the dominant language (often English) but is hardly raised when considering their L2 accent in the client’s L1 probably because it is rare for SLTs to even speak the client’s other language(s) (Levy & Crowley, 2012). For the client this may negatively impact how intelligible/comprehensible or target-like their production is perceived to be (e.g. Robinson & Stockman, 2009), potentially leading to over-referral.
42.7 Practical Solutions? While research into the sociophonetic properties of speech is progressively providing us with a more rounded account of their function within speech communication, painting an apparently ever more complex picture in the process, work on determining how these properties should be dealt with in clinical assessment is some way behind. Clearly clinicians can refer to sources providing accounts of particular varieties of a language (e.g. for UK English, volumes such as Foulkes and Docherty (1999) or Hughes et al. (2012)), or for a wider range of varieties to journals such as English World-Wide, but these tend to capture a snap-shot in time and do not necessarily cover in sufficient detail all of the parameters of variability relevant for the assessment of a particular individual. A more productive approach would be the development of tools to allow sociophonetic variation to be factored into a clinical assessment. For example, while, as pointed out above, assessment of vowel production has a relatively low profile in routine clinical assessment, it is likely that Wells’ lexical sets analysis could be applied quite straightforwardly as a means of determining not only the predominant vocalic features of an individual’s accent but also the main dimensions along which the speaker varies. It would seem reasonable to suggest that a similar approach could be taken to the assessment of consonants – since, typically, consonant assessments are based on a sample of words which provide exemplars of all consonants in most if not all environment in which they can occur, it would not be difficult to tag those consonant/environment conjunctions which are known to be most likely to give rise to sociophonetic variation across different varieties of English such that the clinician can pay particular attention to variability which might be encountered with those particular items. There would probably also be much to gain by bringing to bear a greater focus on stylistic variation in clinical phonological assessment as assessing speaker performance across different styles is likely to give greater insight to the range of variants that a speaker is able to generate and about the extent to which they can be deployed in a way that reflects the stylistic variation present within that speaker’s community. For some time now, the predominant approach in the assessment of phonology has been to elicit sounds based on the production of words in isolation, typically from picture-naming (e.g. Core, 2011). This is time-efficient and allows the comparison of a range of sounds in comparable lexical and phonological contexts across a number of children. There is clearly a cost in time arising from incorporation of a wider range of styles, but if this enables a more complete assessment of an individual’s sound patterning to emerge, then it is arguably valuable for that additional time to be invested. Likewise, when dealing with bilingual/multilingual clients, studies by Stow and Pert (1998a, 1998b) and Pert and Letts (2003), Letts and Sinka (2011) have led the way in documenting the normal patterns of discourse by speakers of the Pakistani Heritage languages in the UK and creating assessments and toolkits for languages other than English as well as
Sociophonetics and Clinical Linguistics 627 bilingual phonological assessments. In the future, more bilingual phonological assessments are needed that can accommodate code-switching and that can take into account the variable phonetic and phonological patterns of bilingual speech depending on issues like language activation and demands of the situation. Hand in hand with developing tools which would enable greater account to be taken of sociophonetic variability in clinical assessment, it is crucial that the issues discussed within this chapter are given a significant profile within the education and training of clinicians. While it is probably many years now since any program training Speech and Language Therapists/Speech Pathologists (SLTs) advocated a prescriptive approach to the assessment of speech production, it is important that in learning how to assess clients, trainee clinicians learn to guard against subconscious appeals to a “standard” as representing somehow the “target” against which an individual’s performance should be evaluated and against situations where their clinical judgments could be influenced by their own implicit ideologies relating to phonological variation (as described above in Section 42.4). This is perhaps all the more important in contexts where the social and ethnic make-up of the SLT profession is far from representative of that of society as a whole (ASHA, 2010; Thanki, 2002); interestingly, an increase in diversity in the SLT profession brings its own challenges too (e.g. Levy & Crowley, 2012). There are many approaches which could be taken to addressing this, but it would seem important to build this fundamentally into the values underpinning any SLT program such that the basic message is reinforced by all of those involved in its delivery at every opportunity (as opposed, for example, to seeing this simply as being a point to be handled via the course in sociolinguistics or phonetics). Investigations into SLT trainees’ awareness of dialectal and sociolinguistic variation presents a cautiously optimistic opportunity, with training in bilingualism and cultural and linguistic diversity contributing to better awareness of variation (e.g. Clark et al., 2021; Levey & Sola, 2013; McAlister et al., 2023), but bias in clinical practice is unfortunately still rife (Easton & Verdon, 2021).
NOTES 1 The term “social-indexical” is applied here to those properties of speech which are correlated with relevant social dimensions of a speech community either at the level of groups of speakers or in relation to individuals. 2 As a means for facilitating the analysis of vowel variables, Wells (1982) introduced the notion of standard “lexical sets” to refer to groups of English words that share the same vowel pronunciation across two or more varieties. Since the vowel that is associated to a particular lexical set may vary across different dialects, the vowel differences between these dialects can be conveniently expressed in terms of these lexical sets. 3 But it should be noted that, developmentally, [w] is more common than [ʋ] so it’s not obvious that this is a persisting developmental pattern.
REFERENCES Alam, F. (2015). Glaswasian? A sociophonetic analysis of Glasgow-Asian accent and identity [Doctoral dissertation]. University of Glasgow. American Speech-Language-Hearing Association. (2010). Demographic profile of
ASHA members providing bilingual and Spanish-language services. Retrieved from http://www.asha.org/uploadedFiles/ Demographic-Profile-Bilingual-SpanishService-Members.pdf
628 Gerard Docherty and Ghada Khattab Armstrong, S., & Ainley, M. (1988). South Tyneside assessment of phonology. STASS Publications. Ball, M. J., & Gibbon, F. E. (2013). Handbook of vowels and vowel disorders. Psychology Press. https://doi.org/10.4324/9780203103890 Bishop, H., Coupland, N., & Garrett, P. (2005). Conceptual accent evaluation: Thirty years of accent prejudice in the UK. Acta Linguistica Hafniensia, 37(1), 131–154. https://doi.org/10.1 080/03740463.2005.10416087 Bosch, L., & Ramon-Casas, M. (2011). Variability in vowel production by bilingual speakers: Can input properties hinder the early stabilization of contrastive categories? Journal of Phonetics, 39(4), 514–526. Bullock, B., Toribio, J. A., Davis, K. A., & Botero, C. G. (2004). Phonetic convergence in bilingual Puerto Rican Spanish. In B. Schmeiser, V. Chand, A. Kelleher, & A. Rodriguez (Eds.), Proceedings of the west coast conference on formal linguistics 23 (pp. 113–125). Cascadilla Press. Campbell-Kibler, K. (2011). Intersecting variables and perceived sexual orientation in men. American Speech, 86(1), 52–68. https://doi.org/ 10.1215/00031283-1277510 Canta, A. J., Abu El Adas, S., Washington, K. N., & McAllister, T. (2023). Variability, accuracy, and cross-linguistic transfer in bilingual children speaking Jamaican Creole and English. Clinical Linguistics & Phonetics, 37(406), 436–453. Chambers, J. K. (2002). Dynamics of dialect convergence. Journal of Sociolinguistics, 6(1), 117–130. Clark, E. L., Easton, C., & Verdon, S. (2021). The impact of linguistic bias upon speech-language pathologists’ attitudes towards non-standard dialects of English. Clinical Linguistics & Phonetics, 35(6), 542–559. Clopper, C. G. (2004). Linguistic experience and the perceptual classification of dialect variation [Doctoral dissertation]. Indiana University. Core, C. (2011). Assessing phonological knowledge. In E. Hoff (Ed.), Research methods in child language: A practical guide (pp. 77–99). Wiley-Blackwell. Cramer, J. (2014). Perceptual dialectology. In Oxford handbook topics in linguistics (online edn, Oxford Academic, 5 December 2014). https:// doi.org/10.1093/ oxfordhb/9780199935345.013.60 Accessed 21 March 2023. Cruttenden, A. (2014). Gimson’s pronunciation of English. Taylor & Francis.
Dallaston, K., & Docherty, G. (2020). The quantitative prevalence of creaky voice (vocal fry) in varieties of English: A systematic review of the literature. PLoS ONE, 15(3), e0229960. https://doi.org/10.1371/journal.pone.0229960 De Houwer, A. (2017). Bilingual language acquisition. In P. Fletcher & B. MacWhinney (Eds.), Handbook of child language (2nd ed., pp. 219–250). Blackwell. https://doi.org/10.11 11/b.9780631203124.1996.00009 Docherty, G. (2022) Sociophonetics. Oxford Research Encyclopedia of Linguistics. Retrieved 7 March 2023, from https://oxfordre.com/ linguistics/view/10.1093/acrefore/ 9780199384655.001.0001/acrefore9780199384655-e-752 Docherty, G. J., Foulkes, P., Tillotson, J., & Watt, D. J. L. (2006). On the scope of phonological learning: Issues arising from socially structured variation. In L. Goldstein, D. H. Whalen, & C. T. Best (Eds.), Laboratory phonology 8 (pp. 393–421). Mouton de Gruyter. Dodd, B. (1995). Differential diagnosis and treatment of children with speech disorder. Whurr Publisher. Dodd, B., Zhu, H., Crosbie, S., Holm, A., & Ozanne, A. (2002). Diagnostic evaluation of articulation and phonology. Psychological Corporation. Drager, K. (2010a). Sociophonetic variation in speech perception. Language and Linguistics Compass, 4(7), 473–480. Drager, K. K. (2010b). Sensitivity to grammatical and sociophonetic variability in perception. Laboratory Phonology, 1(1), 93–120. Drager, K., & Kettig, T. (2021). Sociophonetics. In R. Knight & J. Setter (Eds.), The Cambridge handbook of phonetics (Cambridge Handbooks in Language and Linguistics, pp. 551–577). Cambridge University Press. https://doi. org/10.1017/9781108644198.023 Easton, C., & Verdon, S. (2021). The influence of linguistic bias upon speech-language pathologists’ attitudes toward clinical scenarios involving nonstandard dialects of English. American Journal of Speech-Language Pathology, 30(5), 1973–1989. Eckert, P. (2017). Age as a sociolinguistic variable. In F. Coulmas (Ed.), The handbook of sociolinguistics (pp. 151–167). Blackwell Publishing Ltd. https://doi. org/10.1002/9781405166256 Foulkes, P. (2005). Sociophonetics. In K. Brown (Ed.), Encyclopedia of language and linguistics (2nd ed., pp. 495–500). Elsevier Press.
Sociophonetics and Clinical Linguistics 629 Foulkes, P., & Docherty, G. J. (Eds.). (1999). Urban voices: Accent studies in the British Isles. Arnold. Foulkes, P., & Docherty, G. J. (2000). Another chapter in the story of /r/: “labiodental” variants in British English. Journal of Sociolinguistics, 4(1), 30–59. Foulkes, P., & Docherty, G. J. (2006). The social life of phonetics and phonology. Journal of Phonetics, 34(4), 409–438. Foulkes, P., Docherty, G. J., & Watt, D. J. L. (2005). Phonological variation in child directed speech. Language, 81(1), 177–206. Foulkes, P., Scobbie, J. M., & Watt, D. J. L. 2010. Sociophonetics. In W. Hardcastle, J. Laver, & F. Gibbon (Eds.), Handbook of phonetic sciences (2nd ed., pp. 703–754). Blackwell. Gafaranga. (2017). Bilingualism as interactional practices. Edinburgh University Press. Gallena, S. K., & Pinto, J. A. (2021). How graduate students with vocal fry are perceived by speech-language pathologists. Perspectives of the ASHA Special Interest Groups, 6(6), 1554–1565. https://doi.org/10.1044/ 2021_PERSP-21-00083 Gerosa, M., Giuliani, D., & Brugnara, F. (2007). Acoustic variability and automatic recognition of children’s speech. Speech Communication, 49(10–11), 847–860. Giles, H., Coupland, N., & Coupland, J. (Eds.). (1991). The contexts of accommodation: Dimensions in applied sociolinguistics. Cambridge University Press. Grabe, E., & Nolan, F. (2001). English intonation in the British Isles. The IViE corpus. CD ROMs produced as part of ESRC grant R000237145. Hay, J., Warren, P., & Drager, K. (2006). Factors influencing speech perception in the context of a merger-in-progress. Journal of Phonetics, 34(4), 458–484. Henton, C., & Bladon, A. (1988). Creak as a sociophonetic marker. In L. M. Hyman & C. N. Li (Eds.), Language, speech and mind: Studies in honour of in honor of Victoria A. Fromkin (pp. 3–29). Routledge. Holm, A., & Dodd, B. (1999). A longitudinal study of the phonological development of two Cantonese-English bilingual children. Applied Psycholinguistics, 20(3), 349–376. Hughes, A., Trudgill, P., & Watt, D. (2012). English accents and dialects: An introduction to social and regional varieties of English in the British Isles (5th ed.). Hodder Education ISBN 978-1-444-12138-4 Johnson, E. K., & White, K. S. (2020). Developmental sociolinguistics: Children’s
acquisition of language variation. Wiley Interdisciplinary Reviews: Cognitive Science, 11(1), e1515. Kapantzoglou, M., Brown, J. E., Cycyk, L. M., & Fergadiotis, G. (2021). Code-switching and language proficiency in bilingual children with and without developmental language disorder. Journal of Speech, Language, and Hearing Research, 64(5), 1605–1620. Kendall, T., & Fridland, V. (2021). Sociophonetics. Cambridge University Press. Kerswill, P. (1996). Children, adolescents and language change. Language Variation and Change, 8(2), 177–202. Khattab, G. (2002). /l/ production in EnglishArabic bilingual speakers. International Journal of Bilingualism, 6(3), 335–354. Khattab, G. (2013). Phonetic convergence and divergence strategies in English-Arabic bilingual children. Linguistics, 51(2), 439–472. Laffey, K., Pearce, W. M., & Steed, W. (2014). Effect of dialect on the identification of speech impairment in Indigenous children. Australian Review of Applied Linguistics, 37(2), 161–177. Letts, C., & Sinka, I. (2011). Multilingual toolkit. GL-Assessment. Levey, S., & Sola, J. (2013). Speech-language pathology students’ awareness of language differences versus language disorders. Contemporary Issues in Communication Science and Disorders, 40(Spring), 8–14. Levy, E. S., & Crowley, C. J. (2012). Beliefs regarding the impact of accent within speechlanguage pathology practice areas. Communication Disorders Quarterly, 34(1), 47–55. Levy, H., & Hanulíková, A. (2019). Variation in children’s vowel production: Effects of language exposure and lexical frequency. Laboratory Phonology: Journal of the Association for Laboratory Phonology, 10(1), 9, pp. 1–26. https://doi.org/10.5334/labphon.131 Lippi-Green, R. (2011). English with an accent: Language, ideology, and discrimination in the United States (2nd ed.). Routledge. Low, A., Koh, G., Young, S. E., & Chandler-Yeo, H. (2019). Effect of dialect on phonological analysis. Clinical Linguistics and Phonetics, 33(5), 457–478. https://doi.org/10.1080/02699206.20 18.1550812 McAlister, H., Hopf, S. C., & McLeod, S. (2023). Effect of dialect on identification and severity of speech sound disorder in Fijian children. Speech, Language and Hearing, 26(1), 48–60. McLeod, S. (Ed.). (2007). The international guide to speech acquisition. Thomson Delmar Learning.
630 Gerard Docherty and Ghada Khattab Mennen, I. (2011). Speech production in simultaneous and sequential bilinguals. Multilingual Aspects of Fluency Disorders, 5, 24. Milroy, J. (2006). Language ideologies. In M. C. Llamas, L. Mullany, & P. Stockwell (Eds.), The Routledge companion to sociolinguistics (pp. 133–139). Routledge. Milroy, L., & Gordon, M. (2003). Sociolinguistics: Method and interpretation. Language in Society 34. Blackwell Publishing. Mooney, S. (2020). Child acquisition of sociolinguistic variation [Doctoral dissertation]. Georgetown University. Niedzielski, N. A. (1999). The effect of social information on the perception of sociolinguistic variables. Journal of Social Psychology (Special Edition), 18(1), 62–85. Nycz, J. (2015). Second dialect acquisition: A sociophonetic perspective. Language and Linguistics Compass, 9(11), 469–482. Paradis, J. (2001). Do bilingual two-year-olds have separate phonological systems? International Journal of Bilingualism, 5(1), 19–38. Peña, E. D., Gutiérrez-Clellen, V. F., Iglesias, A., Goldstein, B. A., & Bedore, L. M. (2018). Bilingual English Spanish Assessment (BESA). Brookes. Pert, S., & Letts, C. (2003). Developing an expressive language assessment for children in Rochdale with a Pakistani heritage background. Child Language Teaching and Therapy, 19(3), 267–289. Pert, S., & Letts, C. (2006). Codeswitching in Mirpuri speaking Pakistani heritage preschool children: Bilingual language acquisition. International Journal of Bilingualism, 10(3), 349–374. Preston, D. R. (2017). Perceptual dialectology. In C. Boberg, J. A. Nerbonne, & D. Watt (Eds.), The handbook of dialectology (pp. 177–203). John Wiley & Sons, Inc. https://doi.org/ 10.1002/9781118827628.ch10 Purnell, T., Idsardi, W., & Baugh, J. (2016). Perceptual and phonetic experiments on American English dialect identification. Journal of Language and Social Psychology, 18(1), 10–30. Robinson, G. C., & Stockman, I. J. (2009). Cross-dialectal perceptual experiences of speech-language pathologists in predominantly Caucasian American school districts. Language, Speech, and Hearing Services in Schools, 40(2), 138–149. Sangster, C. (2002). Inter- and intra-speaker variation in Liverpool English: A sociophonetic
study. Unpublished PhD dissertation, University of Oxford. Shetewi, O. (2018). Acquisition of sociolinguistic variation in a dialect contact situation: The case of Palestinian children and adolescents in Syria [Doctoral dissertation]. Newcastle University. Sim, J. H. (2021). Sociophonetic variation in English/l/in the child-directed speech of English-Malay bilinguals. Journal of Phonetics, 88, 101084. Smith, J., & Durham, M. (2019). Sociolinguistic variation in children’s language: Acquiring community norms. Cambridge University Press. Smith, J., Durham, M., & Fortune, L. (2007). “Mam, my trousers is fa’in doon!”: Community, caregiver, and child in the acquisition of variation in a Scottish dialect. Language Variation and Change, 19(1), 63–99. Sóskuthy, M. (2021). Evaluating generalised additive mixed modelling strategies for dynamic speech analysis. Journal of Phonetics, 84, 101017. Stow, C., & Pert, S. (1998a). The Rochdale Assessment of Mirpuri Phonology with Punjabi and Urdu (RAMP). Pert. Stow, C., & Pert, S. (1998b). The development of a bilingual phonology assessment. International Journal of Language and Communication Disorders, 33(Supplement), 338–343. Stow, C., Pert, S., & Khattab, G. 2012. Translation to practice: Sociolinguistic and cultural considerations when working with the Pakistani heritage community in England, UK. In S. McLeod & B. A. Goldstein (Eds.), Multilingual aspects of speech sound disorders in children (Vol. 6, pp. 24–27). Multilingual Matters. Strand, E. A. (1999). Uncovering the role of gender stereotypes in speech perception. Journal of Language and Social Psychology, 18(1), 86–99. Stuart-Smith, J. (1999). Glasgow: Accent and voice quality. In P. Foulkes & G. Doherty (Eds.), Urban voices: Accent studies in the British Isles (pp. 201–222). Edward Arnold. Thanki, M. (2002). White Women Only. The Challenge for Change. Royal College for Speech and Language Therapy (RCSLT) Diversity Strategy Report. RCSLT. Thomas, E. (2016). Sociophonetics of consonantal variation. Annual Review of Linguistics, 2, 95–113. Thomas, E. R. (2011). Sociophonetics: An introduction. Palgrave.
Sociophonetics and Clinical Linguistics 631 Warburton, J. (2020). The merging of the goat and thought vowels in Tyneside English: Evidence from production and perception [Doctoral dissertation]. Newcastle University. https:// theses.ncl.ac.uk/jspui/handle/10443/5342 Watt, D. (2002). “I don’t speak with a Geordie accent, I speak, like, the Northern accent”: Contact-induced levelling in the Tyneside vowel system. Journal of Sociolinguistics, 6(1), 44–63. Watt, D., & Tillotson, J. (2001). A spectrographic analysis of vowel fronting in Bradford English. English World-Wide, 22(2), 269–302. Wells, J. C. (1982). Accents of English (3 Vols.). Cambridge University Press.
Winn, M. B., Tripp, A., & Munson, B. (2022). A critique and call for action, in response to sexist commentary about vocal fry. Letter to the editor. Perspectives of the ASHA Special Interest Groups, 7(6), 1903–1907. https://doi. org/10.1044/2022_PERSP-21-00319 Yow, W. Q., Tan, J. S., & Flynn, S. (2018). Codeswitching as a marker of linguistic competence in bilingual children. Bilingualism: Language and Cognition, 21(5), 1075–1090. Yu, V. Y., De Nil, L. F., & Pang, E. W. (2015). Effects of age, sex and syllable number on voice onset time: Evidence from children’s voiceless aspirated stops. Language and Speech, 58(2), 152–167.
Index
accentism, 82, 85 accent modification, 82 accountability, 116–117 acoustic analysis, 523–529, see also harmonic to noise ratio, jitter, shimmer Acoustic Voice Quality Index (AVQI), 527 Dysphonia Severity Index (DSI), 527 Cepstral Spectral Index of Dysphonia (CSID),527 see also cepstral analysis nonlinear analysis, 526 perturbation, 524–528 spectral analysis, 526 voice range profiles (VRP), phonetograms, 524 acoustic-articulatory mapping, 473–474 acoustics, 567–568 acquired disorders, 33–35 acquisition, of liquids, 425–427 of codas, 428–429 of clusters, 429–431 action, 71 adapt, 69 aerometry, 499–500 African-American, 85–87 children, assessing communication in, 87 English, 85 Afrikaans, 409 Akan, 411 allomorph, 203, 208, 209 Alzheimer’s disease, 42, 179–180, 221–222, see also dementia American Sign Language, 287–290, 291–292 anti-phase, 335 aphasia, 107, 123, 134, 139, 178–9, 189, 195–197, 216–219, 237–218, 304, 311, 564 and bilingualism, 280–292 and coarticulation, 580 and cognition, 280–282 and conversational implicatures, 22–24 and pragmatic impairment, 56–59 and sign, 289–290 primary progressive, 564
AphasiaBank, 143 apraxia (of speech), 325, 342, 445–446, 564–566, 589, 598 and coarticulation, 581 and sign, 289–290 progressive, 564, 565 Arabic, 410, 424–425 articulation, 304, 308 see also coarticulation articulatory and acoustic distinctive features, 310, 312 articulatory mechanisms, 313 articulatory phonology, see phonology, articulatory complex, 422, 425 disorder, 458, 461 impairment, 341 scores (AS), 415 shifts, 304 ASDBank, 143 assessment, 303, 305–307, 321–322, 326, 440–443, 448 dynamic, 253 multilingual, 245–255, 327 of cognition, 274–276 of developmental phonology, 321–322 of language, 274–276 sociolinguistic sensitivity in, 86 see also sociophonetics assimilation, 303, 306–307, 338 ataxia, and sign, 294–295 attention, 276–278 atypical interaction, 73 attrition, 251–252 augmentative and alternative communication (AAC), 122–123 autism spectrum disorders (ASD), 17, 28, 29, 33, 42, 76, 107, 159, 172, 173, 180–181, 260, 589, 594 and conversational implicatures, 17–18 and pragmatic impairment, 55–56, 58 automatic speech recognition (ASR), 609 autosegmentalism, 307 autosegmental phonology, 307 avoidance, 171, 204
The Handbook of Clinical Linguistics, Second Edition. Edited by Martin J. Ball, Nicole Müller, and Elizabeth Spencer. © 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
634 Index babbling, 382, 383 behavioral measures, 541 across languages, 546 across the life span, 546 analytical measures, 544 dual-task measures and listening effort, 546 lexical tasks, 544 masking noise, 545 presentation mode, 544 response format, 544 spatial speech perception measures, 545 towards more real-life ecological measures, 546 bi-dimensional continuum, 34, 35 bilingual-bicultural, 83 bilingualism, 129–140, 245–255 see also multilingualism advantage, 280–281 and assessment of children, 245–255 and developmental language disorder (DLD), 245–255 and language mixing, 281 and language switching, 281 and misdiagnosis, 245 and diagnostic accuracy, 254–255 simultaneous type, 245, 248 successive type, 245, 247 and test norms, 252 and transfer, 250 bilinguals, 411 Czech-English, 411 English-Hungarian, 414 English-Spanish, 414 Greek-English, 411 biofeedback, 490, 492, 494, 497 bradykinesia, 563 British Sign Language, 290–291, 293–295 browsable database, 145 Bulgarian, 411, 429 Cantonese, 410, 425 categorization, 367–368 CHAT, 143 CHILDES, 143 childhood apraxia of speech (CAS), 398–399, 567, 589, 597 and vowel errors, 398–399, 403 markers of, 399 childhood dysarthria, 566 cepstral analysis, Cepstral Peak Prominence (CPP), 526, 528–529 for voice assessment, 526, 528–529 in vowel phonation, 528 in speech, 528 cerebral palsy, 566 Chinese, 359 CLAN, 144 clinical applications, 317–327 developmental, 321–323 developmental phonology, 321–323 neurogenic disorders, 324–325 clinical linguistics and conversational implicatures, 17–18, 24 and pragmatic impairment, 58 clusters, 338 acquisition of, 429–431
deletion, 338 reduction, 338 /s/C clusters, 430–431 coarticulation, 573–583 definition of, 573 in impaired speech, 579–583 measurements of, 575–579 code-switching, 253 cognition and assessment, 274–276 and language, 273–282 and language impairment, 273–282 and pragmatic impairment, 56–58, 63, 65 cognitive-communication disorders, 279 cognitive control, 279–282 cognitive impairment and language, 273–282 and pragmatic impairment, 57–61 and traumatic brain injury, 279 assessment of, 274–276 cognitive phonology, see phonology, cognitive cognitive principle, 30 collaborative commentary, 151 Comanche, 423 communication disorders, 17 communication impairment, 129–140 communicative competence, 130 communicative principle, 30 communicative repertories, 139 compensatory adaptations and strategies and pragmatic impairment, 60–62 control group, 204 consonants, 318–323, 325–326, 407–410 affricates, 410 aspirated obstruents, 411 clicks, 410 dual place, 410 ejectives, 410 fricatives, 410, 514–515 geminates, 410 glides, 410, 511–512 implosives, 410 in government phonology, 356–357 lisp, 410 liquids, 410 nasals, 410, 512 non-pulmonic, 410 non-stridents, 410 obstruents, 410 percentage correct (PCC), 383 plosives, 410 prenasalized obstruents, 411 pulmonic, 410 rhotics, 410 sonorants, 410 stops, 512–514 trill, 410 velarized lateral, 410 consonant clusters, 207, 412, 416 consonantal inventories, 408 early, 408–409 developmental path, 409–411 constituency in phonology, 351–353
Index 635 constraints-based theories, 320–321 clinical application, 321–327 Optimality Theory (OT), 320–321 context, 438 conversation, 69 conversation analysis (CA), 6, 62, 65, 69, 107, 116, 119–122 conversation maxims, 15 manner, 63 quality, 55 quantity, 63 relevance, 55 conversational implicatures, 15 and ad-hoc implicatures, 16 and scalar implicatures, 16 convex hull area, 516 coordinative structure, 334 correct information units (CIUS), 277–278 core vocabulary intervention, 461, 462–463 coupling graph, 335 critical discourse analysis, 99, 107 Croatian, 429–430 cross-linguistic, 321–322, 407–419 comparison, 261, 267 differences, 438 perspective, 129–139 studies of language disorders, 249, 250 culture, 135–136 Cypriot (Greek), 409, 416 Czech, 411, 438 Danish, 409 deaffrication, 340 declarative/procedural model, 206–207, 212 demands, 9 dementia, 76,107, 220–222, 446, see also Alzheimer’s disease and pragmatic impairment, 62 fronto-temporal, 220–222 DementiaBank, 143 development, 303–304, 306, 311–313 atypical, 306 cognitive, 309 phonemic, 303, 311 phonological, 303, 305–306, 310–313 speech, 303–304, 306, 311 typical, 304, 306, 311 developmental disorders, 31–33, 318, 321–324, 326–327 developmental language disorders (DLD), 159–161, 163, 166–170, 172–173, 189, 192–194, 205, 208, 209, 260, 595 and pragmatic impairment, 19–20 diphthongs, see vowels, diphthongs discourse, 129–140 analysis, 3–10 distinctive features, 305, 310–312 Down syndrome, 159–160, 166–167, 169, 173, 210, 260, 566 dual system models, 182 dual-task, costs of, 276–278 Dutch, 209, 409, 426, 430–431, 438 dynamic assessment, 253 dysarthria, 73, 107–108, 343, 609–611, 562–566, 589, 591
and coarticulation, 581 ataxic, 563 flaccid, 562 hyperkinetic, 563, 589 hypokinetic, 589 rigid-hypokinetic, 563 spastic, 562 dysexecutive symptomatology, 279 dyskinesia, 563 dyslexia, 20–21 and conversational implicatures, 20–21 and metaphor, 20 dysphonia, spasmodic, 563 dystonia, 563 echolalia, 180–181 EEG, 183 electroglottography (EGG), 529–531 contact quotient (CQ), 530–531 dEGG, 530 open quotient (OQ), 530 electromagnetic Articulography (EMA), 492–494, 505 in coarticulation studies, 578 electromyography, (EMG) 490 electropalatography (EPG), 498–499 in coarticulation studies, 578 electrophysiological recordings, 41 elements in phonology, 354–357 element geometry, 357 elicitation, 203 emergence, 370–371 emergentist approach to pragmatics, 55–65 English, 352, 409, 423–424, 430 African-American, 85 English-speaking countries, inner circle of, 83 epenthesis, 337 error patterns, 457–458 ethnomethodology, 116–117 executive functions, 17, 47 and language/language impairment, 278–279 and SSD, 460 exemplar theory, 373 eye-tracking, 183 feature, 354 geometry, 319 final consonant deletion, 337 Finnish, 413, 423–424 flanker task, 280 FluencyBank, 143 formant, see vowels, formants formant frequencies, 507 in coarticulation studies, 576 formulaic sequences, 177–186, 440, 448 definition of, 177 FOXP2, 567 French, 323, 425–429 frequency, 208, 411–412, 423–424, 438, 440, 446, 447 and token frequency, 208 and type frequency, 208 lexical, 365–366, 372–373 fronting, 340 functional load, 411–412
636 Index generative phonology, see phonology, generative German, 160, 163, 164–167, 169, 203, 207, 209, 210, 325–327, 359, 409, 425–431, 438, 446 Germanic languages, 262–264, 409 and autism spectrum disorder (ASD), 263 and developmental language disorder (DLD), 263 and hearing impairment (hi), 264 and morphosyntax, 262 gestures, 333 gliding, 341 government phonology, see phonology, government grand rounds, 146 Greek, 264–265, 358, see also Cypriot and developmental language disorder (DLD), 265 Gricean tradition, 41, 43 Conversational implicature, 6 Haitian Creole, 410 harmonics-to-noise ratio (HNR), 526, 528–529 for voice assessment, 526, 528–529 in vowel phonation, 528 in speech, 528 hearing impairment (HI), 73, 260 and coarticulation, 582 Hebrew, 410, 430 high-speed videoendoscopy (HSV), 532–533 Hungarian, 410, 425, 429 Huntington’s disease, 563 hypokinesia, 563 Icelandic, 409, 429 identity, 9, 10 idiom, 182 Igbo, 425 imaging, 495–498 individual differences, 440 inference and pragmatic impairment, 59–62 inflection, 201–213 and agglutinative type, 202 and analytic language, 204–205 and concatenative/nonconcatenative type, 202 and crosslinguistic differences, 201–202, 212 and dual-mechanism model, 207 and fusional type, 202 and inflectional deficits, 201–213 and grammatical dimensions, 201–202, 208–209 and grammatical functions, 202 and grammatical relations, 202 and inflectional paradigm, 208–209 and phonological complexity, 209–210 and single-mechanism model, 207 and synthetic language, 204–205 inflectional deficit, 160, 163–164, 167, 202–213 and agreement, 160, 205, 209, 210 and aphasia, 160–161, 204, 205, 206–207, 209 and aspect, 205 and deficit theories, 211–212 and DLD, 160–161, 205, 208, 209 and Down syndrome, 160–161, 210 and hearing impairment, 161, 210 and language production, 202–204 and markedness, 208–209
and omission error, 160–161, 203–205 and productivity, 208 and substitution error, 160–161, 167, 203–205 and syllable complexity, 210 and tense inflection, 166–167, 205–207, 208, 209 and selective impairment, 205–207 and regular inflection, 206–207 and irregular inflection, 206–207 and task demands, 171, 210–211 and tree-pruning, 206, 211 and triangle model, 207 and Williams syndrome, 207, 212 and working memory, 211 in-phase, 334 instrumental measures, 477–478, 480, 547 extensions for binaural hearing, 549 extensions for time-varying maskers, 549 extensions of STI and SII, 548 SII model, 547 STI model, 548 intelligibility, 383, 384, 561, 563, 568–569 of speech, see speech intelligibility intervention, 303–304, 325–326, 446–448, 449, 461–463 for SSD, 461–462, 464 program of, 303–304 with neurogenic disorders, 324–326 inventory of phonemes, 303 isolating languages, 266 and developmental language disorder (DLD), 266 Italian, 410 Jamaican Creole, 409 Japanese, 410, 425, 428 jitter, 524–525, 528 for voice assessment, 524–525, 528–529 in vowel phonation, 528 in speech, 528 juncture, 438–439, 440, 442–443, 444–445 Kannada, 413 Konkani, 414 Korean, 410, 425, 438 language, 303–2 acquisition, 303, 309, 313 and cognitive impairment, 273–282 and power in the clinic, 87 and thought, 273 change, 313 child language, 304 disorders, 304 learners, 313 pathologies, 304 standard, 409 language comprehension, 162–164 and syntactic deficit, 162–164 language families, 409–410 Algik, 422 Austronesian, 409 Dravidian, 411, 413 East Papuan, 421 Eskimo-Aleut, 423
Index 637 Germanic-West African, 409 Greek, 409, 416 Indo-Iranian, 411 Indo-Aryan, 411, 414 Japonic, 410 Khoisan, 421 Koreanic, 410 Mayan, 409, 424 Na-Dene, 422 Niger-Kongo, 410, 411 Romance, 410 see also Romance languages Semitic, 410 Sinitic, 410 Slavic, 411 see also Slavic languages Turkic, 410 Uralic, 410 Uto-Aztecan, 423 language in context, 101–103 context of culture, 101–102 context of situation, 101 metafunctions, 103 text and genre, 102 language mixing, 281 language processing, 229–238, 305 classical models of, 229–230 in real time, 230–231 semantic integration, 231–233 syntactic processes in, 233–236 violations, 236–238 language specificity, 411 language switching, 281 LARSP, 251 laryngoscopy, 531 learning disability, 76 left hemisphere brain damage, 598, 599 lenition, 358–359 linguicism, 82, 85 linguistic analysis, 309 clinical analysis, 305 developmental analysis, 305 lexical analysis, 313 Natural Process Analysis, 306 nonlinear analysis, 307 NPA analysis, 306 phonemic analysis, 304–305 phonological analysis, 304–306, 312 literacy skills of children with SSD, 460 LITMUS, 253–254 locus equation, 576 MacArthur-Bates CDI, 254 magnetic resonance imaging (MRI), 498 in coarticulation studies, 578 Malay, 409 Maltese, 410 Mandarin, 266, 410 markedness of consonants, 422–425, 428 of inventory, 422, 425 of syllables, 423 of clusters, 430 measures of speech perception, 539 measuring, 183
melody in phonology, 354–357 memory and pragmatic impairment, 56, 58–61, 64 rich, 366 mental lexicon, 171, 208 and full-form storage, 206–208 and lexical access, 208metaphor, 182 metaphor interpretation, 31 modalities and channels, 116 mora, 318 morphology, 163, 201 and multilingual/cross-linguistic studies, 246–247, 250 morphosyntactic difficulties, 260 motor equivalence, 489 motor speech disorders, 561–569 multicompetence, 131, 138 multilingualism, 129–139, 410 see also bilingualism multimodal gestalt, 119, 123 multimodality, and its analysis, 119–122, 123–124 and communication disability, 122–124 and interaction, 115, 117–119, 124 and multiactivity, 118 and social activity, 116–118 and talk-based systems, 117–118, 121–122 and transcription, 119–121 mutual manifestness, 33 narrative assessment, 254 nasalance, 500 nasality, in coarticulation studies, 577 neurogenic impairment, 317–318, 324–327 neuroimaging, 43–45 neurophysiology, 45–47 neuropragmatics, 41–48 contribution of, 42–43 definition of, 41 non-linear phonology, see phonology, non-linear nonspeech, 568 non-verbal communication and pragmatic impairment, 60 non-word repetition, 252, 254 normative data, 453, 455–457 Norwegian, 430–431 optoelectronic systems, 492 oromandibular lingual dystonia, 563 ostensive stimulus, 30 PABIQ, 253 palatalization, 340 palatography, 498–499 palilalia, and sign, 294 Parkinson’s disease, 563 and sign, 292–293 patterns, 303–306 complex patterns, 305 deviation patterns, 303 developmental patterns, 306
638 Index patterns (cont’d) patterns of phonemes, 304 patterns of speech, 306 phonological patterns, 307, 313 pronunciation patterns, 305–306 sound patterns, 305 stress patterns, 308 pediatric motor speech disorders, 566 PEPS-C (Profiling Elements of Prosody in SpeechCommunication), 593, 596 percentage of consonants correct (PCC), 383, 455–456 percentage of vowels correct (PVC), 383 Persian, 411 PhonBank, 143 phoneme, 303–305, 308–309, 311–312 phonemic repertoires, 456 phonetic and phonological systems, 476–477 and phonetic and phonotactic inventories, 475, 476 and phonological processes and atypical patterns, 475 and principled variability, 475, 476 and lexical inconsistency, 475, 477 phonetic planning, 564–566 phonetic repertoires, 456 phonetic transcription and challenges and constraints, 479–481 and clinical context, 471–473 and descriptive labels, 473 and notation systems, 473 and broad, 474–475 and narrow, 474–477 and objective listening, 479–480 phonology, 303–305, 307–313 articulatory, 332 autosegmental, 307 clinical, 303–304, 306, 310, 312 cognitive, 305, 308–310, 305, 308 developmental, 304, 307, 310–312, 378 English, 305 generative, 305, 317–318 government, 351–361 government, clinical applications, 359–361 metrical, 307 natural, 305–307 nonlinear, 305, 307, 318–325 as human behavior (PHB), 305, 312–313 prosodic, 307 phonological approach, 304, 308, 313 phonological assessment, 412 phonological complexity, 421–424 phonological development, 407–419, 439–440 see also phonology, developmental bilingual, 379, 381 developmental phase model, 379, 380,386 nontypical, 386 typical, 382 phonological derivation, 357–359 phonological disorder and delay, 386, 387, 457–460 consistent, 458–461 inconsistent, 458–461 phonological impairment, 317–318, 324–327, 412 phonological mean length of utterance (pMLU), 412 phonological typology, 408 phonological processes, 383
phonological rules, 305, 310 phonological theory, 407–408 phonological typology, 408 phonological word proximity, 412 physiological measures, 551 neural encoding of higher-order processes, 553 neural encoding of spectral features, 552 neural encoding of temporal envelope features, 552 plethysmography, 499 point tracking systems, 490–495 Polish, 428, 430 Portuguese, 410, 426–427, 438 positioning theory, 10 power, see language, and power in the clinic pragmatic impairment, 29, 55–65, 133–135, 596–597 as an emergent phenomenon, 56, 59–63, 65 pragmatic language difficulties (PLD), 31–32, 596–597 pragmatic theory, 55–56, 65 pragmatic training, 24 pragmatics, 41–48, 130–133, 135–136, 138–139, 182 bilateral networks for processing, 43–45 processing deficit, 159, 163, 169, 171 processing regulation, 185 presupposition triggers, 19 progressive supranuclear palsy, and sign, 293–294 progressivity, 70 prosody, 182, 589 impairment of, 589 psycholinguistic framework, 379 developmental phase model, 379, 380, 386 phonological input, 379, 382 phonological output, 379 PsychosisBank, 143 Punjabi, 411 Quiché, 409, 424 relevance theory, 6, 30–31 repair, 72 other-initiation of, 72 self-initiation of, 72 requests / recruitments, 118–119 research-practice divide, 481 RHDBank, 143 right basal ganglia, 182 right hemisphere, 182 right hemisphere damage, 33–34, 42, 219–220, 598–599 and sign, 291 Romance languages, 264 and autism spectrum disorder (asd), 264 and clitics, 264 and developmental language disorder (dld), 264 and morphosyntax, 264 Rotokas, 421 schizophrenia spectrum and psychotic disorders and pragmatic impairment, 18–19 semantic control, 279 semantic impairment, in aphasia, 216–219 in RHD, 219–220
Index 639 semantics, processing of words, 215–223 word, impairment in, 220–222 word, two facets of, 215–216 Semitic languages, 265–266 and autism spectrum disorder (asd), 265, 266 and developmental language disorder (dld), 265, 266 and hearing impairment (hi), 266 and morphosyntax, 265, 266 sentence repetition (SREP), 254, 261 and error patterns, 262, 267 and LITMUS SRep, 261, 267 and qualitative analyses of error patterns, 262, 267 sentences, active, 191, 197 cleft-object, 196 passive, 191, 197 relative clause, 195, 196–197 sequence, 71 Setswana, 410 shimmer, 524–526, 528 for voice assessment, 524–525, 528–529 in vowel phonation, 528 in speech, 528 sign compared to speech, 289 sign language, 266–267 and developmental language disorder (DLD), 267 and quantifiers, 23–24 sign linguistics, 287–288 showing and meaning, 34–36 Slave(y), 422 Slavic languages, 265 and developmental language disorder (DLD), 265 Slovene, 410 social construction theory, 10 social pragmatic communication disorder and conversational implicatures, 21–22 and figurative language, 21 sociolinguistics, 81–92, 99, 129–130, 136–139 sensitivity in assessment, 86 variationist, 81 sociophonetics, 615–627 and clinical assessment, 621–626 establishing baselines, 621–623 ideology and assessment, 625–626 interpreting, 619 multilingual contexts, 619–621, 624–625 practical solutions, 626–627 variability, 616–618 with individuals, 618–619 sonority, 423–424, 430 sonority dispersion principle (SDP), 423 sonority sequencing principle (SSP), 423 Spanish, 358, 410, 425–429 spatio-temporal index, 516–517 specific language impairment (SLI), 31–32 specific learning disorders and pragmatic impairment, 20–21 speech, 303, 306–307, 309–312 and acceptability and intelligibility, 472, 474, 479 and characteristics, 472, 473, 478, 479 and long-domain phenomena, 477 and progressive change, 473, 475, 476, 477, 478
and sampling, 478–479 and Speech Sound Disorder, 471, 475 and sub-phonemic / covert contrast, 476, 478 child speech, 305–306 clinical speech, 313 disordered speech, 307, spontaneous speech, 306, 313 speech capacity, 306 speech sounds, 304, 307–308 speech act theory, 5 speech comprehensibility, 606–607 distinguishing from speech intelligibility, 607 speech intelligibility, 605–611 distinguishing from speech comprehensibility, 607 instrumental assessment of, 607–609 measures, 609–611 perceived phonological deviation (PPD), 610 signal-complementary confounds, 606 with speech production disorders, 607–609 speech-language pathologists, 453, 457, 461 speech-language pathology, creating social justice in, 84 speech perception, 539–554 measures of, 539 speech-language therapist, 304 speech production, 160–161, 202–204 and probed speech, 202–204 and spontaneous speech, 160, 202–204 speech sound, 381, see also speech sound disorder acquisition, 381 inventory, 383 metrics (PCC, PVC), 383 speech sound disorder (SSD), 386, 387, 443–445, 453–454, 597, 625–627 and vowel errors, 396–398, 403 classification of, 458 subgroups, 454, 458–459 spontaneous language analysis, 250–251, 253 stammering, 75 see also stuttering stimulability testing, 454, 456, 463 stopping, 339 structure, 4, 7 microstructure, 4 macrostructure, 4 superstructure, 4 structuralism, 310 structuralist theory, 310 stuttering, 107 see also stammering in coarticulation studies, 582 substitution process, 303, 339 Swahili, 410 Swedish, 415, 417, 438 syllable, 303, 306–308, 317–326, 351–353 and neurogenic disorders, 323–326 deletion/omission/reduction, 303, 306 developmental aspects of, 321–323 harmony, 303 production, 311 structure deviations, 303 syntactic deficit, and feature specification, 166 and DLD, 159–161, 163, 166–170, 172–173 and relativized minimality, 168–169
640 Index syntactic deficit (cont’d) and trace deletion, 168–169, 173 and tree pruning, 165–166172–173, 206 and treatment, 171–172 and wh-questions, 164–170, 172 and working memory, 163, 170, 172 syntactic processing, 189–190 in neurotypical children, 190–191 in children with DLD, 192–194 in acquired disorders, 195–197 syntax and pragmatic impairment, 56, 58–60, 64–65 systemic functional linguistics, 6, 99–110 and expression, 106 and intervention, 108 and lexicogrammar, 104–106 and semantics, 103–104 clinical applications of, 106–108 glossary of terms, 109 key concepts of, 100–101 systems, 303–304, 306, 310–312 consonant and vowel systems, 306 phonemic system, 311 phonological system, 303–308, 311–312 production system, 305 pronunciation system, 306 semiotic system, 308 sound system, 303–304, 311, 313 stress system, 307 TalkBank, 143 Tagalog, 411 TalkBankDB, 151 talk-in-interaction, 69 task dynamics, 333 task impurity problem, 276 TBIBank, 143 technology, 10 testing, 183 theories, 5–7, 10 theory of mind, 17, 47 and pragmatic impairment, 57, 59–61, 65 thought and language, 273 tone, in coarticulation studies, 577 transcription, 326–327 traumatic brain injury (TBI), 42, 76, 107, 219–220 and cognition and language, 279 tremor, 563 Tulu, 414 Turkish, 410 turn-taking, 70, 117–118, 123 ultrasound, 496–497 in coarticulation studies, 578 units of analysis, 7, 8, 9 usage-based linguistics, 365–368, 438, 447 acquisition and, 368–370 clinical application of, 371–373 utterance-level analysis of articulation, 515 variable, 81 varieties, non-prestige, 83
video tracking, 490–491 videokymography (VKG), 532–533 videolaryngostroboscopy (VLS), 532–533 Vietnamese, 266 vision impairment and coarticulation, 583 vocalization, 341 voice assessment, 523–533 indirect assessment techniques: instrumental acoustic assessment, 523–529 semi-direct assessment techniques: electroglottography, 529–531 direct assessment techniques: visual examination, 531–533 voice onset time (VOT), 513–514 vowel errors (vowel error patterns) and intelligibility, 398 assessment of, 400 incidence of, 396 independent analysis of, 394 relational analysis of, 394 treatment of, 401–403 vowels, 318, 322–323, 355–358 accuracy of, 394, 397, 399 acoustic realization of, 391 acquisition (development) of, 393–396 charts, 507 diphthongs, 392, 397, 399, 401 F2 slope, 510–511 features, 391–392 formants, 507 General American English, 392 in government phonology, 355–356 percentage correct (PVC), 383 rhotic, 392, 394, 397–398 space area, 507–510 stress, 395, 397, 399 suprasegmental aspects of, 395, 398–399, 403 syncope, 357–358 timing of, 397, 399, 403 transition, 510 weak syllable deletion, 337 whole word accuracy, 412 complexity, 412 proximity, 412 whole word match (WWM), 414 Williams syndrome, 589, 595 word fluency task, 275 word production, consistency of, 455, 457–458, 463 word semantics, see semantics, word Xhosa, 410 x-rays, 495–496 Yorkshire English, 397, 620 Yupik, 423 Yurok, 422 Zapotec, 429–410 !Xu, 421
WILEY END USER LICENSE AGREEMENT Go to www.wiley.com/go/eula to access Wiley’s ebook EULA.