Handbook of Research on Perspectives in Foreign Language Assessment 1668456605, 9781668456606

As a predominant teaching paradigm, foreign language learning has increasingly been one of the crucial elements that lea

454 112 10MB

English Pages 418 [446] Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Editorial Advisory Board
List of Reviewers
List of Contributors
Table of Contents
Foreword
Preface
Acknowledgment
Section 1: Language Domains
1 Revisiting the Past to Shape the Future: Assessment of Foreign Language Abilities • Nurdan Kavaklı Ulutaş
2 The Challenge of Assessing Intercultural Competence: A Review of Existing Approaches • Moritz Brüstle, Karin Vogt
3 Culturally-Biased Language Assessment: Collectivism and Individualism • Ömer Gökhan Ulum, Dinçay Köksal
Section 2: Methods of Language Assessment
4 Mispronunciation Detection Using Neural Networks for Second Language Learners • Lubana Isaoglu, Zeynep Orman
5 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning • Vasfiye Geçkin
6 Dynamic Assessment as a Learning-Oriented Assessment Approach • Tuba Özturan, Hacer Hande Uysal Gürdal
7 Flipped Spiral Foreign Language Assessment Literacy Model (FLISLALM) for Developing Preservice English Language Teachers’ Language Assessment Literacy • Çiler Hatipoğlu
Section 3: Language Assessment in Education
8 Teaching and Assessment in Young Learners’ Classrooms • Belma Haznedar
9 The Long-Term Washback Effect of University Entrance Exams: An EFL Learner and Teacher’s Critical Autoethnography of Socialization • Ufuk Keleş
10 Dynamic Assessment in an Inclusive Pre-K FLEX Program Within Universal Design for Learning (UDL) Framework • Hilal Peker
11 An Analysis of the General Certificate Examination Ordinary Level English Language Paper and Students’ Performance • Achu Charles Tante, Lovelyn Chu Abang
12 Evaluation of ESL Teaching Materials in Accordance With CLT Principles through Content Analysis Approach • Muhammad Ahmad, Aleem Shakir, Ali Raza Siddique
Section 4: Perspectives in Language Assessment
13 Language Assessment: What Do EFL Instructors Know? What Do EFL Instructors Do? • Dilşah Kalay, Esma Can
14 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement • Nesrin Ozturk, Begum Atsan
15 Academic Integrity in Online Foreign Language Assessment: What Does Current Research Tell Us? • Aylin Sevimel-Sahin
16 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications in Web of Science (WoS) • Devrim Höl, Ezgi Akman
Compilation of References
About the Contributors
Index
Recommend Papers

Handbook of Research on Perspectives in Foreign Language Assessment
 1668456605, 9781668456606

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Handbook of Research on Perspectives in Foreign Language Assessment Dinçay Köksal Çanakkale 18 Mart University, Turkey Nurdan Kavaklı Ulutaş Izmir Demokrasi University, Turkey Sezen Arslan Bandırma 17 Eylül University, Turkey

A volume in the Advances in Educational Technologies and Instructional Design (AETID) Book Series

Published in the United States of America by IGI Global Information Science Reference (an imprint of IGI Global) 701 E. Chocolate Avenue Hershey PA, USA 17033 Tel: 717-533-8845 Fax: 717-533-8661 E-mail: [email protected] Web site: http://www.igi-global.com Copyright © 2023 by IGI Global. All rights reserved. No part of this publication may be reproduced, stored or distributed in any form or by any means, electronic or mechanical, including photocopying, without written permission from the publisher. Product or company names used in this set are for identification purposes only. Inclusion of the names of the products or companies does not indicate a claim of ownership by IGI Global of the trademark or registered trademark. Library of Congress Cataloging-in-Publication Data Names: Köksal, Dinçay, editor. | Kavaklı Ulutaş, Nurdan, 1988- editor. | Arslan, Sezen, 1987- editor. Title: Handbook of research on perspectives in foreign language assessment / Dinçay Köksal, Nurdan Kavaklı Ulutaş, and Sezen Arslan, editors. Description: Hershey PA : Information Science Reference, [2023] | Includes bibliographical references and index. | Summary: “This book takes an informed look at researching perspectives on foreign language assessment through reflections on classroom applications and makes recommendations to strengthen quality language assessments by drawing on a variety of research methodologies”-- Provided by publisher. Identifiers: LCCN 2022034096 (print) | LCCN 2022034097 (ebook) | ISBN 9781668456606 (hardcover) | ISBN 9781668456613 (ebook) Subjects: LCSH: Language and languages--Ability testing. | Language and languages--Study and teaching. | LCGFT: Essays. Classification: LCC P53.4 .H37 2023 (print) | LCC P53.4 (ebook) | DDC 418.0076--dc23/eng/20220831 LC record available at https://lccn.loc.gov/2022034096 LC ebook record available at https://lccn.loc.gov/2022034097 This book is published in the IGI Global book series Advances in Educational Technologies and Instructional Design (AETID) (ISSN: 2326-8905; eISSN: 2326-8913) British Cataloguing in Publication Data A Cataloguing in Publication record for this book is available from the British Library. All work contributed to this book is new, previously-unpublished material. The views expressed in this book are those of the authors, but not necessarily of the publisher. For electronic access to this publication, please contact: [email protected].

Advances in Educational Technologies and Instructional Design (AETID) Book Series Lawrence A. Tomei Robert Morris University, USA

ISSN:2326-8905 EISSN:2326-8913 Mission Education has undergone, and continues to undergo, immense changes in the way it is enacted and distributed to both child and adult learners. In modern education, the traditional classroom learning experience has evolved to include technological resources and to provide online classroom opportunities to students of all ages regardless of their geographical locations. From distance education, Massive-Open-Online-Courses (MOOCs), and electronic tablets in the classroom, technology is now an integral part of learning and is also affecting the way educators communicate information to students. The Advances in Educational Technologies & Instructional Design (AETID) Book Series explores new research and theories for facilitating learning and improving educational performance utilizing technological processes and resources. The series examines technologies that can be integrated into K-12 classrooms to improve skills and learning abilities in all subjects including STEM education and language learning. Additionally, it studies the emergence of fully online classrooms for young and adult learners alike, and the communication and accountability challenges that can arise. Trending topics that are covered include adaptive learning, game-based learning, virtual school environments, and social media effects. School administrators, educators, academicians, researchers, and students will find this series to be an excellent resource for the effective design and implementation of learning technologies in their classes.

Coverage • Instructional Design Models • E-Learning • Hybrid Learning • Higher Education Technologies • K-12 Educational Technologies • Social Media Effects on Education • Online Media in Classrooms • Virtual School Environments • Web 2.0 and Education • Digital Divide in Education

IGI Global is currently accepting manuscripts for publication within this series. To submit a proposal for a volume in this series, please contact our Acquisition Editors at [email protected] or visit: http://www.igi-global.com/publish/.

The Advances in Educational Technologies and Instructional Design (AETID) Book Series (ISSN 2326-8905) is published by IGI Global, 701 E. Chocolate Avenue, Hershey, PA 17033-1240, USA, www.igi-global.com. This series is composed of titles available for purchase individually; each title is edited to be contextually exclusive from any other title within the series. For pricing and ordering information please visit http://www.igi-global.com/book-series/advances-educational-technologies-instructional-design/73678. Postmaster: Send all address changes to above address. Copyright © 2023 IGI Global. All rights, including translation in other languages reserved by the publisher. No part of this series may be reproduced or used in any form or by any means – graphics, electronic, or mechanical, including photocopying, recording, taping, or information and retrieval systems – without written permission from the publisher, except for non commercial, educational use, including classroom teaching purposes. The views expressed in this series are those of the authors, but not necessarily of IGI Global.

Titles in this Series

For a list of additional titles in this series, please visit: http://www.igi-global.com/book-series/advances-educational-technologies-instructional-design/73678

Multifaceted Analysis of Sustainable Strategies and Tactics in Education Theresa Dell Neimann (Oregon State University, USA) Lynne L. Hindman (Oregon State University, USA) Elena Shliakhovchuk (The Polytechnic University of Valencia, Spain) Marian Moore (Austin Community College, USA) and Jonathan J. Felix (RMIT University, Vietnam) Information Science Reference • © 2023 • 300pp • H/C (ISBN: 9781668460351) • US $215.00 Learning With Escape Rooms in Higher Education Online Environments Alexandra Santamaría Urbieta (Universidad Internacional de La Rioja, Spain) and Elena Alcalde Peñalver (Universidad de Alcalá, Spain) Information Science Reference • © 2023 • 356pp • H/C (ISBN: 9781668460818) • US $215.00 Handbook of Research on Learning in Language Classrooms Through ICT-Based Digital Technology Rajest S. Suman (Bharath Institute of Higher Education and Research, India) Salvatore Moccia (EIT Digital Master School, Spain) Karthikeyan Chinnusamy (Veritas, USA) Bhopendra Singh (Amity University, Dubai, UAE) and R. Regin (SRM Institute of Science and Technology, India) Information Science Reference • © 2023 • 359pp • H/C (ISBN: 9781668466827) • US $270.00 Handbook of Research on Shifting Paradigms of Disabilities in the Schooling System Hlabathi Rebecca Maapola-Thobejane (University of South Africa, South Africa) and Mbulaheni Obert Maguvhe (University of South Africa, South Africa) Information Science Reference • © 2023 • 435pp • H/C (ISBN: 9781668458006) • US $270.00 Promoting Diversity, Equity, and Inclusion in Language Learning Environments Karina Becerra-Murillo (Jurupa Unified School District, USA & American College of Education, USA) and Josefina F. Gámez (Jurupa Unified School District, USA) Information Science Reference • © 2023 • 300pp • H/C (ISBN: 9781668436325) • US $215.00 Engaging Students With Disabilities in Remote Learning Environments Manina Urgolo Huckvale (William Paterson University, USA) and Kelly McNeal (School of Education, Georgian Court University, Lakewood, USA) Information Science Reference • © 2023 • 305pp • H/C (ISBN: 9781668455036) • US $215.00

701 East Chocolate Avenue, Hershey, PA 17033, USA Tel: 717-533-8845 x100 • Fax: 717-533-8661 E-Mail: [email protected] • www.igi-global.com

Editorial Advisory Board Hung Phu Bui, University of Economics Ho Chi Minh City, Vietnam Samantha M. Curle, University of Bath, UK Howard Giles, Department of Communication, University of California, Santa Barbara, USA & School of Psychology, The University of Queensland, Brisbane, Australia Elena Parubochaya, Volgograd State University, Russia Salim Razı, Çanakkale Onsekiz Mart University, Turkey Elena P. Schmitt, Southern Connecticut State University, USA

List of Reviewers Ferzan Atay, Hakkari University, Turkey Zülal Ayar, Izmir Katip Celebi University, Turkey Selami Aydın, Istanbul Medeniyet University, Turkey Mehmet Bardakçı, Gaziantep University, Turkey Ayfer Su Bergil, Amasya University, Turkey Moritz Brüstle, Cooperative State University Baden-Wuerttemberg Mosbach, Germany Sevcan Bayraktar Çepni, Trabzon University, Turkey Kürşat Cesur, Çanakkale Onsekiz Mart University, Turkey Emrah Cinkara, Gaziantep University, Turkey Nilüfer Can Daşkın, Hacettepe University, Turkey Sinem Doğruer, Trakya University, Turkey Eda Duruk, Pamukkale University, Turkey Sibel Ergün Elverici, Yildiz Technical University, Turkey Ali Erarslan, Alanya Alaaddin Keykubat University, Turkey Vasfiye Geçkin, Izmir Democracy University, Turkey Gülten Genç, Inonu University, Turkey İpek Kuru Gönen, Anadolu University, Turkey Aydan Irgatoğlu, Ankara Hacı Bayram Veli University, Turkey Ayşegül Liman Kaban, Bahcesehir University, Turkey Işıl Günseli Kaçar, Middle East Technical University, Turkey Dilşah Kalay, Dumlupinar University, Turkey 



Pınar Karahan, Pamukkale University, Turkey Ali Karakaş, Burdur Mehmet Akif Ersoy University, Turkey Süleyman Kasap, Yuzuncu Yil University, Turkey Ufuk Keleş, Bahçeşehir University, Turkey Eylem Perihan Kibar, Cukurova University, Turkey Özkan Kırmızı, Karabük University, Turkey Muhlise Coşgun Ögeyik, Trakya University, Turkey Zekiye Özer, Niğde Ömer Halisdemir University, Turkey Cem Özışık, Istanbul Kultur University, Turkey Turan Paker, Pamukkale University, Turkey Hilal Peker, University of Central Florida, USA Sevgi Şahin, Başkent University, Turkey Umut Salihoğlu, Uludağ University, Turkey Aylin Sevimel-Sahin, Anadolu University, Turkey Osman Solmaz, Dicle University, Turkey Sinem Hergüner Son, Gazi University, Turkey Mehmet Takkaç, Ataturk University, Türkiye Achu Tante, University of Buea, Cameroon Ayşegül Takkaç Tulgar, Ataturk University, Turkey Kutay Uzun, Trakya University, Turkey Aylin Yardımcı, Kahramanmaras Sutcu Imam University, Turkey Demet Yaylı, Pamukkale University, Turkey Ceyhun Yükselir, Osmaniye Korkut Ata University, Turkey Nurcihan Yürük, Selcuk University, Turkey Gülin Zeybek, Isparta Suleyman Demirel University, Turkey

List of Contributors

Abang, Lovelyn Chu / University of Buea, Cameroon....................................................................... 204 Ahmad, Muhammad / Government College University, Pakistan................................................... 232 Akman, Ezgi / Pamukkale University, Turkey................................................................................... 329 Atsan, Begum / İzmir Democracy University, Turkey........................................................................ 284 Brüstle, Moritz / Cooperative State University Baden-Wuerttemberg Mosbach, Germany................ 11 Can, Esma / Kütahya Dumlupinar University, Turkey...................................................................... 254 Geçkin, Vasfiye / Izmir Democracy University, Turkey....................................................................... 71 Hatipoğlu, Çiler / Middle East Technical University, Turkey............................................................ 104 Haznedar, Belma / Bogazici University, Turkey................................................................................ 136 Höl, Devrim / Pamukkale University, Turkey..................................................................................... 329 Isaoglu, Lubana / Istanbul University-Cerrahpasa, Turkey................................................................ 48 Kalay, Dilşah / Kütahya Dumlupinar University, Turkey.................................................................. 254 Kavaklı Ulutaş, Nurdan / Izmir Demokrasi University, Turkey............................................................ 1 Keleş, Ufuk / Faculty of Educational Sciences, Bahçeşehir University, Turkey................................ 156 Köksal, Dinçay / Çanakkale Onsekiz Mart University, Turkey........................................................... 36 Orman, Zeynep / Istanbul University-Cerrahpasa, Turkey................................................................. 48 Özturan, Tuba / Erzincan Binali Yıldırım University, Turkey............................................................. 89 Ozturk, Nesrin / İzmir Democracy University, Turkey...................................................................... 284 Peker, Hilal / University of Central Florida, USA............................................................................. 181 Sevimel-Sahin, Aylin / Anadolu University, Eskisehir, Turkey.......................................................... 306 Shakir, Aleem / Government College University, Pakistan............................................................... 232 Siddique, Ali Raza / Government College University, Pakistan........................................................ 232 Tante, Achu Charles / University of Buea, Cameroon....................................................................... 204 Ulum, Ömer Gökhan / Mersin University, Turkey............................................................................... 36 Uysal Gürdal, Hacer Hande / Hacettepe University, Turkey.............................................................. 89 Vogt, Karin / University of Education Heidelberg, Germany............................................................. 11



Table of Contents

Foreword............................................................................................................................................xviii Preface................................................................................................................................................... xx Acknowledgment............................................................................................................................... xxvi Section 1 Language Domains Chapter 1 Revisiting the Past to Shape the Future: Assessment of Foreign Language Abilities............................. 1 Nurdan Kavaklı Ulutaş, Izmir Demokrasi University, Turkey Chapter 2 The Challenge of Assessing Intercultural Competence: A Review of Existing Approaches................ 11 Moritz Brüstle, Cooperative State University Baden-Wuerttemberg Mosbach, Germany Karin Vogt, University of Education Heidelberg, Germany Chapter 3 Culturally-Biased Language Assessment: Collectivism and Individualism.......................................... 36 Ömer Gökhan Ulum, Mersin University, Turkey Dinçay Köksal, Çanakkale Onsekiz Mart University, Turkey Section 2 Methods of Language Assessment Chapter 4 Mispronunciation Detection Using Neural Networks for Second Language Learners.......................... 48 Lubana Isaoglu, Istanbul University-Cerrahpasa, Turkey Zeynep Orman, Istanbul University-Cerrahpasa, Turkey Chapter 5 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning.............................. 71 Vasfiye Geçkin, Izmir Democracy University, Turkey  



Chapter 6 Dynamic Assessment as a Learning-Oriented Assessment Approach.................................................. 89 Tuba Özturan, Erzincan Binali Yıldırım University, Turkey Hacer Hande Uysal Gürdal, Hacettepe University, Turkey Chapter 7 Flipped Spiral Foreign Language Assessment Literacy Model (FLISLALM) for Developing Preservice English Language Teachers’ Language Assessment Literacy................................................. 104 Çiler Hatipoğlu, Middle East Technical University, Turkey Section 3 Language Assessment in Education Chapter 8 Teaching and Assessment in Young Learners’ Classrooms................................................................. 136 Belma Haznedar, Bogazici University, Turkey Chapter 9 The Long-Term Washback Effect of University Entrance Exams: An EFL Learner and Teacher’s Critical Autoethnography of Socialization.......................................................................................... 156 Ufuk Keleş, Faculty of Educational Sciences, Bahçeşehir University, Turkey Chapter 10 Dynamic Assessment in an Inclusive Pre-K FLEX Program Within Universal Design for Learning (UDL) Framework............................................................................................................................... 181 Hilal Peker, University of Central Florida, USA Chapter 11 An Analysis of the General Certificate Examination Ordinary Level English Language Paper and Students’ Performance......................................................................................................................... 204 Achu Charles Tante, University of Buea, Cameroon Lovelyn Chu Abang, University of Buea, Cameroon Chapter 12 Evaluation of ESL Teaching Materials in Accordance With CLT Principles through Content Analysis Approach............................................................................................................................... 232 Muhammad Ahmad, Government College University, Pakistan Aleem Shakir, Government College University, Pakistan Ali Raza Siddique, Government College University, Pakistan



Section 4 Perspectives in Language Assessment Chapter 13 Language Assessment: What Do EFL Instructors Know? What Do EFL Instructors Do?................. 254 Dilşah Kalay, Kütahya Dumlupinar University, Turkey Esma Can, Kütahya Dumlupinar University, Turkey Chapter 14 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement............................. 284 Nesrin Ozturk, İzmir Democracy University, Turkey Begum Atsan, İzmir Democracy University, Turkey Chapter 15 Academic Integrity in Online Foreign Language Assessment: What Does Current Research Tell Us?....................................................................................................................................................... 306 Aylin Sevimel-Sahin, Anadolu University, Eskisehir, Turkey Chapter 16 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications in Web of Science (WoS)................................................................................................ 329 Devrim Höl, Pamukkale University, Turkey Ezgi Akman, Pamukkale University, Turkey Compilation of References................................................................................................................ 356 About the Contributors..................................................................................................................... 410 Index.................................................................................................................................................... 415

Detailed Table of Contents

Foreword............................................................................................................................................xviii Preface................................................................................................................................................... xx Acknowledgment............................................................................................................................... xxvi Section 1 Language Domains Chapter 1 Revisiting the Past to Shape the Future: Assessment of Foreign Language Abilities............................. 1 Nurdan Kavaklı Ulutaş, Izmir Demokrasi University, Turkey Hailing the value of foreign language assessment, this chapter embarks on reflections from classroom practices in order to forecast the future of foreign language assessment, which is molded by a historical perspective. In doing so, it provides a recent contribution to the field of foreign language assessment by demonstrating to practitioners how they can make the best out of their assessment practices by addressing both theoretical and practical issues and listing recommendations in order to empower quality language assessment. Chapter 2 The Challenge of Assessing Intercultural Competence: A Review of Existing Approaches................ 11 Moritz Brüstle, Cooperative State University Baden-Wuerttemberg Mosbach, Germany Karin Vogt, University of Education Heidelberg, Germany An ever-faster developing globalization of our world brings many changes with it and poses a variety of challenges for human society. Accordingly, living and working in today’s times also bring new demands for institutionalized education all around the world. This chapter will give a detailed insight into the undeniable significance of intercultural competence and its place within foreign language education due to these developments. Following this, the discordant state of defining intercultural competence and agreeing on a common model will be explained, while also providing viable solutions to these challenges. Finally, ways of assessing this complex construct are discussed, aiming for a pragmatic approach usable especially in the foreign language classroom.





Chapter 3 Culturally-Biased Language Assessment: Collectivism and Individualism.......................................... 36 Ömer Gökhan Ulum, Mersin University, Turkey Dinçay Köksal, Çanakkale Onsekiz Mart University, Turkey This chapter suggests that culture and evaluation are inextricably linked. Therefore, culture should not be regarded as a phenomenon that needs to be controlled for in assessments; rather, it should be regarded as a fundamental component of assessment, beginning with its conceptualization and continuing through its design, construction, and interpretation of student performance. This study aims to discuss the relationship of culture to language assessment, teachers’ awareness of cultural and linguistic bias in testing, and the negative effect of test bias on learners’ motivation and performance; the ways of minimizing linguistic and cultural bias in language tests and maximizing cultural validity in classroombased language assessment; to find out how learners and teachers from different cultures view success from a language learning perspective; and to find out how learners and teachers from different cultures view success from a cultural perspective. Section 2 Methods of Language Assessment Chapter 4 Mispronunciation Detection Using Neural Networks for Second Language Learners.......................... 48 Lubana Isaoglu, Istanbul University-Cerrahpasa, Turkey Zeynep Orman, Istanbul University-Cerrahpasa, Turkey Speaking a second language fluently is the aim of any language learner. Computer-aided language learning (CALL) systems help learners achieve this goal. Mispronunciation detection can be considered the most helpful component in CALL systems. For this reason, the focus is currently on research in mispronunciation detection systems. There are different methods for mispronunciation detection, such as posterior probability-based methods and classifier-based methods. Recently, deep-learning-based methods have also attracted great interest and are being studied. This chapter reviews the research that proposed neural network methods for mispronunciation detection conducted between 2014 and 2021 for second language learners. The results obtained from studies in the literature and comparisons between different techniques are also discussed. Chapter 5 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning.............................. 71 Vasfiye Geçkin, Izmir Democracy University, Turkey Robot-assisted language learning (RALL) explores the role of educational humanoid robots in the learning of first and second language(s). Today, there is no definitive answer as to its effectiveness in the long run. Some studies report that adult L2 learners benefit from RALL in learning words while children enjoy only small gains from vocabulary instruction. The kind of feedback humanoid robots can provide in an ongoing conversation merely goes beyond facial expressions or words of encouragement. The need to upgrade the skills of educational robots and concerns with data privacy and abusive behavior towards robots are some challenges faced in RALL today. Plus, not a single study examined the role of RALL in assessing the reading abilities and pragmatic knowledge of L2 learners. This chapter focuses on the effectiveness of RALL in assessing word learning, pragmatics, grammar, listening, speaking, and reading



skills in an L2 and discusses its reflection for future classroom applications. Chapter 6 Dynamic Assessment as a Learning-Oriented Assessment Approach.................................................. 89 Tuba Özturan, Erzincan Binali Yıldırım University, Turkey Hacer Hande Uysal Gürdal, Hacettepe University, Turkey This chapter presents the theoretical background of dynamic assessment (DA) and its praxis with pedagogical suggestions for foreign language writing instructional settings. Resting on Vygotsky’s sociocultural theory (1978), DA asserts that there is a need for blending instruction with assessment because of the social interaction’s salience on cognition modification. Thus, DA adopts learning-andlearner-based feedback approaches and a present-to-future model of assessment, which rests on reciprocal teacher-learner interaction. Grounding in the need for enlightening DA in an EFL setting, this chapter presents reciprocal interactions between a teacher and four students. The interaction analyses unveil that the teacher has adopted a variety of mediational moves to finely instruct the students and diagnose their microgenesis, and students have displayed various reciprocity acts towards the mediational moves provided to them, which unpacks each student’s zone of proximal development. Based on these findings, the chapter ends with suggestions for EFL writing teachers. Chapter 7 Flipped Spiral Foreign Language Assessment Literacy Model (FLISLALM) for Developing Preservice English Language Teachers’ Language Assessment Literacy................................................. 104 Çiler Hatipoğlu, Middle East Technical University, Turkey Foreign language (FL) assessment is one of the most critical and challenging areas for pre-service FL teachers to develop. It is essential since various studies have shown that typical teachers spend up to half of their professional time in assessment-related activities. The area is difficult because its theoretical concepts are highly abstract. Because of these, for several years now, experts have emphasised the necessity of developing the language assessment literacy (LAL) of pre-service FL teachers and encouraged academics to research this area. In response to these calls, this chapter describes the developmental stages of the flipped spiral language assessment literacy model (FLISLALM) used to teach the undergraduate English Language Testing and Evaluation course in an FL teacher training program in Turkey. The model was developed using Brindley’s and Giraldo’s LAL frameworks and data from student questionnaires, product outputs, and self-assessment presentations collected between 2009-2020. The model aims to maximise prospective Turkish FL teachers’ LAL growth.



Section 3 Language Assessment in Education Chapter 8 Teaching and Assessment in Young Learners’ Classrooms................................................................. 136 Belma Haznedar, Bogazici University, Turkey Language learning in early childhood has been the subject of great interest both in first language (L1) and second language (L2) acquisition research. For the past 40 years, we have witnessed significant advances in the study of child language, with particular references to the cognitive, linguistic, psychological, pedagogical, and social aspects of child language. This chapter aims to shed light on some of the theoretical paradigms and their implications on language learning and assessment in young children whose exposure to another language begins early in life. In view of the diversity facing pedagogical practices worldwide, the authors aim to show the connection between classroom practices and assessment tools appropriate for young language learners, with special reference to formative and ongoing assessment. Chapter 9 The Long-Term Washback Effect of University Entrance Exams: An EFL Learner and Teacher’s Critical Autoethnography of Socialization.......................................................................................... 156 Ufuk Keleş, Faculty of Educational Sciences, Bahçeşehir University, Turkey In this chapter, the author explores the long-term washback effects of taking the nationwide university entrance exam (UEE) on his L2 socialization. He scrutinizes how he used his agency to break away from such effects in his later life. His theoretical framework incorporates L2 socialization theory and the concept of “desire” in TESOL. Methodologically, he employs critical autoethnography of socialization. The findings reveal that his L2 socialization was shaped by studying for and being taught to the test (aka the UEE), which greatly helped him earn a place at a top-notch university yet created many obstacles in his undergraduate studies and professional life. The findings further showed that the UEE’s format was, to some extent, egalitarian in that it provided high schoolers from low socio-economic status families with the opportunity to study in prestigious universities. Chapter 10 Dynamic Assessment in an Inclusive Pre-K FLEX Program Within Universal Design for Learning (UDL) Framework............................................................................................................................... 181 Hilal Peker, University of Central Florida, USA This chapter discusses how dynamic assessment (DA) is utilized for both instruction and assessment by using universal design for learning (UDL) framework to support inclusive education of young learners with special needs in a program offering French as a foreign language (FFL). The author focuses on incorporating DA in order to better understand student learning in this inclusive, prekindergarten FFL program. There have been some studies conducted with foreign language programs at the elementary level and higher or with typical young learners in an English as a second or foreign language setting; however, there are not enough studies focusing on foreign language programs with special needs students (SNSs) because these programs are not often available to many SNSs due to the practice of exemption. Thus, it is crucial to use DA as a tool for both instruction and assessment to be able to understand SNSs’ needs and learning gains. In this chapter, DA is examined and implications for inclusive education are provided.



Chapter 11 An Analysis of the General Certificate Examination Ordinary Level English Language Paper and Students’ Performance......................................................................................................................... 204 Achu Charles Tante, University of Buea, Cameroon Lovelyn Chu Abang, University of Buea, Cameroon This chapter sets out to analyse the Ordinary Level English Language Paper at the General Certificate of Examination from 2012– 2015 within the English-speaking sub-system in Cameroon. Five specific research objectives were formulated to guide the study that used the survey research design. The population of the study comprised of 45 English language teachers/examiners and 260 forms four and five students (approximately 14-15 years). Qualitative and quantitative data were collected. Two sets of questionnaires were developed for both teachers and students, and an interview guide for Head of Departments and examiners. Documentation was also employed such as past GCE questions from 2012–2015, end of marking subject reports, and O/L English language syllabus. Data analysed using the Pearson Product Moment Correlation showed that there was a correlation between assessment objectives, test content, test item development, assessment rubrics, and students’ performance in English language. Based on findings, certain recommendations were suggested. Chapter 12 Evaluation of ESL Teaching Materials in Accordance With CLT Principles through Content Analysis Approach............................................................................................................................... 232 Muhammad Ahmad, Government College University, Pakistan Aleem Shakir, Government College University, Pakistan Ali Raza Siddique, Government College University, Pakistan Owing to the rising needs of English language for communication at a global level, experts have stressed the significance of teaching English supported by materials based on communicative language teaching (CLT) principles to facilitate the development of communicative competence. This study, therefore, aims to evaluate ESL teaching materials to check their suitability to develop learners’ communicative competence. The study, for this purpose, employs content analysis approach for the analysis of text of English designed for class two in the light of a checklist devised on CLT principles. The results reveal that the content of the said textbook does not conform to the CLT principles. Therefore, it is not suitable to facilitate the development of communicative competence in the learners. The study suggests either to improve/revise the textbook or to replace it by another suitable one.



Section 4 Perspectives in Language Assessment Chapter 13 Language Assessment: What Do EFL Instructors Know? What Do EFL Instructors Do?................. 254 Dilşah Kalay, Kütahya Dumlupinar University, Turkey Esma Can, Kütahya Dumlupinar University, Turkey Teachers/instructors have the critical role of bridging teaching and assessment, meaning the more knowledgeable the teachers/instructors are, the more effective the assessment becomes. This results in that language instructors are to integrate various assessment strategies into their teaching to make better decisions about the learners’ progress, which highlights the term “assessment literacy.” Besides language instructors’ being knowledgeable, what they do in classrooms deserves attention. Language assessment practices are strategies/methods instructors use in classrooms to reach to-the-point and objective evaluations of students’ language development. Within this scope, the purpose of the current study is two-fold: first, to investigate the language assessment knowledge of language instructors and, second, to identify their language assessment practices in classrooms. Based on the findings, it is critical to understand not only what language instructors know but also what they do in classes. As a result, the ultimate goal of standardization in language assessment could be attained. Chapter 14 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement............................. 284 Nesrin Ozturk, İzmir Democracy University, Turkey Begum Atsan, İzmir Democracy University, Turkey International and national foreign language education policies recognize the invaluable role of parents. Because parents’ perceptions of foreign language assessment may initiate any parental involvement behavior, a qualitative descriptive study was conducted to investigate the phenomenon. Data were collected from 25 parents via semi-structured interviews and analyzed thematically. Findings confirmed parents’ understandings of a foreign language proficiency pertain to communicative use of the language. However, assessment practices at schools are test-driven, and they may not be authentic, valid, and criterion-referenced practices. Parents, moreover, highlighted a need for assessment literacy; nevertheless, they do not get any support from any stakeholders. Also, assessment practices’ outcomes may initiate parental involvement behaviors that pertain to parenting helping, communicative effective, and learning at home. This study highlights an urgent need to improve parents’ foreign language assessment literacy and parental involvement behaviors to enrich learners’ development.



Chapter 15 Academic Integrity in Online Foreign Language Assessment: What Does Current Research Tell Us?....................................................................................................................................................... 306 Aylin Sevimel-Sahin, Anadolu University, Eskisehir, Turkey The immediate transition to online teaching due to the pandemic has required the institutions to employ online assessment more frequently than ever. However, most teachers, students, and schools are not ready for that. Therefore, they have not planned and practiced their assessment methods effectively in online settings because of some challenges faced. One of them is the difficulty in sustaining academic integrity in digital environments, and many studies have already concluded there is a huge increase in dishonest behaviors in online assessment tasks. But academic integrity is an indispensable concept to improve teaching and learning by performing reliable, valid, and secure assessments, especially in online platforms. Then, the purpose of this chapter is to discuss academic integrity in relation to online foreign language assessment practiced during the pandemic by presenting the background to online assessment, academic integrity, and their relationship, and reporting the recent research studies within this scope. Chapter 16 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications in Web of Science (WoS)................................................................................................ 329 Devrim Höl, Pamukkale University, Turkey Ezgi Akman, Pamukkale University, Turkey This chapter aimed to examine the e-assessment in second/foreign language teaching-themed international publications from WoS (Web of Science) using the bibliometric method, one of the literature review tools. In particular, the most prolific countries, annual scientific production, the most globally cited documents, authors, institutions, keywords, and changing research trends were analyzed. A total of 3352 research documents from the Web of Science (WoS) Core Collection database were included in the analysis including publications until June 2022. In the analysis of the data obtained, the open-source R Studio program and the “biblioshiny for bibliometrix” application, which is an R program tool, were used. Based on the data analysis and discussions of these documents, this study has revealed some important results that will contribute to the field of trends of e-assessment in second/foreign language teaching. Compilation of References................................................................................................................ 356 About the Contributors..................................................................................................................... 410 Index.................................................................................................................................................... 415

xviii

Foreword

LANGUAGE AND EDUCATION – HEARTS, ACTION, AND IMAGINATION Across the world many learn, think, write, do research and publish in languages which are not the language of their hearts and homes. As a means to advance opportunities additional (foreign) language learning and assessment is a global phenomenon. From a very young age already, of necessity we exit a place of pride and esteem that is our home language – be it Turkish, Afrikaans, Urdu – and enter the world of learning and writing in an additional language as a currency towards the imagined futures we dream of. In this sometimes uncomfortable, unknown space we aim as best we can in a language which is not our own to offer the education space we work in an alternative glimpse of a socio-culturally and linguistically diverse world we occupy: as researchers, educators, students, office bearers, community partners, policymakers. This inevitable need to contribute to knowledge and advance learning in an additional language is unequal. Pluralism (Odora-Hoppers, 2004; Santos, 2007) is not easily merged with additional (foreign) language learning and assessment. Not being (fully) allowed to be present and write in the language that beats to rhythms of your heart aligns with views of cognitive injustice (Cossa, 2013; Veintie, 2013; Santods, 2007; Mignolo, 2000), and Fraser’s (2009) structural notions of distributive (in-)justice. She pleads that decisions-makers – here, in the education research sphere – make decisions that will lead to ideal states of justice. Here the purpose of research (on assessment of additional language) would be to generate knowledge that will assure that available resources for language-in-education is (i) re-distributed (rather than maldistributed so that some are privileged in their access to publications, quality research, and standards of language and assessment practices) to be (ii) representative (rather than misrepresenting only some, often Global North, Eurocentric positions views on the topic) of varied perspectives in order to ensure (iii) the recognition of a multiplicity of sociocultural and epistemological lenses. The editors and authors of this volume respond to the unevenness of the education-and-language playing field and a need for cognitive justice. Their contributions do not dwell on Fraser’s meritorious aim of crafting structural opportunities to (re-)distribute resources for linguistic diversity in the education research (and social mobility) spaces. Rather, the careful compilation of chapters in this volume aligns more with Sen’s (2011) agentic social justice view. The focus here is to include those who experience the injustice (here the cognitive injustice of not being able to advance as researcher – or global citizen? – in the absence of an additional, dominant language) as agents who act to work towards forms of life they value – where their hearts are. Arguably, for non-dominant language speakers who aspire in this space themselves to support peers, students (and themselves) to teach and learn and assess in a foreign  

Foreword

language, an ideal form of life they value could be to include their evidence into the center of knowledge creation on additional language assessment. And herein lies the power of this volume on additional (foreign) language assessment. The chapters show evidence from a collective of those living with the practice of education in an additional language – figuratively sitting around a (new) table to think of this stepchild language of our hearts. The chapters draw on evidence from non-typical Global North spaces with voices from the Cameroon, Pakistan, Turkey, and the United States to make sense of what is needed by a multiplicity of education role-players (be they researchers, students, teachers, parents, or powerful journal editors and reviewers) for high quality ‘foreign language assessment.’ It is evidence generated on topics by those in the know. The chapters are by researchers whose hearts are involved every day, and who are most touched by the need to learn and teach and do research in an additional language, and to assess that instructional practices afford quality learning. The volume is a global act of minimal standards and universal provision of conditions, culture, relationships and resources that may inform future foreign language assessment practices, thereby promoting the imagined lives researchers and students may have when evidence merge both languages of their hearts. Liesel Ebersöhn Department of Educational Psychology, University of Pretoria, South Africa

REFERENCES Cossa, J. (2013). Power Dynamics in International negotiations towards equitable policies, partnerships, and practices: Why it matters for Africa, the developing world, and their higher education systems. African and Asian Studies, 12(1-2), 100–117. doi:10.1163/15692108-12341253 Fraser, N. (2009). Scales of justice: Reimagining political space in a globalizing world. Columbia University Press. Mignolo, W. D. (2000). Local histories/global designs: Coloniality, subaltern knowledges, and border thinking. Princeton University Press. Odora-Hoppers, C. (2004). Culture, indigenous knowledge and development: The role of the university. Centre for Education Policy Development. Santos, B. S. (2007). Epistemologies of the south: Justice against epistemicide. Routledge. Sen, A. (2011). The idea of justice. Harvard University Press. Veintie, T. (2013). Coloniality and cognitive justice: Reinterpreting formal education for the indigenous peoples in Ecuador. International Journal of Multicultural Education, 15(3), 45–60. doi:10.18251/ijme. v15i3.708

xix

xx

Preface

As a predominant teaching paradigm, teaching English as a foreign language (EFL) has increasingly been one of the crucial elements that leads to career accomplishment for students, and thus, foreign language assessment has emerged as a major topic in foreign language learning. In the same vein, the use of English as a lingua franca (ELF) has gained momentum and yielded nascent pedagogical implications for EFL and foreign language assessment. Inevitably, assessment is no longer synonymous with conventional testing albeit juxtaposed to alternative forms, and teachers of foreign languages cannot disregard the pedagogical implications blossomed as a result in this changing educational context. In the growing body of literature, our knowledge of foreign language assessment has recently been shaped by skills assessment. In doing this, language assessment is regarded as the verification of learners’ language proficiency especially in formal contexts. Learners’ language proficiency is determined by their actual performance in each foreign language; thus, cognitive, affective, linguistic, and socio-cultural meanings of the language forms with a principal focus on communication and creativity in language use are significantly scrutinized. Quite the contrary, quickened technological and socio-cultural developments have a considerable effect on language use revisiting the assessment of foreign language abilities. It also reveals a potential mismatch between language teachers’ theoretical knowledge and their practical considerations for classroom practice, which is also noted as teachers’ assessment literacy. Therefore, qualitative techniques such as observation checklists, verbal protocol analysis and discourse analysis, and quantitative techniques such as measurement models, data mining and statistical methods have emerged as mainstream components of foreign language assessment. Accordingly, the present volume proposes a series of research on foreign language assessment by different perspectives in order to provide a foundation as to why foreign language assessment as a discipline should be refocused with caution, what sort of theoretical and practical implications should be in place for foreign language teachers, and in what ways it can be possible to provide futuristic perspectives on foreign language assessment for test developers and users involved in the process of language assessment. More specifically, this handbook takes an informed look at researching perspectives on foreign language assessment through reflections on classroom applications by giving recommendations to strengthen quality language assessments by drawing on a variety of research methodologies. The target audience of this coedited book is composed of professionals, educators, researchers working in the field of foreign language education together with foreign language learners. Besides, foreign language education undergraduates, test designers and/or test-item developers, practicing foreign language teachers especially those involved in assessment processes, high-stake holders, and test administrators are also targeted. From a global viewpoint, this coedited volume can be used at graduate and undergraduate  

Preface

level courses as a principal textbook, or as a supplemental reading material for researchers, test developers, and/or educators from both within and beyond Europe. To elaborate, this handbook is organized in four main sections: language domains (i.e., culture and language assessment, language skills assessment, assessment of teaching materials); methods of language assessment (i.e., flipped spiral foreign language assessment, robot-assisted language assessment, dynamic assessment); language assessment in education (i.e., language assessment in K-12 schools, assessment in higher education, assessing pre-school students with special needs, washback), and perspectives in language assessment (i.e., linguistic perspectives, historical perspectives, parent-driven perspectives, research-driven perspectives) are explored with the contributions of prominent scholars in the field, all of which are summarized in the organization of the handbook below.

ORGANIZATION OF THE HANDBOOK Section 1: Language Domain Chapter 1 Prof. Kavaklı Ulutaş provides a recent contribution to the field of foreign language assessment by demonstrating practitioners how they can make the best out of their assessment practices by addressing both theoretical and practical issues, and listing recommendations in order to empower quality language assessment as summarized below: Hailing the value of foreign language assessment, this chapter embarks on reflections from classroom practices in order to forecast the future of foreign language assessment which is molded by a historical perspective.

Chapter 2 Mr. Brüstle and Prof. Vogt introduce the challenge of assessing intercultural competence through existing approaches within the scope of language domain nestled by culture and language assessment in order to aim for a pragmatic approach implemented in language classrooms as summarized below: This chapter will give a detailed insight into the undeniable significance of intercultural competence and its place within foreign language education due to these developments. Following this, the discordant state of defining intercultural competence and agreeing on a common model will be explained, while also providing viable solution to these challenges. Finally, ways of assessing this complex construct are discussed, aiming for a pragmatic approach usable especially in the foreign language classroom.

Chapter 3 Prof. Ulum and Prof. Köksal introduce the language domain molded by culture and language through the concepts of collectivism and individualism in order to minimize cultural and linguistic bias in classroom-

xxi

Preface

based foreign language assessment, and thereby how to embed cultural tidbits without throwing off the learners as summarized below: This chapter aims to discuss the relationship of culture to language assessment; teachers’ awareness of cultural and linguistic bias in testing and the negative effect of test bias on learners’ motivation and performance; the ways of minimizing linguistic and cultural bias in language tests and maximizing cultural validity in classroom-based language assessment; to find out how learners and teachers from different cultures view success from the point of collectivism and individualism.

Section 2: Methods of Language Assessment Chapter 4 Mrs. Isaoglu and Prof. Orman introduce the methods by means of a longitudinal approach in order to provide a basis for the implementation of neural network methods for the detection of mispronunciation as summarized below: This chapter reviews the research that proposed neural network methods for mispronunciation detection conducted between 2014 and 2021 for second language learners. The results obtained from studies in the literature and comparisons between different techniques are also discussed.

Chapter 5 Prof. Geçkin introduce the kind of feedback the robots can provide in the learning of first and second language(s) to detect their effectiveness in the long run as summarized below: This chapter focuses on the effectiveness of Robot Assisted Language Learning (RALL) in assessing word learning, pragmatics, grammar, listening, speaking and reading skills in an L2 and discusses its reflection for future classroom applications.

Chapter 6 Prof. Özturan and Prof. Uysal Gürdal introduce the theoretical background of dynamic assessment together with its praxis to provide pedagogical implications and suggestions for teaching writing in language classrooms as summarized below: This chapter presents the theoretical background of Dynamic Assessment (DA) and its praxis with pedagogical suggestions for foreign language writing instructional settings. With a need for blending instruction with assessment because of the social interaction’s salience on cognition modification, DA adopts learning-and-learner-based feedback approaches and a present-to-future model of assessment to provide a basis for reciprocal teacher-learner interaction, and suggestions for EFL writing teachers.

xxii

Preface

Chapter 7 Prof. Hatipoğlu introduces the significance of developing foreign language teachers’ Language Assessment Literacy (LAL) by means of ‘Flipped Spiral Language Assessment Literacy Model’ (FLISLALM) as a response to their ongoing efforts paid in assessment-related activities which are highly critical and abstract concepts as summarized below: This chapter describes the developmental stages of the “Flipped Spiral Language Assessment Literacy Model” (FLISLALM) used to teach the undergraduate “English Language Testing and Evaluation” course in an FL teacher training program in Turkey in maximize prospective Turkish FL teachers’ LAL growth.

Section 3: Language Assessment in Education Chapter 8 Prof. Haznedar introduces the theoretical paradigms together with their pedagogical implications on language learning and assessment in the development of child language with references to cognitive, linguistic, psychological, pedagogical, and social aspects as summarized below: This chapter aims to shed light on some of the theoretical paradigms and their implications on language learning and assessment in young children whose exposure to another language begins early in life. In view of the diversity facing pedagogical practices across the world, we hope to show the connection between classroom practices and assessment tools appropriate for young language learners, with special reference to formative and ongoing assessment.

Chapter 9 Prof. Keleş introduces the long-term washback effect of a high-stake examination on L2 socialization by incorporating L2 socialization theory and the concept of ‘desire’ in TESOL as a theoretical framework, and critical autoethnography as a methodology to discuss the findings as summarized below: This chapter explores the long-term washback effect of taking the nationwide university entrance exam on L2 socialization by scrutinizing the researcher’s use of agency to break away from them in the academic and professional life.

Chapter 10 Prof. Peker introduces the significance of incorporating Dynamic Assessment (DA) in order to comprehend student learning(s) and gain(s) in an inclusive, prekindergarten French as a foreign language (FFL) program as summarized below: This chapter discusses how Dynamic Assessment (DA) is utilized for both instruction and assessment by using Universal Design for Learning (UDL) framework to support inclusive education of young learners with special needs in a program offering French as a foreign language (FFL). xxiii

Preface

Chapter 11 Prof. Tante and Mrs. Abang introduce the correlation amidst assessment objectives, test content, item development, assessment rubrics and students’ performance in the English language in order to provide a basis for both teachers and students together with the head of departments and examiners by means of quantitative and qualitative research findings as summarized below: This chapter sets out to analyze the Ordinary Level English Language Paper at the General Certificate of Examination from 2012 – 2015 within the English-speaking sub-system in Cameroon through the utilization of five objectives formulated to guide the research elaborated within.

Chapter 12 Mr. Ahmad, Prof. Shakir and Mr. Siddique introduce the assessment of ESL teaching materials, specifically an English textbook used to develop learners’ communicative competence by employing a content analysis approach as summarized below: This chapter aims to evaluate ESL teaching materials to check their suitability to develop learners’ communicative competence. The chapter, for this purpose, employs content analysis approach for the analysis of a textbook of English designed for class two in the light of a checklist devised on communicative language teaching (CLT) principles.

Section 4: Perspectives in Language Assessment Chapter 13 Prof. Kalay and Ms. Can introduce the critical role(s) of foreign language teachers in order to integrate various assessment strategies into their teaching to provide meaningful decisions on foreign language learners’ progress within the perspective of assessment literacy as summarized below: This chapter is constituted around two main purposes: first, to investigate the language assessment knowledge of language instructors, and second, to identify their language assessment practices in classrooms.

Chapter 14 Prof. Ozturk and Ms. Atsan introduce international and national foreign language education policies to recognize the invaluable role of parents pertain to parenting helping, communicative effective, and learning at home in order to understand learners’ foreign language learning outcomes, and to enhance their parents’ foreign language assessment literacy as summarized below: Because parents’ perceptions of foreign language assessment may initiate any parental involvement behavior, this chapter entails a qualitative descriptive study conducted to investigate the phenomenon in order to highlight the significance of parents’ foreign language assessment literacy development and parental involvement behaviors to enrich learners’ development. xxiv

Preface

Chapter 15 Prof. Sevimel-Sahin introduces the perceived difficulty and challenges faced in sustaining academic integrity specifically in online language learning environments while performing reliable, valid, and secure assessment practices as a result of the immediate transition to online teaching as summarized below: The purpose of this chapter is to discuss academic integrity concept in relation to online foreign language assessment practiced during the pandemic by presenting the background to online assessment, academic integrity, and their relationship, and reporting the recent research studies within this scope.

Chapter 16 Prof. Höl and Ms. Akman introduce the trends in e-assessment in second/foreign language teaching with the utilization of the open-source R Studio program and the “biblioshiny for bibliometrix” as an R program tool to analyze the data comprised of thousands of research documents from the Web of Science (WoS) Core Collection database as summarized below: This chapter aims to examine the e-assessment in second/foreign language teaching themed international publications until June 2022 from Web of Science (WoS) using the bibliometric method, one of the literature review tools. Dinçay Köksal Çanakkale 18 Mart University, Turkey Nurdan Kavaklı Ulutaş Izmir Demokrasi University, Turkey Sezen Arslan Bandırma 17 Eylül University, Turkey

xxv

xxvi

Acknowledgment

This co-edited book would have not existed without the amazing efforts of authors, unified through foreign language testing and assessment, although geographically apart. We are extremely excited to have authors from Türkiye, Germany, Cameroon, Pakistan, and the United States. Their studies, journeys, and commitment to the field of language testing and assessment to promote foreign language education together with foreign language teacher education are remarkable. We hope that this handbook will engage us all in the future local and international collaborations with each other. We are grateful to our Editorial Advisory Board for their diligent work and support throughout the process of this book; specifically for their valuable editing and feedback on the chapters submitted. We would also like to thank all the Chapter Reviewers for without their attention to detail, expertise, and thoughtful care for topics, this project would have not been complete. Despite their hard work on their own chapters, these authors took the extra time to review the assigned chapters, with extreme care for quality. Many thanks to: Moritz Brüstle, Ufuk Keleş, Achu Tante, Hilal Peker, Vasfiye Geçkin, Devrim Höl, Dilşah Kalay, and Aylim Sevimel-Sahin. We extend special thanks to Prof. Liesel Ebersöhn (Director of Centre for the Study of Resilience; President Elect of the World Education Research Association- WERA) for supporting our effort through writing of the Foreword to this handbook, and for the relentless advocacy in the understanding of language testing and assessment in the scholarly traditions embedded in language education. We are deeply touched by the IGI Global commitment to both local and international endeavors and inquiries. Special thanks to the supportive staff at IGI Global, in particular, to Jan Travers, Sierra Miron, Melissa Wagner, Nina Eddinger, Katelyn McLoughlin, and Emma Baronak, among others on the Development and Marketing Teams. We would also like to thank our families, partners and friends for their love, support and understanding throughout this project.

 

Section 1

Language Domains

1

Chapter 1

Revisiting the Past to Shape the Future:

Assessment of Foreign Language Abilities Nurdan Kavaklı Ulutaş https://orcid.org/0000-0001-9572-9491 Izmir Demokrasi University, Turkey

ABSTRACT Hailing the value of foreign language assessment, this chapter embarks on reflections from classroom practices in order to forecast the future of foreign language assessment, which is molded by a historical perspective. In doing so, it provides a recent contribution to the field of foreign language assessment by demonstrating to practitioners how they can make the best out of their assessment practices by addressing both theoretical and practical issues and listing recommendations in order to empower quality language assessment.

INTRODUCTION As an integral part of instructional endeavors, assessment is basically defined as the activities assigned to learners by teachers in order to diagnose learning proficiency, or achievement, which are directly influenced by their learning experiences (Cheng, Rogers, & Wang, 2007). In doing so, assessment is not only important for ascertaining the achievement of educational objectives but also for the continuity in improvement and learning progress. However, assessment is not an unequivocal process because assessing learner performance cannot be isolated from the socio-historical frame of reference in which it prevails (McNamara, 2000). To that end, assessment has undergone a change in time emanating from one paradigm to another. In order to discuss the current trends in foreign language assessment, it will be helpful to review the brief history of development of language testing both in terms of theoretical and practical perspectives. Assessment of language constructs started with discrete items, which were language skills for the case of foreign language assessment. It was followed by integrative assessment, which was later furnished by DOI: 10.4018/978-1-6684-5660-6.ch001

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 Revisiting the Past to Shape the Future

communicative testing. Since testing trends were evolved around the major theme of ‘language learning as a dynamic process’, assessment of language abilities was remarkably shaped by the notion of a robustly defined criterion, waving between criterion- referenced and norm-referenced testing. Therefore, different facets of language constructs and intertwined variables could then be assessed in order to have a more comprehensible view of foreign language abilities. In doing so, traditional assessment was concerned with the language product in order to define learning weaknesses, which was conducted at the end of the language course, and thus, labelled as summative assessment. However, the shifting paradigm has shown that language learning is regarded as a process; therefore, it has mushroomed as a need for more dynamic and authentic assessment methods. This situation has traversed the assessment concerns to formative and/or continuous assessment. What is more, some alternative methods are also favored by educators, practitioners and researchers in the field, such as peer-assessment, portfolios, self-assessment (Poehner & Inbar-Lourie, 2020). With this in mind, the assessment of foreign language abilities within a diachronic perspective by responding to some basic questions will be revisited below concerning the facts that: 1. Has foreign language assessment been shaped by a new conceptualization in a recently changing educational landscape? If so, should educators reshape their foreign language assessment practices by means of futuristic methods? 2. How should educators select the best method for assessing their students’ language abilities? And then, how should educators revision their assessment procedures in the classroom environment in line with various methodological frameworks and epistemologies? Valuing the paramountcy of foreign language assessment, this chapter specifically embarks on reflections from classroom practices in order to improve teachers’ (language) assessment literacy by demonstrating different assessment types and methods forecasting the future of foreign language assessment which is mostly molded by new language teaching and learning methods to demonstrate practitioners how they can make the best out of their assessment practices. Finally, addressing both theoretical and practical issues in foreign language assessment, recommendations for future directions will be noted in order to empower quality language assessment, as well.

LAY OF THE LAND: WHERE WERE WE? Assessment of language learning has been the focal point of myriad of researchers, syllabus designers, test developers, and teacher-testers. Thereof, it is a vital component in the educational process as it is serving for different purposes such as achievement, progress, diagnostic, among others. As language assessment refers to “the act of collecting information and making judgments on a language learner’s understanding of a language and his ability to use it” (Chapelle & Brindley, 2002, p. 267), it is an interpretation of the taker’s ability to utilize some aspects of language. And, it has long been assumed that assessing foreign language abilities has been doomed to the notion of an “idealized” native speaker although a broader angle of the language construct has been dealt by assessment experts without any preliminary concern of who the language user is. The language construct, herein, is remarked as “a proficiency, ability, or characteristic of an individual that has been inferred from observed behavioral consistencies and that can be meaningfully interpreted” 2

 Revisiting the Past to Shape the Future

(Chapelle, Enright, & Jamieson, 2009, p. 3), which also encompasses paramount weight with high-stakes test developers and users since determining the factors to decide on the language skills and proficiency of language user(s) in order to be admitted to a course, enrolled for a language program, accepted as a citizen in a country, admitted as an immigrant, and/or diagnosed as a proficient user of a language is belittling (Shohamy & McNamara, 2009). Since choosing appropriate assessment methods is of critical importance, the concepts of reliability, validity, test usefulness, impact, interactiveness, and practicality have been influential in testing and assessment practices. In this regard, Bachman (2004) states that “language tests thus have the potential for helping us collect useful information that will benefit a wide variety of individuals. However, to realize this potential, we need to be able to demonstrate that scores we obtain from language tests are reliable, and that they ways in which we interpret and use language test scores are valid. If the language tests we use did not provide reliable information, and if the uses we make of these test scores cannot be supported with credible evidence, then we risk making incorrect and unfair decisions that will be potentially harmful to the very individuals we hope to benefit” (p. 3). To address reliability and validity, psychometric development in language assessment has evolved with the emergent notion of “discrete elements” (Lado, 1961), which is today known as norm-referenced (NR) as a kind of testing where students’ performance(s) are compared to one other. As another paradigm pertaining to language assessment, criterion-referenced (CR) testing has blossomed in contrast to normreferenced (NR) testing. While NR analyzes learners’ performance in lieu of language constructs, CR measures external tenets, such as language ability, learning outcomes, and learning objectives. However, mostly, modern theories of testing have overlooked NR at the essence of CR on the grounds that the measurement of language skills is to be employed by means of well-defined criteria. Shortly afterwards, item response theory (IRT) has been announced, through which student’s ability is assessed in lieu of a “sole development scale” (Masters, 1990, p. 58). As different statistical software is in use, more comprehensive definitions of the language constructs and abilities can then be envisioned. Since actual language abilities of the learners can be measured therein, possible distractors and confounding variables may be exerted so that students are assumed to get the “true score” in a real-world context. Thus, language may be viewed both in integrative and communicative alternates. With the “true” definition of the language construct and ability, the significance of the language discourse and related patterns could be understood comprehensively by means of “contextualized interactions” (Chalhoub-Deville, 2003; de Jong, 1990; Messick, 1994). For instance, according to Messick (1994), there are two frameworks exploited to interpret the test results and item design: namely, taskcentered and competency-centered, where the former is fortuitous by using the interpretations of the test results in order to specify the language construct(s) theoretically; and on the other hand, the latter craves for a meaningful interpretation of the types of the tasks in a real-world context in order to develop test (items) as a simulation of the real-world representations. This is confirmed by the theory of Bachman and Palmer (1996) with the notion of “target language use” domain (TLU) which stands for ensuring the construct validity in assessment for educational purposes. On another move, interpreting the test results is a significant point in aforementioned theories since these interpretations are fundamental together with their justifiable arguments, which mean to address the concept of test bias that is linked to the socio-cognitive context. In this vein, Chapelle et al. (2009) assert the premise that proficiency in a foreign language is composed of two folds: knowledge and use 3

 Revisiting the Past to Shape the Future

of the language. In that, pure knowledge of grammar and/or vocabulary could only lead to a fuzzy vision of the language construct, which, in turn, gives wrong interpretations about the test takers’ performance. This is also confirmed by previous research (see Bachman & Palmer, 1996; McNamara, 1996). For instance, Bachman and Palmer (1996) previously noted that an individual could manifest his/her language use through four aspects: language knowledge, personal characteristics, topical knowledge, and strategic competence all of which were influential in order to define the construct(s) of a foreign language ability. Though assessment is undertaken for various purposes (i.e., from formative to summative assessment), the major purpose is to support learning that occurs if students are (Cameron et al., 1998): “thinking, problem-solving, constructing, transforming, investigating, creating, analyzing, making choices, organizing, deciding, explaining, talking and communicating, sharing, representing, predicting, interpreting, assessing, reflecting, taking responsibility, exploring, asking, answering, recording, gaining new knowledge, and applying that knowledge to new situations” (p. 6). To do so, what constitutes or sounds as ‘good’ language assessment should be remarked cautiously: language assessment should serve for clear purposes in order to reflect appropriate targets of achievement; language assessment should depend on an appropriate assessment method within the scope of the purpose and target noted within; and language assessment should reflect achievement of the students appropriately in order to control all other relevant sources of distortion and bias (Stiggins, 2007). In sum, effective language assessment is sounding as purposive with a clear target and relevant objectives in order to enhance students’ language abilities. What’s more, new challenges have also arisen in order provide a more comprehensive and appropriate definition of the language abilities together with the language constructs. To note herein, the introduction of a variety of assessment methods, which are also labeled as alternative assessment, and the advents in technology have paved the way towards an understanding of test authenticity in a different perspective. To exemplify, computers are used in order to administer language tests, which entails the idea whether test takers are employing related strategies in order to show their language abilities as prima facie evidence of their nestled language abilities, which are elaborated below as the directions for the future of (foreign) language assessment.

DIRECTIONS FOR THE FUTURE: WHERE TO GO? What is apropos about technology is that technology is contributing to the assessment of the students’ performance a lot (Buck, 2001). The devices are ranging from phones to other digital devices, albeit mostly in the form of computers. Accordingly, computer-adaptive testing (CAT) is carried out in order ensure fairness, validity and reliability both for administering and scoring, together with the employment of a “true” statistical software, such as SPSS, AMOS, FACETS, etc. Thanks to CAT, test items are “tailored to the apparent ability level of the testee” (Alderson, 1990, p. 21) since different test items can be applied in order to test the language construct of language abilities, like “matching techniques, language transformational tasks, reordering tasks, information transfer techniques, and other objectively scored item types” (p.23). In that, scrutiny of the interpretations has helped to define test takers’ future in relation to their performance so much so that language programs, textbooks in use, frequency of language exams, the nature of language exams, and even the future of 4

 Revisiting the Past to Shape the Future

language teachers together with the other stakeholders who are responsible for the preparation of the test takers for the language test to be taken can be defined as future acts to be molded. Beyond question, the foreign language teaching and learning paradigm shift from early Audio-Lingual Method (ALM), passing by the Communicative Language Teaching (CLT), and coming up to the era of Post-Method, has extended the understanding of language constructs, paving the way towards an alter in the facets of second/ foreign language abilities due to the fact that language assessment is also recognized by the variety of learning styles and second language acquisition (SLA) in order to reshape the learners’ language abilities (Ellis, 1990). Thus, the dichotomy between knowledge of grammar as norm-oriented learning, and language use for effective communication creates an obstacle for test item writers in the design process. Such variables are also noted by Davies (1990) that “the five major variables involved in language proficiency are ill-defined and subject to unreliability. These variables are: the native speaker, the cutoff, the criterion score, the test, and the language itself” (p. 179.); thus, “testing should be concerned with evidence-based validity” (Weir, 2005, p. 1). Fairness is also affected negatively since test bias may emerge from race, gender, and other abilities so to sit for the same tests, which is also reported as “differential item functioning” (DIF) to build language proficiency. There are numerous facets that might be intertwined with test bias, such as students’ background as the test takers, their attitudes, previous courses taken, exam anxiety, and/or motivation. This is also attributed by teachers’ conceptions and practices in relation language assessment, their educational background together with their experience(s) with scoring, rating and/or assessing. This intricacy can be resolved with the positive effects of washback in the language assessment process through assessment information, the consequences of teaching and learning for assessment, and decisions for the overall evaluation, which is more convincing for teachers as a response to unremitting changes. This convincing purpose of assessing students’ performance lies behind the use of appropriate methods that will reflect the actual learners’ language abilities together with learning objectives. As a supportive meritocracy, this is helpful for language teachers to provide meaningful feedback since assessment should entail relevant purpose(s) in order to aid test designer and other policy makers to take felicitous decisions. As noted by Bachman and Palmer (2010): “the reason why (original italics) we use an assessment is to collect information in order to make decisions. This information may be about our students’ achievement of the learning objectives, about their perceptions, feelings, and attitudes towards language learning and the course of instruction” (p. 26). Therefore, students should be given formative feedback which is tailored to their abilities together with learning objectives, which paves the way towards an effective classroom-based assessment both priori and posteriori. In doing so, language assessment literacy (LAL) is not solely apparent for students as language learners, albeit for language teachers, too. In a broadened concept, assessment literacy is recognized as “the capacity to ask and answer critical questions about the purpose for assessment, about the fitness of the tool being used, about testing conditions, and about what is going to happen on the basis of the results” (Inbar-Lourie, 2008, cited in Watanabe, 2011, p. 29). In that, assessment literacy equips teachers with the required knowledge and tools to aid them comprehend what and how they are assessing so to take decisions in order to maximize learning and assessing students’ performance effectively. To this end, alternative assessment (or alternate assessment) is promulgated as a nascent paradigm, which is featured as having an open-ended, untimed, and free-response format (Brown, 2004). Since the prevailing tendency is to direct students to learn how they learn so that they can cope with new 5

 Revisiting the Past to Shape the Future

technologies and environments for learning, in fact, become autonomous. As stated by Gibbs (2006) “assessment frames learning, creates learning activity and orients all aspects of learning behaviors” (p. 23). Henceforth, “a move away from outcome-based assessment and towards more holistic, process-based assessment, such as portfolios and personal development planning” (Clegg & Bryan, 2006, p. 218-219) is triggered in order to advocate performance-based assessment to assess higher-order thinking skills for a deeper understanding of students’ utilization of language abilities. Performance-based assessment (PBA) is a type of an alternative assessment which caters students with the engagement with the task and task requirement(s) by means of their knowledge, skills, and learning strategies in order to create a product, or generate a novel response. In this respect, this type of an assessment is considered as an authentic way of assessing students’ knowledge and skills since it provides a two-way interaction between the learning environment and themselves (Yawkey, Gonzalez, & Juan, 1994). This type of an assessment is mostly known using presentations, case-based scenarios, reflective pieces, concept maps, reports, portfolios, and the like. What’s more, as a scion of an interventionist approach, dynamic assessment (DA) has developed from the theory of Vygotsky, which does not only measure the final product, albeit focuses on the process of development of the product (Poehner, 2008). Accordingly, the idea of “assessment of learning” has been changed to the view of “assessment for learning” (AfL) (Lidz & Gindis, 2003); and thus, transferred to the field of second/foreign language education by providing a continuous insight into students’ development in lieu of learning outcomes so to fix any problem that may mushroom. Currently underway, the chancing priorities in the field education trigger the notion that the reliance on discrete language constructs do define grammatical competence should be directed towards the development and utilization of performance objectives and pragmatics of language, such as negotiation of meaning in the second/foreign language, language awareness, and communicative repertoire (Canagarajah, 2009). Moreover, we also see a growing interest in the assessment practices that go beyond subject-based assessment to 21st century skills assessment, which is embellished by soft skills and non-cognitive 21st century competencies. This gives us the understanding of students’ global competence development; and thus, critical considerations are noted in order to employ more analytical methodologies even for the assessment of life skills. Becoming more prominent, an awareness of such methodological demands is now more apparent with the changes in the educational landscape. Since our educational landscape has been transforming over time, teachers are awaited to reshape their foreign language assessment practices by means of futuristic methods; otherwise, assessment will “remain strikingly similar to those that prevailed century or more ago” (Broadfoot, 2009, p. vii). Besides, in order to revision their assessment procedures in the classroom environment in line with various methodological frameworks and epistemologies, it is also surmised that teachers are to develop their knowledge of assessment theories and practices. Herein, training, especially in the early part of their careers, is needed (Department for Education, 2015; 2019) in order to enable them to feel confident with their existing knowledge of assessment approaches, strategies of questioning, critical consumption of the (test) results, and performing interventions if needed with the standardized (test) data. In that, classroom-based assessment is mostly nestled by the classroom teachers, also known as the undeniable role of teacher agency; therefore, they should “be provoked and supported in trying to establish new practices in formative assessment” as “there being extensive evidence to show that the present levels of practice in this aspect of teaching are low” (Black & William, 1998, p. 61). Ultimately, not only are teachers triggered to display their knowledge in language assessment, but they should also showcase a practical lacuna to use their skills to design equitable (test) items in order 6

 Revisiting the Past to Shape the Future

to take reasonable decisions which will not influence the future of other stakeholders negatively in the long run. Thus, it is a crystal-clear fact that what good assessment means should be understood while recognizing different angles at different levels with various approaches and methods (Coombe, Al-Hamly, & Troudi, 2009). And, to use language assessment most effectively, assessment should be regarded as an important element to motivate students to learn; therefore, assessment methods and tools should not merely used for assessing students’ achievement and reporting the results, but also for enhancing the quality of language learning and teaching together with that of language assessment, in turn.

REFERENCES Alderson, C. (1990). Learner-centered testing through computers: Institutional issues in individual assessment. In J. A. L. de Jong (Ed.), Individualizing the assessment of language abilities (pp. 20–37). Multilingual Matters Ltd. Bachman, L. F. (2004). Statistical analyses for language assessment. Cambridge University Press. doi:10.1017/CBO9780511667350 Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford University Press. Bachman, L. F., & Palmer, A. S. (2010). Language assessment in practice: Developing language assessment and justifying their use in the real world. Oxford University Press. Black, P., & William, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policy & Practice, 5(1), 7–74. doi:10.1080/0969595980050102 Broadfoot, P. (2009). Signs of change: Assessment past, present and future. In C. Wyatt-Smith & J. Cummings (Eds.), Educational Assessment in the 21st Century. Connecting Theory and Practice (pp. v–xi). Springer. Brown, H. D. (2004). Language assessment: Principles and classroom practices. Pearson Education. Buck, G. (2001). Assessing listening. Cambridge University Press. doi:10.1017/CBO9780511732959 Cameron, C., Tate, B., Macnaughton, D., & Politano, C. (1998). Recognition without rewards. Peguis Publishers. Canagarajah, S. (2006). Changing communicative needs, revised assessment objectives: Testing English as an international language. Language Assessment Quarterly, 3(3), 229–242. doi:10.120715434311laq0303_1 Chalhoub-Deville, M. (2003). Second language interaction: Current perspectives and future trends. Language Testing, 20(4), 369–383. doi:10.1191/0265532203lt264oa Chapelle, C., & Brindley, G. (2002). Assessment. In N. Schmitt (Ed.), An introduction to applied linguistics (pp. 267–286). Arnold. Chapelle, C. A., Enright, M. K., & Jamieson, J. M. (Eds.). (2009). Building a validity argument for the Test of English as a Foreign Language. Routledge, Taylor & Francis Group.

7

 Revisiting the Past to Shape the Future

Cheng, L., Rogers, W. T., & Wang, X. (2007). Assessment purposes and procedures in ESL/EFL classrooms. Assessment & Evaluation in Higher Education, 33(1), 9–32. doi:10.1080/02602930601122555 Clegg, K., & Bryan, C. (2006). Reflections, rationales and realities. In C. Bryan & K. Clegg (Eds.), Innovative assessment in higher education (pp. 216–227). Routledge. Coombe, C., Al-Hamly, M., & Troudi, S. (2009). Foreign and second language teacher assessment literacy: Issues, challenges and recommendations. Research Notes, 38, 14–18. Davies, A. (1990). Operationalizing uncertainty in language testing: An argument in favor of content validity. In J. A. L. de Jong (Ed.), Individualizing the assessment of language abilities (pp. 179–195). Multilingual Matters. de Jong, J. A. L. (Ed.). (1990). Individualizing the assessment of language abilities. Multilingual Matters. Department for Education. (2015). Carter review of initial teacher training. Retrieved on August 25, 2022 from https://www.gov.uk/ government/publications/carter-review-of -initial-teacher-training D e p a r t m e n t fo r E d u c a t i o n . ( 2 0 1 9 ) . E a rly c a re e r f ra m e wo rk . Ret r i eve d on August 25, 2022 from https://www.gov.uk/government/publications/early-career-fram ework Ellis, R. (1990). Individual learning styles in classroom second language development. In J. A. L. de Jong (Ed.), Individualizing the assessment of language abilities (pp. 83–96). Multilingual Matters. Gibbs, G. (2006). How assessment frames student learning. In C. Bryan & K. Clegg (Eds.), Innovative assessment in higher education (pp. 23–36). Routledge. Inbar-Lourie, O. (2008). Constructing a language assessment knowledge base: A focus on language assessment courses. Language Testing, 25(3), 385–402. doi:10.1177/0265532208090158 Lado, R. (1961). Language testing. Longmans, Green and Co. Lidz, C. S., & Gindis, B. (2003). Dynamic assessment of the evolving cognitive functions in children. In A. Kozulin, B. Gindis, V. S. Ageyev, & S. M. Miller (Eds.), Vygotsky’s educational theory in cultural context (pp. 99–116). Cambridge University Press. doi:10.1017/CBO9780511840975.007 Masters, G. N. (1990). Psychometric aspects of individual assessment. In J. A. L. de Jong (Ed.), Individualizing the assessment of language abilities (pp. 56–70). Multilingual Matters. McNamara, T. (1996). Measuring second language performance. Longman. McNamara, T. (2000). Language testing. Oxford University Press. Messick, S. (1994). The interplay evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13–23. doi:10.3102/0013189X023002013 Poehner, M. (2008). Dynamic assessment: A Vygotskian approach to understanding and promoting second language development. Springer. doi:10.1007/978-0-387-75775-9

8

 Revisiting the Past to Shape the Future

Poehner, M. E., & Inbar-Lourie, O. (Eds.). (2020). Toward a reconceptualization of second language classroom assessment: Praxis and researcher-teacher partnership. Springer. doi:10.1007/978-3-03035081-9 Shohamy, E., & McNamara, T. (2009). Language tests for citizenship, immigration, and asylum. Language Assessment Quarterly, 6(1), 1–5. doi:10.1080/15434300802606440 Stiggins, R. J. (2007). Classroom assessment for student learning. Pearson Education, Inc. Watanabe, Y. (2011). Teaching a course in assessment literacy to test takers: Its rationale, procedure, content and effectiveness. Research Notes, 46, 29–34. Weir, C. J. (2005). Language testing and validation: An evidence-based approach. Palgrave MacMillan. doi:10.1057/9780230514577 Yawkey, T. D., Gonzalez, V., & Juan, Y. (1994). Literacy and biliteracy strategies and approaches for young culturally and linguistically diverse children: Academic excellence P.I.A.G.E.T. comes alive. Journal of Reading Improvement, 31(3), 130–141.

ADDITIONAL READING Alderson, J. C. (2005). Diagnosing foreign language proficiency: The interface between assessment and learning. Continuum. Hidri, S. (Ed.). (2018). Revisiting the assessment of second language abilities: From theory to practice. Springer. doi:10.1007/978-3-319-62884-4

KEY TERMS AND DEFINITIONS Alternative Assessment (or Alternate Assessment): It is promulgated as a nascent paradigm, which is featured as having an open-ended, untimed, and free-response format (Brown, 2004) through different modes of assessment. Assessment: It is basically defined as the activities assigned to learners by teachers in order to diagnose learning proficiency, or achievement, which are directly influenced by their learning experiences (Cheng, Rogers, & Wang, 2007). Computer-Adaptive Testing (CAT): By using CAT, fairness, validity, and reliability both for administering and scoring items are “tailored to the apparent ability level of the testee” (Alderson, 1990, p. 21) with the employment of a computer technology. Differential Item Functioning (DIF): It occurs if groups defined according to different variables such as gender, age, educational background, and/or ethnicity have various other probabilities of nestling a given item on a multi-item scale when controlled for overall scale scores. Dynamic Assessment (DA): It is developed from the theory of Vygotsky, which does not only measure the final product, albeit focuses on the process of development of the product (Poehner, 2008).

9

 Revisiting the Past to Shape the Future

Language Assessment Literacy (LAL): It is the general repertoire of one’s knowledge, skills and competences of using assessment methods in appropriate times with proper tools in order to comprehend, assess and build language tests, and analyze the scores. Performance-Based Assessment (PBA): It is a type of an alternative assessment which caters students with the engagement with the task and task requirement(s) by means of their knowledge, skills, and learning strategies in order to create a product, or generate a novel response. Target Language Use (TLU): It is the domain which stands for ensuring the construct validity in assessment for educational purposes.

10

11

Chapter 2

The Challenge of Assessing Intercultural Competence: A Review of Existing Approaches Moritz Brüstle Cooperative State University Baden-Wuerttemberg Mosbach, Germany Karin Vogt https://orcid.org/0000-0001-6019-2655 University of Education Heidelberg, Germany

ABSTRACT An ever-faster developing globalization of our world brings many changes with it and poses a variety of challenges for human society. Accordingly, living and working in today’s times also bring new demands for institutionalized education all around the world. This chapter will give a detailed insight into the undeniable significance of intercultural competence and its place within foreign language education due to these developments. Following this, the discordant state of defining intercultural competence and agreeing on a common model will be explained, while also providing viable solutions to these challenges. Finally, ways of assessing this complex construct are discussed, aiming for a pragmatic approach usable especially in the foreign language classroom.

INTRODUCTION The world is more globalized than it has ever been before and still, tomorrow it will be more so than today. This process of global interconnectedness seems to be driven and accelerated mainly by technological advancements of the last decades, such as the world wide web or easily accessible global transportation. Thus, the global interconnectedness of all aspects of contemporary social life (i.e., globalization) can for example be seen in an increased mobility and availability of goods, knowledge, capital, and, of course, people (Held et al., 1999). Although the debate on these developments often seems to be focused on economic chances and challenges, it is people - and their ability to interact - who are the DOI: 10.4018/978-1-6684-5660-6.ch002

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 The Challenge of Assessing Intercultural Competence

cornerstone of successful globalization (Souto-Otero, 2020). More and more aspects of human life are adversely affected by globalization and the ability to interact successfully and efficiently with culturally different others is becoming increasingly important. Not least, a labor market that is trying to keep up with this rapid global development also demands graduates with a competence profile that naturally takes the abilities, knowledge and attitudes for successful international and intercultural teamwork into account (Lustig & Koester, 2010). Particularly the inevitable relevance for the success of multinational companies is repeatedly highlighted in the literature (overview in Matveev, 2017). Future employees are expected to act effectively and successfully in a multinational working environment (Committee for Economic Development, 2006; Lustig & Koester, 2010), and intercultural competence and its efficient development have to become a focus of educational institutions and the labor market all over the world (Spitzberg & Changnon, 2009). Though there seems to be agreement across disciplines on the significance of intercultural competence, defining the construct, finding a uniform model and assessment-approach has traditionally proven to be rather difficult. In fact, even deciding on a common terminology for this capacity to engage in efficient intercultural interaction has yet to happen. This chapter will provide a comprehensible account on the origin and rationale for the existence of this ambiguity as well as an approach to find consensus and commonalities of different definitions and models across disciplines. Furthermore, the equally discordant status of assessing this complex construct, especially in a classroom setting, is presented by giving an overview of a distinct selection of existing approaches.

THE SIGNIFICANCE OF INTERCULTURAL COMPETENCE The globalization of the world constantly advances the potential of humanity. It has never been easier nor faster for information, goods or people to circle the globe (Matveev, 2017; Vogt, 2018). The world wide web can be a great indicator as well as amplifier for this global interconnectedness. However, most recently the global Covid pandemic has given an impressive glance at the many levels on which the world is (and sometimes is relying to be) interconnected: The virus spread in the matter of weeks due to easy worldwide travelling, global supply-chains were disrupted and caused shortages of essential goods, temporary travel bans prohibited people from moving and affected families, organizations, and companies all over the world. The list of impacts of the pandemic is obviously much longer and still growing, still these are three authentic examples for the steadily increasing globalization. Precisely because of these multifaceted affects globalization can have on all aspects of contemporary social life, appropriate and effective communication and behavior in the international space have inevitably become one of the most central issues of the 21st century (Lustig & Koester, 2010). In order to properly portray the significance of intercultural competence in this paragraph, a working definition is needed. As such, intercultural competence is to be understood as the capacity to interact appropriately and effectively with culturally different others (based on Fantini, 2009). The key terms of this definition will be elaborated later in a dedicated paragraph of this chapter. Intercultural competence is consequently considered as a key qualification for the 21st century (Bensel & Weiler, 2000; Brislin, 2010; Dodrige, 1999; Rott et al., 2003) and a decisive factor contributing to the employability of graduates (British Council, 2013; Crossman & Clarke, 2010; European Commission, 2014; Suarta et al., 2017). Today’s globalized labor market poses an ever-growing range of diverse challenges of intercultural interactions as workplaces become more diverse in themselves as well as interconnected worldwide. Solving com12

 The Challenge of Assessing Intercultural Competence

plex problems efficiently in multinational teams has become an essential qualification employers expect around the world (OECD, 2020). This has been confirmed by several studies in which employers have been asked to identify key qualifications or, sometimes more specific, to evaluate the importance of intercultural competence. One of these was commissioned by the Association of American Colleges & Universities (AACU) and conducted in form of an online survey by Hart Research Associates in 2015. In this survey, 400 employers and 613 college students in the USA were asked which tertiary learning outcomes they deem most important on a scale from 0 to 10. “The ability to analyze and solve problems with people from different backgrounds and cultures” was rated 8 or higher by 56% of the employers and 71% of the students. Strikingly, the top three learning outcomes were “the ability to effectively communicate orally”, “the ability to work effectively with others in teams” and “the ability to effectively communicate in writing” (Hart Research Associates, 2015). In an attempt to gain a more global image, the British council commissioned a survey of HR managers with 367 large enterprises in 9 different countries (at least N=40 per country): Brazil, China, India, Indonesia, Jordan, South Africa, the United Arab Emirates (UAE), the United Kingdom (UK), and the United Stated (US). The goal was to find out what employers mean by “intercultural skills”, which of them they deem most important and why (British Council, 2013). Apart from giving valuable insight in what employers recognize as intercultural skills and which benefits / risks they suspect, this study showed the relevance of intercultural skills for these employers: On a three-point Likert-scale (unimportant to very important), a maximum of 15% of the surveyed HR managers said it was unimportant (Jordan and Indonesia had 0% saying unimportant) while at least 40% said it was very important. The only outlier was China, where 31% of the HR managers said intercultural skills are unimportant and only 25% said they are very important (British Council, 2013). In addition to highlighting the demand for these competences, a comprehensive study conducted by Tung (1987) has shown the consequences to which a lack of intercultural competence can lead for international companies. She found that considerable amounts of expatriate assignments by 80 U.S. American, 29 Western European, and 35 Japanese multinational employers failed (inability to perform effectively abroad and, hence, the need for the employee to be fired or recalled home), mainly due to “the manager’s [and their spouse’s] inability to adapt to a different physical or cultural environment” (Tung, 1987, p. 117). Considering the aforementioned pace of globalization, especially during the last two decades, this study surely has to be considered to be from another time. Yet Tung’s findings are as relevant as ever, especially her insight that these failure rates can be traced back to a lack of intercultural competence and therefore be addressed by offering respective training programs for managers (Tung, 1987). Tung’s study also shows the necessity to broaden the focus to all employees not only graduates that are currently trying to enter the workforce. It becomes clear that in order to ensure success and to thrive in an ever faster evolving economy, companies need to develop and deploy a core set of key qualification within their staff, intercultural competence being one of them (Whittemore, 2018). In addition to such research, large international companies such as Nike or Meta naturally include cultural diversity as a top priority in their company policies, thus creating an imperative for their workforce to develop the ability to cooperate successfully in diverse, multicultural teams (Spitzberg & Changnon, 2009). Globalization influences all aspects of human life, not only the working life of employers and employees all over the world. As such, the everyday life of people is evolving just as quickly and demanding a comparable set of key qualifications. Technological advancements make the world more interconnected, for example: The internet allows immediate communication across the globe, low-threshold easy accessible air travel enables and amplifies worldwide migration. According to the world migration report 2022 (McAuliffe & Triandafyllidou, 2021) almost 281 million people (3.6% of world’s population) had 13

 The Challenge of Assessing Intercultural Competence

migrated to a country other than their country of birth. The report also shows that this number rapidly increased during the last 20 years, as it were only 2.8% of the world’s population in the year 2000. This becomes even more striking considering that the number of international migrants increased only by 0.5% in the 30 years prior to that (2.3% in 1970 to 2.8% in 2000). Knowledge, skills and attitudes to interact efficiently and appropriately with others that are culturally different from oneself (Fantini, 2009) are more important than ever before. According to the aforementioned migration report (McAuliffe & Triandafyllidou, 2021), this is particularly true for developed countries, as the USA and Germany are the two most popular destinations for migrants in 2022. This furthermore allows the assumption that everyday life in these countries becomes more multicultural due to migration. At the same time, tourist and excursionist mobility of people has been steadily increasing for the last 20 years, excluding the considerable dip in the last three years due to the pandemic (World Tourism Organization, 2022). Authentic interaction with culturally different people is becoming more and more common and a demand for according competences is ever growing. Myron W. Lustig (2005) put it very strikingly in his presidential address over 15 years ago: “The current generation has a relational and a national task, an obligation, a requirement that, whether it prefers it or not, it must undertake. The task of the current generation can be summarized very succinctly: to create a well-functioning intercultural nation” (p. 2). Though Lustig is addressing the US American society in his speech, the many arguments above underline a global relevance of his demands. Because of its relevance across all disciplines and numerous aspects of everyday life, intercultural competence is considered to be transversal (Hecker, 2015), meaning that it transcends strictly disciplinary knowledge, skills and attitudes (OECD, 2020). Not only does that mean that it is beneficial for many different professions, but it also means that study and training programs have to take intercultural competence into account, regardless of their specific discipline. The urgent demand for intercultural competence for successful human interaction in workplaces and everyday life therefore also arises at educational institutions of this world. Their purposeful internationalization in education is a prevalent topic of the 21st century, and developments in this regard have even been underway since the 1990s (De Haan, 2014; Jones & de Wit, 2012). In Europe for example, this development has been successfully driven by the Bologna Process, in which originally 29 European states have agreed on common goals for the unification and centralization of the European Higher Education Area (EHEA, 1999). The main focus was to create comparable structures in European higher education and thus allowing students to obtain internationally comparable degrees. One major step in this process was the introduction of the European Credit Transfer and Accumulation System (ECTS) to compare the workload and learning outcome of curricula within the EHEA (European Commission, 2015). This suggests a very important outcome of the aforementioned internationalization that goes beyond structures and is concerned with the very content of curricula and a distinct focus on developing intercultural competence in the course of institutionalized education. This is furthermore highlighted by the fact that the OECD has included “global competency” for the first time ever in their 2018 PISA assessment, because “coming to terms with globalization, this generation requires new capacities.” (OECD, 2020, p. 5). Global competency is defined as “a multidimensional, life-long learning goal. Globally competent individuals can examine local, global and intercultural issues, understand and appreciate different perspectives and worldviews, interact successfully and respectfully with others, and take responsible action toward sustainability and collective well-being” (OECD, 2019, p. 166). Graduates of secondary and tertiary education are expected to act effectively and successfully in a multinational working environment (Committee for Economic Development, 2006; Lustig & Koester, 2010) which is why intercultural competence and its 14

 The Challenge of Assessing Intercultural Competence

development have to be included in curricula on both levels. In many cases, this is done in the context of foreign language education. In the United States, for example, “knowledge and understanding of other cultures” was added to the National Standards for Foreign Language Learning more than 20 years ago and has since been further elaborated to “interacting with cultural competence and understanding” as well as “communicate and interact with cultural competence in order to participate in multilingual communities at home and around the world” (ACTFL, 2011). In Germany, too, intercultural competence has been one of three competence areas of first foreign language education (English or French) in lower secondary schools since 2003 (Standing Conference of the Ministers of Education of the Federal States, 2004). In higher secondary schools for advanced foreign language education (English or French) it was implemented as intercultural communicative competence in 2012 (Standing Conference of the Ministers of Education of the federal states, 2014). For European education institutions, intercultural communicative competence was included in the well-established Common European Framework of Reference for Languages: Learning, teaching, assessment (CEFR) from the beginning (CEFR, 2001). In 2016 a “guide for the development and implementation of curricula for plurilingual and intercultural education” was published by the Council of Europe to further support institutions and curriculum planners in implementing this complex task and therefore realizing the potential of the CEFR (Beacco et al., 2016). The theoretical implementation in curricula world-wide of some form of intercultural competence begs the question for the practical outcomes, meaning the actual competence level of graduates. The aforementioned study commissioned by the AACU (Hart Research Associates, 2015) also asked employers whether recent college graduates were well prepared (on a zero-to-ten scale) in the 17 most important learning outcomes as identified previously in the same study. Interestingly, the 5 learning outcomes connected to intercultural competence had the lowest scores with a mere 15-21% of surveyed employers giving an 8-10 rating and therefore agreeing that recent college graduates were well prepared in these areas (Heart Research Associates, 2015). The more global study commissioned by the British Council (British Council, 2013) also asked employers how the education system meets their intercultural skills needs (on a three-point scale from “not at all” to “a great deal”). In six of the nine surveyed countries (see above) at least 30% of the employers said “not at all”. This sheds light on a sobering reality and underlines the imperative for foreign language educators to aspire to these learning outcomes in their teaching.

Language and Culture Developing intercultural competence can be seen as a domain of foreign language education, because language and culture have a special connection. Language as a symbolic system must be seen as an integral part of human culture, not least because it enables humans to construct and understand themselves and their culture, as will be elaborated below (Kramsch, 2012; Fantini, 2012). Still, disciplines in which intercultural competence is researched as well as in teacher education often seem to disregard or miss this special connection. This can be assumed because most models of intercultural competence do not include a linguistic or communication element (Garrett-Rucks, 2016). Furthermore, surveys of university students that study a language or a related discipline have shown that cultural knowledge and understanding (cf. Ad Hoc Committee on Foreign Languages report 2007) is never seen as “the main point of language learning” (Magnan et al., 2014, p. 84). This study showed that over a quarter (28%) of the surveyed students said their priority when learning a foreign language is elsewhere (Magnan et al., 2014). Some students even regarded the MLA report’s standard of cultural knowledge and understand15

 The Challenge of Assessing Intercultural Competence

ing to be unteachable in language class (Magnan et al., 2014; Chavez, 2002). As elaborately proven above though, the development and assessment of intercultural competence is mostly agreed to be a part of foreign language education around the world. While the relationship of language and culture is presented as the foundation of many discussions in the context of cultural studies and linguistics, this relationship and its impact lie much deeper. Considering language as a system of symbols that can convey any meaning and are also the main way to describe our thoughts, it becomes apparent why Alvino E. Fantini (2012) argues “that language makes the anthropoid ‘human’” (p. 263). This so-called linguistic relativity means that language “construct[s] the meaning we give to objects, people, and events.” (Kramsch, 2012, p. 21). Its most extreme form (linguistic determinism) means that human perception of the world is made possible solely by language (Baker & Ishikawa, 2021). In this sense, language and culture seem to be inextricably linked on several levels. However, this should by no means lead to the assumption that culture or even intercultural competence are automatically learned along with learning a foreign language (Byram & Wagner, 2018). Thus, the relationship of culture and language seems to be somewhat of a dilemma. Karen Risager’s (2006) so called language-culture-nexus disproves this though by showing how language and culture can both be connected and separated at the same time (Baker & Ishikawa, 2021). Risager’s complex understanding of the relationship between culture and language can only roughly be touched on in the context of this chapter, a more detailed insight can be found in Risager (2015). Essentially, she noted that there are three dimensions of language, whose respective relationship to culture can be looked at individually: Psychological, sociological and system-oriented. In the psychological dimension, which refers to an individual’s cognitive comprehension and representation of the language and specifically the cultural context in which it is learned, they both are actually inseparable. In the sociological dimension, which refers to how individuals actually use language to interact, however, language and culture have to be separated in several ways (Risager, 2006). The complex relationship of language and culture obviously has many implications on language teaching and learners which is why educators must be very aware of it. Learning a foreign language acquires learners to be an interlocutor in intercultural space which comes with responsibility. As contemporary foreign language education aims for students to be able to mediate between languages rather than reaching a perfect native-speaker-level (cf. MLA report), Risager’s findings (2006, 2007, 2015) are more relevant than ever. Knowledge and awareness of the potential connection of a specific language and the culture it might represent is thus imperative for language learners and naturally also teachers. How foreign language education that aims for intercultural communicative competence is particularly affected by these findings can for example be found in more detail in Byram and Wagner (2018, p. 143-145).

INTERCULTURAL COMPETENCE: CONTESTED DEFINITIONS Although there is agreement on the undeniable and increasing significance of intercultural competence across many disciplines, it is notoriously difficult to define this complex construct in a uniform way. Even finding a common designation for this capacity to interact with others that are culturally different from oneself (Fantini, 2009) seems impossible and has been identified as a persisting problem by scholars and practitioners in the field for decades. Brent D. Ruben (1989) for example opens his analysis of then contemporary issues of studying cross-cultural competence by citing a multitude of studies reaching back as far as 1955 and concluding: “The definitions of each were then, as many would argue they remain today [1989], relatively ambiguous and undifferentiated” (p. 229). This ambiguity is unfortunately still 16

 The Challenge of Assessing Intercultural Competence

very much an issue over 30 years later. Reviews of the conceptualization of intercultural competence generally begin with the observation that there seems to be no agreement between disciplines, and often even within a discipline, on either terminology or definition of the construct (Collier, 1989; Deardorff, 2006, 2015; Frawley et al., 2020; Griffith et al., 2016; Kroeber & Kluckhohn, 1952; Ruben, 1989; Sinicrope et al., 2007; Spitzberg & Changnon, 2009). It is impossible to tell how many (seemingly) different concepts and terms related to intercultural competence are currently out there, Spitzberg and Changnon (2009) suggested an overwhelming number that exceeds 300 in their comprehensive review over ten years ago. Amongst many others, commonly found terminology for example is cultural intelligence (e.g., Van Dyne et al., 2012), global competence (e.g., OECD, 2018), cross-cultural competence (e.g., Magala, 2005), transcultural competence (e.g., Kramsch, 2012), or intercultural sensibility (Hammer et al. 2003). Although this lack of agreement can give the impression that the field is stagnating, or as Spitzberg and Changnon (2009) put it: “many conceptual wheels are being reinvented at the expense of legitimate progress” (p. 45), there can be very good reasons for deliberately differentiating certain terminology. For example, there are considerable and well researched differences between the concepts of a competence and an intelligence. The same goes for the adjectives: Cross-cultural, intercultural, transcultural, or just cultural can mean very different things, as will be alluded to later in this chapter. Very often though, the components and their relationship of the models behind the wide range of terms are very comparable which makes it seem as though they are used synonymously. In this sense, the lack of a common terminology and definition can be addressed by finding consensus of existing approaches and therefore describe the concept as best as possible, as will be demonstrated later in this chapter. Though this terminological ambiguity is a very real issue concerning the overarching term of intercultural competence, most scholars agree that a workable solution can only be derived by first defining the single components: Culture and competence (e.g., Deardorff, 2020).

Defining Culture Spencer-Oatey and Franklin (2009) base their effort of “unpacking culture” (p. 13) on the immediate observation that “culture is notoriously difficult to define” (p. 13). To the disappointment of the intrigued researcher, there seems to be no single all-purpose definition of the term culture either and “there are almost as many meanings of “culture” as people using the term” (Ajiferuke & Boddewyn, 1970, p.154). Again, it is important to highlight that the lack of a common definition is not necessarily the result of inadequacy of previous research efforts. Over the course of the last century many different perceptions of the concept culture emerged and were subsequently replaced by others, each representing the current concerns of that era and, in this same sense, it can be assumed that this adaption will continue to happen in the future (Ingold, 2002). Instead of forming a concrete definition of the term, past research across disciplines make it seem most beneficial to abstractly constitute culture as different ways of and reasons for humans living their life and based on this, specify according to a given context. The complexity of the concept culture, which is why it is defying a single all-purpose definition, further becomes evident when considering the distinction of etics and emics, as coined by Pike (1954). Roughly speaking, both describe two very distinct ways of analyzing culture which, instead of seeing them separately, should be used complimentary. Etics describe attributes (e.g., ideas or behavior) that are culture general while emics on the other hand describe such attributes of a specific culture, often unique to that one (SpencerOatey & Franklin, 2009). This distinction sheds light on the level of abstraction and many layers on which the term culture is used. In order to arrive at a shared understanding of what the concept entails, 17

 The Challenge of Assessing Intercultural Competence

it is again helpful to summarize consensus of the last decades’ research. Spencer-Oatey and Franklin (2009) reviewed eight seemingly different definitions of culture, developed between 1952 and 2008 and derived four common characteristics (p.15): 1. Culture is manifested through different types of regularities, some of which are more explicit than others. 2. Culture is associated with social groups, but no two individuals within a group share exactly the same cultural characteristics. 3. Culture affects people’s behavior and interpretations of behavior. 4. Culture is acquired and/or constructed through interaction with others. These regularities occur in no finite amount; they are all elements in which a culture is more or less projected to the outside. In a more implicit way, these can be values, norms, or basic assumptions. More explicit and thus easier to detect, such regularities could be language, music, religion, or food. The aforementioned regularities can, if recognized from the outside, describe a certain social group (i.e., their culture, cf. etics). It is very important to realize that these groups do not necessarily find themselves in the forms of nations or regions, as it has long been assumed for cross-cultural comparisons. Instead, these can be religious groups, organizations, professional groups, communities of practice that share a set of regularities. It is also possible for people to be considered a member of several groups, one of which can still be a nationality. Describing such groups is a very integral part of understanding intercultural collaboration and assuming successful multinational teamwork. Still, it is incredibly difficult to describe them, and one has to be aware of the risk of over-generalizing on the basis of minimal evidence, the risk of inappropriate stereotyping, or the risk of applying excessive essentialism and reductionism (Spencer-Oatey & Franklin, 2009).

Emergence of the Prefix Trans Considering the multitude of effects globalization has on society, the idea of viewing each culture as a neat and describable entity to which one might belong or not, seems outdated. Especially a clear distinction as is made in terms of etics (culture-specific) and emics (culture-general) suggests an understanding of cultures and the intercultural space (culture A interacts with culture B), that is no longer fitting. As such, the prefix t rans (e.g., transcultural or translingual) has appeared more and more often in several contexts in the recent past (Witte, 2012). The idea of the transcultural focusses on abandoning the classical self vs. other binary, since the global networking (e.g., increased migration) leads to generally more diverse societies that are no longer adequately described by the means explained above. Instead, one draws on inclusive ways of thinking about cultural encounters and development, and thus might speak of “selfas-part-of-others” (Bach, 2005). In much the same way, translingual communication does not assume individual languages that can be clearly separated from each other (Canagarajah, 2013). The Modern Languages Association (MLA) ad hoc Committee (2007) formulates “translingual and transcultural competence” as a specific goal or requirement for foreign language teaching (p. 237). It thus turns away from the achievement of a “native speaker level” as a desirable goal of foreign language teaching and the use of the prefix inter. Instead, the aim is to enable foreign language learners to mediate between languages and to emphasize their special role as interlocutors in the target language (Ad Hoc Committee on Foreign Languages, 2007). Cultures should not be seen as rooted in the nation-state, which suggests 18

 The Challenge of Assessing Intercultural Competence

an indispensable belonging of individuals and thus automatically a shared history, language, values, norms, etc. (Kramsch, 2011). Rather, one leads a more constructivist way of thinking that emphasizes the individual character of one’s culture and thus the encounter of different cultures, moving from inter (between) to trans (through). This has previously been emphasized in the context of the second common characteristic of culture found by Spencer-Oatey and Franklin (2009, p. 15): “…no two individuals within a group share exactly the same cultural characteristics.” The anthropologist Tim Ingold (2002) puts this very impressively in his attempt to answer the question What is Culture?: “The idea that humanity as a whole can be parceled up into a multitude of discrete cultural capsules, each the potential object of disinterested anthropological scrutiny, has been laid to rest at the same time as we have come to recognize the fact of the interconnectedness of the world’s peoples, not just in the era of modern transport and communications, but throughout history. The isolated culture has been revealed as a figment of the Western anthropological imagination. It might be more realistic, then, to say that people live culturally rather than that they live in cultures” (p. 330). Still, in order to understand a culture and its relationship, encountering, or intersecting with another culture, comparison seems to be not only a very fruitful approach but also a rather instinctive one (Baldwin & Mussweiler, 2018). In assessing or developing intercultural competence, it can therefore be helpful to use this instinctive stance and build on a comparative approach, while carefully avoiding overgeneralizing or stereotyping.

Defining Competence In this volume, the capacity to interact efficiently and appropriately with humans culturally different from oneself (Fantini, 2009) was consistently called intercultural competence, purposefully so. As previously elaborated, agreeing on a terminology has proven to be rather difficult. Still, or maybe precisely because of that, it is absolutely essential to make a conscious decision for a precise terminology as both parts (adjective and substantive) have distinct individual meaning and should not be chosen arbitrarily. As such, the term competence was consciously chosen over competing concepts like intelligence, skill or sensitivity, as they were each briefly mentioned above. Though once again it has to be recognized preliminarily that the term competence is also contested when it comes to a clear and pragmatic definition (Koch & Straßer, 2008). Not rarely is it claimed that the term has become a “buzzword” and is used over excessively, which results in a fuzzy concept. Still there are valuable findings of past research that can be consulted in order to gain a feasible understanding of the concept. Though many different disciplines have contributed to a diverse body of insights on competence as a concept, in the context of assessment it seems most beneficial to turn towards the educational sciences, specifically educational assessment. In this context, the concept has gained particular attention through PISA (Programme for International Student Assessment, first conducted in 2000), subsequently leading to an increased use of the term as well as an education reform. The then deemed “innovative” concept was described as “the capacity of students to analyze, reason and communicate effectively as they pose, solve and interpret problems in a variety of subject matter areas” (OECD, 2005, p. 3). Meanwhile, this has been further refined to three distinct components of competencies: Knowledge, skills, attitudes and values; each consisting of several further facets (OECD, 2018, p. 4). This triad is one particularly unique factor of a competence and specific reason for it being preferred over other concepts named above. In the German context, Franz E. Weinert for example defines competence as “the cognitive capacities available in or learned by individuals to solve specific problems, as well as related motivational, volitional, and social dispositions and abilities to apply the problems solutions successfully and responsibly in variable 19

 The Challenge of Assessing Intercultural Competence

comparable situations” (Weinert, 2001, p. 21, translated from German). A competence therefore goes beyond knowledge of or the ability to do something, which naturally also has fundamental effects on the assessment of the concept. This has been recognized in the context of intercultural competence, as will be discussed in detail below. To sum up, a competence comprises an individual’s cognitive (e.g., knowledge), action-oriented (e.g., skills), and affective (e.g., motivations, attitudes and values) capacities which are used to solve authentic and diverse problems. This begs the question how the three elements of a competence are specified when it comes to intercultural competence. This point will be addressed in the upcoming paragraph.

Defining Intercultural Competence As elaborated above, consensus of previous research can be condensed into a common understanding of the concept, even though a uniform definition for intercultural competence is still not agreed upon. Similar to the concept of culture, intercultural competence is a highly individual capacity that defies a pragmatic, all-purpose definition. Instead, it again seems most fruitful to identify characterizing features of the concept, not necessarily how and under which conditions it can be applied to individuals. In the previous paragraph it was found that a competence comprises cognitive, action-oriented, and affective elements. Accordingly, this was also found as the main consensus on intercultural competence amongst scholars and practitioners over the last decades. Extensive reviews of the many existing models (e.g., Deardorff, 2004; Spencer-Oatey & Franklin, 2009; Spitzberg & Changnon, 2009) allow the three aforementioned elements to be identified as common features across existing conceptualizations of intercultural competence. These elements comprise a large number of sub-facets, the number, name and nature of which depend on the individual model and vary considerably (Schnabel et al., 2015). Still, there are also some common elements across models and definitions (e.g., ability to interact), as comprehensive reviews show (e.g., Spitzberg & Changnon, 2009, p. 10-34). The following working definition can be derived from these elements, which shall serve as the basis for further proceedings: Intercultural competence is to be understood as a complex conglomerate of several elements (see above) that are needed to interact effectively and appropriately with others who are culturally different from oneself (based on Deardorff, 2006; Fantini, 2009). In this definition, effective refers to how successful an interaction is from the perspective of the learner’s themselves, meaning whether or not they reached a desired outcome. Appropriate on the other hand refers to the interlocutor’s perspective on the interaction and how they perceive the learner’s behavior. Lastly, different is another key term in the definition that needs careful attention. As previously elaborated, assuming different cultures as fixed and comparable entities is a paradigm of the past. Still, assuming an other which in turn depends on the self-perception and interpretation of the individual itself is an important concept of intercultural competence. These three adjectives already suggest certain methods and limitations of assessing intercultural competence as defined above. Deardorff (2004), in a comprehensive and often cited study, has for the first time attempted to establish a consensus among leading intercultural scholars and practitioners on how intercultural competence could be defined. In accordance with the definition stated above, the study found “the ability to communicate effectively and appropriately in intercultural situations based on one’s intercultural knowledge, skills, and attitudes” (Deardorff, 2004, p. 194) to be the most agreed upon description of what intercultural competence 20

 The Challenge of Assessing Intercultural Competence

should be understood as. In addition to that, some sub-facets of these elements were agreed upon by the surveyed scholars. Particularly affective elements “such as curiosity, general openness, and respect for other cultures” as well as cognitive elements such as “cultural awareness, various adaptive traits, and cultural knowledge.” Furthermore, action-oriented elements such as “skills to analyze, interpret, and relate, as well as skills to listen and observe” were also considered to be a part of intercultural competence (Deardorff, 2006, p. 248). In this study, the scholars were given predesigned definitions, based on relevant intercultural literature from different disciplines. The one agreed upon (see above) is based on Michael Byram’s (1997, 2021) seminal work on “teaching and assessing intercultural communicative competence”, which will be further highlighted in the next paragraph.

Models of Intercultural Competence Over several decades a variety of different models for intercultural competence (or comparable concepts with different names), in which specific components are named, have been developed. These can be assigned to the respective discipline they were developed in, such as anthropology (e.g, Hofstede, 2001), psychology (e.g., Bennett, 1993), communication sciences (e.g., Gudykunst, 2004), applied linguistics (e.g., Byram, 1997), or international business and management studies (e.g., Stahl, 2001). Comprehensive lists of existing models, sorted by disciplines, can for example be found in Deardorff (2004), SpencerOatey & Franklin (2009) or Vogt (2018). These models also can be compared beyond their specific discipline and sorted by specific model type. Spitzberg and Changnon (2009, p. 10) have analyzed and compared over 20 different models and derived five different model types that must be distinguished: compositional, co-orientational, developmental, adaptational, and casual process models: • • • • •

Compositional: Focuses on identifying the particular components of the concept, usually in the form of lists, without specifying any relationships. Co-Orientational: Focuses on a distinct aspect of communicative mutuality or shared meanings within the development of intercultural competence. Developmental: Focuses on the development of the concept within an individual, usually across particular stages over the course of time. Adaptational: Focuses on the mutual adjustment of two or more individuals in an intercultural interaction. Casual Process: Focuses on the interrelationship among components, usually providing a set of outcomes at the end of a process. (p. 10).

These model types are not mutually exclusive, as Deardorff’s model (2004) illustrates. In her seminal work, she first developed a so-called pyramid model (compositional type) in which she illustrates the components of intercultural competence, the lower tiers of the pyramid being hypothesized as prerequisites for the upper ones. Consisting of four tiers that ultimately culminate in desired external outcomes, the very base of this pyramid model consists of the three requisite attitudes respect, openness, and curiosity (p.196). In the same work, she then presents a casual process model which takes the very same components from the pyramid model, now illustrated in a path that includes internal and external outcomes. Although there is a vast number of different models available, a clear choice for one is recommended. The model will give a definite number of components as well as their relationship and operationaliza21

 The Challenge of Assessing Intercultural Competence

tion. These aspects are fundamental to understanding the concept and attempting its assessment, as will be shown below. It has been illustrated now that these aspects certainly differ within the available contemporary models and there can be very good reasons for that. Thus, the choice for one model can hardly be made at random, instead the model should fit the specific discipline and context in which it is to be applied.

Rationales for Byram’s Model In the context of this book, it is certainly most beneficial to consult the work of Michael Byram (2021), as his is one of the few approaches that particularly include a communicative competence (linguistic, sociolinguistic, discourse). Furthermore, Byram pursued the assessment of intercultural communicative competence (ICC) and thus put a particular focus on the operationalization and appropriate assessment types for the different components he identified. Byram’s model of ICC can be considered as a coorientational model (see above). Figure 1. Byram’s Model of intercultural communicative competence (Spitzberg & Changnon, 2009, p. 17)

22

 The Challenge of Assessing Intercultural Competence

Byram focused on creating a comprehensible understanding of the complex construct as well as suggesting implications for curriculum design and assessment approaches, specifically in the context of foreign language education. He thus designed a model around the acquisition of ICC in such educational settings as well as assuming a teacher and learner role. In accordance with the previously mentioned MLA ad hoc committee on foreign languages (2007), Byram also assumes an attainable ideal for foreign language learning that is no native speaker but instead an intercultural speaker. As such, the components of his model, as illustrated in Figure 1, show so called savoirs of the intercultural speaker and their relationship to communicative competences (Byram, 1997, 2021). The usage of French terminology can be traced back to the origin of this model, which was a research project committed by the Council of Europe in Strasbourg (Byram & Zarate, 1994, 1996). It suits the model very much though, as it overarches the facets of ICC and describes attitudes, knowledge, skills, and even critical cultural awareness, which is rather unique to this model. In the same way that the term capacity was used in this chapter, savoir does not necessarily commit to a specific facet of the competence but addresses all three. Michael Byram’s model of ICC is very well established and practically undeniable for foreign language education. As such, it is the basis of the aforementioned curricula in which intercultural competence has long been included as an outcome of foreign language classes (e.g., in American context ACTFL, 2011 or the German context Standing Conference of the Ministers of Education of the Federal States, 2004, 2011). Furthermore, Michael Byram had a considerable influence on the development of the CEFR and the fact, that ICC was included from the beginning (CEFR, 2001). Based on the many changes that globalization brings to everyday life, as extensively elaborated above, Byram (2008, 2014) too has argued that a national identity is not adequate anymore. Though he is promptly adding that an international identity is just as undesirable and coins the alternative phrase intercultural citizenship. In accordance with his original work and the resulting model shown in Figure 1, Byram (2008) also suggests leaving the desire to label identities behind and putting a distinct focus on competences. In doing so, his concept of intercultural citizenship assumes political education in combination with intercultural and communicative competences from his original model (Figure 1). In the context of foreign language education, this assumes a competence profile that once again goes beyond mere language education.

ASSESSING INTERCULTURAL COMPETENCE The assessment of any concept relies on a clear understanding of its construct as well as their relationship and operationalization. The same is true for intercultural competence which makes the elaborative derivation above even more fundamental. Still, the ambiguity of the construct naturally leads to difficulties in assessing intercultural competence (Fantini, 2009). In this context, the question is often raised as to whether the development and proficiency of intercultural competence can be adequately assessed or measured at all (Vogt, 2016). In Deardorff’s (2004, 2006) study, this question was answered by using the Delphi method to survey a total of 24 U.S. postsecondary institutions on viable assessment methods and purposes for intercultural competence. There was agreement across the board that intercultural competence is important and should be assessed. On a four-point scale (not important – extremely important), none of the 24 institutions said not important, 54% said extremely important, 42% important, and 4% somewhat important (Deardorff, 2004, p. 120). Amongst the most frequently used methods were student interviews, student presentation, and observation of students by representatives of the host 23

 The Challenge of Assessing Intercultural Competence

culture (>5 institutions using them). Additionally, it was found that most institutions use a combination of methods with an average of five different approaches per institution. In accordance with this, scholars and practitioners in the field agree that intercultural competence can be assessed, though it can be a very resource intensive undertaking as will be discussed below. Over the last two decades, a large number of assessment instruments has emerged, which, with very few exceptions, can be categorized according to indirect and direct approaches (Sinicrope et al., 2007). Indirect assessment approaches do not deal with an authentic student performance in, for example, an intercultural interaction directly. Instead, students’ intercultural competence is assessed through abstracting their behavior, knowledge and attitudes. Most commonly, this is done by standardized scales which are conducted in form of a questionnaire. Its items are designed to gain a quantitative result by operationalizing the underlying construct an its components. As such, indirect assessment approaches very often require students to complete a self-assessment of their skills, knowledge and attitudes. For example, in the commonly used Expanded Cultural Intelligence Scale (E-CQS) (Van Dyne et al., 2012), students are asked whether or not they agree with certain statements that are meant to represent cognitive, metacognitive, behavioral, and motivational aspects of intercultural interaction. Knowledge (cognitive) for example is assessed by several different statements such as “I can describe the different cultural value frameworks that explain behaviors around the world” (p. 301). Such instruments are rather common as they are relatively easy to conduct with large groups and quickly provide a comprehensible result. Though indirect approaches are more commonly used as direct ones, they are often regarded as rather insufficient and over-simplifying, since usually not all factors of intercultural competence can be surveyed or taken into account, and the complexity of the construct explained above cannot be satisfied (Deardorff, 2006; Vogt, 2016). Furthermore, self-assessment is seen rather controversial, as some studies found a substantial discrepancy between the students’ self-assessment and their actual intercultural competence (e.g., Altshuler et al., 2003). Whether or not students are actually unable to assess themselves accurately or if they assume some answers to be more desirable and thus purposely respond inaccurate, is still being discussed (Sinicrope et al., 2007; Fantini, 2018). Direct assessment approaches, on the other hand, often analyze the behavior of an individual in authentic intercultural situations, for example through direct observation of student performance, analysis of student reflection reports (often in the context of portfolio assessment) or interviews with an expert (Sinicrope et al., 2007). These approaches aim to directly assess the students’ knowledge, skills, and attitudes, which still requires a proper operationalization of these elements. As such, students’ desired behavior in a particular intercultural interaction has to be precisely defined to assess, for example, some aspects of action-oriented elements (e.g., communicative behavior). As a result, these procedures are significantly more resource-intensive and occur less frequently overall. Of particular note here is Michael Byram, who through his participation in numerous projects has helped to develop assessment tools of this nature (e.g., Intercultural Competence Assessment (INCA) (Prechtl & Lund, 2007)). The assessment approach of the INCA project is rather complex as it includes several different approaches such as a questionnaire and role-play as well as written scenarios with questions on how students would behave or what they make of the particular situation. As such, it also includes a comprehensive direct part though, which can be used as a case in point. The element knowledge is operationalized not solely as a cognitive factor but as a combination of motivational, metacognitive, cognitive and action-oriented factors and is overall called “knowledge discovery” (INCA Assessor Manual, 2004, p. 9). As such, it comprises “the ability to acquire new knowledge of a culture and cultural practices and the ability to act using that knowledge, those attitudes and those skills under the constraints of real-time communication 24

 The Challenge of Assessing Intercultural Competence

and interaction” (INCA Assessor Manual, 2004, p. 6). Students are prompted with authentic video or written scenarios in which intercultural communicative competence is required. Students then have to answer several pre-designed questions and present a solution to the scenario. These answers are then analyzed by an expert using a framework of six different aspects of ICC, each operationalized in a way like the example knowledge discovery above (INCA Assessor Manual, 2004). In terms of assessment methods, direct assessment of intercultural competence can mean a real time observation, an assessment of reflection reports, or in conversation like interviews. As these approaches are considerably more time-consuming, they can also offer a higher quality, more comprehensive assessment of intercultural competence especially when an authentic interaction can be created for students (Sinicrope et al., 2007).

Assessment Instruments This collection of contemporary assessment approaches does not claim to be complete by any means, as that would far exceed the means of this chapter. Instead, table 1 purposefully contains a distinct selection of available approaches with the intention of illustrating a range assessment tools available to educators. While focusing on classroom-based assessment and particularly on foreign language education, the collection also comprises instruments that are rather different in their approach. There are much more comprehensive reviews of approaches for intercultural competence assessment, e.g., Sinicrope et al. (2007), Griffith et al (2016), Matsumoto et al. (2013), or Fantini (2009). Table 1. Overview of contemporary assessment approaches for intercultural competence Name Author(s) Expanded Cultural Intelligence Scale (E-CQS). Van Dyne, L. Ang, S. The Intercultural Development Inventory (IDI) Hammer, M.; Bennett, M.; Wiseman, R. Assessment of Intercultural Competence (AIC) Fantini A.

Approach

Description

Components

Cost

Comment

Indirect

39 items selfassessment on a 7-point Likert-scale (1=strongly disagree; 7=strongly agree)

motivational, cognitive, metacognitive, and behavioral cultural intelligence

Free for academic research purposes only.

Commonly used, empirically sound, based on a compositional model. Less suitable for CBA. Self-assessment holds risk of students giving socially desirable answer.

Indirect

50 items selfassessment on a 5-point Likertscale (1=disagree; 5=agree)

denial/defense reversal minimization acceptance/adaption encapsulated marginality

price depends on version and sample size

Widely used, empirically sound, based on a developmental model. Can be used in CBA, though is rather difficult to acquire due to costs. Self-assessment holds risk of students giving socially desirable answer.

Indirect

97 items selfassessment or assessment by others on a 6-point Likert-scale (0=no competence; 5=very high competence)

awareness attitudes skills knowledge language proficiency

Free after getting permission from author

Includes communicative factor, used in YOGA form, can be used as normative, formative and summative assessment. Suitable for CBA. Risk of socially desirable answers can be lowered by including assessor.

Continued on following page

25

 The Challenge of Assessing Intercultural Competence

Table 1. Continued Name Author(s) Intercultural Competence Assessment (INCA) Project www. incaproject.org

The Intercultura Assessment Protocol (IAP) Baiutti, M.

Critical Incidents (part of the DESI Study) Hesse, G.

Approach

Description

Components

Direct

Intercultural scenarios with questions, and student role plays. Assessor evaluates answers and role playing with assessment framework

Tolerance of ambiguity, behavioral flexibility, communicative awareness, knowledge discovery, respect for otherness, empathy

Direct

A framework of 8 dimensions for IC, used by students during and teachers after student mobility to assess IC, based on several data collection methods, e.g., pupils’ logbooks (journals)

respect openness curiosity flexibility culture-specific knowledge sociolinguistic awareness ability to speak the language(s) of the host country listening for understanding

Direct

Critical incident exercises for the students. Students are prompted with questions about the incident to determine their cognitive, affective, actionoriented analysis of the situation as well as their ability to transfer these capacities to new situations.

Based on M. Bennett’s intercultural sensitivity: Denial defense minimization acceptance adaption

Cost

Comment

Free

Includes very detailed assessor manual. Framework and approaches usable in / adaptable to own context. Includes communicative factor. Includes suitable exercises and tasks for the classroom. Very suitable for CBA.

Free

Focusses on student mobility. Includes comprehensive assessment from several data sources and assessors. For secondary school students. Includes communicative factor.

Free

Very suitable for CBA in the foreign language classroom. Was developed based on foreign language education standards (in Germany). For secondary school students. Does not include communicative factor. There are additional incidents that focus on socio-pragmatic language awareness rather than intercultural competence. Adding these allows an assessment of communicative factors.

ALIGNMENT OF THE ASSESSMENT AND A MODEL OF IC As illustrated in table 1, many different approaches, direct and indirect, are available for the assessment of intercultural competence. It has to be noted that these different approaches are usually based on different models and components. As such, the choice of a particular model, as suggested above, goes hand in hand with the choice of a viable assessment approach and is thus equally based on the specific discipline and context in which it is to be applied. In the context of foreign language teaching, intercultural competence assessment is often considered classroom-based language assessment (CBLA), as it is carried out teachers to give feedback on a certain level or performance by students. The majority of available assessment approaches are not primarily designed for a CBLA though. Instead, they are to be used as an empirical assessment (EA) on a randomly sampled group to, for example, research hypotheses on their intercultural development during different types of stays abroad. The distinction between CBLA and

26

 The Challenge of Assessing Intercultural Competence

EA is fundamental, especially for language educators that are aiming to assess students’ intercultural competence. Practical guidance for classroom assessment is found neither in the CEFR nor in the aforementioned educational standards, although intercultural competence is clearly stated as an objective of foreign language teaching (see above). Deardorff (2004) has formulated some guiding questions which can be a great starting point in designing one’s own assessment approach for intercultural competence. These guiding questions have since been elaborated by other scholars in the field (e.g., Fantini, 2009). Vogt (2016) has further developed these guiding questions with a particular focus on foreign language teaching (p. 84, translated from German): • • • • • •

Which theoretical model of intercultural competence is the assessment based on? Which area or aspects of the model are prioritized or primarily assessed? Which learning objectives are primarily assessed? What task format is used to assess performance? Does the task adequately represent the learning objectives being assessed? How is the assessment performed and evaluated? Is the assessment objective and transparent? - What learner performance is expected?

Furthermore, Caspari and Schnischke (2009) have developed a collection of task types that can be used for CBA of intercultural competence, certainly suitable for foreign language classes. Amongst many others, the authors suggest tasks to put one’s own perceptions into perspective, analysis of critical incidents, or virtual encounters with authentic target culture interlocutors. Still, these are just types of tasks which are suitable for CBA. Exemplary task designs can for example be found in Vogt (2016, pp. 88). Finally, it has to be acknowledged that the assessment of intercultural competence is also a considerable challenge in foreign language teaching. In particular, the operationalization of affective elements is very difficult and is sometimes even described as a failure entirely (Reimann, 2018). Nevertheless, there are promising approaches that can be drawn upon here as well. These seem to focus especially on approaches of formative assessment, whereby learning and assessment move closely together (Vogt, 2016; formative approaches e.g., in Fellmann 2006; Byram, 1997, p. 91ff.). The relationship of the individual elements in the model also plays a supporting role in assessment. For example, it must be assumed that the presence of some affective elements cannot be assessed without also eliciting cognitive and/or action-oriented elements. This connection is instrumental for the formulation of suitable tasks and accordingly for the design of an assessment plan.

CONCLUSION Intercultural competence is a very complex and quite contested construct, making its development and assessment an equally complex task, especially in a classroom context. Still, the undeniable significance of intercultural competence was illustrated extensively. Not only is there evidence that an increasingly globalized labor market demands such capacities from graduates entering and employees already in the workforce, an inadequacy of these graduates also suggest a pressing need for internationalization of educational institutions and their curricula. Accordingly, intercultural competence can be found as a distinct standard for learning and teaching a foreign language.

27

 The Challenge of Assessing Intercultural Competence

Defining intercultural competence and its components has been proven to be traditionally difficult, yet a feasible approach and even a rationale for being contested could be presented. As such, working definitions for intercultural competence and culture could be formulated by finding consensus previous research. Furthermore, the term intercultural competence was chosen very deliberately. Several alternatives for the prefix inter, the term intercultural, as well as the term competence were discussed in order to justify the distinct choice for intercultural competence. Matching the wide range of different definitions of intercultural competence, there is an equally wide choice of models available, each having a rationale of existing in their own right. After illustrating different ways of categorizing the models, Byram’s (1997, 2022) was further explained, being determined as the most fitting model for foreign language teaching. It is fundamental to understand the discordant state of defining the term and come to a conclusion on what is to be understood as intercultural competence, as this is the very basis on which important decisions on the assessment of the construct are made. This way, though many different approaches were mentioned, a principle way of constructing a classroom based assessment approach could be presented. Though it can be a significant challenge that comes on top of the duties and responsibilities of a foreign language teacher, intercultural competence is an inherent part of language learning that cannot be disregarded. For the same reason, teaching a foreign language is always influenced by culture, may it be the students own or a foreign one.

REFERENCES ACTFL. (2011). World-readiness standards for learni n g l a n g u a g e s . h t t p s : / / w w w. a c t f l . o r g / s i t e s / d e f a u l t / f i l e s / p u b l i c a t i o n s / s t a n d ards/World-ReadinessStandardsforLearningLanguages.pdf Ad Hoc Committee on Foreign Languages. (2007). Foreign languages and higher education: New structures for a changed world. Profession, 2007(1), 234–245. doi:10.1632/prof.2007.2007.1.234 Ajiferuke, M., & Boddewyn, J. (1970). “Culture” and Other Explanatory Variables in Comparative Management Studies. Academy of Management Journal, 13(2), 153–163. doi:10.2307/255102 Altshuler, L., Sussman, N. M., & Kachur, E. (2003). Assess ing changes in intercultural sensitivity among physician trainees using the intercultural development inventory. International Journal of Intercultural Relations, 27(4), 387–401. doi:10.1016/S0147-1767(03)00029-4 Bach, G. (2005). Will the Real Madonna Please Reveal Herself?! Mediating “Self” and “Other” in Intercultural Learning. In G. Hermann-Brennecke (Ed.), Anglo-American awareness: Arpeggios in aesthetics (pp. 15–28). LIT Verlag Münster. Baker, W., & Ishikawa, T. (2021). Transcultural communication through global Englishes: An advanced textbook for students. Routledge. doi:10.4324/9780367809973 Baldwin, M., & Mussweiler, T. (2018). The culture of social comparison. Proceedings of the National Academy of Sciences of the United States of America, 115(39). Advance online publication. doi:10.1073/ pnas.1721555115 PMID:30201717

28

 The Challenge of Assessing Intercultural Competence

Beacco, J.-C., Byram, M., Cavalli, M., Coste, D., Egli Cuenat, M., Goullier, F., & Panthier, J. (2016). Guide for the development and implementation of curricula for plurilingual and intercultural education. Council of Europe Publishing. Bennett, M. J. (1993). Towards ethnorelativism: A developmental model of intercultural sensitivity. In R. Paige (Ed.), Education for the intercultural experience (pp. 21–71). Intercultural Press. Bensel, N., & Weiler, H. N. (2000). Hochschulen für das 21. Jahrhundert zwischen Staat, Markt und Eigenverantwortung: Ein Hochschulpolitisches Memorandum im Rahmen der „Initiative D21“ unter Federführung der DaimlerChrysler Services (debis). DaimlerChrysler Services (debis) AG. http://www.hochschul-management.de/HS-Politisches_Memorandum .pdf Black, P., & Wiliam, D. (2004). Assessment for Learning in the Classroom. In Assessment and Learning (pp. 9–21). SAGE Publications Ltd. Brislin, R. W. (2010). The undreaded job: Learning to thrive in a less-than-perfect workplace. Praeger. Br itish Council. (2013). Culture at work: The Value of Intercultural Skills in the Workplace. British Council. https://www.britishcouncil.org/sites/default/files/culture-a t-work-report-v2.pdf Byram, M. (1997). Teaching and assessing intercultural communicative competence. Multilingual Matters. Byram, M. (2008). From foreign language education to education for intercultural citizenship: Essays and Reflections. Multilingual Matters. doi:10.21832/9781847690807 Byram, M. (2014). Twenty-five years on – from cultural studies to intercultural citizenship. Language, Culture and Curriculum, 27(3), 209–225. doi:10.1080/07908318.2014.974329 Byram, M. (2021). Teaching and assessing intercultural communicative competence: Revisited (2nd ed.). Multilingual Matters. doi:10.21832/9781800410251 Byram, M., & Wagner, M. (2018). Making a difference: Language teaching for intercultural and international dialogue. Foreign Language Annals, 51(1), 140–151. doi:10.1111/flan.12319 Byram, M., & Zarate, G. (1996). Defining and assessing intercultural competence: Some principles and proposals for the European context. Language Teaching, 29(4), 239–243. doi:10.1017/S0261444800008557 Canagarajah, A. S. (2013). Translingual practice: Global Englishes and cosmopolitan relations. Routledge. doi:10.4324/9780203120293 Caspari, D., & Schinschke, A. (2009). Aufgaben zur Feststellung und Überprüfung interkultureller Kompetenzen im Fremdsprachenunterricht—Entwurf einer Typologie. In M. Byram & A. Hu (Eds.), Interkulturelle Kompetenz und fremdsprachliches Lernen. Modelle, Empirie, Evaluation (pp. 299–315). Gunter Narr Verlag. Chavez, M. (2002). We say “culture” and students ask “what?”: University students’ definitions of foreign language culture. Die Unterrichtspraxis / Teaching German, 35(2), 129.

29

 The Challenge of Assessing Intercultural Competence

Collier, M. J. (1989). Cultural and intercultural communication competence: Current approaches and directions for future research. International Journal of Intercultural Relations, 13(3), 287–302. doi:10.1016/0147-1767(89)90014-X Committee for Economic Development. (2006). Education for global leadership: The importance of international studies and foreign language education for U.S. economic and national security. Committee for Economic Development. Council of Europe. (Ed.). (2001). Common European framework of reference for languages: Learning, teaching, assessment (10th print). Cambridge Univ. Press. Crossman, J. E., & Clarke, M. (2010). International experience and graduate employability: Stakeholder perceptions on the connection. Higher Education, 59(5), 599–613. doi:10.100710734-009-9268-z De Haan, H. (2014). Can internationalisation really lead to institutional competitive advantage? – A study of 16 Dutch public higher education institutions. European Journal of Higher Education, 4(2), 135–152. doi:10.1080/21568235.2013.860359 Deardorff, D. K. (2004). The identification and assessment of intercultural competence as a student outcome of internationalization at institutions of higher education in the United States [Dissertation]. North Carolina State University. Deardorff, D. K. (2006). Identification and assessment of intercultural competence as a student outcome of internationalization. Journal of Studies in International Education, 10(3), 241–266. doi:10.1177/1028315306287002 Deardorff, D. K. (2015). Intercultural competence: Mapping the future research agenda. International Journal of Intercultural Relations, 48, 3–5. doi:10.1016/j.ijintrel.2015.03.002 Deardorff, D. K. (2020). Manual for developing intercultural competencies: Story circles. Routledge, Taylor & Francis Group. Dodrige, M. (1999). Generic skill requirements for engineers in the 21st Century. Academic Press. EHEA. (1999). The bologna declaration of 19 June 1999: Joint declaration of the European Ministers of Education. https://www.eurashe.eu/library/modernising-phe/Bologna_1999_ Bologna-Declaration.pdf European Commission. (2014). The Erasmus impact study: Effects of mobility on the skills and employability of students and the internationalisation of higher education institutions. European Commission: Education and Culture. https://ec.europa.eu/assets/eac/education/library/study/2014 /erasmus-impact-summary_en.pdf European Commission. (2015). ECTS users’ guide 2015. http:// bibpurl.oclc.org/web/75797 https://ec.europa.eu/educa tion/library/publications/2015/ects-users-guide_en.pdf Fantini, A. E. (2000). A central concern: developing intercultural competence. SIT Occasional Papers Series, Inaugural Issue, 25–43.

30

 The Challenge of Assessing Intercultural Competence

Fantini, A. E. (2009). Assessing intercultural competence. Issues and tools. In D. K. Deardorff (Ed.), The SAGE handbook of intercultural competence (pp. 456–476). SAGE Publications. doi:10.4135/9781071872987.n27 Fantini, A. E. (2012). Language: An essential component of intercultural communicative competence. In The Routledge handbook of language and intercultural communication (pp. 263–278). Routledge. Fantini, A. E. (2018). Intercultural communicative competence in educational exchange: A Multinational Perspective (1st ed.). Routledge. doi:10.4324/9781351251747 Fellmann, G. (2016). Interkulturelles Lernen sichtbar machen. Lernertagebücher. Praxis Fremdsprachenunterricht, 5, 26–33. Frawley, J., Nguyen, T., & Sarian, E. (Eds.). (2020). Transforming lives and systems: cultural competence and the higher education interface. Springer Singapore. doi:10.1007/978-981-15-5351-6 Garrett-Rucks, P. (2016). Intercultural competence in instructed language learning: bridging theory and practice. IAP. Griffith, R. L., Wolfeld, L., Armon, B. K., Rios, J., & Liu, O. L. (2016). Assessing intercultural competence in higher education: Existing research and future directions. ETS Research Report Series, 2016(2), 1–44. doi:10.1002/ets2.12112 Gudykunst, W. B. (2004). Bridging differences: Effective intergroup communication. Sage (Atlanta, Ga.). Hammer, M. R., Bennett, M. J., & Wiseman, R. (2003). Measuring intercultural sensitivity: The intercultural development inventory. International Journal of Intercultural Relations, 27(4), 421–443. doi:10.1016/S0147-1767(03)00032-4 Hart Research Associates. (2015). Falling short? College learning and career success. Selected findings from online surveys of employers and college students conducted on behalf of the association of American colleges & universities. Hart Research Associates. https://www.aacu.org/sites/default/files/files/LEAP/2015empl oyerstudentsurvey.pdf Hecker, K. (2015). Kompetenzkonzepte des Bildungspersonals im Übergangssystem. Springer Fachmedien Wiesbaden. doi:10.1007/978-3-658-07655-9 Held, D., Goldblatt, D., Perraton, J., & McGrew, A. G. (1999). Global Transformations: Politics, economics, and culture. Stanford University Press. Hesse, H.-G. (2008). Interkulturelle Kompetenz: Vom theoretischen Konzept über die Operationalisierung bis zum Messinstrument. In N. Jude, J. Hartig, & E. Klieme (Eds.), Kompetenzerfassung in pädagogischen Handlungsfeldern. Theorien, Konzepte und Methoden. (Vol. 26). Bundesministerium für Bildung und Forschung. Hofstede, G. H. (2001). Culture’s consequences: Comparing values, behaviors, institutions, and organizations across nations (2nd ed.). Sage Publications. INCA Assessor Manual. (2004). https://ec.europa.eu/migrant-integration/library-document/in ca-project-intercultural-competence-assessment_en

31

 The Challenge of Assessing Intercultural Competence

I n g o l d , T. ( 2 0 0 2 ) . C o m p a n i o n e n c y c l o p e d i a o f a n t h ro p o l o g y . Taylor & Francis. https://public.ebookcentral.proquest.com/choice/publicfullre cord.aspx?p=169490 Jones, E., & de Wit, H. (2012). Globalization of internationalization: Thematic and regional reflections on a traditional concept. AUDEM: The International Journal of Higher Education and Democracy, 3(1), 35–54. Koch, M., & Straßer, P. (2008). Der Kompetenzbegriff: Kritik einer neuen Bildungsleitsemantik. In M. Koch & P. Straßer (Eds.), In der Tat kompetent: Zum Verständnis von Kompetenz und Tätigkeit in der beruflichen Benachteiligtenförderung (pp. 25–52). wbv Media. Kramsch, C. (2011). The symbolic dimensions of the intercultural. Language Teaching, 44(3), 354–367. doi:10.1017/S0261444810000431 Kramsch, C. (2012). Theorizing translingual/transcultural competence. In G. Levine & A. M. Phipps (Eds.), Critical and Intercultural Theory and Language Pedagogy (pp. 15–31). Heinle Cengage Learning. Kroeber, A. L., & Kluckhohn, C. (1952). Culture. A critical review of concepts and definitions. Museum of American Archaeology and Ethnology. https://www.pseudology.org/Psyhology/CultureCriticalReview19 52a.pdf Lustig, M. W. (2005). WSCA 2005 presidential address: Toward a well‐functioning intercultural nation. Western Journal of Communication, 69(4), 377–379. doi:10.1080/10570310500305612 Lustig, M. W., & Koester, J. (2010). Intercultural competence: Interpersonal communication across cultures (6th ed.). Allyn & Bacon. Magala, S. (2005). Cross-cultural competence. Routledge. doi:10.4324/9780203695494 Magnan, S. S., Murphy, D., & Sahakyan, N. (2014). Goals of collegiate learners and the standards for foreign language learning. Modern Language Journal, 98(S1), 1–11. doi:10.1111/j.1540-4781.2013.12056_3.x Matsumoto, D., & Hwang, H. C. (2013). Assessing cross-cultural competence: A review of available tests. Journal of Cross-Cultural Psychology, 44(6), 849–873. doi:10.1177/0022022113492891 Matveev, A. (2017). Intercultural competence in organizations. Springer International Publishing. doi:10.1007/978-3-319-45701-7 McAuliffe, M., & Triandafyllidou, A. (Eds.). (2021). World migration report 2022. International Organization for Migration (IOM). http://hdl.handle.net/1814/74322 OECD. (2005). The definition and selection of key competencies executive summary. OECD Publishing. https://www.oecd.org/pisa/35070367.pdf OECD. (2018). The future of education and skills. Education 2030. OECD Publishi n g . h t t p s : / / w w w. o e c d . o r g / e d u c a t i o n / 2 0 3 0 - p r o j e c t / c o n t a c t / E 2 0 3 0 % 2 0 Position%20Paper%20(05.04.2018).pdf OECD. (2019). PISA 2018 assessment and analytical framework. OECD.

32

 The Challenge of Assessing Intercultural Competence

OECD. (2020). PISA 2018 Results (Volume 4): Are students ready to thrive in an interconnected world? OECD. Pike, K. L. (1954). Language in relation to a unified theory of the structure of human behavior, part 1 (Preliminary ed.). Summer Institute of Linguistics. Prechtl, E., & Lund, A. D. (2007). Intercultural competence and assessment: Perspectives from the INCA Project. In H. Kotthoff & H. Spencer-Oatey (Eds.), Handbook of intercultural communication. Mouton de Gruyter. doi:10.1515/9783110198584.5.467 Reimann, D. (2018). Inter- und transkulturelle kommunikative Kompetenz. In D. Reimann & S. MeloPfeifer (Eds.), Plurale Ansätze im Fremdsprachenunterricht in Deutschland: State of the art, Implementierung des REPA und Perspektiven (pp. 247–296). Narr Francke Attempto. Risager, K. (2006). Language and culture: Global flows and local complexity. Multilingual Matters. doi:10.21832/9781853598609 Risager, K. (2007). Language and culture pedagogy: From a national to a transnational paradigm. Multilingual Matters. doi:10.21832/9781853599613 Risager, K. (2015). LINGUACULTURE The language–culture nexus in transnational perspective. In F. Sharifian (Ed.), The Routledge Handbook of language and culture (pp. 87–99). Routledge. Rott, G., Diefenbach, B., Vogel-Heuser, B., & Neuland, E. (2003). The challenge of inter- and transdisciplinary knowledge: Results of the WISA Project. European Conference of Educational Research, University of Hamburg. https://www.leeds.ac.uk/educol/documents/00003520.htm Ruben, B. D. (1989). The study of cross-cultural competence: Traditions and contemporary issues. International Journal of Intercultural Relations, 13(3), 229–240. doi:10.1016/0147-1767(89)90011-4 Sinicrope, C., Norris, J., & Watanabe, Y. (2007). Understanding and assessing intercultural competence: A summary of theory, research, and practice (technical report for the foreign language program evaluation project). Second Language Studies, 26(1), 58. Souto-Otero, M. (2020). Globalization of higher education, critical views. In P. N. Teixeira & J. C. Shin (Eds.), The international encyclopedia of higher education systems and institutions (pp. 568–572). Springer Netherlands. doi:10.1007/978-94-017-8905-9_215 Spencer-Oatey, H., & Franklin, P. (2009). Intercultural interaction: A multidisciplinary approach to intercultural communication. doi:10.1057/9780230244511 Spitzberg, B. H., & Changnon, G. (2009). Conceptualizing intercultural competence: Issue and tools. In D. K. Deardorff (Ed.), The SAGE handbook of intercultural competence (pp. 2–52). SAGE. doi:10.4135/9781071872987.n1 Stahl, G. K. (2001). Using assessment centers as tools for global leadership development: An exploratory study. In M. E. Mendenhall, G. K. Stahl, & T. M. Kühlmann (Eds.), Developing global business leaders: Policies, processes, and innovations (pp. 197–210). Quorum Books.

33

 The Challenge of Assessing Intercultural Competence

Standing Conference of the Ministers of Education of the Federal States. (Ed.). (2004). Bildungsstandards für die erste Fremdsprache (Englisch/Französisch) für den Mittleren Schulabschluss. Beschlüsse Der Kultusministerkonferenz Vom 04.12.2003, Art.-Nr. 05966. https://www.kmk.org/fileadmin/veroeffentlichungen_beschluess e/2003/2003_12_04-BS-erste-Fremdsprache.pdf Standing Conference of the Ministers of Education of the Federal States. (Ed.). (2014). Bildungsstandards für die fortgeführte Fremdsprache (Englisch/Französisch) für die Allgemeine Hochschulreife. Beschlüsse Der Kultusministerkonferenz Vom 18.10.2012. https://www.kmk.org/fileadmin/veroeffentlichungen_beschluess e/2012/2012_10_18-Bildungsstandards-Fortgef-FS-Abi.pdf Suarta, I. M., Suwintana, I. K., Sudhana, I. G. P. F. P., & Hariyanti, N. K. D. (2017). Employability skills required by the 21st-century workplace: A literature review of labour market demand. Advances in Social Science, Education and Humanities Research, 102, 337–342. doi:10.2991/ictvt-17.2017.58 Tung, R. L. (1987). Expatriate assignments: Enhancing success and minimizing failure. The Academy of Management Perspectives, 1(2), 117–125. doi:10.5465/ame.1987.4275826 Van Dyne, L., Ang, S., Ng, K. Y., Rockstuhl, T., Tan, M. L., & Koh, C. (2012). Sub-Dimensions of the four factor model of cultural intelligence: Expanding the conceptualization and measurement of cultural intelligence: cq: sub-dimensions of cultural intelligence. Social and Personality Psychology Compass, 6(4), 295–313. doi:10.1111/j.1751-9004.2012.00429.x Vogt, K. (2016). Teaching Practice Abroad for developing intercultural competence in foreign language teachers. Canadian Journal of Applied Linguistics, 19(2), 85–106. Vogt, K. (2018). Interkulturelle kommunikative Kompetenz fördern. In Basiswissen Lehrerbildung: Englisch unterrichten (pp. 80–95). Klett/Kallmeyer. Weinert, F. E. (Ed.). (2001). Leistungsmessungen in Schulen (1st ed.). Beltz. Whittemore, S. (2018). Transversal competencies essential for future proofing the workforce. Skilla. Witte, A. E. (2012). Making the Case for a Post-National Cultural Analysis of Organizations. Journal of Management Inquiry, 21(2), 141–159. doi:10.1177/1056492611415279 World Tourism Organization. (Ed.). (2022). Yearbook of tourism statistics, data 2016 – 2020, 2022 Edition. World Tourism Organization (UNWTO).

ADDITIONAL READING Black, P., & Wiliam, D. (2012). Assessment for Learning in the Classroom. In Assessment and Learning (pp. 11–32). SAGE Publications Ltd. doi:10.4135/9781446250808.n2 Deardorff, D. K., Arasaratnam-Smith, L. A., & Arasaratnam-Smith, L. A. (2017). Intercultural Competence in Higher Education: International Approaches, Assessment and Application. Routledge. doi:10.4324/9781315529257

34

 The Challenge of Assessing Intercultural Competence

Frey, N., & Fisher, D. (2007). Checking for Understanding. Formative Assessment Techniques for Your Classroom. ASCD. Griffin, P., & Care, E. (Eds.). (2015). Assessment and Teaching of 21st Century Skills. Springer Netherlands. doi:10.1007/978-94-017-9395-7 Leung, C. (2013). Classroom-Based Assessment Issues for Language Teacher Education. In A. J. Kunnan (Ed.), The Companion to Language Assessment (pp. 1510–1519). John Wiley & Sons, Inc. doi:10.1002/9781118411360.wbcla064 Steger, M. B. (2013). Globalization: A very short introduction. Oxford University Press. doi:10.1093/ actrade/9780199662661.001.0001 Vogt, K., & Quetz, J. (2018). Assessment im Englischunterricht. Kompetenzorientiert beurteilen und bewerten (1st ed.). Helbling.

KEY TERMS AND DEFINITIONS Assessment for Learning: It is “any assessment for which the first priority in its design and practice is to serve the purpose of promoting students’ learning. It thus differs from assessment designed primarily to serve the purposes of accountability, or of ranking, or of certifying competence.” (Black & Wiliam, 2004, p. 10). Culture: It is a very complex and traditionally hard to define construct. It defies a single all-purpose definition; it can be described best by taking consensus of the previous decades’ findings. According to an extensive analysis done in 2009 by Helen Spencer-Oatey and Peter Franklin, Culture is manifested through different types of regularities, some of which are more explicit than others, Culture is associated with social groups, but no two individuals within a group share exactly the same cultural characteristics, culture affects people’s behaviour and interpretations of behaviour, and culture is acquired and/or constructed through interaction with others. (Spencer-Oatey, & Franklin, 2009). Globalization: It refers to the growing interconnectedness of the world. This interconnectedness can be observed in many different dimensions, though popular discussions on the topic often seem to focus on economic chances and challenges. Amongst many others though, globalization very much also concerns political or social issues for example. Intercultural Competence: Over the last decades, intercultural competence has proven to be unsuitable for a single all-purpose definition. Instead, scholars agree to conceptualize intercultural competence as a capacity to interact effectively and appropriately with culturally different others. Furthermore, many agree that this capacity consists of certain attitudes, knowledge, and skills.

35

36

Chapter 3

Culturally-Biased Language Assessment: Collectivism and Individualism Ömer Gökhan Ulum Mersin University, Turkey Dinçay Köksal Çanakkale Onsekiz Mart University, Turkey

ABSTRACT This chapter suggests that culture and evaluation are inextricably linked. Therefore, culture should not be regarded as a phenomenon that needs to be controlled for in assessments; rather, it should be regarded as a fundamental component of assessment, beginning with its conceptualization and continuing through its design, construction, and interpretation of student performance. This study aims to discuss the relationship of culture to language assessment, teachers’ awareness of cultural and linguistic bias in testing, and the negative effect of test bias on learners’ motivation and performance; the ways of minimizing linguistic and cultural bias in language tests and maximizing cultural validity in classroombased language assessment; to find out how learners and teachers from different cultures view success from a language learning perspective; and to find out how learners and teachers from different cultures view success from a cultural perspective.

INTRODUCTION Several factors affecting learners’ test performance are encountered in the related literature (Birjandi & Alemi, 2010; Giannakos, 2013; Janebi Enayat & Babaii, 2018). In the field of language assessment, a big number of studies on the impacts of test-taker background features on language competence are also encountered (Kunnan, 2017; Kendik-Gut, 2019; Nasrul, Alberth, & Ino, 2019). Bachman (1990), in his communicative language competence framework, put forward that test-taker background features involve one of three principal aspects that influence performance on language test content. Bachman’s DOI: 10.4018/978-1-6684-5660-6.ch003

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 Culturally-Biased Language Assessment

suggested background features cover background knowledge, cultural background, learning styles (field dependence, cognitive ability, native language, age, and gender (Kunnan, 1998; Bachman & Palmer, 2010; Ariyanti, 2016, Kasap, 2021); some other aspects can be put forward as diversities in culture and background knowledge that influence how they interpret test questions (Suzuki & DeKeyser, 2017,); language register utilized in the test and lack of familiarity with the vocabulary (Volodina, Weinert, & Mursin, 2020); limited English language proficiency (Erdodi, et al., 2017); and issues of language dominance (Solano-Flores, 2006; Treffers-Daller & Silva-Corvalán, 2016). The content knowledge and familiarity with the culture may also be added here. As Goodwin and Macdonald (1997) state, ‘knowledge is personal, contextual, and cultural’ and so is learning. Culture is ingrained not only in the context of assessment but also in every other dimension of assessment (Harrison, et al., 2017; Köksal & Ulum, 2018, Kasap, 2020). Our way of acquiring and sharing knowledge differs from culture to culture (Köksal & Ulum, 2016). Learners from Mexican American culture, for instance, are prone to be less competitive than their dominant culture peers and more likely towards cooperating with peers to learn (Raeff, Greenfield, & Quiroz, 2000). Such collectivistic cultures as Alaska Native and American Indian, Polynesian and Micronesian, those of Central America and Mexico, African and Asian cultures, are more oriented towards team achievement (Torelli, et al., 2020). However, individualistic cultures, such as the cultures of Western Europe (Swader, 2019), the United States (Mennell, 2020), and Australia (Chambers, et al., 2019), give prominence to individual success. Though these orientations are mentioned here as dichotomous, no group or person is totally collectivistic or individualistic in reality (Heu, Van Zomeren, & Hansen, 2019). That the naturally competing orientation of assessment in American education already puts such students at a disadvantage could be also expressed (Egalite & Mills, 2019). Thus, our principle should be ‘not competition but cooperation’ in classrooms. Because of this, culture and evaluation are inextricably linked. Therefore, culture should not be regarded as a phenomenon that needs to be controlled for in assessments; rather, it should be regarded as a fundamental component of assessment, beginning with its conceptualization and continuing through its design, construction, and interpretation of student performance. Assessment plays a critical part in the learning process anywhere in the globe, and maintaining its cultural validity is one of the most important challenges that educators face today. In general, we put a lot of emphasis on construct validity, content validity, and face validity, but we tend to ignore cultural and language validity. Testing with a cultural bias refers to a scenario in which a certain exam is inappropriate for a particular audience because it does not evaluate the learner’s real understanding of a taught topic or includes information connected to a culture that the student is unfamiliar with. For instance, such questions asking “When did Coca-cola enter the east European countries?” were previously included in the language proficiency tests administered in Turkey. It is quite evident that the exam does not cover the subject of culture, and it should not contain cultural morsels that might confuse any of the students taking it. In the related literature, several studies on the negative impact of biased content of the exams on learners’ success are encountered (Deeks, Macaskill, & Irwig, 2005; Jacob, 2001; Mengel, Sauermann, & Zölitz, 2019; Popham, 2006; Schleicher, et al., 2017; Schwerdt & Woessmann, 2017; Warne, Yoon, & Price, 2014). Besides, since Turkish culture is a conventionalist one and English comes from a Western individualistic base, it may be hard for Turkish teachers and prospective teachers that are exposed to the English language since they may naturally transfer cultural elements because language is the bridge to transmit culture across ages and continents. Further, the difference between the natures of the two

37

 Culturally-Biased Language Assessment

languages and cultures which couldn’t possibly catch each other in time may be a failure after learners (Joshanloo, 2014; Hassan, Jamaludin, Sulaiman, & Baki, 2010; Niu & Sternberg, 2006; Ourfali, 2015). This study aims to discuss the relationship of culture to language assessment, teachers’ awareness of cultural and linguistic bias in testing, and the negative effect of test bias on learners’ motivation and performance; the ways of minimizing linguistic and cultural bias in language tests and maximizing cultural validity in classroom-based language assessment; to find out how learners and teachers from different cultures view success from a language learning perspective; and to find out how learners and teachers from different cultures view success from a cultural perspective.

Research Questions • • • • •

What do the Turkish ELT students think about the questions asked in EFL exams that represent individualism? What do the Turkish ELT students think about the representation of Turkish culture in EFL exams? What kind of conflicts do the Turkish ELT students experience in EFL exams in terms of cultural perspectives? To what extent do the EFL teachers in Turkey represent collectivism as a cultural issue? To what extent do the EFL teachers in Turkey consider collectivism in EFL exams?

METHOD This research is composed of a descriptive single study and critical reflection based on a convenience sample. Case studies deal with ‘a spatially bounded and real-life phenomenon that takes place in a limited time (Gerring, 2004: 342). Yin (2009, p.14) describes a case study as “an empirical inquiry that investigates a contemporary phenomenon in depth and within its real-life context, especially when the boundaries between phenomenon and context are not evident”. Thus, case studies denote well-thought, elaborate and planned methods (Gerring, 2004). In a single case study, a specific group is chosen to elicit data and information through surveys, questionnaires, diaries and interviews. The nature of this case study entails the selection of a specific group and context whose data are analyzed inductively (Maoz, 2002). In addition to the descriptive single case study, the critical reflection approach was also employed as an extension of critical pedagogy and to comprehend the participants’ reflections. This was done in order to get a better understanding of the participants’ reflections. Because critical reflection refers to the process of humans generating meaning in relation to a particular circumstance or environment in which they find themselves, instead of dealing with learning issues in a superficial manner without questioning or forming prejudiced stereotypes in their minds, critical reflection enables learners to ask critical questions, understand relationships of causality, deconstruct bias, or notice differences and problems between theory and practice. Critical reflection also enables learners to notice differences and problems between theory and practice (Ash & Clayton, 2009). A person’s beliefs about a social setting are called into question via the process of critical reflection, which involves gaining an awareness of the connections that exist between power, knowledge, and behaviors that exclude people (Fook & Askeland, 2006). As a result, everything that we do, everything that we know, and how we know something is called into question. There are three key steps that assist people apply critical reflection: the introduction of subjects, deconstruction via dialogic conversation, and rebuilding of practices and 38

 Culturally-Biased Language Assessment

new experiences (Fook & Gardner, 2007). This research’s methodological technique was based on this three-stage process, and it was enlarged and adapted depending on the findings of the investigation. An open-ended questionnaire with ten questions was constructed in accordance with this design in order to elicit the participants’ thoughts and opinions. The questions included a variety of subjects, including cultural and linguistic backgrounds, among others. The participants were told that they would express their ideas about the assessment from a cultural perspective.

Participants The participants were composed of 40 junior undergraduates majoring in English language teaching at a Turkish university and 20 EFL state schoolteachers. The EFL teachers had at least a 6-year of experience.

Procedure The study was mainly composed of three stages. In each stage, the participants were given some questions that guided them so that they could be familiar with the procedure of the study. Different materials were used to orient the participants to the content of the study. In the first stage, the participants were informed about the nature of the study, the main tenets of collectivism and individualism were introduced. As a warm-up activity, we made a list of cultural and linguistic topics in exams. Some of these topics were debated as a warm-up practice in the classroom for 9 hours in three weeks. After the researcher presented the constructive feedback to the participants, they were asked to tell what they knew about some of the cultural and linguistic contexts in the exams and to make a list of individualistic and collectivistic and linguistic issues.

Data Collection and Analyses The analyzed data and the related results are illustrated under the titles conceptions of Turkish ELT students towards the questions asked in EFL exams that represent individualism, conceptions of Turkish ELT students towards the representation of Turkish culture in EFL exams, conflicts experienced by Turkish ELT students in EFL exams in terms of cultural perspectives, the extent of collectivism as a cultural issue represented by EFL teachers in Turkey, and the extent of collectivism in EFL exams applied by EFL teachers in Turkey. The related frequencies and percentages of each item are displayed in the tables as well. Table 1. Conceptions of Turkish ELT Students towards the Questions Asked in EFL Exams that Represent Individualism Item

          f

          %

More individualistic compounds

          22

          33.33

Less collectivist compounds

          20

          30.30

The need for more collectivist compounds

          16

          24.24

Compounds fostering selfishness

          6

          9.10

Both individualistic and collectivist compounds

          2

          3.03

Total

          66

          100.00

39

 Culturally-Biased Language Assessment

It is observed in Table 1 that the highest emerging items are more individualistic compounds (%33.33) and less collectivist compounds (%30.30). Furthermore, the item the need for more collectivist compounds occurred with a percentage of 24.24 just before such slightly emerging themes as Compounds fostering selfishness (%9.10) and Both individualistic and collectivist compounds (%3.03). Table 2. Conceptions of Turkish ELT students towards the Representation of Turkish Culture in EFL Exams Item

          f

          %

More individualistic western compounds

          24

          41.38

Less collectivist Turkish cultural compounds

          18

          31.03

Ignored collectivist culture

          12

          20.69

Insufficient cultural diversity

          4

          6.90

Total

          58

          100.00

As it is clearly understood from Table 1, the item More individualistic western compounds (%41.38) emerged with the highest rate. The item Less collectivist Turkish cultural compounds (%31.03) was detected to come second just after the first item in the table. Further, with a percentage of 20.69, the item Ignored collectivist culture followed the initial two items. Lastly, the item Insufficient cultural diversity (%6.90) was observed to be occurring with the weakest frequency. Table 3. Conflicts Experienced by Turkish ELT Students in EFL Exams in terms of Cultural Perspectives Item

          f

          %

Gap between individual and collectivist compounds

          19

          34.54

Being exposed to the individualistic exam type

          18

          32.73

The dominance of an individualism-centric perspective

          16

          29.10

Disregarding cultural diversity

          2

          3.63

Total

          55

          100.00

One can easily understand from Table 1 that item 1 Gap between individual and collectivist compounds (%34.54) was observed to be emerging with the biggest percentage. Besides, item being exposed to individualistic exam type was seen to emerge with a percentage of 32.73. Moreover, with a percentage of 29.10, the dominance of an individualism-centric perspective occurred with the third highest rate. Finally, the item disregarding cultural diversity (%3.63) emerged with the least frequency within this category.

40

 Culturally-Biased Language Assessment

Table 4. The Extent of Collectivism as a Cultural Issue Represented by EFL Teachers in Turkey Item

          f

          %

An individualism-centric perspective in exams

          16

          37.24

Hard to transfer individualistic compounds into collectivistic ones

          8

          18.60

A standard approach disregarding collectivism

          7

          16.27

No balance between collectivism and individualism

          7

          16.27

Non-inclusive practices involved

          5

          11.62

Total

          43

          100.00

Table 4 simply illustrates that the first item an individualism-centric perspective in exams (%37.24) was observed to emerge with the highest rate. Then, the second item Hard to transfer individualistic compounds into collectivistic ones occurred with a percentage of 18.60. Furthermore, the items A standard approach disregarding collectivism and No balance between collectivism and individualism were observed to appear with similar percentages (%16.27). With a percentage of 11.62, the last item non-inclusive practices involved were observed to emerge with the weakest frequency. Table 5. The Extent of Collectivism in EFL Exams Applied by EFL Teachers in Turkey Item

          f

          %

Non-collectivist perspective in exams

          17

          45.95

Difficult to change individualistic culture into collectivistic one

          9

          24.33

Trying to build a more collectivist perspective

          6

          16.21

No equity between collectivism and individualism

          5

          13.51

Total

          37

          100.00

It is crystal clear from Table 5 that the highest emerging item is non-collectivist perspective in exams with a percentage of 45.95. In addition, the second item Difficult to change individualistic culture into a collectivistic one (%24.33) emerged with the second highest rate. Lastly, the items Trying to build a more collectivist perspective (%16.21) and No equity between collectivism and individualism (%13.51) emerged with the barest occurrences.

DISCUSSION It may be challenging for Turkish teachers and prospective teachers who are exposed to the English language since Turkish culture is a conventionalist one and English is derived from a Western individualistic base. This is because teachers and prospective teachers who are exposed to the English language may naturally transfer cultural elements because language is the bridge that transmits culture across ages and continents. In addition, the differences in the origins of the two cultures and languages, which couldn’t reasonably catch up to each other in time, may result in a failure once students have acquired

41

 Culturally-Biased Language Assessment

the information (Joshanloo, 2014; Hassan, Jamaludin, Sulaiman, & Baki, 2010; Niu & Sternberg, 2006; Ourfali, 2015). Therefore, teachers and prospective teachers need to develop a sense of cultural awareness in order to ensure that the transfer of language does not take place with cultural elements that may be inappropriate in a particular context. They must also be aware of the unique qualities and properties of each language in order to ensure that the transfer is meaningful and respectful. This can be difficult to achieve, as the difference in cultural orientation between Turkish and English can be considerable. This can be especially challenging when teaching English to Turkish students, as the differences in cultural orientation between the two languages can lead to issues of miscommunication and misinterpretation. Consequently, it is essential for teachers to be aware of the cultural differences between Turkish and English and to allow time for a gradual process of communication between the two languages. In order to bridge the cultural divide and address misunderstandings, teachers must emphasize effective communication and mutual understanding between both languages. It is also important for teachers to create an environment that values and respects the differences between both languages.

CONCLUSION AND RECOMMENDATIONS This study aims to elicit the views of the students and teachers in English language teaching on the representation of two different cultural paradigms. It has been found that individualistic culture has been emphasized in the exams. Therefore, individualism as a central part of western culture has been brought to the fore. However, collectivistic culture has remained secondary and largely ignored. Therefore, a huge chasm between the two cultural types has been observed. The Turkish students were forced to keep up with an individualistic exam type from a cultural perspective. Therefore, their failure was ascribed to their individual weaknesses rather than cultural elements. The teachers also mentioned that it was timeconsuming to transform individualistic culture into collectivistic culture. Exams have been observed to guide learners to be tested from an individualism-centric perspective. A critical cultural perspective can alter the minds of learners and teachers in ELT. If culture matters in applied linguistics, then collectivist culture should find room in exams so that learners from this culture can increase their scores in exams. Unless the collectivistic culture is represented in exams, individualistic culture dominates ELT field (Al-Zahrani & Kaplowitz, 1993). Exposing learners and teachers to only individualistic cultures in exams in a standard manner causes this discipline to disregard cultural diversity (Barry & Lechner, 1995). Exams act as disciplinary practices which control learners and teachers (Zipp, 2007). Therefore, exams need to be designed with culture-specific features that open space to represent different cultures. Success in exams can be encouraged through inclusive practices that prioritize diversity. Collectivism and individualism can be balanced out in exams (Schimmack et al., 2002).

REFERENCES Al-Zahrani, S. S. A., & Kaplowitz, S. A. (1993). Attributional biases in individualistic and collectivistic cultures: A comparison of Americans with Saudis. Social Psychology Quarterly, 56(3), 223–233. doi:10.2307/2786780

42

 Culturally-Biased Language Assessment

Ariyanti, A. (2016). Psychological factors affecting EFL students’ speaking performance. Asian TEFL Journal of Language Teaching and Applied Linguistics, 1(1), 77–88. doi:10.21462/asiantefl.v1i1.14 Ash, S. L., & Clayton, P. H. (2009). Generating, deepening, and documenting learning: The power of critical reflection in applied learning. Journal of Applied Learning in Higher Education, 1, 25–48. Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford University Press. Bachman, L. F., & Palmer, A. (2010). Language assessment in practice. Oxford University Press. Barry, N. H., & Lechner, J. V. (1995). Preservice teachers’ attitudes about and awareness of multicultural teaching and learning. Teaching and Teacher Education, 11(2), 149–161. doi:10.1016/0742051X(94)00018-2 Birjandi, P., & Alemi, M. (2010). The impact of test anxiety on test performance among Iranian EFL learners. BRAIN. Broad Research in Artificial Intelligence and Neuroscience, 1(4), 44–58. Chambers, I., Costanza, R., Zingus, L., Cork, S., Hernandez, M., Sofiullah, A., & Kubiszewski, I. (2019). A public opinion survey of four future scenarios for Australia in 2050. Futures, 107, 119–132. doi:10.1016/j.futures.2018.12.002 Deeks, J. J., Macaskill, P., & Irwig, L. (2005). The performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed. Journal of Clinical Epidemiology, 58(9), 882–893. doi:10.1016/j.jclinepi.2005.01.016 PMID:16085191 Egalite, A. J., & Mills, J. N. (2019). Competitive impacts of means-tested vouchers on public school performance: Evidence from Louisiana. Education Finance and Policy, 1-45. Retrieved on the 29th of September, 2020 from https://www.mitpressjournals.org/doi/abs/10.1162/edfp_a_0028 6 Erdodi, L. A., Nussbaum, S., Sagar, S., Abeare, C. A., & Schwartz, E. S. (2017). Limited English proficiency increases failure rates on performance validity tests with high verbal mediation. Psychological Injury and Law, 10(1), 96–103. doi:10.100712207-017-9282-x Fook, J., & Askeland, G. A. (2006). The ‘critical’ in critical reflection. In S. White, J. Fook, & F. Gardner (Eds.), Critical reflection in health and social care. Open University Press/McGraw-Hill Education. Fook, J., & Gardner, F. (2007). Practicing critical reflection: A resource handbook: A handbook. McGraw-Hill Education. Gerring, J. (2004). What is a case study and what is it good for? The American Political Science Review, 98(2), 341–354. doi:10.1017/S0003055404001182 Giannakos, M. N. (2013). Enjoy and learn with educational games: Examining factors affecting learning performance. Computers & Education, 68, 429–439. doi:10.1016/j.compedu.2013.06.005 Goodwin, A. L., & Macdonald, M. (1997). Educating the Rainbow: Authentic Assessment and Authentic Practice for Diverse Classrooms. In A. L. Goodwin (Ed.), Assessment for Equity and Inclusion. Embracing All Our Children (pp. 221–228). Routledge.

43

 Culturally-Biased Language Assessment

Harrison, C. J., Könings, K. D., Schuwirth, L. W., Wass, V., & van der Vleuten, C. P. (2017). Changing the culture of assessment: The dominance of the summative assessment paradigm. BMC Medical Education, 17(1), 73. doi:10.118612909-017-0912-5 PMID:28454581 Hassan, A., Jamaludin, N. S., Sulaiman, T., & Baki, R. (2010). Western and Eastern educational philosophies. 40th Philosophy of Education Society of Australasia Conference, Murdoch University. Heu, L. C., van Zomeren, M., & Hansen, N. (2019). Lonely alone or lonely together? A Cultural-psychological examination of individualism–Collectivism and loneliness in five European countries. Personality and Social Psychology Bulletin, 45(5), 780–793. doi:10.1177/0146167218796793 PMID:30264659 Jacob, B. A. (2001). Getting tough? The impact of high school graduation exams. Educational Evaluation and Policy Analysis, 23(2), 99–121. doi:10.3102/01623737023002099 Janebi Enayat, M., & Babaii, E. (2018). Reliable predictors of reduced redundancy test performance: The interaction between lexical bonds and test takers’ depth and breadth of vocabulary knowledge. Language Testing, 35(1), 121–144. doi:10.1177/0265532216683223 Joshanloo, M. (2014). Eastern conceptualizations of happiness: Fundamental differences with western views. Journal of Happiness Studies, 15(2), 475–493. doi:10.100710902-013-9431-1 Kasap, S. (2020). Sosyodilbilim ve dil eğitimi. In F. Tanhan & H. İ. Özok (Eds.), Eğitim Ortamlarında Nitelik. Anı Yayıncılık. Kasap, S. (2021). Sosyodilbilim. Akademisyen Kitabevi. Kendik-Gut, J. (2019). Influence of Background Knowledge and Language Proficiency on Comprehension of Domain-specific Texts by University Students. Theory and Practice of Second Language Acquisition, 5(2), 59–74. doi:10.31261/tapsla.7519 Köksal, D., & Ulum, Ö. G. (2016). Language learning strategies of Turkish and Arabic Students: A cross-cultural study. European Journal of Foreign Language Teaching, 1(1), 122–143. Köksal, D., & Ulum, Ö. G. (2018). Language assessment through Bloom’s Taxonomy. Journal of Language and Linguistic Studies, 14(2), 76–88. Kunnan, A. J. (2017). Evaluating language assessments. Routledge. doi:10.4324/9780203803554 Maoz, Z. (2002). Case study methodology in international studies: From storytelling to hypothesis testing. Evaluating Methodology in International Studies, 163. Mengel, F., Sauermann, J., & Zölitz, U. (2019). Gender bias in teaching evaluations. Journal of the European Economic Association, 17(2), 535–566. doi:10.1093/jeea/jvx057 Mennell, S. (2020). Power, Individualism, and Collective Self Perception in the USA. Historical Social Research. Historische Sozialforschung, 45(171), 309–329. Nasrul, E. S., Alberth, A., & Ino, L. (2019). Students’ background knowledge, vocabulary competence, and motivation as predictors of reading comprehension at grade 11 of SMA Kartika Kendari. Journal of Language Education and Educational Technology, 4(1), 1–12.

44

 Culturally-Biased Language Assessment

Niu, W., & Sternberg, R. J. (2006). The philosophical roots of Western and Eastern conceptions of creativity. Journal of Theoretical and Philosophical Psychology, 26(1-2), 18–38. doi:10.1037/h0091265 Ourfali, E. (2015). Comparison between Western and Middle Eastern cultures: Research on why American expatriates struggle in the Middle East. Otago Management Graduate Review, 13, 33–43. Popham, W. J. (2006). Assessment bias: How to banish it. Routledge. Raeff, C., Greenfield, P. M., & Quiroz, B. (2000). Developing interpersonal relationships in the cultural contexts of individualism and collectivism. In S. Harkness, C. Raeff, & C. R. Super (Eds.), Variability in the social construction of the child: New directions in child development (pp. 59–74). Jossey-Bass. Schimmack, U., Radhakrishnan, P., Oishi, S., Dzokoto, V., & Ahadi, S. (2002). Culture, personality, and subjective well-being: Integrating process models of life satisfaction. Journal of Personality and Social Psychology, 82(4), 582–593. doi:10.1037/0022-3514.82.4.582 PMID:11999925 Schleicher, I., Leitner, K., Juenger, J., Moeltner, A., Ruesseler, M., Bender, B., Sterz, J., Schuettler, K.-F., Koenig, S., & Kreuder, J. G. (2017). Examiner effect on the objective structured clinical exam– a study at five medical schools. BMC Medical Education, 17(1), 1–7. doi:10.118612909-017-0908-1 PMID:28056975 Schwerdt, G., & Woessmann, L. (2017). The information value of central school exams. Economics of Education Review, 56, 65–79. doi:10.1016/j.econedurev.2016.11.005 Solano-Flores, G. (2006). Language, dialect, and register: Sociolinguistics and the estimation of measurement error in the testing of English language learners. Teachers College Record, 108(11), 2354–2379. doi:10.1111/j.1467-9620.2006.00785.x Suzuki, Y., & DeKeyser, R. (2017). The interface of explicit and implicit knowledge in a second language: Insights from individual differences in cognitive aptitudes. Language Learning, 67(4), 747–790. doi:10.1111/lang.12241 Swader, C. S. (2019). Loneliness in Europe: Personal and societal individualism-collectivism and their connection to social isolation. Social Forces, 97(3), 1307–1336. doi:10.1093foy088 Torelli, C. J., Leslie, L. M., To, C., & Kim, S. (2020). Power and status across cultures. Current Opinion in Psychology, 33, 12–17. doi:10.1016/j.copsyc.2019.05.005 PMID:31336191 Treffers-Daller, J., & Silva-Corvalán, C. (Eds.). (2016). Language dominance in bilinguals: Issues of measurement and operationalization. Cambridge University Press. Volodina, A., Weinert, S., & Mursin, K. (2020). Development of academic vocabulary across primary school age: Differential growth and influential factors for German monolinguals and language minority learners. Developmental Psychology, 56(5), 922–936. doi:10.1037/dev0000910 PMID:32162935 Warne, R. T., Yoon, M., & Price, C. J. (2014). Exploring the various interpretations of “test bias”. Cultural Diversity & Ethnic Minority Psychology, 20(4), 570–582. doi:10.1037/a0036503 PMID:25313435 Yin, R. K. (2009). How to do better case studies. The SAGE Handbook of Applied Social Research Methods, 2, 254-282.

45

 Culturally-Biased Language Assessment

Zipp, J. F. (2007). Learning by exams: The impact of two-stage cooperative tests. Teaching Sociology, 35(1), 62–76. doi:10.1177/0092055X0703500105

ADDITIONAL READING Bruner, J. (1996). The culture of education. Harvard University Press. doi:10.4159/9780674251083 Fernández, A. L., & Abe, J. (2018). Bias in cross-cultural neuropsychological testing: Problems and possible solutions. Culture and Brain, 6(1), 1–35. doi:10.100740167-017-0050-2 Miller-Jones, D. (1989). Culture and testing. The American Psychologist, 44(2), 360–366. doi:10.1037/0003066X.44.2.360 Ostrosky-Solís, F., Ramirez, M., & Ardila, A. (2004). Effects of culture and education on neuropsychological testing: A preliminary study with indigenous and nonindigenous population. Applied Neuropsychology, 11(4), 186–193. doi:10.120715324826an1104_3 PMID:15673490 Shepard, L. A. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7), 4–14. doi:10.3102/0013189X029007004

KEY TERMS AND DEFINITIONS Assessment: The act of judging or deciding the value, quality, or importance of something. Collectivism: Emphasis on collective rather than individual action or identity. Cultural Bias: The phenomenon of interpreting or judging phenomena by particular standards inherent to one’s own culture. Individualism: A theory maintaining the independence of the individual and stressing individual initiative, action, and interests.

46

Section 2

Methods of Language Assessment

48

Chapter 4

Mispronunciation Detection Using Neural Networks for Second Language Learners Lubana Isaoglu https://orcid.org/0000-0001-5193-1380 Istanbul University-Cerrahpasa, Turkey Zeynep Orman https://orcid.org/0000-0002-0205-4198 Istanbul University-Cerrahpasa, Turkey

ABSTRACT Speaking a second language fluently is the aim of any language learner. Computer-aided language learning (CALL) systems help learners achieve this goal. Mispronunciation detection can be considered the most helpful component in CALL systems. For this reason, the focus is currently on research in mispronunciation detection systems. There are different methods for mispronunciation detection, such as posterior probability-based methods and classifier-based methods. Recently, deep-learning-based methods have also attracted great interest and are being studied. This chapter reviews the research that proposed neural network methods for mispronunciation detection conducted between 2014 and 2021 for second language learners. The results obtained from studies in the literature and comparisons between different techniques are also discussed.

INTRODUCTION Language is the primary approach to communication and knowing more than one language is almost vital in today’s world as people live in the era of technology and communication. The considerable skills that a foreign language learner needs to learn are listening, reading, speaking, and writing. Probably, speaking is the most important one since it is the primary communication tool. Developing the learner’s oral

DOI: 10.4018/978-1-6684-5660-6.ch004

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 Mispronunciation Detection Using Neural Networks for Second Language Learners

fluency and accuracy is essential for the success of second language communication. For this reason, many studies have focused on mispronunciation detection systems. The Oxford English Dictionary defines mispronunciation as “incorrect or inaccurate pronunciation.” It can be considered the most helpful component in Computer-Aided Language Learning (CALL) systems. CALL systems, powered by the advancement of speech technology, are effective learning tools with handy smartphones, tablets, laptops, and computers. Because these systems help learners improve their speaking skills without the presence of a language teacher, the research of mispronunciation detection systems has great importance. Mispronunciation detection systems can be categorized into three main groups: posterior probabilitybased methods, classifier-based methods, and deep-learning-based methods. In the posterior probabilitybased processes, the log-likelihood and posterior probability scores have been commonly used for mispronunciation detection. Witt and Young (2000) proposed the goodness of pronunciation (GOP) score. Also, Strik et al. (2009) suggested acoustic-phonetic features. Furthermore, Zhang et al. (2008) proposed Scaled Log Posterior Probability (SLPP) and weighted phone SLPP in their study. In the classifier-based methods, Ito et al. (2005) developed clusters of pronunciation rules. Flege et al. (1992) extracted the features using a discrete wavelet transform. Wavelet analysis is a unique signal processing method and uses a Support vector machine (SVM) for a classification algorithm to detect mispronunciation. Strik et al. (2009) presented a comparative study of four approaches (GOP, decision tree, LDA_APF, and LDA_MFCC) for mispronunciation detection. Amdal et al. (2009) used acoustic-phonetic features and trained a linear discriminant analysis (LDA) classifier. Li et al. (2009) presented an effective Generalized Linear Discriminant Sequence- Support Vector Machine (GLDS-SVM) based detection technique. In this chapter, the studies in mispronunciation detection using deep neural networks are reviewed. Since it is a new field of research, the papers published from 2014 to 2021 (both inclusive) have been examined. This research presented novel neural network models applied to other languages for mispronunciation detection. Most studies focused mainly on English, Mandarin Chinese, and Arabic since these are the most spoken and learned languages worldwide (Most spoken languages in the world, 2022). The rest of this paper is organized as follows. The literature review presents the architecture of the neural networks used in the research papers in detail. The results section shows the results of these studies and a comparison between them. Then it is analyzed in the discussion section. After that, the future research direction is discussed. Finally, some conclusions are drawn.

LITERATURE REVIEW In this section, the datasets used in the studies will be discussed since choosing a good dataset for the training and testing of the neural network models is very important and directly affects the performance quality.

The Corpus The datasets used in mispronunciation research are called corpus. A corpus is a collection of natural language (text and transcriptions of speech or signs) constructed with a specific purpose. Here, the different corpora used contain the speeches collected from native and non-native speakers.

49

 Mispronunciation Detection Using Neural Networks for Second Language Learners

Corpora in different languages ​​usually belong to Arabic, English, and Mandarin. Figure1 shows the number of research papers that used each corpus. These corpora are used in the training and evaluation stages for the neural networks proposed in the studies. Corpus selection is essential for adequate research results. Figure 1. Number of research papers that used each corpus

Arabic Corpus Arabic is the ðfth-largest native language worldwide, but still, there is no state-of-art corpus available for this language. That is why Maqsood, et al. (2019) manually constructed an Arabic corpus. Two hundred speakers, including males, females, and children of different ages (ranging from 15-50 years old), were asked to record the data. These speakers included two types of speakers; speakers who were highly proficient in speaking Arabic and speakers who had just started learning Arabic. The obtained data formed in this corpus is used in their research. Furthermore, Akhtar et al. (2020) created a corpus consisting of Arabic words. They chose multiple words that covered all the Arabic letters. Each word was recorded 30 times, and each recording was from a different Pakistani speaker who had been learning the Arabic language. They chose people of different ages (ranging from 8 to 60 years old) and genders. Nazir et al. (2019) collected data from 400 Pakistani speakers learning Arabic as a second language. The dataset was made by considering an equivalent number of male and female speakers ages (10 to 50 years old) and having distinctive ðrst languages. Moreover, Almekhlafi et al. (2022) in their work created a novel dataset for phonemes of the Arabic alphabet with diacritics, called the Arabic alphabet phonemes dataset (AAPD). In Arabic, there are twenty-eight (28) letters, and each of these letters has three diacritics, and that is why each Arabic letter with each diacritic is pronounced differently.

50

 Mispronunciation Detection Using Neural Networks for Second Language Learners

English Corpus There are many standard English language corpora that can be used in research for native and non-native speakers. One of these corpora is called “LDC95S27”, a native database phonetically rich, isolated word, telephone speech corpus. This corpus is used in studies by Hu, Qian, et al. (2014), and Hu et al. (2015) to train the native acoustic model for phone mispronunciation detection. Another one is TIMIT, a corpus of phonemically and lexically transcribed speech of American English speakers of males and females with different dialects. This corpus is used in Leung et al. (2019); Lo et al. (2021); and Li et al. (2017) works for training. It is also used by Wang et al. (2021); and Yan et al. (2021) to train, develop, and test the datasets. Another is the CU-CHLOE (Chinese University Chinese Learners of English), a non-native speaker corpus. This corpus contains 110 Mandarin speakers (60 males and 50 females) and 100 Cantonese speakers (50 males and 50 females). In Leung et al. (2019) work, this corpus was split into the training set, development set, and testing set. And in Li et al. (2017), it was used as a training and development set. Another type of this corpus used in Li et al. (2017) work is called the Supra-CHLOE (Suprasegmental Chinese Learners of English) corpus. Another one is a benchmark corpus called L2-ARCTIC. It is an open-access L2-English speech corpus compiled for research on CAPT, accent conversion, and others. L2-ARCTIC contains correctly pronounced utterances and mispronounced utterances of 24 non-native speakers (12 males and 12 females) whose mother-tongue languages are from different nationalities such as Hindi, Korean, Mandarin, Spanish, Arabic, and Vietnamese. This corpus has been split in the following works Wang et al. (2021); Lo et al. (2021); and Yan et al. (2021) for training, development, and testing sets. Furthermore, the Interactive Spoken Language Education (ISLE) dataset (Arora et al., 2017). It consists of 23 German and 23 Italian native speakers speaking English as a second language.

Mandarin Corpus Like the English language, there is a different standard corpus for Mandarin. Mandarin is the spoken language in northern and southwestern China. In Hu et al. (2014) work, a speaker-independent, enormous vocabulary and continuous Mandarin speech database (BJ2003) is used for acoustic modeling. Another corpus is Chinese National Hi-Tech Project 863 for Mandarin large vocabulary continuous speech recognition (LVCSR) system. This corpus is a native corpus used in Guo et al. (2019); Wana et al. (2020); Dong and Xie (2019); Li et al. (2018); Li, Wei & Chen et al. (2017); Li et al. (2019); Li et al. (2016); and Wang et al. (2022) for training. In Wana et al. (2020) work, the Development and test data are from a Chinese L2 speech database called BLCU inter-Chinese speech corpus. The BLCU is also used in Dong and Xie (2019). In addition, it is used in Wang et al. (2022) work as a non-native corpus. One more large-scale Mandarin learning corpus is called the iCALL corpus. This corpus is a nonnative corpus recorded by 300 beginning learners of Mandarin whose mother tongues are mainly of European origin. It was used to evaluate the performance of the proposed pronunciation measure in Hu et al., (2015) work. In Li et al. (2018); and Li, Wei & Chen et al. (2017) works, a sup-set of iCALL was mixed with Chinese National Hi-Tech Project 863 to train the speech attribute and phone classifiers. 51

 Mispronunciation Detection Using Neural Networks for Second Language Learners

Another corpus is the Aishell corpus. It is an open-source Mandarin speech corpus published by Beijing Shell Technology Co., Ltd. 400 people from different accent areas in China were invited to participate in the recording. This corpus is used in Guo et al. (2019) work as the testing set. In Li et al. (2019) work, three speech corpora were used to train acoustic tonal models and mispronunciation veriðers. The native speech corpora are the 863 LVCSR corpora and the THCHS-30 corpus. The non-native corpus is iCALL. Also, in Li et al. (2016), two corpora used the native one is the 863 LVCSR corpus. The non-native corpus is iCALL. In Zhang et al. (2020), different corpora were used. The Standard Speech Corpora CCTV, China Central Television (CCTV) news speech corpus was used to train a standard acoustic model, and the Non-Standard Speech Corpora used was PSC-1176, a spot speech corpus.

The Neural Network Architectures This section will review 26 novel neural networks used in mispronunciation detection systems. These networks were proposed in research papers published between 2014 and 2022. Figure 2 shows the distribution of documents over the years. Figure 2. Number of papers in each year

Almekhlafi et al. (2021) have proposed a network VGG–based model for classifying the proposed AAPD corpus. In their work, they performed three experiments. The first experiment aimed to choose the most efficient technique for extracting features from the proposed dataset. MFCC, Mel-Spectrogram, and FilterBank techniques were applied for extracting features with Mel-bands number 128. Then, the outputs of each method were used as training inputs to the proposed VGG-based model. As a result, the MFCC technique has proven its efficiency in extracting important and correct features. And that is why they selected the MFCC technique for the rest of their work (Almekhlafi et al., 2021). The second experiment aimed to choose an appropriate Mel-bands number. The third experiment compared and evaluated the proposed VGG-based model with three models AlexNet, RNN, and Resnet. Another deep neural network proposed by Hu et al. (2014) is a DNN-based tone-embedded model. Their work employed a DNN-based framework for embedded tone modeling, where tone features are appended to the spectral characteristics for modeling Mandarin speech. To evaluate the performance of

52

 Mispronunciation Detection Using Neural Networks for Second Language Learners

the proposed model, they compared it with the baseline DNN system of 39 MFCC features. As a result, they obtained an improvement of 2% EER of mispronunciation detection reduction by comparing two DNN-based systems without/with embedded tone modeling. In another study, a multi-view approach was proposed for a mispronunciation verification task (Wana et al., 2020). Acoustic phone embeddings and multi-source information embeddings were jointly learned in the training process. They used bottleneck features and speech attribute patterns as multi-source information input views. Embedding models in multi-view network structures for each view were consistent with the single-view setup. Hu et al. (2014) built a neural network-based logistic regression classifier for phone-level mispronunciation detection. The phone mispronunciation detection can be formulated as a 2-class classifier task. They first grouped the whole data set into M subsets by its phoneme label. Then For each subgroup, a 2-class logistic regression classiðer was trained for the classification decisions. Instead of training each phoneme speciðc classifier separately, they model all M classifiers jointly in a neural network-based framework. The proposed method in Hu, Qian, et al. (2014) work was compared with conventional GOP and SVM-based methods on an isolated word corpus recorded by non-native English learners. The proposed NN-based LR classiðer outperformed the GOP-based approach by a large margin and the SVM-based classiðers. Another Deep Neural Network (DNN) trained acoustic model, and transfer learning-based Logistic Regression (LR) classifier was presented by Hu et al. (2015) in their work. Firstly, the baseline acoustic model was trained as context-dependent GMM-HMM models (CD-GMM-HMM) in the Maximum Likelihood (ML) sense. An MFCC extracted the acoustic features. M. Guo et al. (2019) proposed a modeling method based on the fine-grained speech attribute (FSA). HMM/TDNN framework is used to design attribute detectors in the proposed approach. Then a largescale Chinese corpus was then used to train Time to delay neural networks (TDNN) based on speech attribute models and tested on a Russian learner data set. Then expanded frames of input speech (MFCC) and i-Vector features were fed into each front-end classifier; then, they generated current frame likelihoods about each possible attribute within that category. A group of the frame attribute posteriors was used to evaluate the cross-language ability of FSA modeling methods on Chinese and English test sets and fed into the back-end module for mispronunciation detection. It generated phoneme level posterior probability for sub-segmental mispronunciation detection. Moreover, they have completed pronunciation error detection on the Russian L2 learner corpus. The performance of three modeling methods was compared, namely Monophone HMM+TDNN, Triphone HMM+TDNN, and Monophone HMM+Context-independent DNN. Experimental results have shown that this approach extracts the frame-level accuracy rate of speech attributes and achieves better detection results than segment-based approaches. Besides, they also tested the trained FSA-based Method on native English corpus, and the proposed method performed about a 50% accuracy rate. A further multiple-layer ANN was proposed by Maqsood et al. (2019). A multiple layers ANN was used with various hidden layers in their work. To train the classifier, a backpropagation algorithm was used. They started with pre-processing for noise removal and segmentation of Arabic consonants. Then, in the feature extraction, a large set of low-level descriptors were calculated, comprising the first 14 coefficients MFCCs and its first and second delta, RMS Energy, Pitch, Entropy, Spectral features, Cepstrum features, low energy, and zero-cross. A separate ANN classifier was trained to detect mispronunciation for each group. An artificial neural network classifier was used to create nodes for each target label sepa-

53

 Mispronunciation Detection Using Neural Networks for Second Language Learners

rately. Two different testing conditions were used to evaluate both ways. The proposed system covered the pronunciation mistakes for all 28 consonants in the Arabic language and achieved 82% accuracy. Hu et al. (2015) extended the Goodness of Pronunciation (GOP) algorithm from the conventional GMM-HMM to DNN-HMM and further optimized the GOP measure toward L2 language learners’ accented speech. In DNN model training, multi-layer neural networks were trained as nonlinear basis functions to represent speech. In contrast, the top layer of the network was trained discriminatively as the posterior probabilities of sub-phones (“Senones”). Two types of databases were used in these experiments. A native database was used to train the native acoustic model. The second one is a non-native database used to evaluate the performance of different GOP approaches. Then to evaluate the effectiveness of their approach to mispronunciation diagnosis in other languages, they tested them in a continuously read L2 Mandarin learners’ corpus. The results showed that the EER is reduced using the proposed method. A network called the CNN-RNN-CTC model was proposed by Leung et al. (2019). This network architecture consisted of 5 parts. The ðrst part is the input layer; it accepts the framewise acoustic features. This is followed by a batch normalization layer and a zero-padding layer. The second part is convolution; it contains 4 CNN layers, 2 Maxpool layers, and followed by the batch normalization layer. The layer should capture the high-level acoustic features from the input layer. The third part is bi-directional RNN which captures the temporal acoustic features. Gated Recurrent Unit (GRU) was used instead of Long Short-Term Memory (LSTM) since it is more straightforward than LSTM and can speed up the training process. The fourth part is MLP layers (Time Distributed Dense layers); it ends with a SoftMax layer for the classification output. The last part is the CTC output layer used to generate the predicted phoneme sequence. They computed the accuracy and the f-score and compared them with other approaches to evaluate the performance. The experiment results in Leung et al. (2019) work showed that the proposed method signiðcantly outperforms previous approaches like APM, AGM, and APGM. Dong and Xie, (2019) proposed a feature adaptation method using the Correlational Neural Network (CorrNet). This model is a multi-view model and aims to learn a common representation from two data views. In the experiment, the input feature is Fbank applying CMVN with ten frames in context and 11 frames as an input feature. The common and output layer’s nodes were 50, input layer and one hidden layer nodes were 500, 300. The BNF extractor was trained, and its input feature was the same as CorrNet. BNF extractor has six layers; each layer has 625 nodes. The last hidden was used as a bottleneck layer, and it has 27 nodes. Kaldi toolkit was used to train Gaussian Mixture Modeling (GMM), Hidden Markov Modeling (HMM), and TDNN; TDNN has six hidden layers, and each layer had 850 nodes. The GMM-HMM model generated the alignments. The input feature was Fbank, CorrNet, and BNF. The experiment results showed that the CorrNet methods could reduce the influence of mismatch. Convolutional (CNN) and recurrent neural networks (RNN), as well as their combination (CRNN) for detecting typical pronunciation errors, was proposed by Diment et al. (2019). The proposed architectures included either convolutional (CNN), recurrent layers (more speciðcally, gated recurrent units, GRU), or their combination (a convolutional recurrent neural network, CRNN), with possible additional fully connected (FC) layers. The CRNN architecture, consisting of a set of convolutional layers, was followed by the recurrent ones and aimed at combining the benefits of both. They used the bidirectional variant of the architecture (BiGRU), which increases the amount of information available for the network at each step. The proposed architecture consists of a variable number of CNN, BiGRU, and FC layers and a one-unit output layer. The performance of the proposed system, evaluated on a subset of the dataset, shows that it can detect pronunciation errors with high accuracy.

54

 Mispronunciation Detection Using Neural Networks for Second Language Learners

A new end-to-end ASR system based on improved hybrid CTC/attention architecture to detect pronunciation errors was proposed by Zhang et al. (2020). This method has five main steps: the first step is data preparation. Next is acoustic feature extraction. To extract Mel scale ðlter bank coefficients and fundamental frequency features from speech waveforms. Following is encoder and decoder network training using hybrid CTC/attention end-to-end architecture. The advantages of a CTC and attention mechanism were fully utilized in encoding and decoding. Bidirectional long short-term memory projection (BLSTMP) was selected to reduce the error rate and accelerate the training. The encoder network was trained by the CTC criterion and the attention mechanism, and the probability of CTC was considered to ðnd more consistent inputs. The CTC probability enforces monotonic alignment in the decoding process and does not allow large jumps or the cycle of the same frame. At the same time, CTC and attention-based probability scores were calculated to obtain robust decoding results. After that next step is speech recognition. Recognition results were obtained by using the end-to-end network models from step 3. The final step is Sequence comparison. To evaluate the performance of this method, they calculated the accuracy and f-score. They compared it with the gaussian mixture model–hidden Markov model (GMM–HMM) and deep neural network-hidden Markov model (DNN–HMM). Wang et al. (2021) presented a novel E2E neural method for mispronunciation detection and diagnosis, consisting of two-ingredient elements, a dictation model in tandem with a pronunciation model. The former dictates the phone sequence of an L2 learner’s utterance. In contrast, the latter judges whether each dictated phone is a correct pronunciation or not, given its confidence score and the prompt text corresponding to the statement. The Mask-CTC dictation model that consists of three-ingredient components is trained with a joint CTC and mask-predict objective. In the inference stage, the dictated phone sequence of an L2 learner is initialized with the CTC’s output phone sequence: A Conformer-based encoder component first converts frame-wise acoustic feature vectors into intermediate frame-wise phonetic representations, followed by the frame-wise latent alignment between intermediate frame-wise phonetic words and the output phone sequence. The dictated phones with low confidence scores are masked subsequently. In turn, each of these masked phones is respectively predicted by conditioning on other high-confidence phones in its both-side context using a Transformer-based conditional masked language model. The augmented phone embedding sequence is fed into the Transformer encoder part to obtain an intermediate phone embedding sequence. Finally, an additional single- or multi-layer feedforward neural network (FFN) with the SoftMax normalization, stacked on top of the PMG component and taking the modulated intermediate phone embedding sequence as the input, is employed to make the ultimate MD&D decision. Lo, et al. (2021) proposed a mechanism to overcome the enormous number of parameters that often need to be estimated for an E2E neural method and the possible lack of sufficient labeled training data. They proposed to perform two simple augmentation mechanisms, input augmentation, and label augmentation. The Input augmentation (IA) for a Connectionist temporal classification attention-based mispronunciation detection model (CTC- ATT) by concatenating the original input spectral feature vector of each speech frame with a phonetic posteriorgram (PPG) vector, which is produced by a pre-trained Deep neural network – hidden Markov (DNN-HMM) acoustic model. In their work, the hybrid DNNHMM acoustic model was trained on both the native TIMIT and the non-native L2-ARCTIC dataset. It was used to extract the phonetic PPG vector that corresponds to each speech fame of an L2 learning utterance. For label augmentation, the common practice is to train the baseline E2E MD model with label smoothing, which is instantiated with the unigram smoothing mechanism. Unigram smoothing 55

 Mispronunciation Detection Using Neural Networks for Second Language Learners

is employed mainly because it can be exploited to capture the phonological patterns in training data transcripts. They conducted sets of experiments on the L2-ARCTIC benchmark dataset, which show the effectiveness and practical feasibility of the modeling paradigm in comparison to some top-of-the-line models such as Goodness of Pronunciation (GOP) CNN-RNN-CTC (Lo et al., 2021). Li et al. (2018) proposed three techniques to improve mispronunciation detection. The first technique was extending the model from deep neural network (DNN) to bidirectional long-short-term memory (BLSTM) to model tone-level co-articulation influenced by a broader time-related context. The BLSTM model has two hidden layers, with 320 memory cells for each layer. Second, they relaxed the complex labels to characterize the situations when a single-tone class label is insufficient. Thus, soft targets are proposed for acoustic model training instead of conventional challenging targets. Third, the average tone scores produced by BLSTM models trained with hard and soft targets to look for the complementarity from modeling at the tone-target levels. Compared to their previous system based on the DNN-trained ERNs, the BLSTM-trained plan with soft targets reduces the equal error rate (ERR), and the system combination decreases EER. To improve the detection performance of phonetic mispronunciations produced by second language learners, Li, Wei & Chen et al. (2017) suggested that the speech attribute features can be a complement to the conventional phone features when merging them into pronunciation representations as inputs to the LSTM-based classifiers. And they show that in their paper. The model is prepared as follows. First, frame-level speech attributes and phone posteriors are concatenated to form a new feature vector. Then a phone-dependent LSTM mispronunciation detector is trained on this vector according to the phone level labels (correct/incorrect). The LSTM has two hidden layers, each with 128 memory cells. Before training the LSTM parameters, two data pre-processing steps have been executed to deal with the variable length of input sequences and data imbalance problems. The acceptance rate (FAR) and false rejection rate (FRR) were used to measure the system performance. After modeling dynamic changes in frame-level pronunciation scores, the proposed framework significantly reduces the FAR and FRR. Akhtar et al. (2020) developed two models to detect mispronunciation of Arabic words. The first model is CNN features-based model. In this model, they extracted deep features using a pre-trained model of Alex Net to systematize the feature extraction procedure. To remove elements from Alex’s net, they converted audio signals into spectrograms and then passed them to CNN. The components were extracted through fully connected layers of Alex Net, i.e., layer 6, layer 7, and layer 8. Then the correlation-based feature selection technique was applied to these features because these layers have features with high dimensions and need lots of time to process. Then three different classiðers, KNN, SVM, and RF, were applied to estimate the performance of those features. The second model is the transfer learning-based model. They fed the spectrogram dataset to Alex net in this model. They extracted the features automatically and so performed classiðcation on these features. These features are transferred to a second task to train the target dataset. The feature extraction and classiðcation tasks are performed automatically in transfer learning. They input the data to the model within the form of a spectrogram, and as an output, they got a mispronunciation score. Within the transfer learning-based model, the process of ðne-tuning is performed through different parameters like learning rate and bias. The performance of the proposed framework is evaluated based on accuracy. One more model is called the CNN-GRU-CTC (convolutional neural network (CNN) - with Gated Recurrent Unit (GRU) - and Connectionist Temporal Classification (CTC) model) established by Gan et al. (2021) in their work. This model consists of 5 parts. The first part is the input layer which accepts the framewise acoustic features. A zero-padding layer was added to ensure that all statements in a batch 56

 Mispronunciation Detection Using Neural Networks for Second Language Learners

have the same length. The second part is convolution which contains 6 CNN layers and three max pool layers, followed by the batch normalization layer. The layer should capture the high-level acoustic features from the input layer. The third part is bi-directional RNN; it captures the temporal acoustic features. LSTM is more complicated than GRU; using GRU can speed up the training process. The fourth part is MLP layers, which end with a SoftMax layer for the classification output. The last part is the CTC output layer used to generate the predicted phoneme sequence. Li et al. (2019) used a two-stage tone mispronunciation detection framework. Within the ðrst stage, the entire spoken utterance spanning multiple tonal phonemes is employed to train the acoustic non-native tonal model, similarly to the standard DNN-HMM ASR training framework. Next, they get individual tone segments by forced alignment using the tonal acoustic model. The trained tonal acoustic model was used within the second stage to extract frame-level features within each tone segment. The GOP-, DNN-, or BLSTM-based veriðer was designed to detect whether the current tone is mispronounced or not. In the testing phase, word or phone sequences are assumed to be known. Therefore, the acoustic tonal model and the word-/ phone-level transcriptions were used to generate tone segments of non-native learners’ utterances by forced alignment. The feature employed in this study contains a 43-dimensional log Mel ðlter bank (LMFB) coefðcients, F0, the probability of voicing (POV), and also the corresponding derived velocity and acceleration features. Also, in this work, they extend the tone model from a deep neural network (DNN) to a bidirectional long short-term memory (BLSTM) network to model the high variability of non-native tone productions more accurately. Also, they characterized ambiguous pronunciations where L2 learners’ tone realizations are between two canonical tone categories by relaxing complex target labels to soft targets with probabilistic transcriptions. Additionally, to enhance mispronunciation detection, the segmental tone features fed into veriðers are extracted by a BLSTM to exploit sequential context information (Li et al., 2019). The automated speech attribute transcription (ASAT) paradigm was adopted by Li et al. (2016) to build the mispronunciation detection framework. They focused on mispronunciations that occurred at the segmental and sub-segmental levels to enhance performance. In their work, they used speech attribute scores to measure pronunciation quality at a subsegmental level. Neural network classifiers integrate these speech attribute scores to generate segmental pronunciation scores. The feature extraction module consists of a bank of speech attribute classifiers. A context-dependent DNN-based attribute classifier is separately built for each category. Expanded input speech frames are fed into each detector, generating the current frame posteriors regarding each possible attribute within that category. Subsequently, many frame attribute posteriors were provided in the following module. Then the goodness of pronunciation (GOP) is used, and a threshold is needed to verify whether this unit is correctly pronounced. The proposed framework reduces the equal error rate compared with the conventional phone-based GOP (Goodness of Pronunciation) system. Li et al. (2017) investigated the employment of multi-distribution deep neural networks (MD-DNNs) for automatic intonation classiðcation in second-language English speech. They used the pitch accent detector, which is an MD-DNN. The syllable-based prosodic features include syllable nucleus duration, maximum loudness, and a pair of dynamic pitches (fm1 and fm2). These features are used as input features. There are 15 binary units within the bottom of the MD-DNN for the lexical and syntactic features and 20 linear units with Gaussian noise for the syllable-based prosodic features. Above the bottom layer, there are three hidden layers of 128 units. Two units generate the posterior probabilities of being accented or unaccented for the top output layer. To transcribe speech data for intonation classiðcation, they proposed the RULF labels, which are used to transcribe an intonation as rising, upper, lower, or 57

 Mispronunciation Detection Using Neural Networks for Second Language Learners

falling. These four forms of labels are further merged into two groups ~ rising and falling. The MDDNN for intonation classiðcation includes NFAET linear units with Gaussian noise to characterize the pitch contour over the FAET. Wang, et al. (2018) proposed an approach to evaluating L2 learners’ goodness of pronunciation based on phone embedding and Siamese networks to address mispronunciation verification and pronunciation evaluation. The GOP-based evaluation system was implemented to train the context-dependent (CD)-DNN-HMM acoustic models. Using the Mel Frequency output layer, they employed the SoftMax Cepstrum Coefficient (MFCC) and pitched with deltas and delta-delta coefficients as the 48-dimensional acoustic features. The DNN network contained six hidden layers; each consisted of 1024 sigmoid units. The input of DNNs was an augmented 11-frame vector composed of 5 preceding, current, and five succeeding frames. The GOP-based system was developed as a baseline taking in augmented frame-level feature vectors to generate frame log posteriors. The Siamese network structures described in their work are as follows: DNN SIA: 1024-unit fully connected RELU; 512-unit fully linked RELU; 256 fully connected linear; terminates in the contrastive loss. CNN SIA: 1-D convolution with 96 filters over nine frames; ReLU; max-pooling over three units; 1-D convolution with 96 filters over eight teams; ReLU; max-pooling over three groups; 1024-unit fully connected ReLU; 256-unit fully linked linear; terminates in Euclidean contrastive loss or cosine similarity loss. CNN TRI: same structure with CNN SIA, loss function is applied hinge cosine loss. They calculated the accuracy and compared it with the GOP-based and DNN-HMM methods to evaluate the performance (Wang et al., 2022). Yan et al. (2021) explored the use of a discriminative objective function for training E2E MDD models, aiming to maximize the expected F1 score directly. First, they trained the hybrid CTC-Attention model. The training objective of the hybrid CTC-Attention neural model is to maximize a logarithmic linear combination of the posterior probabilities predicted by CTC and the attention-based model. The baseline E2E MDD method was implemented with the CTC-Attention neural architecture, which was pretrained on the training utterances from the TIMIT corpus and then fine-tuned on the training utterances of the L2-ARCTIC dataset. Further, the model was trained with the proposed maximum F1 score on the training utterances of the L2ARCTIC dataset to enhance the discriminative power of the MDD method. The encoder modules comprised the VGG-based deep CNN component and a bidirectional LSTM component with 1,024 hidden units. In contrast, the input to the network is 80-dimensional Mel-filter-bank feature vectors extracted every 10 ms with a Hanning window size of 25 ms. The decoder modules of the experimental models consist of a single-layer LSTM component with 300 hidden units. In the experiment, the proposed model results were compared with the GOP-based method, and they found that the E2E models can yield better F1-score results than the GOP-based method. The use of multi-distribution deep neural networks (DNNs) for mispronunciation detection and diagnosis (MDD) was investigated by Li et al. (2017) in their work to overcome the difðculties encountered in an existing approach based on extended recognition networks (ERNs). They combined two models, The acoustic-phonemic model (APM) and the acoustic-graphemic model (AGM), to propose the acousticgraphemic-phonemic model (AGPM). In the bottom layer of the AGPM, there are 273 linear units with Gaussian noise for MFCC features (13 MFCCs represent 21 frames and each frame) as well as 77 binary units standing for seven graphemes (each grapheme is described by 5 bits) and seven canonical phonemes (each phoneme is encoded with 6 bits). Above the bottom layer, there are four hidden layers and one output layer, the same as the APM and AGM. This AGPM uses acoustic features and corresponding graphemes and canonical phonemes as its input features. They attempt to implicitly model error patterns 58

 Mispronunciation Detection Using Neural Networks for Second Language Learners

through phonological transduction or letter-to-sound conversion integrated into acoustic modeling. The proposed AGPM method gains a signiðcant performance improvement overall. Nazir et al. (2019) developed three models for mispronunciation detection in their work. The first model is the HandCrafted_Features model. In this model, ðrstly, they extracted important features to distinguish the phonemes. They used spectral features (Mfcc, Chroma, Mel spectrogram, spectral contrast, Tonnetz, pitch, root mean square energy, and zero-crossing rate) and statistical features (mean, standard deviation, and slope) to detect mispronunciation. Secondly, the data were pre-processed to handle the sparsity problem. Thirdly they reduced the feature space to obtain the best features for classiðcation. Finally, they pass the features to classiðcation algorithms for mispronunciation detection like KNN, SVM, and NN. The second model is the CNN_Features model. This model converted the audio data set to the spectrograms dataset and passed them to the convolutional neural network for the feature extraction process. The features were extracted from convolutional layers (Conv4 and Conv5) and fully connected layers (FCL6, FCL7, FCL8) of the pre-trained model. Then they applied KNN, SVM, and NN on extracted features for mispronunciation detection. The third model is the transfer learning model. In this model, the spectrogram data were passed to pertained convolutional neural network AlexNet, and feature extraction and classiðcation were automatically done by the convolutional neural network and detected mispronunciation. The accuracy was used as an evaluation parameter to evaluate the performance of the proposed methods. Arora et al. (2017) proposed an acoustic model consisting of a multi-task deep neural network, which uses a shared representation to estimate the phonological features and HMM state probabilities. The present system uses 18 phonological features that characterize all the phonemes of English. The proposed method is based on the DNN-HMM framework. Table 1 shows a summary of the literature review. The research papers are organized in four groups according to the application language (Arabic, English, English & Mandarin, and Mandarin). Then, it is ordered according to the publishing year in each group.

59

 Mispronunciation Detection Using Neural Networks for Second Language Learners

Table 1. Summary of the literature review Proposed Method

Corpus (Dataset)

Application Language

Published Year

Library

An Efficient Mispronunciation Detection System Using Discriminative Acoustic Phonetic Features for Arabic Consonants

Acoustic Phonetic Feature (APF) based technique

Manually constructed Arabic phoneme

Arabic

2019

The International Arab Journal of Information Technology

(Nazir et al., 2019)

Mispronunciation Detection Using Deep Convolutional Neural Network Features and Transfer Learning-Based Model for Arabic Phonemes

convolutional neural network features (CNN_Features)based technique a transfer learningbased technique

Manually constructed Arabic dataset

Arabic

2019

IEEE Access

(Akhtar et al., 2020)

improving mispronunciation detection of Arabic words for Non-Native Learners using deep convolutional neural network features

CNN features-based model The transfer learningbased model

Manually constructed 27 Arabic words

Arabic

2020

MDPI journals

(Almekhlafi et al.,2022)

A classification benchmark for Arabic alphabet phonemes with diacritics in deep neural networks

Visual Geometry Group (VGG)-based model

Arabic alphabet phonetics dataset (AAPD)

Arabic

2021

Science Direct

(Hu, Qian, et al., 2014)

A New Neural NetworkBased Logistic Regression Classifier For Improving Mispronunciation Detection of L2 Language Learners

a new neural networkbased logistic regression classifier

telephone speech corpus “LDC95S27.”

English

2014

IEEE

(Li et al., 2017)

Intonation classification for L2 English speech using multi-distribution deep neural networks

multi-distribution deep neural networks (MD-DNNs)

Supra-CHLOE

English

2017

Science Direct

(Li & Qian et al., 2017)

Mispronunciation Detection and Diagnosis in L2 English Speech Using Multidistribution Deep Neural Networks

acoustic-graphemicphonemic model (AGPM) using a multi-distribution DNN

TIMIT, CU- CHLOE

English

2017

IEEE

(Arora et al., 2017)

Phonological Feature-Based Mispronunciation Detection and Diagnosis using MultiTask DNNs and Active Learning

a multi-task deep neural network, and HMM state probabilities

Interactive Spoken Language Education (ISLE) dataset

English

2017

ora.ox.ac.uk

(Leung et al., 2019)

CNN-RNN-CTC based end-to-end mispronunciation detection and diagnosis

end-to-end speech recognition (CNNRNN-CTC model)

TIMIT and CUCHLOE

English

2019

IEEE

(Diment et al., 2019)

Detection of Typical Pronunciation Errors in Nonnative English Speech Using Convolutional Recurrent Neural Networks

convolutional (CNN) and recurrent neural networks (RNN), as well as their combination (CRNN)

collected dataset of recordings of 120 speakers pronouncing 80 different words.

English

2019

IEEE

(Wang et al., 2021)

Exploring nonautoregressive end-to-end neural modeling for English mispronunciation detection and diagnosis

a novel use of non- autoregressive (NAR) E2E modeling framework

the L2-ARCTIC b

English

2021

arXiv.org

(Lo et al., 2021)

Improving End-ToEnd Modeling for Mispronunciation Detection with Effective Augmentation Mechanisms

hybrid DNN-HMM acoustic model

TIMIT dataset The L2-ARCTIC dataset

English

2021

arXiv.org

Reference

Title

(Maqsood et al., 2019)

Continued on following page

60

 Mispronunciation Detection Using Neural Networks for Second Language Learners

Table 1. Continued Application Language

Published Year

Library

the L2-ARCTIC dataset

English

2021

arXiv.org

extend the Goodness of Pronunciation (GOP) algorithm from the conventional GMM-HMM to DNN-HMM

native database:’ NYNEX Non-native: “LDC95S27”

English and Mandarin

2015

International Speech Communication Association

a Deep Neural Network (DNN) introduced an acoustic model and transferred the learning-based Logistic Regression (LR) classifier

for English: “LDC95S27” for Mandarin: iCALL

English and Mandarin

2015

Science Direct

English and Mandarin

2019

IEEE

Reference

Title

Proposed Method

(Yan et al., 2021)

Maximum F1-score training for end-to-end mispronunciation detection and diagnosis of L2 English speech

hybrid CTC-Attention model

(Hu et al., 2015)

An Improved DNNbased Approach to Mispronunciation Detection and Diagnosis of L2 Learners’ Speech

(Hu, Qian et al., 2015)

Improved mispronunciation detection with a deep neural network trained acoustic models and transferred learning-based logistic regression classifiers

Corpus (Dataset)

(Guo et al., 2019)

A Study on Mispronunciation Detection Based on Fine-grained Speech Attribute

A fine-grained speech attribute (FSA) modeling method

training corpus: Hi-Tech Project 863 and Aishell 178 Test sets 7000 utterances from Aishell corpus and 6000 statements from Timit.

(Hu et al., 2014)

DNN-based acoustic modeling of tonal language and its application to Mandarin pronunciation training

Deep Neural Network (DNN) based approach to acoustic modeling of tonal language

(BJ2003)

Mandarin

2014

IEEE

(Li et al., 2016)

Improving non-native mispronunciation detection and enriching diagnostic feedback with DNN-based speech attribute modeling

attribute-based classifier system

Hi-Tech Project 863

Mandarin

2016

IEEE Access

Improving Mispronunciation Detection for Non-Native Learners with Multisource Information and LSTMBased Deep Models

deep neural network models (DNNs) with long short-term memory (LSTM)

a native speech corpus Hi-Tech Project 863 a subset of a non-native speech corpus iCALL

Mandarin

2017

research gate

(Li et al., 2018)

Improving Mandarin Tone Mispronunciation Detection for Non-Native Learners with Soft-Target Tone Labels and BLSTM-Based Deep Models

Three techniques to improve mispronunciation detection of Mandarin tones of the second language (L2) learners using tone-based extended recognition network (ERN)

native speech corpus: National Hi-Tech Project 863 non-native speech: corpus called iCALL

Mandarin

2018

IEEE Access

(Wang et al., 2018)

L2 Mispronunciation Verification Based on Acoustic Phone Embedding and Siamese Networks

An approach to evaluating L2 learners’ goodness of pronunciation based on phone embedding and Siamese networks

Hi-Tech Project 863

Mandarin

2018

IEEE Access

(Li, Wei & Chen et al., 2017)

Continued on following page

61

 Mispronunciation Detection Using Neural Networks for Second Language Learners

Table 1. Continued Application Language

Published Year

Library

National Hi-Tech Project and BLCU

Mandarin

2019

IEEE

the BLSTM-GOP framework trained with soft targets

863 LVCSR corpus and THCHS-30 corpus iCall

Mandarin

2019

IEEE

Mandarin

2020

IEEE

Reference

Title

Proposed Method

(Dong et al., 2019)

Correlational Neural Network-Based Feature Adaptation in L2 Mispronunciation Detection

feature adaptation method using Correlational Neural Network (CorrNet)

(Li et al., 2019)

Improving Mispronunciation Detection of Mandarin Tones for Non-Native Learners With Soft-Target Tone Labels and BLSTM-Based Deep Tone Models

Corpus (Dataset)

(Wana et al., 2020)

A multi-view approach for Mandarin nonnative mispronunciation verification

multi-view approach

Hi-Tech Project 863 (LVCSR) system development. Development and test data: BLCU.

(Zhang et al., 2020)

End-to-End Automatic Pronunciation Error Detection Based on Improved Hybrid CTC/ Attention Architecture

end-to-end ASR system based on improved hybrid CTC/attention architecture

CCTV and PSC1176

Mandarin

2020

MDPI journals

Improving Mispronunciation Detection of Mandarin for Tibetan Students Based on the End-To-End Speech Recognition Model

CNN-GRU-CTC acoustic model with Gated Recurrent Unit (GRU), convolutional neural network (CNN), and Connectionist Temporal Classification (CTC) technologies

The standard mandarin pronunciation corpus

Mandarin

2021

IEEE Access

(Gan et al., 2021)

RESULTS Several evaluation matrices are used to evaluate the performance of any proposed method for mispronunciation detection. Figure 3 shows the evaluation metrics used in the discussed research papers and the number of papers used for each one of these metrics.

62

 Mispronunciation Detection Using Neural Networks for Second Language Learners

Figure 3. Number of papers that used the evaluation metrics

Exactly 26 papers that proposed neural network methods for mispronunciation detection were reviewed. The performance of each one of them was tested using different evaluation metrics, and the results were compared with other methods. Table 2 shows the detailed results of the performance and the comparison.

63

 Mispronunciation Detection Using Neural Networks for Second Language Learners

Table 2. Performance results and comparisons Proposed Method

Proposed Method Results (%)

Compared With

Compared Method Result (%)

Accuracy Recall

Acoustic Phonetic Feature (APF) based technique

82.27 82.1

computer-assisted pronunciation training (CAPT)

52.2 -

Accuracy

CNN features-based model The transfer learningbased model

93.2

Confidence scoring Global average loglikelihood ANN MFCC with GMMUBM MFCC + ANN

62 76.66 82.7 75 92.42

(Nazir et al., 2019)

Accuracy

convolutional neural network features (CNN_ Features)-based technique a transfer learning-based technique

91 92

baseline handcrafted features-based method the state-of-art techniques

82.9 82

(Almekhlafi et al.,2022)

Precision Recall F1-score accuracy

Visual Geometry Group (VGG)-based model

95.71 95.02 95.36 95.26

AlexNet ResNet LSTM

87.5,84,04, 85.7,85.8 96.98,96.8, 96.99,96.9 85.03,83.7, 84.3,84.3

(Hu, Qian, et al., 2014)

precision recall Accuracy

a new neural networkbased logistic regression classifier

78.67 59.84 92

The conventional GOP-based phoneme specific SVM-based approaches

41.57, 48.13, 88.27 78.38,49.23, 90.88

F-measure Accuracy

end-to-end speech recognition (CNN-RNNCTC model)

74.62 89.38

Extended Recognition Network (ERN) State-level Acoustic Model (S-AM) APM AGM APGM

51.55, 84.07 56.41, 78.66 68.10, 89.75 71.04, 90.18 72.61, 90,94

(Diment et al., 2019)

F1-score, accuracy

convolutional (CNN) and recurrent neural networks (RNN), as well as their combination (CRNN), for detecting typical pronunciation errors

90.03 91.57

-

-

(Wang et al., 2021)

F1-score Precision recall

a novel use of nonautoregressive (NAR) E2E modeling framework

46.77 47.87 45.73

GOP CNN-RNN-CTC

48.52, 50.15, 46.99 45.94, 67.29, 34.88

(Lo et al., 2021)

precision recall F-score

hybrid DNN-HMM acoustic model

43.80 61.23 51.07

GOP CNN-RNN-CTC

46.99, 50.15, 48.52 34.88, 67.29, 45.94

(Li et al., 2017)

Accuracy

multi-distribution deep neural networks (MDDNNs)

93.0

-

-

(Yan et al., 2021)

recall precision F-score

hybrid CTC-Attention model

72.92 34.33 46.68

GOP CNN-RNN-CTC

52.88, 35.42, 42.42 76.91, 32.21, 45.41

Reference (Maqsood et al., 2019)

(Akhtar et al., 2020)

(Leung et al., 2019)

Evaluation Metrics

Continued on following page

64

 Mispronunciation Detection Using Neural Networks for Second Language Learners

Table 2. Continued Reference

Evaluation Metrics

Proposed Method

Proposed Method Results (%)

Compared With

Compared Method Result (%)

(Li & Qian et al., 2017)

Accuracy Precision Recall f-measure

acoustic-graphemicphonemic model (AGPM) using a multi-distribution DNN

90.94 76.05 69.47 72.61

extended recognition networks (ERNs)

84.07 47.47 56.41 51.55

(Arora et al., 2017)

EER

a multi-task deep neural network, and HMM state probabilities

28.3

the goodness of pronunciation (GOP) classifier based Method

39.0 31.8

F-score DA

A fine-grained speech attribute (FSA) modeling method

71.5 86.5

segment-based approaches

63.5 84.3

EER

extend the Goodness of Pronunciation (GOP) algorithm from the conventional GMMHMM to DNN-HMM

27.0

conventional GMMHMM system

32.9

precision recall accuracy

a Deep Neural Network (DNN) trained acoustic model and transfer learning-based Logistic Regression (LR) classifier

69.23 69.2 86.7

GOP support Vector Machine (SVM)

44.13, 44.08, 73.5 66.84, 69.2, 84.5

(Hu et al., 2014)

EER

Deep Neural Network (DNN) based approach to acoustic modeling of tonal language

25.5

the baseline DNN system of 39 MFCC features

27.4

(Gan et al., 2021)

(FRR) (FAR) (DA)

multi-view approach

5.2 24.07 91.43

GOP-based Method single-view method

21.86, 31.36, 80.93 5.3, 30.29, 90.69

(Dong et al., 2019)

Accuracy

feature adaptation method using Correlational Neural Network (CorrNet)

87.31

traditional Goodness of Pronunciation (GOP)

84.12

25.07, 68.11, 36.65, 70.55 46.09, 81.14, 58.79, 85,77

(Guo et al., 2019)

(Hu et al., 2015)

(Hu, Qian et al., 2015)

precision recall F-score accuracy

end-to-end ASR system based on improved hybrid CTC/attention architecture

57.47 81.45 67.39 90.14

Gaussian mixture model–hidden Markov model (GMM–HMM) a deep neural network is a hidden Markov model (DNN–HMM)

(Li et al., 2018)

(FAR) (FRR)

BLSTM (SOFT+HARD)

4.34 4.34

DNN-trained ERNs BLSTM

5.38 4.86

(Li, Wei & Chen et al., 2017)

(FAR) (FRR)

deep neural network models (DNNs) with long short-term memory (LSTM)

8.65 3.09

-

-

(CER) (FAR) (FRR) accuracy Recall

CNN-GRU-CTC acoustic model with Gated Recurrent Unit (GRU), convolutional neural network (CNN), and Connectionist Temporal Classification (CTC) technologies

14.91 22.56 7.26 88.35 77.44

conventional methods: DNNHMM-MFCC, DNN-HMM-Fbank,

56.76, 52.92, 60.59, 42.42, 47.08 44.82, 42.66, 46.98, 47.50, 57.34

(Zhang et al., 2020)

(Wana et al., 2020)

Continued on following page 65

 Mispronunciation Detection Using Neural Networks for Second Language Learners

Table 2. Continued Reference

Evaluation Metrics

Proposed Method Results (%)

Compared With

Compared Method Result (%)

the BLSTM-GOP framework trained with soft targets

11.03

DNN-GOP trained with challenging targets,

13.10

PHONE-BASED GOP SYSTEM PHONE-BASED CLASSIFIER SYSTEM

77.80, 90.84, 86.08 79.20, 91.32, 86.99

GOP-based Method DNN-HMM method

74.67, 32.81, 23.86 82.61, 43.29, 10.48

Proposed Method

(Li et al., 2019)

EER

(Li et al., 2016)

EER (DEA) (DA)

attribute-based classifier system

79.75 91.57 87.01

(Wang et al., 2018)

(DA) (FAR) (FRR)

An approach to evaluating L2 learners’ goodness of pronunciation based on phone embedding and Siamese networks

87.13 34.88 7.31

DISCUSSION Millions of people around the world learn foreign languages. However, many people do not receive one-to-one instruction to acquire the correct pronunciation. To solve this problem, automated computer tools are being developed to help foreign language learners. Pronunciation error detection systems based on computer-aided language learning attempt to detect errors made by language learners at either the phoneme, word, or sentence level and inform the learner of those errors. One of the newest and most promising solutions is the neural network models. And this field still needs more research and improvements. That is what motivated us to collect all the research papers that have been published between 2014 and 2022 hoping that it will help researchers in proceeding in this research area. The papers were chosen to be discussed in this paper according to three criteria. The first criterion is to be published in the chosen time period. The second is to present a new model for mispronunciation detection using a neural network. The third is the application language which should be one of the most used languages in the world (Arabic, English, and Mandarin). Additionally, the datasets used in the paper are considered. As can be noticed from the literature there is a lack in the Arabic datasets. Furthermore, there is no standard one for the Arabic language and that causes a lack of research on this language. As the result section shows, using neural network models increased the systems’ accuracy compared with other mispronunciation detection systems. Also, an improvement was shown in the evaluation matrices that were used to evaluate the performance of the proposed methods for mispronunciation detection such as F1-score, precision, and recall.

FUTURE RESEARCH DIRECTION Even though the models mentioned above have achieved satisfactory mispronunciation detection results, there is still room for further improvement, mainly in some languages like Arabic and Turkish. Most of the studies were conducted using English and Mandarin since these are the most spoken and learned languages worldwide. There are numerous standard datasets for native and non-native speakers in these

66

 Mispronunciation Detection Using Neural Networks for Second Language Learners

two languages. Due to the lack of Arabic dataset availability, creating a common native and non-native dataset for Arabic will be a good contribution. It will help develop more research in the mispronunciation detection field for the Arabic language. Also, most of the studies proposed a neural network model that can be conducted for one language. Expanding models so that a model can detect mispronunciation errors for more than one language is still a challenging research area.

CONCLUSION This paper reviewed mispronunciation detection models using neural networks for second language learners. Through a series of experiments conducted in those research, it has been shown that the performance of mispronunciation detection systems can be significantly improved by adequately choosing the neural architecture and the training scheme best to fit the features of the corpus for each language. Also, it has been shown that neural methods have recently had some successes as a promising alternative to the classic GOP-based method and its variants for mispronunciation detection. In addition, some of the mispronunciation detection methods based on neural networks obtained high accuracy for some datasets but at the same time obtained low accuracy compared with other strategies for other datasets.

REFERENCES Akhtar, S., Hussain, F., Raja, F., Ehatisham-ul-haq, M., Baloch, N., Ishmanov, F., & Zikria, Y. (2020). Improving mispronunciation detection of arabic words for non-native learners using deep convolutional neural network features. Electronics (Basel), 9(6), 963. doi:10.3390/electronics9060963 Almekhlafi, E., AL-Makhlafi, M., Zhang, E., Wang, J., & Peng, J. (2022). A classification benchmark for Arabic alphabet phonemes with diacritics in deep neural networks. Computer Speech & Language, 71, 101274. doi:10.1016/j.csl.2021.101274 Amdal, I., Johnsen, M. H., & Versvik, E. (2009). Automatic evaluation of quantity contrast in non-native Norwegian speech. Proc. Int. Workshop Speech Lang. Technol. Educ., 21–24. Arora, V., Lahiri, A., & Reetz, H. (2017). Phonological feature-based mispronunciation detection and diagnosis using multi-task DNNs and active learning. Proc. INTERSPEECH 2017, 1432-1436. Diment, A., Fagerlund, E., Benfield, A., & Virtanen, T. (2019). detection of typical pronunciation errors in non-native English speech using convolutional recurrent neural networks. Proc. International Joint Conference on Neural Networks (IJCNN), 1-8. 10.1109/IJCNN.2019.8851963 Dong, W., & Xie, Y. (2019). Correlational neural network-based feature adaptation in L2 mispronunciation detection. Proc. International Conference on Asian Language Processing (IALP), 121-125. 10.1109/ IALP48816.2019.9037719 Flege, J. E., & Fletcher, K. L. (1992). Talker and listener effects on the degree of perceived foreign accent. The Journal of the Acoustical Society of America, 91(1), 370–389. doi:10.1121/1.402780 PMID:1737886

67

 Mispronunciation Detection Using Neural Networks for Second Language Learners

Gan, Z., Zhao, X., Zhou, S., & Wang, R. (2021). Improving mispronunciation detection of Mandarin for Tibetan students based on the end-to-end speech recognition model. Proc. International Symposium on Artificial Intelligence and its Application on Media (ISAIAM), 151-154. 10.1109/ISAIAM53259.2021.00039 Guo, M., Rui, C., Wang, W., Lin, B., Zhang, J., & Xie, Y. (2019). A study on mispronunciation detection based on fine-grained speech attribute. Proc. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 1197-1201. 10.1109/APSIPAASC47483.2019.9023156 Hu, W., Qian, Y., Soong, F., & Wang, Y. (2015). Improved mispronunciation detection with a deep neural network trained acoustic models and transferred learning-based logistic regression classifiers. Speech Communication, 67, 154–166. doi:10.1016/j.specom.2014.12.008 Hu, W., Qian, Y., & Soong, F. K. (2014). A DNN-based acoustic modeling of tonal language and its application to Mandarin pronunciation training. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 3206-3210. 10.1109/ICASSP.2014.6854192 Hu, W., Qian, Y., & Soong, F. K. (2014). A new neural network-based logistic regression classifier for improving mispronunciation detection of L2 language learners. Proc. The 9th International Symposium on Chinese Spoken Language Processing, 245-249. 10.1109/ISCSLP.2014.6936712 Hu, W., Qian, Y., & Soong, F.K. (2015). An improved DNN-based approach to mispronunciation detection and diagnosis of L2 learners’ speech. Proc. Speech and Language Technology in Education (SLaTE 2015), 71-76. Ito, A., Lim, Y.-L., Suzuki, M., & Makino, S. (2005). Pronunciation error detection method based on error rule clustering using a decision tree. Proc. 9th Eur. Conf. Speech Commun. Technol., pp. 173–176. Leung, W., Liu, X., & Meng, H. (2019). CNN-RNN-CTC based end-to-end mispronunciation detection and diagnosis. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 8132-8136. 10.1109/ICASSP.2019.8682654 Li, H., Liang, J., Wang, S., & Xu, B. (2009). An efficient mispronunciation detection method using GLDS-SVM and formant enhanced features. Proc. IEEE Int. Conf. Acoust., Speech Signal Process (ICASSP), 4845–4848. Li, K., Qian, X., & Meng, H. (2017). Mispronunciation detection and diagnosis in L2 English speech using multidistribution deep neural networks. Proc. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(1), 193-207. 10.1109/TASLP.2016.2621675 Li, K., Wu, X., & Meng, H. (2017). Intonation classification for L2 English speech using multi-distribution deep neural networks. Computer Speech & Language, 43, 18–33. doi:10.1016/j.csl.2016.11.006 Li, W., Chen, N., Siniscalchi, M., & Lee, C.-H. (2017). Improving mispronunciation detection for nonnative learners with multisource information and LSTM-based deep models. Proc. INTERSPEECH 2017, 2759-2763. 10.21437/Interspeech.2017-464 Li, W., Chen, N. F., Siniscalchi, S. M., & Lee, C.-H. (2018). Improving Mandarin tone mispronunciation detection for non-native learners with soft-target tone labels and BLSTM-based deep models. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 6249-6253. 10.1109/ ICASSP.2018.8461629

68

 Mispronunciation Detection Using Neural Networks for Second Language Learners

Li, W., Chen, N. F., Siniscalchi, S. M., & Lee, C.-H. (2019). Improving mispronunciation detection of Mandarin tones for non-native learners with soft-target tone labels and blstm-based deep tone models. Proc. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(12), 2012-2024. 10.1109/ TASLP.2019.2936755 Li, W., Siniscalchi, S. M., Chen, N. F., & Lee, C.-H. (2016). Improving non-native mispronunciation detection and enriching diagnostic feedback with DNN-based speech attribute modeling. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 6135-6139. 10.1109/ ICASSP.2016.7472856 Lo, T., Sung, Y., & Chen, B. (2021). Improving end-to-end modeling for mispronunciation detection with effective augmentation mechanisms. ArXiv, abs/2110.08731. Maqsood, M., Habib, H. A., & Nawaz, T. (2019). An efficient pronunciation detection system using discriminative acoustic-phonetic features for Arabic consonants. The International Arab Journal of Information Technology, 16, 242–250. M o st s p o ke n l a n g u a ge s i n t h e wo rl d . ( 2 0 2 2 ) . S ta t i sta . Ret r i eve d 3 D e c e m b e r 2 0 2 1 , f r o m h t t p s : / / w w w. s t a t i s t a . c o m / s t a t i s t i c s / 2 6 6 8 0 8 / t h e - m o s t - s p o k e n - l anguages-worldwide/ Nazir, F., Majeed, M. N., Ghazanfar, M. A., & Maqsood, M. (2019). Mispronunciation detection using deep convolutional neural network features and transfer learning-based model for Arabic phonemes. IEEE Access: Practical Innovations, Open Solutions, 7, 52589–52608. doi:10.1109/ACCESS.2019.2912648 Strik, H., Truong, K., De Wet, F., & Cucchiarini, C. (2009). Comparing different approaches for automatic pronunciation error detection. Speech Communication, 51(10), 845–852. doi:10.1016/j.specom.2009.05.007 Wana, Z., Hansen, J. H. L., & Xie, Y. (2020). A multi-view approach for Mandarin non-native mispronunciation verification. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 8079-8083. 10.1109/ICASSP40776.2020.9053981 Wang, H., Yan, B., Chiu, H., Hsu, Y., & Chen, B. (2021). Exploring non-autoregressive end-to-end neural modeling for English mispronunciation detection and diagnosis. ArXiv, abs/2111.00844. Wang, Z., Zhang, J., & Xie, Y. (2018). L2 mispronunciation verification based on acoustic phone embedding and Siamese networks. Proc. 11th International Symposium on Chinese Spoken Language Processing (ISCSLP), 444-448. 10.1109/ISCSLP.2018.8706597 Witt, S. M., & Young, S. J. (2000). Phone-level pronunciation scoring and assessment for interactive language learning. Speech Communication, 30(2–3), 95–108. doi:10.1016/S0167-6393(99)00044-8 Yan, B.-C., Wang, H.-W., Jiang, S.-W. F., Chao, F.-A., & Chen, B. (2022). Maximum F1-score training for end-to-end mispronunciation detection and diagnosis of L2 English speech. Proc. IEEE International Conference on Multimedia and Expo 2022. 10.1109/ICME52920.2022.9858931 Zhang, F., Huang, C., Soong, F. K., Chu, M., & Wang, R. (2008). Automatic mispronunciation detection for Mandarin. Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, 5077–5080.

69

 Mispronunciation Detection Using Neural Networks for Second Language Learners

Zhang, L., Ziping, Z., Chunmei, M., Linlin, S., Huazhi, S., Lifen, J., Shiwen, D., & Chang, G. (2020). End-to-End automatic pronunciation error detection based on improved hybrid CTC/attention architecture. Sensors (Basel), 20(7), 1809. doi:10.339020071809 PMID:32218379

ADDITIONAL READING Lin, B., & Wang, L. (2022). Phoneme mispronunciation detection by jointly learning to align. Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 10.1109/ ICASSP43922.2022.9746727 Phung, T., Vu, D.-Q., Mai-Tan, H., & Nhung, L. T. (2022). Deep models for mispronounce prediction for Vietnamese learners of English. Proc. International Conference on Future Data and Security Engineering, 682–689. 10.1007/978-981-19-8069-5_48 Yang, L., Zhang, J., & Shinozaki, T. (2022). Self-supervised learning with multi-target contrastive coding for non-native acoustic modeling of mispronunciation verification. Interspeech, 4312–4316. Advance online publication. doi:10.21437/Interspeech.2022-207

KEY TERMS AND DEFINITIONS Computer-Assisted Language Learning: An approach to teach and learning within which the pc and computer-based resources like the web are used to assess the material to be discovered. It includes considerable interactive elements. Corpora: Plural of the corpus. Corpus: A language dataset selected systematically and stored as an electronic database. Deep Neural Network: A neural network with more than two layers. Goodness of Pronunciation: A method used for automatic mispronunciation detection. Mispronunciation: The incorrect pronunciation of a specific word or sound. Neural Network: A series of algorithms that aims to recognize underlying relationships in a set of data through a process that mimics how the human brain operates.

70

71

Chapter 5

Assessing the Learning Outcomes of Robot-Assisted Second Language Learning Vasfiye Geçkin https://orcid.org/0000-0001-8532-8627 Izmir Democracy University, Turkey

ABSTRACT Robot-assisted language learning (RALL) explores the role of educational humanoid robots in the learning of first and second language(s). Today, there is no definitive answer as to its effectiveness in the long run. Some studies report that adult L2 learners benefit from RALL in learning words while children enjoy only small gains from vocabulary instruction. The kind of feedback humanoid robots can provide in an ongoing conversation merely goes beyond facial expressions or words of encouragement. The need to upgrade the skills of educational robots and concerns with data privacy and abusive behavior towards robots are some challenges faced in RALL today. Plus, not a single study examined the role of RALL in assessing the reading abilities and pragmatic knowledge of L2 learners. This chapter focuses on the effectiveness of RALL in assessing word learning, pragmatics, grammar, listening, speaking, and reading skills in an L2 and discusses its reflection for future classroom applications.

INTRODUCTION Robot-Assisted Language Learning (RALL) can be defined as using educational humanoid social robots in instructional settings to teach verbal and non-verbal components of first and second language(s) (Randall, 2019). Educational robots have been an integral part of second language (L2) instruction in many preschools (Tanaka & Matsuzoe, 2012; Mazzoni & Benvenuti, 2015), primary schools (Chang, Lee, Chao, Wang, & Chen, 2010; In & Han, 2015; Kennedy, Baxter, Senft, & Belpaeme, 2016) and colleges (Rosenthal-von der Pütten, Straßmann, & Krämer, 2016; Iio, Maeda, Ogawa, Ishiguro, Suzuki, Aoki, Maesaki, & Hama 2019) especially within the last two decades (see van den Berghe, Verhagen,

DOI: 10.4018/978-1-6684-5660-6.ch005

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

Oudgenoeg-Paz, van der Ven, & Leseman, 2019; Kanero, Geçkin, Oranç, Mamus, Küntay & Göksun, 2018; Lee & Lee, 2022a; 2022b; Randall, 2019 for a thorough review). To test the effectiveness of these educational robots, research on RALL has mostly focused on a certain aspect of second language learning such as learning words or improving speaking skills. Such robots may assume the role of a peer, a needy communication partner, a teacher’s assistant, or a tutor (Mubin, Stevens, Shahid, Al Mahmud, & Dong, 2013). Regardless of the role they assume, they are reported to offer certain benefits to the learning and teaching processes in second language classrooms. First, educational robots can use both verbal and non-verbal cues to foster communication (Kidd & Breazeal, 2004). They triumph over other 2D technologies such as tablets or virtual agents since they can manipulate objects and manifest verbal and nonverbal expressions. Programming robots with humanlike characteristics and behaviors makes the child and adult learners anthropomorphize them (Bartneck, Kulić, Croft, & Zoghbi, 2009). Owing to their physical embodiment, young children especially see them as familiar conversational partners (Beran, Ramirez-Serrano, Kuzyk, Fior, & Nugent, 2011). The learners are provided with a real-physical context while interacting with the robot (Wellsby & Pexman, 2014), which offers an invaluable learning environment. That is why, they are viewed to have a greater pedagogical potential in the learning and teaching of language(s). Second, since they can store and save learner utterances, it is possible to receive linguistic feedback from such robots (Randall, 2019). Furthermore, they do add up on learner concentration, motivation, and engagement on the task (Hsiao, Chang, Lin, Hsu, 2015; Keanea, Williams, Chalmers, & Boden, 2017). Robots can also be individualized to cater for learner difficulties, differences, and preferences (Bahari, Zhang, & Ardasheva, 2021; Hrastinski, Stenbom, Benjaminsson, & Jansson, 2019). Among several other review papers on RALL, this chapter specifically reports research on the effectiveness of educational robots in assessing word learning, grammar, speaking, listening, and reading skills and knowledge of pragmatics in second language classrooms that host a variety of learners of different ages and levels of proficiency.

WORD LEARNING The crucial components of mastering a new word include learning the receptive and productive aspects of the word (Webb, 2005). Adding a new word to the vocabulary core involves transferring it from shortterm to long-term memory, using it as means to negotiate meaning (Cowan, 2005; Gass, 1997) and to receive feedback (Long, 1996). Repetition and recycling the word through various channels (Cameron, 2001; Thornburry, 2002), especially when the learner has a lower affective filter with a manageable level of anxiety (Krashen, 1982), contribute to the word learning process. With respect to the effectiveness of humanoid social robots on word learning, research reports mixed results. Some robot tutors have been reported to reinforce the word learning process in a second language. Eimler, von der Pütten and Schächtle (2010), for instance, introduced a robot rabbit Nabaztag to German fifth graders who would learn twenty English vocabulary items. Half of these children were aided by the speaking robot rabbit, Nabaztag, and the other half were instructed on the same vocabulary items through the traditional pen and paper method. It is reported that after a week, the children who worked with Nabaztag had higher recall of the vocabulary items and a positive attitude to learning than those who were instructed through the traditional method.

72

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

In one experimental study, Iranian seventh graders were grouped into two. The RALL group was taught new words in L2 English by the human teacher who was accompanied by the robot, Nima, and the non-RALL group was given traditional instruction (Alemi, Meghdari, and Ghazisaedy, 2014). A sample conversation between Nima, the teacher, and the students is given in (1) below: 1. ‘Teacher: Nima, who is this? Nima: He’s my uncle Peyman. Teacher: What’s his job? Nima: He’s a police officer. Teacher: Wow! A police officer! Can you show us what police officers do? Nima: Yes, why not? (Nima acts out being a police officer by taking up his gun and shooting. Then he turns to the class and says) Nima: Everyone, please repeat; Police officer (The students repeat after Nima). Teacher: Fatemeh, why don’t you ask Nima about the next picture? Student: Who is this, Nima? (The student comes to the screen and points to the picture). Nima: She’s my aunt. Student: What’s her job? Nima: She’s a teacher (Nima acts out being a teacher by taking a piece of chalk and writing on the board). Nima: Everyone, please repeat; Teacher. Class: Teacher. Nima: What’s her job? Student: Teacher. Nima: Great Job! (He will be clapping at the same time).’ (p. 14) Nima asked the learners to point and repeat the phrases and checked on what they learnt over a period of five weeks. It also provided positive and encouraging feedback. The findings of the study suggest that the RALL group retained more words and learnt faster than the non-RALL group. De Wit, Schodde, Willemsen, Bergmann, de Haas, Kopp, Kramer, and Vogt (2018) taught five-yearold Dutch children six animal names, ladybug, monkey, horse, hippo, bird and chicken, in English. The children watched the humanoid social robot, NAO, gestured the animal after the robot and listened to the English pronunciation of these words. The task was a comprehension task due to the difficulties the robot had in recognizing child speech. First, the children were asked to tap onto the correct picture of the animal on a tablet. Next, they played the game I spy with my little eye with the robot and picked one of the animal pictures that popped up on the screen. The children recalled the words better when an adaptive tutoring strategy accompanied the gesture. NAO facilitated the learning and recalling of a considerable number of new words in the second language especially when the children taught the robot these words through games. Similarly, Tanaka and Matsuzoe (2012) utilized NAO as a younger peer to teach four English words to three-to-four-year-old Japanese children. The children learnt the words that they taught the robot better than those who did not teach the robot the same set of words. Kanda, Hirano, Eaton, and Ishiguro (2004) used the robot, Robovie, to teach Japanese elementary school students six English words and phrases over two weeks. After some time, the robot lost its novelty effect. Yet, the children who continued to play with the robot learnt more words than those who stopped playing with it. Mazzoni and Benvenuti (2015) taught four-to-six-year-old Italian children six English words in two conditions. In one condition, the children worked with a child peer and in the other condition, they worked with a 73

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

robot peer, MecWilly. The results showed that the children learnt as much from a robot peer as from a child peer. Meiirbekov, Balkibekov, Jalankuzov, and Sandygulova (2016) used NAO, as a learning peer, to teach ten English words to Kazakh elementary school children. It is reported that girls learnt more from the ever-winning robot, whereas boys learnt more from the ever-losing robot. Not all studies were restricted to teaching English as a second language. For example, Hemminki and Erkinheimo-Kyllönen (2017) investigated the role of the humanoid robot, NAO, in self-paced and selfguided learning of Finnish as a second language. A group of adult learners was given simple dialogues that contained vocabulary items pertaining to medical/healthcare. The next day, the participants were asked to write down the words they could remember. The interaction with the robot came afterwards. The learners are reported to ask for help from the humanoid robot more often than the human teacher in the learning context. The study concludes that the robot supported self-paced and self-guided learning by reducing the anxiety levels of the learners. Yet, certain problems emerged due to the poor speech recognition function of the robot since the learners were all non-native speakers. Overall, the robot acted as a filter to help the learners overcome their fear of making mistakes. Some studies report small or no significant contribution of humanoid social robots to learning new words in a second language. For example, Schodde, Bergmann, and Kopp (2017) assigned NAO the role of a tutor. The task was to teach German adults ten words developed from an artificial language, Vimmi. The result was that the teaching condition did not make a significant difference in mastering the words in the artificial language. In one experimental study, college freshmen students pursuing their undergraduate studies in an English language teaching program were divided into RALL and non-RALL groups. The non-RALL group received instruction from a human tutor and the RALL group had four sessions with the humanoid social robot, NAO, to learn new vocabulary items (Banaenian & Gilanlioglu, 2021). Below is a typical interaction with NAO while discussing the word integral: 2. ‘Student: What is the meaning of the word integral? NAO: Integral is an adjective. Integral means important and necessary as a part of whole. For instance, he’s an integral part of the team and we can’t do without him. [first definition and example] Is it clear? [comprehension check] Student: No, it is not. Can you explain more? [negotiation for meaning] NAO: Okay, let me explain it in another way. Integral means essential. Here is another example: Bars and cafes are integral to the social life of the city. [second definition and example] Is it clear now? [comprehension check] Student: Yes, thank you. NAO: Great!’ (p. 79) As given in (2) above, NAO followed certain steps while interacting with the participants of the RALL group. First, the definition of the word and an example sentence were given. Then, when the learner asked for negotiation of meaning, a second or alternative definition was provided with an example that was always followed by a comprehension check. The conversation ended with the robot giving positive and encouraging feedback such as ‘Great!’. The participants reported that the robot was attractive with a suitable appearance for learning, and it did not put them under stress. However, when the learning outcomes were considered, the non-RALL group performed better than the RALL group. Similar findings were reported for young second language learners. In Gordon et al. (2016)’s study, three-to-five-year-old American children were taught eight Spanish words with the robot, Tega, who as74

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

sumed the role of a peer. Although the children liked the robot, they did not learn more from it. Van den Berghe et al. (2018) used NAO to teach Dutch kindergarteners six English words. Learning words from a robot peer or a child peer did not make a difference in their vocabulary recall. However, the children who learnt the words alone outperformed both groups. To wrap up, research on vocabulary learning with humanoid social robots has revealed mixed results. Some, especially those examining adult learners, report considerable gains in vocabulary learning whereas others focusing on children found small or no gains. There seems to be an effect of gender on the learning outcomes with RALL, too. It should also be noted that adults had lower anxiety levels and asked for more help from the robot instructor. The children, on the other hand, retained more vocabulary items when instruction accompanied gesturing and when they instructed words to the robot who acted as a slow-learning peer. The following section discusses findings with respect to the effectiveness of robot tutors in assessing speaking and listening skills in a second language.

LISTENING AND SPEAKING One challenge that educational robots face in studies concerning the teaching of speaking and listening skills in a non-native language is automatic speech recognition (ASR). ASR aims at mapping the acoustic signal onto a string of words. Certain acoustic models are developed to predict the probability of a given string of words that would be present in a sentence in the target language. To achieve this aim, pronunciation dictionaries are formed by recording the conversational speech of learners communicating in the target language to figure out the dialect that the learners speak. However, developing acoustic model adaptation techniques based on the recognition of non-native speech is limited to a small number of non-native data recordings (Yoon & Kim, 2007). ASR systems expect the words to be pronounced in a certain way. If there is a difference in pronunciation, and this is what we observe in non-native speech most of the time, the word is considered mispronounced. Thus, confusing phonemes need to be added to the ever-developing pronunciation dictionaries. For instance, if one is focusing on teaching fruit names to a group of second language learners, systems need to transcribe a large vocabulary corpus that the speech recognizer needs to evaluate during an ongoing conversational context. To achieve this, a database needs to be formed with respect to different interactional contexts such as small talks or exchanges of information around the same topic. When it comes to assessing the speaking and listening skills in RALL classrooms, educational robots can be excellent companions since they can act as anxiety reducers and confidence builders by motivating second language learners (Alemi, Meghdari, & Haeri, 2017) and these robot tutors never feel exhausted of constant repetition in conversational interactions (Shin & Shin, 2015). Thus, educational robots proved to be effective in improving the speaking skills rather than the listening skills of second language learners. One study explored the effectiveness of RALL on beginner and intermediate level third, fourth and fifth graders learning English as a foreign language in Korea (Lee Noh, Lee, Lee, Lee, Sagong, & Kim 2011). Prior to the study, a small number of Korean children’s speech samples (17 hours) were transcribed to adapt acoustic models. The speech samples included short answers that led to misunderstandings, confusion, communication breakdowns, and errors of grammar. The classroom was divided into rooms in which each child was required to spend a minimum of ten minutes and English instruction was only carried out by the educational robots. In the PC room, the students took some instruction through watching digital content. In the pronunciation training room, the educational robot 75

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

Mero, scored the pronunciation quality of the children’s speech and gave feedback. The data comprised speech recordings of these children interacting at fruit, vegetable, and stationery stores. The children were assigned the role of the customers and had to interact with the robot Engkey who was assigned the role of a salesperson. The children rehearsed possible dialogues buying items. The second robot gave feedback and scored the speaking proficiency of the children performing the task on a rubric composed of pronunciation, vocabulary, grammar, and communicative ability. Overall, the training lasted 68 lessons. The targeted speech acts included greeting, requesting, ordering, confirming, rejecting, thanking, and saying goodbye. The children were tested before and after the treatment. The group of children who worked with the robot did not statistically significantly differ in their listening skills from the group of children who received traditional instruction. This was attributed to the poor quality of text-to-speech mode of the robots, whose various sound effects distracted the children’s attention. However, the results revealed a significant improvement in the speaking skills of the children in the post-test. Iio et al. (2019) tested the role of robot effectiveness in improving the speaking skills of nine female Japanese college students. The students participated in role-play activities where they interacted with the educational robot in a series of set-up scenarios in L2 English. The scenarios included setting up a date at a café, apologizing for being late, and eating a muffin at a café. The turn taking included the robot’s speech, the student’s response to the speech, and the robot’s response to the student’s response. Following Ellis (2008), speaking proficiency was assessed through measures of accuracy, complexity, and fluency in speech. Complexity in speech was calculated by dividing the number of words per clause. Accuracy measures came from the rate of grammatical/lexical errors and measures of fluency were calculated through the number of words per minute and the length of silent pauses. Pronunciation of the learners was rated either as good, so-so, or not good by native speakers of English. Special attention was given to segmental (i.e., how natural the phonemes sounded) and suprasegmental features (i.e., intonation and rhythm) in speech. Task appropriateness was measured by native speakers via the appropriateness of conversational turns when the learners held the floor. The study concluded that educational robots contributed to the improvement in speaking accuracy, fluency, and pronunciation of these adult second language learners. The role of educational robots was tested on more controlled tasks, too. Nevertheless, the results were not as promising. In one study, In and Han (2015) asked Korean fourth graders to repeat declarative and interrogative sentences uttered by a tele-present robot. The aim was to test how much of the prosodic correlates, native-like accent, in English, a stress-timed language, can be acquired by Korean elementary school children who come from a syllable-timed first language. The children did not manifest a change in intonation, suggesting that linguistic input provided by the robot was not sufficient. In another study, Rosenthal-von der Pütten, Straßmann, and Krämer (2016) explored whether pre-recorded natural speech or linguistically aligned speech would affect the listening skills of one hundred and thirty adult learners of German as a second language. Half of the participants interacted with a virtual robot and the other with an embodied robot. The participants first introduced themselves. Later, they described a picture, played a guessing and search game, and finally described another picture. The guessing game, ‘Who am I?’,’ that required asking only yes and no questions and targeted lexical alignment. The lexical alignment ratio was calculated by dividing the lexical choice (e.g., mustache) by the occurrence of the concept (e.g., the number of linguistic expressions that refers to a beard). The search game required taking turns with the robot to describe picture cards and addressed syntactic alignment. After listening to the descriptions, the participant or the robot had to find the depicted card and put it away. The syntactic alignment ratio was calculated by dividing the number of the uses of syntactic structures such as the passive voice or case 76

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

marking by the number of the sentences uttered. The results of the guessing and search game revealed no statistically meaningful difference between the participants in either mode of interaction. Contrary to the expectations, the group which interacted with the embodied robot had lower test scores than the one which interacted with the virtual agent. The learners’ level of proficiency seems to determine the effectiveness of robot tutors. Wang, Young, and Jang (2013) explored the role of tangible learner companions on the L2 English speaking skills of Taiwanese fifth graders. Half of the children conversed with a tangible learner companion and the other half was exposed to traditional teaching. Children in each group were divided into high, medium, and low achievers. Learner performance and interaction were scored by five teaching assistants. The children reported that receiving immediate responses from their companions made them feel more confident in speaking. The higher achievers in the experimental group performed even better and the lower achiever group showed a statistically significant improvement in their speech. The study suggests that speech recognition technology may take the burden off the teachers’ shoulders and provide opportunities for interaction in a second language both in and out of language classrooms. Again, the findings in this section seem to be mixed. Different studies yielded different results even when the first languages of the learners were similar (e.g., Kanda et al., 2004; Lee et al., 2011). To better understand the effectiveness of educational robots on target language speaking skills, first, programming challenges must be overcome, and educational robots need to access a wide variety of native and non-native speech samples to improve their speech recognition function. Second, the level of learner proficiency needs to be taken into consideration while designing testing materials to have a clear insight into when and how often educational robots could be integrated into second language classrooms.

GRAMMAR Much of the research on the effectiveness of humanoid social robots in second language settings centered around learning new words. To the best of our knowledge, only two studies focused on the role of robots in learning the grammatical system of a second language. These studies required the learners to pick the accurate grammatical unit and to translate sentences from the native to the second language. In one study, Herberg, Feller, Yengin, and Saerbeck (2016) report data from English fifth graders who interacted with the robot, NAO, to learn the rules of French and Latin. The participants were asked to complete worksheet items showing how many of the learnt rules they could apply in two conditions: (i) when they were watched by the robot and (ii) when the robot looked away from them. For each language, the robot explained the rules on a card, summarized the rules and waited for the child to translate the sentences on the worksheet. For Latin, the teaching points were negation, conjunction use, word order and noun and verb endings. The robot taught rules for article use, pluralizing nouns, conjugating verbs and negation for French. The children were awarded points for every unit of accurate translation and penalized for misspelt words. The worksheets were marked by two raters. The results revealed that the children performed better on difficult items especially when the robot looked away from them. In the second study, Kennedy, Baxter, Senft, and Belpaeme (2016) examined the role of NAO, in teaching English elementary school children French gender marking on nouns. NAO explained to the children the rules of article assignment depending on the gender of the noun. The instruction was conducted on a touchscreen where the robot dragged and dropped the correct gender marking on the blank space provided preceding the noun. The child’s task was to match the noun with the article. While the 77

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

robot was providing the child with verbal feedback, the robot’s text to speech mode would change to French so that the child would hear the correct pronunciation. Despite the difficulty of the test items and the short period of instruction, most children are reported to perform better in assigning articles to nouns of different genders. Overall, these two studies in the literature suggest a positive influence of the humanoid educational robots on the grammar learning of school-aged second language learners. However, two concerns need to be pointed out. First, human beings do not communicate through isolated grammatical units. That is, a meaningful and purposeful contextualized discourse needs to be utilized to test the grammatical knowledge of second language learners. Moreover, it is difficult to tease apart how much of the improvement in grammatical knowledge could be attributed to drills or practice and how much of it could be linked to the effectiveness of the educational robot. The next section is dedicated to the benefits that robots may offer to the learning of literacy skills.

READING Very little is known about the effect of the RALL paradigm on improving reading skills in the mother tongue. Since no studies looked into the role of educational robots on reading in a second language, research findings in this section merely focus on developing reading skills in the native language. More specifically, studies in this section come from assessing the reading abilities of children in three modes (i) children reading aloud to the robots, (ii) children being read by the robot and (iii) children correcting the robot’s reading mistakes. These three modes of interaction with the robots were given special attention since they can easily be adapted to second language learning contexts. Among the very few studies, Yadollahi et al. (2018) designed an experiment considering task difficulty, reader’s skills, and purpose (Kirby, 1988). Sixteen typically developing six-year-old Swiss children took part in a joint reading activity in their mother tongue with the clumsy robot NAO. Children with lower and higher-level reading skills were given graded readers of different stages. The teacher was also present in the session. The robot made certain types of mistakes while reading and the task of the child was to provide feedback and correct the robot’s mistakes. The first type of mistakes emerged out of a mismatch between the wording and the illustration in the book. For instance, the robot read elephant instead of penguin, when the illustration was one of a penguin. The second group of mistakes corresponded to a change in meaning when the robot replaced one word with another, that is, reading start instead of stop in the text. The third group of mistakes included mispronouncing verbal and nominal markers such as reading jumped as /dʒΛmpId/. Three types of correction were identified. True Positive occurred when the robot made a mistake, and the child corrected it. True Negative was observed when the robot did not make a mistake, but the child still took it as a mistake. False Positive took place when the robot made a mistake, but the child failed to spot it. True Positives were the most frequently corrected ones. False Positives were the hardest ones to recognize. Not all children perceived pointing as helpful in recognizing mistakes while reading. The high reading ability group benefited more from the pointing gesture while the children in the low reading ability group took it as a disturbance. Of course, educational robots were also used to observe and improve the autonomous reading habits of children. Michaelis and Mutlu (2017) developed an in-home learning companion humanoid desktop robot, Minnie, to understand the role of the robot in developing the reading habits of eleven-year-old 78

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

children. The robot used facial recognition to track the child and was able to build eye contact with the child. After greeting the child, the robot asked the child to pick a genre of interest to read. Three book suggestions were made to the child by filtering the level of difficulty and interest. In the twenty-minute reading session, the robot would remind the child of the page of the book and the time left to reach the day’s reading goal while the child reads out loud to the robot. To appear engaged in the reading, the robot would blink, move its eyes, and turn its head. The child scanned certain tags to signal to the robot that she wanted to pause or quit. When the reading session was completed, the robot congratulated the child and shut down. At the end of the session, the robot stored information about what books the child liked or disliked. The study suggests that non-verbal behavior such as the gazing of the robot affected the child’s perception of the robot. One of the critical skills of the reading ability can be developed by symbolically representing and sharing language with others through activities such as storytelling. For example, five-year-old American children were asked to retell stories that were told by a projected robot companion, Sam, who watched the child retelling the story and gave verbal and non-verbal feedback in the forms of nodding, smiling, and asking questions such as ‘And then what happens?’. The study found that the children who played with the virtual peer, Sam, modeled his linguistically advanced stories by making more use of quoted speech, and temporal and spatial expressions (Ryokai, Vaucelle & Cassell, 2003). In another study, preschoolers listened to the stories told by the robot, DragonBot, who acted as an older peer and functioned in two modes. In the non-adaptive mode, the robot told simple stories to the children with higher language ability and in the adaptive mode the robot told more complex versions of the stories which were slightly above the children’s level. In addition, the study assessed the possibility of adapting the robot’s linguistic level to the child’s learning and the complexities of the stories (Kori & Breazel, 2014). One last study explored the effectiveness of the robot’s reading on four-year-old Korean children who were either placed in a robot-assisted reading program or a traditional media-assisted one (Hyun, Kim, & Jang, 2008). The results confirmed that the children in the robot-assisted reading program outperformed those in the media-assisted program on linguistic abilities involved in story-making, story comprehension and word recognition thanks to the bi-directional interaction that the robot provided. With respect to the reading skills in the mother tongue, children are reported to benefit from the educational robot either as a peer or a tutor. Still, assessing the second language reading skills when the learners read to the robot, correct the mistakes of the robot, or retell what was read awaits as an area of promising future work. Now, I turn to the final section which can guide research on the effectiveness of robots in developing the knowledge and assessment of pragmatics in the second language.

PRAGMATICS The studies in the field of pragmatics in second language acquisition are basically concerned with meaning as it is communicated by speakers and interpreted by listeners. Pragmatics, in this context, is the knowledge of “how-to-say, what, to-whom, and when in a second language (Bardovi-Harlig, 2013: 68). The knowledge of pragmatics comes not only from the social context in which communication events occur, but also from the language choices made that would trigger some force or effect on the interlocutors (Crystal, 1997:301). The focus is on intended meanings and assumptions on patterns and routines of language use especially when performing certain speech acts such as requesting, apologizing, refusing, complaining, or complimenting (Austin, 1962; Searle, 1969). These acts can be performed 79

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

with varying degrees of directness or indirectness, also referred to as politeness. The extent of politeness may change depending on the social context, the power difference between the interlocutors, the communicative goals, and the social distance between the conversers (Brown & Levinson, 1978). When it comes to assessing the knowledge of pragmatics in a non-native language, the theoretical conceptions ranging from discourse completion to discourse production tasks are well operationalized in the field (Bachman & Palmer 2010). Both children and adults are known to attribute the characteristics of humans to educational robots by assigning them certain roles in the conversation. Mazzoni and Benvenuti (2015) summarized the interactions addressed in RALL studies. First, in the human-to-robot and robot-to-robot interaction, the human or one of the robots acts as the tutor. In the robot-to-human interaction the robot assumes the role of a partner which provides opportunities to the human to try out a particular set of abilities or social skills. Finally, in the human-to-robot interaction, the human acts as a tutor for the robot (p. 475). That is why, certain pragmatic exchange in the form of speech acts is almost always observed in every experimental setting involving educational robots. However, not a single study investigated the contribution of educational robots to the knowledge of pragmatics in the second language. In one study, Mazzoni and Benvenuti (2015) introduced the robot, MecWilly, to four-to-six-year-old Italian children, as a visitor from England who came to Italy with a list of fruits which the English children asked for. As the robot could not speak Italian, he showed the pictures of the fruits and asked help from the Italian children to collect the fruits. The aim was to explore whether negotiation of meaning through collaborating with other children or collaborating with the robot resulted in retaining more vocabulary items and maintaining a partnership to solve a problem. Even though the study targeted word learning in L2 English, the results demonstrated that the children who collaborated with the robot benefited from vocabulary learning and increasing partnership. Future research could be geared towards learning pragmatically appropriate language use in a second language since educational robots can provide the learners with real-life interaction opportunities. When designing experiments in assessing the effectiveness of educational robots on the pragmatic knowledge in a non-native language, communicative goal, social distance, and power difference could be taken as variables to observe how certain speech acts are performed in diverse social contexts.

DISCUSSION AND FUTURE RESEARCH DIRECTIONS This paper reports experimental research findings on the effectiveness of educational robots on the learning outcomes of second language learners of varying ages and levels of proficiency. The contribution of the educational robot to word learning and improvement in speaking skills seems to outperform findings related to other linguistic abilities. Yet, these results are quite limited and there is need for long-term studies (Benitti, 2012). The role of robots in developing literacy skills and the knowledge of pragmatics is quite an understudied area. For the time being, educational robots are far from replacing human teachers, but they may outperform other traditional or digital platforms (Kennedy et al., 2016). Future work needs to focus on the possible effectiveness of educational robots, especially in assessing the learning outcomes of young second language learners. The teacher’s and learners’ beliefs and attitudes need to be considered before integrating RALL technologies into second language classrooms. Language teachers may refrain from making use of educational robots due to some concerns. The use of robots may require an organized and structured lesson 80

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

flow (Majgaard, 2015). The teachers may be worried about technical difficulties and how to deal with the disruptions in class caused by the robot (Serholt et al., 2014) as well as data security, ethical and privacy issues as a result of learner-robot interaction (Serholt et al., 2017). Having access to resources and being empowered with the necessary pedagogical competence are some of the prerequisites to overcome technical problems while running an educational robot in a second language classroom (Causo, Win, Guo, & Chen, 2017). Adult learners with high levels of anxiety and a negative attitude towards tutor robots are reported not to benefit much from the vocabulary instruction given by tutor robots. Furthermore, personality traits such as extraversion did not contribute much to the vocabulary development when the instruction was given by human tutors (Kanero, Oranç, Koşkulu, Kumkale, Göksun, & Küntay, 2022). One reason why pragmatics is an understudied area within the RALL paradigm is that algorithms that produce scenarios such as dialogue systems for robots which could give feedback to human interlocutors hardly exist (Ondas, Pleva, & Juhar, 2022). These algorithms need to be designed by taking the turn-taking strategies of the interlocutors into consideration. In human-human dialogue interactions, for instance, turn taking is based on anticipatory behavior and prediction. That is, humanoid artificial intelligence needs to process the capability of rapid turn taking and prediction of when one turn starts and the other ends. So far, we know that the human listener can predict approximately five remaining words in ongoing speech (Gisladottir, Bögels, & Levinson, 2018). To investigate the anticipatory behavior of children in human-robot interaction, Ondas et al. (2022) designed a scenario where the robot NAO gave children some riddles to guess in Slovak. The script that NAO followed included a basic question, next/ repeat question and continue question. The feedback given by NAO included head gestures and expressions of agreement, acknowledgment, and continuation. The time needed to predict the riddle varied depending on its difficulty level, syntactic form and previous knowledge, tiredness, and the mood of the children. Another line of research related to the knowledge of pragmatics includes strategies pursued to resolve problems. At times of communication breakdown, primary school children are reported to adapt their behavior, distance themselves from the robot, and seek help from the present adults while playing a mathematics game with the robot (Serholt, Pareto, Ekström, &Ljungblad, 2020). Overall, studies testing pragmatic knowledge in the mother tongue can pave the way for further research on the mastery of pragmatic competence in a second language. In addition, further pedagogical research is needed to combine the use of RALL technologies with classroom teaching. Final words can be spared to some ethical concerns on using educational robots. In order to effectively control the learners, especially the children in class, the robot needs to employ remarks of praise, encouragement and rewards, some of which are difficult even for expert human teachers. Personal information gathered by the robot can be accessed by other people (Sharkey & Sharkey, 2010). If the robot tutor is utilized as direct surveillance, the privacy of the learners could be invaded. Children may suffer from the loss of a robot if they have already built an emotional bond with it or discovered that the robot is a programmed entity (Sharkey, 2016). The anthropomorphic entity provides appropriate verbal and nonverbal feedback that may easily convince (Yamaoka, Kanda, Ishiguro, Hagita, 2007) or deceive (Sharkey & Sharkey, 2012) individuals. What is more, learners may exhibit cruel, unpleasant, and abusive behavior towards the robot (Bartneck & Hu, 2008). Especially children tend to sometimes abuse robots by pinching, punching, or kicking them. Children’s persistent verbal or non-verbal offensive actions or physical violence can violate the human or animal-like nature of the robot (Kanda et al., 2004). For example, Tatsuya, Kanda, Kidokoro, Suehiro and Yamada (2016) conducted a field study in a shopping mall in Japan. The humanoid robot asked for help, in this sitting, because its path was blocked. Some children (3%) persisted in aggressive behavior towards the robot even when the robot shouted for help. When 81

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

interviewed, half of these children reported that they believed that the robot was capable of perceiving the abusive behavior but still rationalized their aggressive behavior towards the robot by stating that they were curious about the robot’s reactions and enjoyed abusing a human-like entity. For further research, the use of RALL technologies needs to embrace all the components required to master a second language. Robot-learner interactional designs could expand interaction types that embrace sociocultural factors. Longitudinal and experimental work needs to be conducted to explore the effectiveness of educational robots, especially on the development and assessment of pragmatic knowledge and literacy skills in a second language. Linguists, psychologists, teachers, and software developers need to work hand in hand by taking the ethical concerns and cognitive, linguistic, social, and emotional development of the learners into consideration.

CONCLUSION To conclude, this paper summarizes recent work and offers insight for future research by reviewing the effectiveness of educational robots in learning a second language. The implications offered to L2 settings are diverse. First, the human-robot interaction refrains from a one-size fits all approach and deploys a student-centered pedagogy. Individual learner differences and needs as well as intercultural similarities and differences could be integrated into verbal and non-verbal human-robot interaction. Second, the voice of the robot has to be as naturalistic as possible with an acceptable accent and appropriate extra linguistic vocal features in the target language. Third, robots work best when accompanied by human tutors who can give appropriate verbal and non-verbal feedback and guidance. Third, humanoid robots are reported to positively affect the affective states of the learners of varying ages and levels of proficiency. Lastly, automatic speech recognition systems of the robots need to be improved to detect errors resulting from the learners’ level of proficiency, accent, pauses and hesitations. For the time being, the jobs of human teachers are secured. However, second language teachers will have to integrate robot companions and constantly upgrade and adapt their technical and pedagogical skills to guide their students in becoming autonomous life-long learners.

ACKNOWLEDGMENT This research received no specific grant from any funding agency in the public, commercial, or not-forprofit sectors.

REFERENCES Alemi, M., Meghdari, A., & Ghazisaedy, M. (2014). Employing humanoid robots for teaching English language in Iranian junior high schools. International Journal of Humanoid Robotics, 11, 14500221–1450022-25.

82

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

Alemi, M., Meghdari, A., & Haeri, N. S. (2017). Young EFL learners’ attitude towards RALL: An observational study focusing on motivation, anxiety, and interaction. In A. Kheddar, E. Yoshida, S. S. Ge, K. Suzuki, J.-J. Cabibihan, F. Eyssel, & H. He (Eds.), Proceedings of the International Conference on Social Robotics (pp. 252–261). Springer. 10.1007/978-3-319-70022-9_25 Austin, J. L. (1962). How to do things with words. Harvard University Press. Bachman, L. F., & Palmer, A. S. (2010). Language assessment in practice. Oxford University Press. Bahari, A., Zhang, X., & Ardasheva, Y. (2021). Establishing a non-linear dynamic individual-centered language assessment model: A dynamic systems theory approach. Interactive Learning Environments, 29(7), 1–15. doi:10.1080/10494820.2021.1950769 Banaeian, H., & Gilanlioglu, I. (2021). Influence of the NAO as teaching assistant on university students’ vocabulary learning and attitudes. Australasian Journal of Educational Technology, 37(3), 71–87. doi:10.14742/ajet.6130 Bardovi-Harlig, K. (2013). Developing L2 pragmatics. Language Learning, 63(1), 68–86. doi:10.1111/ j.1467-9922.2012.00738.x Bartneck, C., & Hu, J. (2008). Exploring the abuse of robots. Interaction Studies: Social Behaviour and Communication in Biological and Artificial Systems, 9(3), 415–433. doi:10.1075/is.9.3.04bar Bartneck, C., Kulić, D., Croft, E., & Zoghbi, S. (2009). Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots. International Journal of Social Robotics, 1(1), 71–81. doi:10.100712369-008-0001-3 Benitti, F. B. V. (2012). Exploring the educational potential of robotics in schools: A systematic review. Computers & Education, 58(3), 978–988. doi:10.1016/j.compedu.2011.10.006 Beran, T. N., Ramirez-Serrano, A., Kuzyk, R., Fior, M., & Nugent, S. (2011). Understanding how children understand robots: Perceived animism in child–robot interaction. International Journal of HumanComputer Studies, 69(7-8), 539–550. doi:10.1016/j.ijhcs.2011.04.003 Brown, B., & Levinson, S. C. (1978). Politeness: Some universals in language usage. Cambridge University Press. Cameron, L. (2001). Teaching languages to young learners. Cambridge University Press. doi:10.1017/ CBO9780511733109 Causo, A., Win, P. Z., Guo, P. S., & Chen, I.-M. (2017). Deploying social robots as teaching aid in preschool K2 classes: A proof-of-concept study. In Proceedings of International Conference on Robotics and Automation (ICRA) (pp. 4264–4269). IEEE. 10.1109/ICRA.2017.7989490 Chang, C. W., Lee, J. H., Chao, P. Y., Wang, C. Y., & Chen, G. D. (2010). Exploring the possibility of using humanoid robots as instructional tools for teaching a second language in primary school. Journal of Educational Technology & Society, 13, 13–24.

83

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

Cowan, N. (1999). An embedded- processes model of working memory. In A. Miyake & P. Shah (Eds.), Models of working memory: Mechanisms of active maintenance and executive control (pp. 62–101). Cambridge University Press. doi:10.1017/CBO9781139174909.006 Crystal, D. (1997). The Cambridge encyclopedia of language. Cambridge University Press. de Wit, J., Schodde, T., Willemsen, B., Bergmann, K., de Haas, M., Kopp, S., Kramer, E., & Vogt, P. (2018). The effect of a robot’s gestures and adaptive tutoring on children’s acquisition of second language vocabularies. In Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction (pp. 50–58). New York, NY: ACM. 10.1145/3171221.3171277 Eimler, S., von der Pütten, A., & Schächtle, U. (2010). Following the white rabbit—a robot rabbit as vocabulary trainer for beginners of English. In G. Leitner, M. Hitz, & A. Holzinger (Eds.), Lecture Notes in Computer Science: Vol. 6389. HCI in Work and Learning, Life and Leisure. USAB 2010. Springer. doi:10.1007/978-3-642-16607-5_22 Ellis, R. (2008). The study of second language acquisition (2nd ed.). Oxford University Press. Gass, S. M. (1997). Input, interaction, and the second language learner. Erlbaum. Gisladottir, R. S., Bögels, S., & Levinson, S. C. (2018). Oscillatory Brain Responses Reflect Anticipation during Comprehension of Speech Acts in Spoken Dialog. Frontiers in Human Neuroscience, 7, 46. doi:10.3389/fnhum.2018.00034 PMID:29467635 Gordon, G., Spaulding, S., Westlund, J. K., Lee, J. J., Plummer, L., Martines, M., Das, M., & Breazeal, C. (2016). Affective Personalization of a Social Robot Tutor for Children’s Second Language Skills. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) (pp. 3951–3957). 10.1609/aaai.v30i1.9914 Hemminki, J., & Erkinheimo-Kyllonen, A. (2017). A humanoid robot as a language tutor - a case study from Helsinki Skills Center. Proceedings of R4L@ HRI2017. Herberg, J. S., Feller, S., Yengin, I., & Saerbeck, M. (2015). Robot watchfulness hinders learning performance. In Proceedings of the 24th IEEE International Symposium on Robot and Human Interactive Communication (pp. 153–160). Los Alamitos, CA: IEEE. Hrastinski, S., Stenbom, S., Benjaminsson, S., & Jansson, M. (2019). Identifying and exploring the effects of different types of tutor questions in individual online synchronous tutoring in mathematics. Interactive Learning Environments, 0(0), 1–13. Hsiao, H. S., Chang, C. S., Lin, C. Y., & Hsu, H. L. (2015). Irobiq: The influence of bidirectional interaction on kindergarteners’ reading motivation, literacy, and behavior. Interactive Learning Environments, 23(3), 269–292. doi:10.1080/10494820.2012.745435 Hyun, E. J., Kim, S. Y., & Jang, S. K. (2008). Effects of a language activity using an ‘intelligent’ robot on the linguistic abilities of young children. Korean Journal of Early Childhood Education, 28(5), 175–197. doi:10.18023/kjece.2008.28.5.009

84

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

Iio, T., Maeda, R., Ogawa, K. Y., Ishiguro, H., Suzuki, K., Aoki, T., Maesaki, M., & Hama, M. (2019). Improvement of Japanese adults’ English-speaking skills via experiences speaking to a robot. Journal of Computer Assisted Learning, 35(2), 228–245. doi:10.1111/jcal.12325 In, J.-Y., & Han, J.-H. (2015). The acoustic-phonetic change of English learners in robot assisted learning. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction (pp. 39–40). New York, NY: ACM. 10.1145/2701973.2702003 Kanda, T., Hirano, T., Eaton, D., & Ishiguro, H. (2004). Interactive robots as social partners and peer tutors for children: A field trial. Human-Computer Interaction, 19(1), 61–84. doi:10.120715327051hci1901&2_4 Kanero, J., Geçkin, V., Oranç, C., Mamus, E., Küntay, A. C., & Göksun, T. (2018). Social robots for early language learning: Current evidence and future directions. Child Development Perspectives, 12(3), 146–151. doi:10.1111/cdep.12277 Kanero, J., Oranç, C., Koşkulu, S., Kumkale, G. C., Göksun, T., & Küntay, A. (2022). Are tutor robots for everyone? The influence of attitudes, anxiety, and personality on robot-led language learning. International Journal of Social Robotics, 14(2), 297–312. doi:10.100712369-021-00789-3 Keane, T., Williams, M., Chalmers, C., & Boden, M. (2017). Humanoid robots awaken ancient language. Australian Educational Leader, 39(4), 58–61. Kennedy, J., Baxter, P., Senft, E., & Belpaeme, T. (2016). Social robot tutoring for child second language learning. In Proceedings of the Eleventh ACM/IEEE International Conference on Human Robot Interaction (pp. 231–238). New York, NY: ACM 10.1109/HRI.2016.7451757 Kidd, C. D., & Breazeal, C. (2004). Effect of a robot on user perceptions. In Proceedings of the IEEE/ RSJ International Conference on Intelligent Robots and Systems (pp. 3559–3564). Los Alamitos, CA: IEEE. 10.1109/IROS.2004.1389967 Kirby, J. R. (1988). Style, strategy, and skill in reading. In R. R. Schmeck (Ed.), Learning strategies and learning styles (pp. 229–274). Plenum Press. doi:10.1007/978-1-4899-2118-5_9 Kory, J., & Breazeal, C. (2014). Storytelling with robots: Learning companions for preschool children’s language development. In Robot and Human Interactive Communication, 2014 RO-MAN: The 23rd IEEE International Symposium (pp. 643–648). IEEE. Krashen, S. (1982). Principles and practices in second language acquisition. Pergamon Press. Lee, H., & Lee, J. H. (2022a). Social robots for English language teaching. ELT Journal, 76(1), 119–124. doi:10.1093/elt/ccab041 Lee, H., & Lee, J. H. (2022b). The effects of robot-assisted language learning: A meta-analysis. Educational Research Review, 35, 1–13. doi:10.1016/j.edurev.2021.100425 Lee, S., Noh, H., Lee, J., Lee, K., Lee, G., Sagong, S., & Kim, M. (2011). On the effectiveness of robotassisted language learning. ReCALL, 23(1), 25–58. doi:10.1017/S0958344010000273 Long, M. H. (1996). The role of the linguistic environment in second language acquisition. In W. C. Ritchie & T. K. Bhatia (Eds.), Handbook of second language acquisition (pp. 413–468). Elsevier.

85

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

Majgaard, G. (2015). Multimodal robots as educational tools in primary and lower secondary education. Proceedings of the International Conferences Interfaces and Human Computer Interaction, 27–34. Mazzoni, E., & Benvenuti, M. (2015). A Robot-partner for preschool children learning English using socio-cognitive conflict. Journal of Educational Technology & Society, 18, 474–485. Meiirbekov, S., Balkibekov, K., Jalankuzov, Z., & Sandygulova, A. (2016). “You win, I lose”: Towards adapting robot’s teaching strategy. In Proceedings of the Eleventh ACM/IEEE International Conference on Human Robot Interaction (pp. 475–476). New York, NY: ACM. Michaelis, E. J., & Mutlu, B. (2017). Someone to read with: Design of and experiences with an in-home learning companion robot for reading. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 301–312). ACM. 10.1145/3025453.3025499 Mubin, O., Stevens, C. J., Shahid, S., Al Mahmud, A., & Dong, J. J. (2013). A review of the applicability of robots in education. Technology for Education and Learning, 1(1), 1–7. doi:10.2316/Journal.209.2013.1.209-0015 Ondas, S., Pleva, M., & Juhar, J. (2022). Child-robot spoken interaction in selected educational scenarios. In 20th International Conference on Emerging eLearning Technologies and Applications (ICETA) (pp. 478–483). IEEE. 10.1109/ICETA57911.2022.9974859 Randall, N. (2019). A Survey of Robot-Assisted Language Learning (RALL). ACM Transactions on Human-Robot Interaction, 9(1), 1–36. doi:10.1145/3345506 Rosenthal-von der Pütten, A. M., Straßmann, C., & Krämer, N. C. (2016). Robots or agents—Neither helps you more or less during second language acquisition. Experimental study on the effects of embodiment and type of speech output on evaluation and alignment. In Proceedings of the International Conference on Intelligent Virtual Agents (pp. 256–268). New York, NY: Springer. 10.1007/978-3-319-47665-0_23 Ryokai, K., Vaucelle, C., & Cassell, C. (2003). Virtual peers as partners in storytelling and literacy learning. Journal of Computer Assisted Learning, 19(2), 195–208. doi:10.1046/j.0266-4909.2003.00020.x Schodde, T., Bergmann, K., & Kopp, S. (2017). Adaptive robot language tutoring based on Bayesian knowledge tracing and predictive decision-making. In Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction (pp. 128–136). Vienna, Austria. 10.1145/2909824.3020222 Searle, J. R. (1969). Speech acts: An essay in the philosophy of language. Cambridge University Press. doi:10.1017/CBO9781139173438 Serholt, S., Barendregt, W., Leite, I., Hastie, H., Jones, A., Paiva, A., ... Castellano, G. (2014). Teachers’ views on the use of empathic robotic tutors in the classroom. In Proceedings of the 23rd IEEE International Symposium on Robot and Human Interactive Communication (955–960). IEEE. 10.1109/ ROMAN.2014.6926376 Serholt, S., Barendregt, W., Vasalou, A., Alves-Oliveira, P., Jones, A., Petisca, S., & Paiva, A. (2017). The case of classroom robots: Teachers’ deliberations on the ethical tensions. AI & Society, 32(4), 613–631. doi:10.100700146-016-0667-2

86

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

Serholt, S., Pareto, L., Ekström, S., & Ljungblad, S. (2020). Trouble and repair in child-robot interaction: A study of complex interactions with a robot tutee in a primary school classroom. Frontiers in Robotics and AI, 7, 46. doi:10.3389/frobt.2020.00046 PMID:33501214 Sharkey, A. J. C. (2016). Should we welcome robot teachers? Ethics and Information Technology, 18(4), 283–297. doi:10.100710676-016-9387-z Sharkey, A. J. C., & Sharkey, N. E. (2012). Granny and the robots: Ethical issues in robot care for the elderly. Ethics and Information Technology, 14(1), 27–40. doi:10.100710676-010-9234-6 Sharkey, N. E., & Sharkey, A. J. C. (2010). The crying shame of robot nannies: An ethical appraisal. Interaction Studies: Social Behaviour and Communication in Biological and Artificial Systems, 11(2), 161–190. doi:10.1075/is.11.2.01sha Shin, J., & Shin, D. H. (2015). Robots as a facilitator in language conversation class. In Proceedings of the 10th Annual ACM/IEEE International Conference on Human-Robot Interaction Extended Abstracts (pp.11–12). Associations for Computing Machinery. 10.1145/2701973.2702062 Tanaka, F., & Matsuzoe, S. (2012). Children teach a care-receiving robot to promote their learning: Field experiments in a classroom for vocabulary learning. Journal of Human-Robot Interaction, 1, 78–95. doi:10.5898/JHRI.1.1.Tanaka Tatsuya, N., Kanda, T., Kidokoro, H., Suehiro, Y., & Yamada, S. (2016). Why do children abuse robots? Interaction Studies: Social Behaviour and Communication in Biological and Artificial Systems, 17(3), 348–370. Thornbury, S. (2002). How to teach vocabulary. Longman. van den Berghe, R., Verhagen, J., Oudgenoeg-Paz, O., van der Ven, S., & Leseman, P. (2019). Social robots for language learning: A review. Review of Educational Research, 89(2), 259–295. doi:10.3102/0034654318821286 Wang, Y. H., Young, S. S., & Jang, J. S. R. (2013). Using tangible companions for enhancing learning English conversation. Journal of Educational Technology & Society, 16(2), 296–309. Webb, S. (2005). Receptive and productive vocabulary learning. Studies in Second Language Acquisition, 27(1), 33–52. doi:10.1017/S0272263105050023 Wellsby, M., & Pexman, P. M. (2014). Developing embodied cognition: Insights from children’s concepts and language processing. Frontiers in Psychology, 5, 506. doi:10.3389/fpsyg.2014.00506 PMID:24904513 Yadollahi, E., Johal, W., Paiva, A., & Dillenbourg, P. (2018). When deictic gestures in a robot can harm child-robot collaboration. In Proceedings of the 17th ACM Conference on Interaction Design and Children (pp.195–206). 10.1145/3202185.3202743 Yamaoka, F., Kanda, T., Ishiguro, H., & Hagita, N. (2007). Interacting with a human or a humanoid robot? Proceeding of the IEEE/RSJ International Conference on Intelligent Robots and Systems.

87

 Assessing the Learning Outcomes of Robot-Assisted Second Language Learning

Yoon, C., & Kim, S. (2007). Convenience and TAM in a ubiquitous computing environment: The case of wireless LAN. Electronic Commerce Research and Applications, 6(1), 102–112. doi:10.1016/j. elerap.2006.06.009

ADDITIONAL READING Butler, Y. G. (2022). Learning through digital technologies among pre-primary school children: Implications for their additional language learning. Language Teaching for Young Learners, 4(1), 30–65. doi:10.1075/ltyl.21009.but Engwal, O., Lopees, J., & Ahlund, A. (2020). Robot interaction styles for conversation practice in second language learning. International Journal of Social Robotics, 13(2), 251–276. doi:10.100712369-02000635-y Hsu, H. L., Chen, H. H. J., & Todd, A. G. (2021). Investigating the impact of the Amazon Alexa on the development of L2 listening and speaking skills. Interactive Learning Environments, 1–14. doi:10.108 0/10494820.2021.2016864 Mustar, M. Y., Hartanto, R., & Santosa, P. I. (2022). User interface for child-robot interaction in education: Perspective and challenges. In Proceedings of the 2nd international conference on Electronic and Electrical Engineering and Intelligent System (ICE3IS) (pp. 360-364). 10.1109/ICE3IS56585.2022.10010156 Zinina, A., Kotov, A., Arinkin, N., & Zaidelman, L. (2023). Learning a foreign language vocabulary with a companion robot. Cognitive Systems Research, 77, 110–114. doi:10.1016/j.cogsys.2022.10.007

KEY TERMS AND DEFINITIONS Automated Speech Recognition (ASR): ASR is the technology that allows humans to use their voices to speak with a computer interface. Humanoid Social Robots (HSRs): HSRs are human-made technologies that resemble people digitally or physically and are designed to interact with people. Pragmatics: A subfield of linguistics which deals with understanding language use and meaning in context. Robot-Assisted Language Learning (RALL): The use of robots to support language learning.

88

89

Chapter 6

Dynamic Assessment as a Learning-Oriented Assessment Approach Tuba Özturan Erzincan Binali Yıldırım University, Turkey Hacer Hande Uysal Gürdal Hacettepe University, Turkey

ABSTRACT This chapter presents the theoretical background of dynamic assessment (DA) and its praxis with pedagogical suggestions for foreign language writing instructional settings. Resting on Vygotsky’s sociocultural theory (1978), DA asserts that there is a need for blending instruction with assessment because of the social interaction’s salience on cognition modification. Thus, DA adopts learning-and-learner-based feedback approaches and a present-to-future model of assessment, which rests on reciprocal teacherlearner interaction. Grounding in the need for enlightening DA in an EFL setting, this chapter presents reciprocal interactions between a teacher and four students. The interaction analyses unveil that the teacher has adopted a variety of mediational moves to finely instruct the students and diagnose their microgenesis, and students have displayed various reciprocity acts towards the mediational moves provided to them, which unpacks each student’s zone of proximal development. Based on these findings, the chapter ends with suggestions for EFL writing teachers.

INTRODUCTION People come from different socio-cultural backgrounds, and their interactions with society may show differences, which directly affects each person’s opportunity to reach qualified education and information. Today students, especially those who are at the tertiary level, are getting mobilized around the world, so students whose socio-cultural and educational backgrounds are diverse might also influence classrooms’ dynamics (Shrestha, 2020). Even though the diversity inside the classrooms may be an advantage, it may DOI: 10.4018/978-1-6684-5660-6.ch006

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 Dynamic Assessment as a Learning-Oriented Assessment Approach

also create some pitfalls. For example, in second/foreign language learning settings, adopting the same corrective feedback approach for all learners may not work effectively (Storch, 2018), and administering merely a past-to-present assessment model may not unpack each student’s ability to achieve something (Poehner & Lantolf, 2013). Moreover, countries’ different education policies during the pandemic and evolving digital technologies have ignited gaps among students (OECD, 2020; Shrestha, 2020). These factors have led to a call for finding new instructional and assessment approaches and highlighted the salience of dialogic interaction, individual learners’ needs, and learning-based assessment approaches (Poehner & Inbar-Lourie, 2020). Grounding in this necessity, Dynamic Assessment (henceforth DA) has appeared as an alternative approach, and this chapter aims to report on DA, which is a learning-oriented assessment approach and relies on learners’ needs (Poehner, 2008). DA’s theoretical orientations are grounded in Vygotsky’s Sociocultural Theory (SCT) (Vygotsky, 1978) and Feuerstein’s Mediated Learning Experience (MLE) (Feuerstein et al., 2010), and DA moots that social milieu and social interaction have a salient impact on people’s cognition modification. That is, people’s innate abilities and people’s direct exposure to any stimuli may not be enough to unpack their exact potential to achieve something because social interaction leads people to think in new ways and modify their higher-order cognitive skills (Haywood & Tzuriel, 2002; Vygotsky, 1978). Within this nature, parents, teachers, and more-able peers have salient roles in children’s and students’ lives since they can support the learning process by mediating input (Feuerstein et al., 2010). In a broad view, Dynamic Assessment refers to the fusion of instruction and assessment in a single collaborative work between teachers and students (Poehner, 2008). Its instructional dimension relies on SCT-oriented feedback approaches (Aljaafreh & Lantolf, 1994). Although both language teachers and SLA (Second Language Acquisition) researchers have long been interested in how to deal with learners’ errors effectively and suggested a variety of feedback approaches to provide qualified input for the learners (Hyland & Hyland, 2006), the best one could not be decided yet because of learner differences, learners’ various needs, and different error natures (Mao & Lee, 2021; Nassaji, 2017; Storch, 2018). Regarding the need for increasing the effectiveness of feedback, negotiation on learner errors has recently been gaining importance (Erlam et al., 2013; Nassaji, 2017) since negotiation between two parties, such as teachers and students, paves the way to discuss, find a solution, and mitigate misunderstanding. This leads to a decrease in the gap between interlanguage and target language (Nassaji, 2016, 2017). Said another way, the salience of negotiation lies at the heart of dialogic reciprocal interaction and tailoring feedback according to each learner’s needs to provide qualified input through graduated feedback (Aljaafreh & Lantolf, 1994; Nassaji & Swain, 2000). DA also adopts a present-to-future assessment model to enhance a fair assessment setting (Poehner & Lantolf, 2013). Compared to static assessment models, which are mostly based on past-to-present assessment models, DA aims to reveal each learner’s matured abilities (ZAD, Zone of Actual Development) and maturing abilities (ZPD, Zone of Proximal Development) (Vygotsky, 1978). What stands out in the underpinnings of DA is that it criticizes static assessment models because they lack interaction, and they merely focus on what each student learned in the past; however, students’ past experiences may not unearth their exact potential. Therefore, their emerging abilities to reach new information, organize high-order cognitive skills, and transfer information in new contexts should be taken into consideration (Poehner & Lantolf, 2005, 2013). Within this nature, DA blends instruction through SCT-oriented feedback into a present-to-future assessment model through dialogic reciprocal teacher-learner interactions (Poehner, 2008; Shrestha, 2020).

90

 Dynamic Assessment as a Learning-Oriented Assessment Approach

Dynamic Assessment can be implemented either with pre-defined hierarchical prompts (interventionist DA) or with graduated prompts that are provided in an ad-hoc manner according to each learner’s ever-shifting needs (interactionist DA) (Poehner, 2008). In both forms, mediation is the core term (Feuerstein et al., 2010). Mediation broadly refers to using physical and symbolic tools, such as leading questions, verbal cues, and/or online tools, to finely determine what a learner has already known and to what extent s/he can reach the target knowledge (Davin, 2016; Feuerstein et al., 2010; Lantolf, 2000). Then, the learner’s ability to transfer that knowledge to new contexts is traced by teachers. In this regard, teachers are mediators of DA. For example, in a typical DA session, teachers start with the most implicit mediational move to trigger learners’ error treatment and verbalization of reasoning. If a student corrects the erroneous output and verbalizes the reasoning, no more subsequent mediational moves are provided since that student is assumed to be self-regulated. In some cases, the student can correct the error but cannot explain the reasoning for the error correction, or the student becomes aware of the problem but cannot produce the correct answer. In these cases, the students are supposed to have partial autonomy, and more explicit mediational moves are provided for them. Furthermore, sometimes the students may fail to respond to mediational moves, and these students are supposed to have no autonomy over their erroneous output (Poehner & Infante, 2016, 2019). Overall, the teachers, as mediators, are responsible for finely analyzing the students’ reciprocity acts towards mediational moves because it is salient to reveal each student’s ZAD and ZPD (Poehner & Lantolf, 2005). The flow of dialogic interaction between the teacher and students can be represented in the figure below (Özturan & Uysal, 2022, p. 311). Figure 1. The Flow of Dialogic Interaction in a DA session

In contrast to conventional assessment models, DA is systematic, it is based on an iterative process, and both teachers and students are active agents during DA sessions. Moreover, it does not divide students as successful and failure, rather, it asserts that each student can achieve but at a different rate. Therefore, the students’ microgenesis, which is defined as quick development upon a mediational move (Vygotsky, 1978), is of importance as well as their autonomy levels (Poehner, 2008; Poehner & van Compernolle, 2020; Shrestha, 2020). To date, influential studies have been conducted to shed light on DA applications in various settings, such as L2 speaking (Antes, 2017; Levi, 2015, 2017), L2 reading (Kozulin & Garb, 2002; Kozulin & Levi, 2018), computer-based DA (Poehner et al., 2015; Poehner & Lantolf, 2013; Tzuriel & Shamir, 2002), teachers’ praxis (Beck et al., 2020; Herazo et al., 2019), and L2 writing

91

 Dynamic Assessment as a Learning-Oriented Assessment Approach

(Davin, 2016; Davin & Donato, 2013; Kushki et al., 2022; Infante & Poehner, 2019; Özturan & Uysal, 2022; Poehner & Leontjev, 2018; Shrestha, 2017, 2020; Zhang & van Compernolle, 2016). Despite these studies’ insights into the field, most of the related studies have adopted interventionist DA; in this regard, more studies are warranted to illustrate interactionist DA in EFL writing settings in situ because analyzing and unveiling teacher-learner interactions during DA sessions in an EFL writing setting could provide valuable insights into the learners’ problems, difficulties, and reasons for their poor performances. Within this nature, this chapter aims to report dialogic interactions between a teacher and four students who take L2 writing courses, and the main research question that guides the study is as follows: 1. What kinds of mediational moves have been held, and what the learners’ reciprocity acts have been during interactive DA sessions?

METHODOLOGY Participants This study is a part of a larger project for which twenty-five students, twenty females and five males, were recruited. Even though the project has collected data from all participants, this chapter is based on four students’ data who have been selected on purpose because they had similar problems, such as using the simple past tense accurately and choosing appropriate words, in their writings, but they needed divergent mediational moves and displayed different reciprocity acts. Within this nature, the data could illustrate interactive DA in detail in an ecological EFL writing setting. All these students have been learning English as a foreign language and were enrolled in English Language Teaching department at a medium-sized university in Turkey. There was no age gap among them. The students have taken a proficiency exam to determine their English Language level before the data collection process. According to the exam results, they started to take courses at the B1 level. Also, all students have become volunteers to participate in the study.

Data Collection Tools and Procedure The data were collected as a part of the Writing Skills course, which is taught five-hour weekly. According to the course syllabus, the students practice writing paragraphs in different writing modes, such as descriptive, narrative, opinion, problem and solution, compare and contrast. The participant students also practiced writing paragraphs in these modes for ten weeks, and the data were collected during that period. The flow of course content was as follows: Each week, the students read a sample text in their coursebooks, analyzed its genre-specific features, organization, and vocabulary, and completed the related activities given in the book. Then, collaborative writing was held inside the classroom. The students were divided into groups, and they produced a sample paragraph based on the target genre requirements. When all groups have finished writing their paragraphs, group DA was done. The teacher randomly chose a written text and showed it to the whole class. Then, the students tried to correct erroneous output by verbalizing their reasoning through the teacher’s mediational moves. As an individual writing activity, the teacher assigned the students to write a paragraph related to the target genre and the topic she determined. Upon this take-home activity, each student met the teacher 92

 Dynamic Assessment as a Learning-Oriented Assessment Approach

at the appointed time in her office, and individual interactive DA sessions were administered by checking the writing tasks. During these sessions, the teacher carefully and systematically determined each student’s ZAD and then tried to reconstruct the student’s ZPD through graduated mediational moves. The sessions were audio-recorded for further analysis. Table 1. Audio-recording chart Date

Duration

Week 1

October 9, 2019

2:16:33

Week 2

October 15, 2019

1:58:08

Week 3

October 22, 2019

1:51:00

Week 4

November 5, 2019

1:55:50

Week 5

November 12, 2019

1:29:36

Week 6

November 27, 2019

50:22

Week 7

December 3, 2019

2:26:16

Week 8

December 11, 2019

1:36:33

Week 9

December 18, 2019

1:14:45

Week 10

December 25, 2019

1:17:42

Total

16:56:35

Data Analysis The data were collected through audio recordings that lasted around seventeen hours and rested on teacher-learner dialogic interactions. To reveal what kinds of mediational moves were adopted during DA sessions and what the learners’ reciprocity acts toward these moves were, the audio recordings were firstly transcribed. By so doing, Transana has been used in order to display temporal aspects of speech like pauses and overlaps and prosodic aspects of speech, such as intonation and speech, through transcription conventions (Sert, 2015, p. 25) because they are salient factors in analyzing dialogic interactions.

93

 Dynamic Assessment as a Learning-Oriented Assessment Approach

Table 2. Transcription conventions ((comments)) Transcriber’s comments, includes non-verbal behavior ? Rising intonation, a question - Truncated speech, self-correction °yes° Quieter than normal talk, whisper (1.8) Numbers enclosed in parentheses indicate a pause. The number represents the number of seconds of the duration of the pause, to one decimal place. A pause of less than 0.2 s is marked by (.) :: A colon after a vowel or a word is used to show that the sound is extended. The number of colons shows the length of the extension. (hm, hh) These are onomatopoetic representations of the audible exhalation of air ¯­ Up or down arrows are used to indicate that there is a sharply rising or falling intonation. The arrow is placed just before the syllable in which the change in intonation occurs Under Underlines indicate speaker emphasis on the underlined portion of the word [ ] Overlap talk by two speakers # Length of pauses in seconds

After completing the transcription process, the researcher and an experienced language teacher coded the mediational moves and the learners’ reciprocity acts for further discussion, which has assured intercoder reliability. By so doing, initially they worked separately to analyze teacher-learner interactions and code the mediational moves and the learners’ reciprocity acts. Then, they worked in tandem to compare and discuss the codes. Because both the researcher and the teacher had similar codes, the third party was not included. For this chapter, four students - Sue, Neomy, Stefan, and Kate (pseudonyms) –were chosen on purpose to illustrate their ever-shifting needs for mediational moves and divergent reciprocity acts despite their similar or same erroneous outputs in their written texts.

FINDINGS AND DISCUSSION Case 1: Sue and Neomy Sue and Neomy (pseudonyms) are two sample students who have had problems using simple past tense accurately in their paragraphs. Even though the error nature has been the same, Neomy has displayed more autonomous reciprocity acts compared to Sue, and the following excerpts have enlightened the issue in detail.

Excerpt 1, T: Teacher, S: Sue 1 2 3 4

94

T: S: T: S:

Now let us look at this sentence (0:04:09.7) ((inaudible)) (0:04:22.4) no ((no response 0.8)) (0:04:32.5)

 Dynamic Assessment as a Learning-Oriented Assessment Approach

5 T: specifically here ((the teacher narrows down the place which 6 needs correction)) (0:04:33.6) 7 S: bulamadım (0:04:46.3) 8 T: OK. which tense did you use here? (0:04:50.1) 9 S: past (0:04:51.0) 10 T: so look at this one (0:04:53.0) 11 S: ((laughing)) (0:04:56.0) 12 T: what is wrong here? (0:04:56.6) 13 S: uhmm what ‘was’ (0:05:00.4) 14 T: yes ‘what was good’ because in this sentence again you are 15 talking about past

As the excerpt above displays, the teacher started to support Sue by using the most implicit mediational move and implying that there was a problem in the sentence (Line 1). Even though Sue’s first reaction could not be transcribed as it was inaudible, the teacher’s response (Line 3) demonstrated that Sue could not find and/or correct the erroneous output. In this regard, the teacher kept going on mediation by using a more explicit mediational move and narrowed down the erroneous part to get Sue’s attention there (Line 5). Sue has again failed to find and correct the error and responded in Turkish ‘bulamadım (I could not find)’ (Line 7). Therefore, the teacher displayed a sample sentence in her text and provided some metalinguistic cues by asking about the tense used in that sentence (Line 8). Sue could recognize that the sentence had been written in the simple past tense (Line 9). Because Sue was aware of using simple past tense accurately, the teacher guided her to look at the erroneous part one more time to make her compare and contrast two sentences and elicit error treatment (Lines 10-12). Upon these mediational moves, Sue succeeded in correcting the error (Line 13). As she needed many turns and mediational moves for error treatment and she failed to show full autonomy on the use of simple past tense on that occasion, the teacher provided the most explicit mediational move at the end by explaining why the correct form should have been ‘was’ (Lines 14-15).

Excerpt 2, T: Teacher S: Sue 15 16 17 18 19 20 21 22 23 24 25 26 27

T: now this sentence can you please look at it again? S: took? T: the problem is about grammar but ‘took’ is not a problem here (0:05:35.6) S: OK uhmm since var T: so which tense do you need there? (0:06:11.0) S: present - no - past T: no S: present perfect T: yes. now let us change the sentence (0:06:18.9) S: taken? T: you need something before taken

95

 Dynamic Assessment as a Learning-Oriented Assessment Approach

28 S: uhm has taken? 29 T: yes has taken good

The next part of the dialogic interaction between the teacher and Sue (Excerpt 2) has yielded that Sue had more problems using correct tense forms. Although the error was based on the accurate form of simple past tense, Sue has experienced some problems in using correct verb forms. Clearly, as Excerpt 2 unpacks, the teacher guided her to revise the sentence through an implicit mediational move (Lines 15-16), and right after it, Sue mentioned ‘took’ by hesitating and implying that there was a problem with using it (Line 17). Then, the teacher explained the nature of the error and mentioned that ‘took’ was the problematic part (Lines 18-19). Upon it, Sue recognized a keyword (Line 20), and the teacher questioned which tense was needed on that occasion (Line 21). Even though Sue found the salient keyword -since-, she has failed to find the accurate tense immediately (Lines 22-24). What stands out in these excerpts is that Sue did not have full autonomy in using accurate tense forms in the right places, and she was at an other-regulation level because she needed more explicit mediational moves and the teacher’s guidance for error treatment. Moreover, even though there were some cases in which she could find keywords and/or produce correct answers, she hesitated and could not verbalize the reasoning for her answers. In this regard, the teacher provided explanations to enlighten the issue in detail, which again unveiled that Sue might not be self-regulated in using tenses accurately (Poehner, 2008).

Excerpt 3, T: Teacher N: Neomy 30 T: 31 N: 32 T: 33 N: 34 T: 35 N: 36 T: 37 N: 38 T: 39 N: 40 T: good

so can you please read these three parts again? (0:14:34.3) there was, past (0:14:37.1) yes there was…and look at here (0:14:41.5) bu da was (0:14:45.5) yes there was and also one more mistake you have (0:14:48.3) an old man there was an old man (0:14:54.3) yes. OK. What is wrong here? (0:14:59.7) hm comma I need (0:15:08.5) where? ((she could show the place)) yes. Because this is subordinate part so you need a comma

Neomy (pseudonym), on the other hand, displayed more autonomous reciprocity acts compared to Sue, albeit with similar error natures in their texts. As Excerpt 3 shows, the teacher started to assist Neomy through an implicit mediational move and guided her to read problematic parts again (Line 30). While doing that, the teacher just underlined the sentence/s with errors, which was the most implicit mediational move. In that case, Neomy was supposed to find the erroneous part and elicit error treatment, and lines 31 and 33 unpacked that not only could Neomy correct the error, but also she could verbalize the reasoning for the error treatment. Furthermore, lines 37 and 39 unearth her potential for self-error treatment on another linguistic dimension which was punctuation merely after getting the most implicit mediational move. 96

 Dynamic Assessment as a Learning-Oriented Assessment Approach

Overall, these three excerpts shed light on learners’ divergent reasons for their poor performances. Even though Neomy and Sue had the same erroneous parts in their written texts, the teacher assisted them through different graduated mediational moves, and they displayed different reciprocity acts: Neomy could treat the erroneous part merely after getting the implicit mediational moves, and explained her reasons for error treatment, whilst Sue had difficulties in so doing. These findings are in line with Poehner and Lantolf’s assertion (2013) that product-oriented and static assessment models may not offer fair assessment settings, but process-oriented and dynamic assessment models, which are based on a goal-oriented interaction, might create opportunities to analyze each student’s matured and maturing abilities, reasons for her/his poor performance, and unveil students’ mediated performance. Within this nature, refuting the idea of past-to-present assessment, dynamic assessment enhances opportunities to see what a student might achieve in the future by analyzing his/her current performances (Poehner, 2008; Shrestha, 2020). Graduated mediational moves might change not only up to learner profiles but also up to error natures, such as local (grammar, vocabulary, punctuation) and more global (content, organization, cohesion) features (Nassaji, 2017). The sample excerpts above are for illustrating dialogic reciprocal interaction grounding in the grammatical problems (the use of simple past tense) of two students. The following excerpts, on the other hand, shed light on the mediational moves provided for the vocabulary problems.

Case 2: Stefan and Kate This part presents sample excerpts belonging to Stefan and Kate (pseudonyms), who experienced problems in choosing accurate vocabulary in their texts. As Excerpts 4 and 5 unpacked, the dialogic interaction rested on two languages, the target language (English) and the student’s mother language (Turkish) because sometimes using people’s L1 might be an effective tool to support the students and diminish ambiguity in their minds while learning/teaching something (Vygotsky, 1978). As an example, Stefan used the adjective, ‘curly’ in the wrong context. In the beginning, the teacher must have had problems reading Stefan’s handwriting, so she asked him to clarify the word used. Upon Stefan’s utterance of the word, the teacher provided the first mediational move by guiding him to revise ’curly sea’ through a questioning tone of voice (Line 97). Then, Stefan replied in Turkish and said that ’dalgalı deniz demek istedim (I wanted to say wavy sea). Then, the teacher recognized that Stefan has needed to differentiate the two words: curly and wavy. In Turkish, there is only one word (dalgalı) that can be used for hair and sea, but in English, there are two different adjectives for these occasions (curly hair/ wavy sea). In this regard, the teacher explained it to Stefan (Line 99), which constituted the instruction part of the dynamic assessment.

Excerpt 4, T: Teacher S: Stefan 95 T: what is this? 96 S: curly 97 T: curly sea? 98 S: ((laughing)) dalgalı deniz demek istedim)) 99 T: we say curly hair, not curly sea. we use wavy sea 100 S: uhm OK

97

 Dynamic Assessment as a Learning-Oriented Assessment Approach

Similarly, Kate needed assistance to recover her inaccurate verb choice in her paragraph, and the teacher began to guide her through the most implicit mediational move (asking for revising – Line 133). Then, Kate merely reread the sentence (when I am cold, I usually see the doctor) instead of making any corrections and changes. Therefore, the teacher requested Kate to focus on the meaning and decide whether it was clear (Line 135). Upon that question, Kate replied in Turkish and said that she thought the meaning was clear (Line 136). At that moment, the teacher moved on with a more explicit mediational move, narrowed the location of the error, and questioned a specific part (hasta olmak?). Kate’s response (Line 139) revealed that she had problems with how to differentiate ‘to have a cold’ and ‘to be cold.’ Then, the teacher kept going on diagnosing Kate’s ZAD. In this regard, the teacher asked how to express some illnesses in English. She uttered the expressions in Turkish and wanted Kate to respond in English. Lines 140-145 have unveiled that Kate could use the correct verb while describing illness (e.g., I have a headache, I have toothache, I have flu). Then, the teacher assisted Kate in saying “soğuk algınlığım var (to have a cold),” and she was surprised and recognized that she should have used ‘to have’ rather than ‘to be’ in her sentence (Line 147). The teacher provided a further explanation in Turkish in order to support Kate and make the differences clear.

Excerpt 5, T: Teacher K: Kate 133 T: can you please revise that sentence? (Mediational move 1 – Asking for revising) 134 K: when I am cold, I usually see the doctor 135 T: Do you think the meaning is clear? (Mediational move 2 – Asking for revising) 136 K: Yes, hasta olunca doktora giderim 137 [I see the doctor when I am ill] 138 T: hasta olmak? (Mediational move 3 – Narrowing the location of error) 139 K: soğuk algınlığı when I am cold 140 T: OK …. Please say me “başım ağrıyor” in English (Diagnosing learner ZAD) 141 K: I have headache 142 T: now say me “diş ağrım var” in English (Diagnosing learner ZAD) 143 K: I have toothache 144 T: „nezleyim“ (Diagnosing learner ZAD) 145 K: I have flu 146 T: now “soğuk algınlığım var” (reconstruction of learner ZPD) 147 K: o da mı have? 148 T: yes, hastalık olarak bahsederken have a cold ama üşümek 149 anlamında 150 be cold (Mediational move 4 – Providing correct answer with an explanation in Turkish) 151 K: oh OK teacher

The samples and excerpts are in accord with the salience of goal-oriented interaction between the teachers and the students (Walsch & Sert, 2019) and the need for adopting learning-oriented assessment approaches 98

 Dynamic Assessment as a Learning-Oriented Assessment Approach

(Poehner & Inbar-Lourie, 2020) because they pave the way to analyze each learner’s needs, instruct accordingly, and assess in an individualized space. Heeding these important aspects, the following section presents pedagogical implications of dynamic assessment in second/foreign language writing settings.

CONCLUSION Dynamic Assessment relies on the fusion of instruction and assessment and appears as an alternative to conventional assessment approaches (Poehner, 2008; Shrestha, 2020). Because the topic is in its infancy, more studies are warranted to illustrate DA, especially interactionist DA in situ. Grounding in this need, this chapter sheds light on the use of individual interactionist DA in an L2 writing setting with the analysis of dialogic interaction and presents some actual examples, implications and recommendations. First, language teachers and language teacher candidates should be aware of the need for tailoring feedback according to learners’ ever-shifting needs and different error natures (Mao & Lee, 2021; Nassaji, 2011, 2017; Storch, 2018). By so doing, setting a goal-oriented and systematic interaction with the learners carries importance (Walsch & Sert, 2019). Furthermore, people’s cognition can be modifiable through social interaction (Ellis et al., 2019), which might also prevail in effective teaching-learning and assessment processes. Second, language teachers might guide the students to use Automated Writing Evaluation (AWE) tools and/or provide corrective feedback themselves as a part of L2 writing courses, but how the students are engaged with the feedback provided is still far from being conclusive. Bearing in mind the salience of interaction, language teachers, especially L2 writing teachers, should support the students through dialogic reciprocal interaction. Third, as people are getting mobilized around the world, especially in higher education, gaps among learners might increase. Within this nature, instead of merely focusing on past-to-present and static assessment approaches, adopting present-to-future and dynamic assessment might help the teachers unpack each student’s proximal growth and achievement in considerable detail. Albeit its promising contributions and implications, dynamic assessment can be criticized because it might require extra time, energy, and systematic application from the teachers, and in some cases, it might be tough to conduct in crowded classes. On these occasions, group dynamic assessment can be an alternative to save time and diminish the teachers’ burden. Moreover, dynamic assessment can be integrated with computer-based technologies, which requires further investigation.

REFERENCES Aljaafreh, A., & Lantolf, J. P. (1994). Negative feedback as regulation and second language learning in the Zone of Proximal Development. Modern Language Journal, 78(4), 465–483. doi:10.1111/j.1540-4781.1994. tb02064.x Antes, T. A. (2017). Audio glosses as a participant in L2 dialogues: Evidence of mediation and microgenesis during information-gap activities. Language and Sociocultural Theory, 4(2), 101–123. doi:10.1558/ lst.31234

99

 Dynamic Assessment as a Learning-Oriented Assessment Approach

Beck, S. W., Jones, K., Storm, S., & Smith, H. (2020). Scaffolding students’ writing processes through dialogic assessment. Journal of Adolescent & Adult Literacy, 63(6), 651–660. doi:10.1002/jaal.1039 Cole, M., John-Steiner, V., Scribner, S., & Souberman, E. (Eds.). (1978). Mind in society: The development of higher psychological processes. L. S. Vygotsky. Harvard U Press. Davin, K. J. (2016). Classroom dynamic assessment: A critical examination of constructs and practices. Modern Language Journal, 100(4), 1–17. doi:10.1111/modl.12352 Davin, K. J., & Donato, R. (2013). Student collaboration and teacher-directed classroom dynamic assessment: A complementary pairing. Foreign Language Annals, 46(1), 5–22. doi:10.1111/flan.12012 Ellis, R., Skehan, P., Li, S., Shintani, N., & Lambert, C. (2019). Cognitive-Interactionist Perspectives. In Task-Based Language Teaching: Theory and Practice (pp. 29-63). Cambridge University Press. doi:10.1017/9781108643689.006 Erlam, R., Ellis, R., & Batstone, R. (2013). Oral corrective feedback on L2 writing: Two approaches compared. System, 41(2), 257–268. doi:10.1016/j.system.2013.03.004 Feuerstein, R., Feuerstein, R. S., & Falik, L. H. (2010). Beyond smarter: Mediated learning and the brain’s capacity for change. Teachers College Press. Haywood, H. C., & Tzuriel, D. (2002). Applications and challenges in dynamic assessment. Peabody Journal of Education, 77(2), 40–63. doi:10.1207/S15327930PJE7702_5 Herazo, J. D., Davin, K. D., & Sagre, A. (2019). L2 dynamic assessment: An activity theory perspective. Modern Language Journal, 103(2), 443–458. doi:10.1111/modl.12559 Hyland, K., & Hyland, F. (2006). Feedback on second language students’ writing. Language Teaching, 39(2), 83–101. doi:10.1017/S0261444806003399 Infante, P., & Poehner, M. E. (2019). Realizing the ZPD in second language education: The complementary contributions of dynamic assessment and mediated development. Language and Sociocultural Theory, 6(1), 63–91. doi:10.1558/lst.38916 Kozulin, A., & Garb, E. (2002). Dynamic Assessment of EFL Text Comprehension. School Psychology International, 23(1), 112–127. doi:10.1177/0143034302023001733 Kozulin, A., & Levi, T. (2018). EFL learning potential: General or modular? Journal of Cognitive Education and Psychology, 17(1), 16–27. doi:10.1891/1945-8959.17.1.16 Kushki, A., Nassaji, H., & Rahimi, M. (2022). Interventionist and interactionist dynamic assessment of argumentative writing in an EFL program. System, 107, 1–13. doi:10.1016/j.system.2022.102800 Lantolf, J. (2000). Introducing sociocultural theory. In J. P. Lantolf (Ed.), Sociocultural theory and second language learning (pp. 1–26). Oxford University Press. Levi, T. (2015). Towards a framework for assessing foreign language oral proficiency in a large-scale test setting: Learning from DA mediation examinee verbalizations. Language and Sociocultural Theory, 2(1), 1–24. doi:10.1558/lst.v2i1.23968

100

 Dynamic Assessment as a Learning-Oriented Assessment Approach

Levi, T. (2017). Developing L2 oral language proficiency using concept-based Dynamic Assessment within a large-scale testing context. Language and Sociocultural Theory, 3(2), 77–100. doi:10.1558/ lst.v3i2.32866 Mao, Z., & Lee, I. (2021). Researching L2 student engagement with written feedback: Insights from sociocultural theory. TESOL Quarterly. Advance online publication. doi:10.1002/tesq.3071 Nassaji, H. (2011). Correcting students’ written grammatical errors: The effects of negotiated versus nonnegotiated feedback. Studies in Second Language Learning and Teaching, 1(3), 315–334. doi:10.14746sllt.2011.1.3.2 Nassaji, H. (2016). Anniversary article interactional feedback in second language teaching and learning: A synthesis and analysis of current research. Language Teaching Research, 20(4), 535–562. doi:10.1177/1362168816644940 Nassaji, H. (2017). Negotiated oral feedback in response to written errors. In H. Nassaji & E. Kartchava (Eds.), Corrective feedback in second language teaching and learning: Research, theory, applications, implications (pp. 114–128). Routledge. doi:10.4324/9781315621432-9 Nassaji, H., & Swain, M. (2000). A Vygotskian perspective on corrective feedback in L2: The effect of random versus negotiated help on the learning of English articles. Language Awareness, 9(1), 34–51. doi:10.1080/09658410008667135 OECD. (2020). Education at a Glance 2020: OECD Indicators. OECD Publishing., doi:10.1787/69096873Özturan, T., & Uysal, H. H. (2022). Mediating multilingual immigrant learners’ L2 writing through interactive dynamic assessment. Kuramsal Eğitimbilim Dergisi, 15(2), 307–326. doi:10.30831/akukeg.1004155 Poehner, M. E. (2008). Dynamic assessment: A Vygotskian approach to understanding and promoting L2 development. Springer. doi:10.1007/978-0-387-75775-9 Poehner, M. E., & Inbar-Lourie, O. (2020). An epistemology of action for understanding and change in L2 classroom assessment: The case for praxis. In M. E. Poehner & O. Inbar-Lourie (Eds.), Toward a reconceptualization of second language classroom assessment (1st ed., pp. 1–20). Springer. doi:10.1007/978-3-030-35081-9_1 Poehner, M. E., & Infante, P. (2016). Dynamic assessment in the language classroom. In D. Tsagari & J. Banerjee (Eds.), The Handbook of Second Language Assessment. De Gruyter. doi:10.1515/9781614513827019 Poehner, M. E., & Infante, P. (2019). Mediated development and the internalization of psychological tools in second language (L2) education. Learning, Culture and Social Interaction, 22, 1–14. doi:10.1016/j. lcsi.2019.100322 Poehner, M. E., & Lantolf, J. P. (2005). Dynamic assessment in the language classroom. Language Teaching Research, 9(3), 233–265. doi:10.1191/1362168805lr166oa Poehner, M. E., & Lantolf, J. P. (2013). Bringing the ZPD into the equation: Capturing L2 development during Computerized Dynamic Assessment (C-DA). Language Teaching Research, 17(3), 323–342. doi:10.1177/1362168813482935

101

 Dynamic Assessment as a Learning-Oriented Assessment Approach

Poehner, M. E., & Leontjev, D. (2018). To correct or to cooperate: Mediational processes and L2 development. Language Teaching Research, 1–22. doi:10.1177/1362168818783212 Poehner, M. E., & van Compernolle, R. A. (2020). Reconsidering time and process in L2 dynamic assessment. In E. P. Matthew & I. L. Ofra (Eds.), Toward a reconceptualization of second language classroom assessment: Praxis and researcher-teacher partnership (pp. 173–195). Springer. doi:10.1007/978-3030-35081-9_9 Poehner, M. E., Zhang, J., & Lu, X. (2015). Computerized dynamic assessment (C-DA): Diagnosing L2 development according to learner responsiveness to mediation. Language Testing, 32(3), 337–357. doi:10.1177/0265532214560390 Sert, O. (2015). Social Interaction and L2 Classroom Discourse. Edinburgh University Press. doi:10.1515/9780748692651 Shrestha, P. N. (2017). Investigating the learner transfer of genre features and conceptual knowledge from an academic literacy course to business studies: Exploring the potential of dynamic assessment. Journal of English for Academic Purposes, 25, 1–17. doi:10.1016/j.jeap.2016.10.002 Shrestha, P. N. (2020). Dynamic Assessment of Students’ Academic Writing: Vygotskian and Systemic Functional Linguistic Perspectives. Springer. doi:10.1007/978-3-030-55845-1 Storch, N. (2018). Written corrective feedback from sociocultural theoretical perspectives: A research agenda. Language Teaching, 51(2), 262–277. doi:10.1017/S0261444818000034 Tzuriel, D., & Shamir, A. (2002). The effects of mediation in computer assisted dynamic assessment. Journal of Computer Assisted Learning, 18(1), 21–32. doi:10.1046/j.0266-4909.2001.00204.x Walsh, S., & Sert, O. (2019). Mediating L2 learning through classroom interaction. In X. Gao (Ed.), Second handbook of English Language teaching (pp. 1–19). Springer., doi:10.1007/978-3-319-58542-0 35-1 Zhang, H., & van Compernolle, R. A. (2016). Learning potential and the dynamic assessment of L2 Chinese grammar through elicited imitation. Language and Sociocultural Theory, 3(1), 99–119. doi:10.1558/lst.v3i1.27549

ADDITIONAL READING Alavi, S. M., & Taghizadeh, M. (2014). Dynamic assessment of writing: The impact of implicit/explicit mediations on L2 learners’ internalization of writing skills and strategies. Educational Assessment, 19(1), 1–16. doi:10.1080/10627197.2014.869446 Davin, K. J. (2013). Integration of dynamic assessment and instructional conversations to promote development and improve assessment in the language classroom. Language Teaching Research, 17(3), 303–322. doi:10.1177/1362168813482934

102

 Dynamic Assessment as a Learning-Oriented Assessment Approach

Davin, K. J., & Herazo, J. D. (2020). Reconceptualizing classroom dynamic assessment: Lessons from teacher practice. In M. E. Poehner & O. Inbar-Lourie (Eds.), Toward a reconceptualization of second language classroom assessment (1st ed., pp. 197–217). Springer. doi:10.1007/978-3-030-35081-9_10 Davin, K. J., Herazo, J. D., & Sagre, A. (2016). Learning to mediate: Teacher appropriation of dynamic assessment. Language Teaching Research, 21(5), 632–651. doi:10.1177/1362168816654309 Frawley, W., & Lantolf, J. P. (1985). Second language discourse: A Vygotskyan perspective. Applied Linguistics, 6(1), 19–44. doi:10.1093/applin/6.1.19 Haywood, H. C., & Lidz, C. S. (2007). Dynamic assessment in practice. Clinical and educational applications. Cambridge University Press. Karim, K., & Nassaji, H. (2020). The revision and transfer effects of direct and indirect comprehensive corrective feedback on ESL students’ writing. Language Teaching Research, 24(4), 519–539. doi:10.1177/1362168818802469 Lantolf, J. P. (2012). Sociocultural theory: A dialectical approach to L2 research. In S. M. Gass & A. Mackey (Eds.), The Routledge handbook of second language acquisition (pp. 57–72). Routledge/ TaylorFrancis. Lantolf, J. P., & Poehner, M. E. (2004). Dynamic assessment of L2 development: Bringing the past into the future. Journal of Applied Linguistics, 1(1), 49–72. doi:10.1558/japl.1.1.49.55872 Lantolf, J. P., & Poehner, M. E. (2010). Dynamic assessment in the classroom: Vygotskian praxis for second language development. Language Teaching Research, 15(11), 11–33. doi:10.1177/1362168810383328

KEY TERMS AND DEFINITIONS Dynamic Assessment: It is a kind of learning-oriented assessment approaches and integrates instruction and assessment in a single and collaborative work between learners and mediators. Mediation: It covers a goal-oriented interaction between learners and teachers/mediators. Regulation: It is broadly divided into object-regulation, other-regulation, and self-regulation. During dynamic assessment sessions, mediators observe a learner’s regulatory level after providing mediational tools to unveil whether s/he needs assistance (other regulation) or regulate the task alone (self-regulation). Sociocultural Theory: Vygotsky’s sociocultural theory highlights the salience of social interaction on individuals’ cognitive development. Transcendence: It refers to the learners’ transfer abilities of mediational tools to new tasks. Zone of Actual Development: An individual’s independent problem-solving skills. Zone of Proximal Development: The distance between what an individual can achieve alone and what s/he can achieve under a mediator’s assistance.

103

104

Chapter 7

Flipped Spiral Foreign Language Assessment Literacy Model (FLISLALM) for Developing Pre-service English Language Teachers’ Language Assessment Literacy Çiler Hatipoğlu https://orcid.org/0000-0002-7171-1673 Middle East Technical University, Turkey

ABSTRACT Foreign language (FL) assessment is one of the most critical and challenging areas for pre-service FL teachers to develop. It is essential since various studies have shown that typical teachers spend up to half of their professional time in assessment-related activities. The area is difficult because its theoretical concepts are highly abstract. Because of these, for several years now, experts have emphasised the necessity of developing the language assessment literacy (LAL) of pre-service FL teachers and encouraged academics to research this area. In response to these calls, this chapter describes the developmental stages of the flipped spiral language assessment literacy model (FLISLALM) used to teach the undergraduate English Language Testing and Evaluation course in an FL teacher training program in Turkey. The model was developed using Brindley’s and Giraldo’s LAL frameworks and data from student questionnaires, product outputs, and self-assessment presentations collected between 2009-2020. The model aims to maximise prospective Turkish FL teachers’ LAL growth.

DOI: 10.4018/978-1-6684-5660-6.ch007

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 Flipped Spiral Foreign Language Assessment

INTRODUCTION Foreign language assessment is one of the most important but also one of the most challenging areas (O‘Loughlin, 2006) that pre-service foreign language teachers need to develop (Berry, Sheehan & Munro, 2019; Hatipoğlu, 2015, 2017; Levi & Inbar-Lorie, 2020; Stiggins, 2002). Knowledge of foreign language assessment is essential because studies conducted in different educational contexts show that a typical teacher spends between one-third and half of their professional time in assessment-related activities (Newsfields, 2006; Stiggins, 1999). The difficulty of the field, on the other hand, arises from the fact that the basic theoretical concepts such as validity, reliability and applicability are highly abstract, and it is hard to establish harmony and balance between them while developing assessment tools. In addition, trends in foreign language education, its principles and methods constantly change, and so should language assessment. What is more, while preparing assessment tools, factors such as the age and proficiency level of the test takers, their reasons for learning the foreign language, the aims of the educational program, and the facilities and available technological tools in the institution should be taken into consideration. Finally, experts in foreign language testing and evaluation argue that “we are still far distant from our ideal of the classroom language teachers with a firm knowledge of at least the fundamental principles of language testing and assessment” (Stevenson, 1985, p. 112; also see Coombe et al., 2020; Giraldo, 2021; Tsagari & Vogt, 2017). Due to this multileveled complex nature of the assessment process and due to the insufficient training received by pre-service language teachers, there have been calls (especially since Davies’ seminal 2008 article) for improving and expanding the assessment literacy of language teachers around the world (Berry et al., 2019; Jeong, 2013; Kremmel & Harding, 2020; Levi & Inbar-Lorie, 2020; Stiggins, 2002; Sultana, 2019; Vogt & Tsagari, 2014) and in Turkey (Can, 2020; Hatipoğlu, 2010, 2015, 2016, 2017a, 2021a; Haznedar, 2012; Köksal, 2004; Mede & Atay, 2017; Şahin, 2019) so that they are prepared to cope with the challenges in their local contexts as well as the ever-changing trends in foreign language education. In response to these calls, this chapter presents and describes the various development stages and implementation of the “Flipped Spiral Language Assessment Literacy Model” (FLISPILALM) used while teaching an undergraduate “English Language Testing and Evaluation” (ELTE) course. The model aims to maximize the language assessment literacy (LAL) level of prospective language teachers in the Turkish context. It takes the LAL frameworks developed by Brindley (2001) and Giraldo (2018) as a basis and presents the stages and procedures that are followed in the ELTE course so that pre-service English language teachers benefit maximally from the training they receive. To check whether the model works, and if needed to execute the necessary changes, both quantitative and qualitative data were collected from the students taking the course.

BACKGROUND: THE ORIGINAL ELTE COURSE (2008-2010) The teacher training system implemented since the establishment of the Republic of Turkey (1923-1981) changed in 1981 with the passing of law N=2574. With this law, the responsibility of training teachers was taken from the Ministry of National Education (MONE) and given to the Higher Education Council (YÖK). Since 1981, all teachers in Turkey are trained in the Faculties of Education (EF) in different universities across the country (Çakıroğlu & Çakıroğlu, 2003; Hatipoğlu, 2017a, 2022; Hatipoğlu & Erçetin, 2016; Kavak et al., 2007). This change is considered one of the most critical reforms in teacher 105

 Flipped Spiral Foreign Language Assessment

training in the country since teacher education, whose duration and quality changed from one institution the other before 1982 (for more details, see Hatipoğlu, 2017a, 2022; Hatipoğlu & Erçetin, 2016) has been standardized and since 1982, foreign language teachers in Turkey graduate from four-year-long programs (prepared and) closely monitored by YÖK. Graduates of these programs are allowed to teach at all levels (primary, secondary, high school and university) of the educational system in Turkey. They also take the responsibility to measure and evaluate the foreign language knowledge of students all ages and proficiency levels. However, there is only one course - English Language Testing and Evaluation (ELTE) in the English Language Teaching Departments (ELTD) programs in Turkey, where pre-service teachers learn how to assess their students. This course was included in the program in 1997 by YÖK and its catalogue description was: Types of tests; test preparation techniques for the purpose of measuring various English language skills; the practice of preparing various types of questions; evaluation and analysis techniques; statistical calculations. (METU General Catalogue, 2001, p. 406) In the following 25 years (i.e., 1997-2022), no other course specifically related to the field of foreign language testing and evaluation has been added to the ELT programs. Therefore, it is expected that all required basic knowledge and skills of testing and evaluation in a foreign language will be taught and practised in the sole ELTE course in the programs. These basic skills and knowledge related to testing and evaluation are usually referred to as “language assessment literacy” (LAL) and have been studied and discussed much more frequently, especially after 1990 when the American Federation of Teachers, National Council on Measurement in Education, and National Education Association (https://buros.org/standards-teachercompetence-educational-as sessment-students) published the “Standards for Teacher Competence in Educational Assessment”. According to this document, teachers must have knowledge and skills in the following areas so that they can design reliable and valid evaluation instruments: The in-service teachers should be: 1. ready to choose assessment methods appropriate for instructional decisions; 2. skilled in developing assessment methods appropriate for instructional decisions; 3. ready to administer, score and interpret the results of both externally produced and teacher-produced assessment methods; 4. able to use assessment results when making decisions about individual students, planning teaching, developing curriculum, and improving schools; 5. competent in developing valid pupil grading procedures which use pupil assessment; 6. skilled in communicating assessment results to students, parents, other lay audiences and other educators; 7. experienced in recognising unethical, illegal, and otherwise inappropriate assessment methods and uses of assessment information (Brindley, 2001, p. 128). Based on these expectations, Brindley (2001, pp. 129-130) developed the “Professional Development Programs in Assessment” scheme, which is considered one of the first attempts to define LAL. As shown in Figure 1, there are five units in this training program, the first two of which are compulsory/ core while the remaining three are supplementary: (1) The social context of assessment (SCA) (core unit), 106

 Flipped Spiral Foreign Language Assessment

(2) Defining and describing proficiency (DDP) (core unit); (3) Constructing and evaluating language tests (CELT); (4) Assessment in the language curriculum (ALC) and (5) Putting assessment into practice (PAP). So, the core of Brindley’s (2001) LAL is the acquisition of information and awareness about the political, educational, and social dimensions of evaluation in society and the knowledge of theories of assessment (see Figure 1). These are also what Davies (2008) calls principles of ethics and the impact of language testing, and knowledge of theories and models of language proficiency. So, the idea is that if teachers have well-established theoretical knowledge and a good understanding of their contexts and the impact of testing, they will be able to develop tailor-made high-quality tests, do statistical analysis and develop beneficial projects and policies. Figure 1. Brindley’s (2001, p. 129-130) professional development programs in assessment

The content of the ‘original ELTE course’ taught between 2007-2010 at ELTD at METU was based on Brindley’e (2001) LAL. Due to the limited time in one academic term, the emphasis in this course was more on the two modules that Brindley (2001) classified as ‘required’ (see Table 1). In the first weeks, the place and effect of measurement and evaluation in the Turkish education system and what needs to be done for the preparation and standardization of various high stake and classroom exams were covered (i.e., topics ‘Kinds of tests’ and ‘Kinds of testing’). In the following two weeks, to strengthen the theoretical background needed for the preparation of various exams, topics such as Validity and Reliability were scrutinized. Weeks 7-13 were devoted to the study of the rules to be considered while writing multiple-choice test items and exams assessing students’ grammar and vocabulary knowledge as well as their reading, writing, listening, and speaking skills in the foreign language (Hatipoğlu, 2017b, 2021a, 2021b). Week 14 focused on ‘Test Administration’ where topics such as quality management and practi-

107

 Flipped Spiral Foreign Language Assessment

cal considerations in test administration, place, time, material and rater selection and monitoring before, during and after the classroom exams were discussed. The last week of the course was a question-answer session where the lecturer and the students reviewed the topics that were covered during the semester. Table 1. ELTE Course Content (2007-2010)

Since the original ELTE’s main aim was to develop the theoretical knowledge of the students, the course assessment focused on that objective as well. As shown in Table 2, 70% of the exams and other evaluation procedures (Midterm=30%, Final=30%, Presentation of testing techniques=10%) elicited information related to the theoretical knowledge of students while 10% of the final grade was coming

108

 Flipped Spiral Foreign Language Assessment

from ‘class participation’ and preparation of the ‘final portfolio’ (i.e., gathering all materials created by the students and uploading them to the METU learner management system). With this distribution, only 20% of the student’s grade was allocated to practices such as preparing and evaluating exam questions inside or outside the classroom. Table 2. Assessment procedures in original ELTE Course (2007-2010) (Hatipoğlu, 2010, p. 44)

Aiming to follow the developments in the field and trying to improve the only ELTE course in the program, the researcher decided to conduct a study to find out whether this course met the needs and expectations of the pre-service teachers. The results of this study were published in Hatipoğlu (2010). The research data were collected using a questionnaire where one of the items was “You have completed the ‘English Language Testing and Evaluation’ Course. List 3 things that you think should be changed in relation to this course to make the course better if it is to be taught again”. As can be seen in Figure 2, 55.7% of the participants thought that the theory-oriented focus/nature of the course should be changed: 37.9% of them listed the lack of opportunities for practice as one of the most important problems, 8.9% argued that the majority of the taught topics were abstract and conceptual and hard to comprehend when not related to practice, and 2.4% stated that the materials selected for this course were too difficult to read and understand. In addition, 6.4% of the participants wrote that the assessment practices employed in the course mostly measure theoretical knowledge and this needs to be changed. In their comments, students reiterated that more space should be allocated to writing, revising and analyzing various kinds of tests. A bit more than one-fifth (21.9%) of the students also suggested that some of the teaching methods and techniques employed in class should be changed. They claimed that “The presentation of the testing techniques”, where students, in groups, were asked to research and present the methods used to test grammar, vocabulary, reading, writing, speaking or listening was not useful. They argued that since this was the first time they were taking a course related to testing and evaluation it was difficult for them to select the most relevant materials for the presentation and they could not learn the subject unless the lecturer in charge of the course explained it in class. This problem, in my opinion, stems from the so-called “teacher-centred education” in the country which was found and discussed in various studies

109

 Flipped Spiral Foreign Language Assessment

conducted in Turkey (e.g., Kalkan, 2017; Tezci et al., 2017). Both Kalkan (2017), whose participants were foreign language teachers from seven geographical regions in Turkey, and Tezci et al. (2017), who worked with pre-service teachers, found that the teacher-centred education system is still the predominant model in Turkey. Their results also showed that teachers are still seen as the only source of knowledge and/or information and unless approved by them, student answers/presentations are doubted. In addition, it was reported that in the observed classrooms very rarely environments encouraging students’ active learning, discovery, participation and cooperative practice were created. Figure 2. Changes requested by students in 2010

The remaining 20.1% of the participants were not happy with the number of students in ELTE classes (8.3%) and the fact that there was only one ELTE course in the whole program (11.8%). They argued that it is almost impossible to get even the most basic skills and knowledge related to foreign language testing in one course and that the task is made even more difficult by the fact that the classes were crowded and they did not have the chance to receive individual feedback related to their work as often they needed from the lecturer. After analysing the results coming from Hatipoğlu’s (2010) study, it was decided to change and improve the ELTE course so that it prepared pre-service teachers better for their future jobs. The following sections present the steps taken to transform the course.

110

 Flipped Spiral Foreign Language Assessment

METHODOLOGY Data Collection The spiral development of FLISPILALM started in 2011 and is based on the LAL frameworks proposed by Brindley (2001) and Giraldo (2018). It also benefitted and continues benefiting from both quantitative and qualitative data coming from three sources: 1. student questionnaires, 2. student-created output (exams, peer assessment feedback files), and 3. self-assessment student presentations.

(i) Student Questionnaires The questionnaire employed to elicit reflective and course evaluative data from the students taking the course comprises two parts. It is filled in by the students taking ELTE at the beginning, middle, and end of the course. The first part of the questionnaires aims to elicit as detailed information about the background of the participants as possible, while the latter includes items gathering evidence about how the ELTE course and the teaching method employed in class affect the views and beliefs of students about testing, their field content knowledge, their skills related to writing and evaluating test items and their professional development. In the questionnaires, there are both closed and open-ended questions.

(ii) Student-Created Output In ELTE, pre-service teachers are first asked to form groups and then write classroom-based exams for middle schoolers in Turkey (i.e., Grades 5, 6, 7 and 8) and give feedback to the exams written by the other groups in class. (iia) Exams Students are instructed to write an exam including seven sections: exam specifications, grammar, vocabulary, reading, writing, speaking, and listening. Each exam should be based on the book(s) used by MONE. (iib) Feedback Each group is also required to provide three “peer assessments” of the different sections of the exams (e.g., grammar and vocabulary, reading and writing, listening and speaking) written by the other groups in the class. The peer assessment is prepared in written format and uploaded to a folder accessible to all pre-service teachers taking ELTE before an in-class feedback session where all comments are discussed and evaluated every week. To track the changes in students’ skills and knowledge, each group is asked to revise and resubmit the sections (e.g., grammar and vocabulary) that receive peer feedback.

111

 Flipped Spiral Foreign Language Assessment

(iii) Self-Assessment Presentations At the end of the term, all groups are asked to prepare (voice-over) PowerPoint presentations where they compare and contrast the first and final versions of their exams. The ‘self-assessment PowerPoint presentations’ aim is to check once more what students have learned and what they see as notable improvements related to their work.

Data Analysis Both quantitative and qualitative analyses of the collected corpora are done at the end of each term ELTE is taught. These analyses aim to identify the effects of the FLISPILALM followed in the ELTE course on students’ LAL.

THE DEVELOPMENT OF THE “FLIPPED SPIRAL LANGUAGE ASSESSMENT LITERACY MODEL (FLISLALM) FOR TEACHING ELTE (2011-) The transformation process aimed to design a course that would prepare future English language teachers as much as possible for their jobs as test writers and evaluators within the Turkish context. The planning started by identifying the variables that need to be considered: 1. there was just one ELTE course in the whole program (i.e., the basic skills, knowledge and principles of language assessment should be taught within 13-14 weeks) 2. foreign language teachers in Turkey are usually the ones who write the tests administered in class and the ones who are expected to evaluate and adapt the tests that are already available (e.g., on the Internet, EBA platform) 3. an in-class education model that creates in and outside class time for more practice should be found 4. a more practice-based LAL definition that will prepare pre-service English language teachers better for their future jobs should be adopted 5. the selected education model and LAL definition should blend successfully so that students graduate with the required theoretical and practical knowledge in the field of foreign language testing and evaluation. The search for new models and materials started and progressed with these objectives in mind.

THE FLIPPED CLASSROOM MODEL (FCM) The first stage of the development of the FLISLALM involved researching and comparing the available instructional models. After a careful and detailed examination of the most frequently used approaches (e.g., direct instruction, lecturing, project-based learning), the flipped (also known as inverted) classroom model (FCM) was selected. In the FCM, traditional classroom activities (e.g., topic presentation) are taken out of the classroom and done at home, while out-of-class activities such as homework, practice, and problem-solving are done in class (Bergmann & Sams, 2012; Lage et al., 2000; Sohrabi & Iraj, 2016; 112

 Flipped Spiral Foreign Language Assessment

Zownorega, 2013). What is more, in this system, students and teachers assume novel roles (see Table 3). Because students are in the centre of the learning environment, they start taking more responsibility for their own progress. The teachers, on the other hand, are not only the individuals transferring material/imparting knowledge but assistants, facilitators, interlocutors, problem-solving partners, and guides (Fuchs, 2021). Table 3. Flipped Classroom Model: Teachers’ and Students’ Roles (based on Artal-Sevil et al., 2020; Lage et al., 2000; Zownorega, 2013)

FCM was selected as the teaching approach for the FLISPILALM because of the reasons listed below: 1. Student-centred System As stated in the Background Section of the chapter, graduates of the English Language Teaching Departments of the Faculties of Education can teach at any level of the Turkish education system. That is, among the students taking the ELTE course there are usually groups with different interests, needs and teaching plans (e.g., while some are planning to work with young learners, others envisage careers in higher education). As there is only one ELTE course in the program but numerous techniques and strategies appropriate for assessing students with different ages, levels of proficiency and aims for learning foreign languages, it is almost impossible to respond to the needs of all students if traditional direct teaching method is followed. Therefore, it was decided to use the FCM (i.e., a method that focuses on students and foresees and ensures their active participation in the course), in the ‘revised ELTE course’ (R-ELTE). 2. A Model compatible with today’s learning-teaching theories With technological developments in the last century, the planet has become a global village. This has led to the emergence of thinking globally but creating locally appropriate solutions in all areas including education. As a result, new educational theories focusing on 21st-century competencies and expecta-

113

 Flipped Spiral Foreign Language Assessment

tions were developed. According to these, knowledge is subjective, and as the 18th-century philosopher Giambattista Vico claims, people can only understand what they are doing. Knowledge is not discovered and uncovered, on the contrary, it is interpreted and constructed by individuals. It is formed as a result of individuals’ own experiences, observations and interpretations (Larochelle et al., 1998; O’Connor, 2022). In the R-ELTE course using FCM, the aim is to move from the ‘instructivist’ (O’Connor, 2022, p. 412) teaching method (i.e., lecturer-centred system) to the student-oriented ‘constructivist’ method (i.e., a system focusing on active learning and in which students construct their own knowledge). In the course, students are expected to construct/reshape the information according to their own needs, rather than wait for someone else to convey it to them (Applefield et al., 2000; John, 2018). In R-ELTE, students are made to think about whether they want to work as teachers after graduation and, if the answer is ‘YES’, to decide with which groups of students they would like to work. After this, future English language teachers are asked to examine the MONE curricula and coursebooks used in secondary schools in Turkey so that they can write group-specific exams. While doing all these, they are expected to combine their knowledge of education, ELT methods, linguistics and literature acquired in the previous years with the newly obtained assessment and evaluation information and to prepare exams suitable for their specific contexts. 3. Abstract content knowledge O’Loughling (2006, p. 71) argues that second language assessment is a notoriously difficult domain of knowledge for students in second-language teacher education programs because of the high level of abstraction around its key theoretical concepts, validity, reliability, and practicality, and how they need to be balanced against each other in designing and using assessment instruments. Without completing these three groups LAL of the teacher (candidates) is not complete (Giraldo, 2018) and it becomes difficult for teachers to prepare appropriate exams that respond to the needs of their education systems, school contexts, and students. With the adaption of FCM in the R-ELTE, the out-of-class time is used more efficiently. Since the out-of-class time is much longer than the in-class time, students have more opportunities to learn the abstract content knowledge in the field of language assessment. In addition, they are offered a flexible study program (i.e., they can read, watch and/or listen to the course materials at their convenience and as much as they need to). With FCM, the lower cognitive levels of Bloom’s Taxonomy -Remembering, Understanding, Using- are completed before class (Andersen & Krathwohl, 2001). This makes it possible to work on activities such as peer review, criticizing written questions, etc. requiring higher cognitive skills such as “Analysing, Evaluating, Creating” in class (see Tables 3 and 4). 4. Students should be in the classroom environment with their peers and their instructors when they face problems (Reich, 2012) In the R-ELTE (see Weeks 6-12), students receive feedback from both the lecturer and two peer groups about their exams. At the same time, they read and asses the exams written by two different groups. In short, thanks to FCM, students get the opportunity to practice in class what they have learned out of 114

 Flipped Spiral Foreign Language Assessment

class (Bishop & Verleger, 2013). This process is not without problems, since students have to walk the ‘painful path’ of discovering what they know, what they do not know, or what they have learned wrong (Debbağ, 2018; Yaşar, 1998) during those practice sessions. With FCM, students’ peers and instructors are with them when they face these problems, and they help each other find the missing pieces of the “puzzle” step by step. Detailed, tailor-made student and/or group feedback is provided by their classmates and lecturers, and gradually, students begin to trust themselves and their peers.

GIRALDO’S (2018) DESCRIPTOR-BASED DEFINITION OF LAL (N=66) Earlier LAL models focused more on developing the theoretical knowledge of teachers (e.g., Brindley, 2001; Newsfields, 2006). This focus has shifted in recent years, and experts now agree that there is a need for a more balanced distribution of the Knowledge, Skills, and Principles components of LAL. The “Descriptor-Based LAL Model”, presented by Giraldo in 2018 was built on this principle and consists of three main components, eight dimensions, and 66 descriptors (see Figure 3). One of the main reasons for choosing this model for our study was that it was developed specifically for foreign language teachers.

(1) Knowledge In Giraldo’s LAL model, the Knowledge component ranks high on the list because it is related to language and language use (Inbar-Lorie, 2012, 2013). The knowledge component has three dimensions ((i) Awareness of applied linguistics; (ii) Theory and concepts; (iii) Own language assessment context) and 24 descriptors (for details see Giraldo, 2018, pp. 187-188).

(2) Skills In Giraldo’s (2018, pp. 189-190) LAL model, the Skills component consists of four dimensions: (i) Instructional skills, (ii) Design skills for language assessment, (iii) Skills in educational measurement (advanced skills not always needed), and (iv) Technological skills. These four dimensions consist of 32 descriptors, and some of them should be acquired by all teachers, including novices (e.g., teaching skills), while others are only needed by those dealing with more advanced assessment tasks (e.g., skills in educational measurement).

115

 Flipped Spiral Foreign Language Assessment

Figure 3. Giraldo’s (2018, p. 187) Descriptors Based LAL Model

(3) Principles The last component in Giraldo’s model is called the “Principles” of measurement and evaluation (see Giraldo, 2018, p. 190). The dimension is “Awareness of and actions towards critical issues in language assessment” and has 10 components. The principles listed in this section are based on studies conducted by Arias et al. (2012), Coombe et al. (2012), and Malone (2013) and include issues related to ethics, justice, transparency, and democracy. Knowledge related to these principles is an important part of teachers’ LAL because through it they are able to approach the assessments, methods, and techniques they use critically (Fulcher, 2012; Scarino, 2013).

USING THE “FLIPPED SPIRAL LANGUAGE ASSESSMENT LITERACY MODEL (FLISLALM) TO TEACH THE R-ELTE COURSE Week 1 The R-ELTE is planned as a 14-week-long course. In Week 1 (i.e., General Introduction to the Course), the FLISLALM model, the organization of the course, what is expected from the students, and the assessment criteria are introduced (see Table 4 and for the full Course Schedule see Appendix A).

116

 Flipped Spiral Foreign Language Assessment

Table 4. Revised ELTE Course Content (Weeks 1-7)

So that all of the ELTE course students can follow the explanations together with the course outline (see Table 4 and Appendix A), the handout shown in Figure 4 is distributed to the students. First, the instructor and the students discuss what they know about the FCM, and whether the model was used in any of the courses students have taken so far. Then, the course instructor explains why and how FCM will be utilized in the ELTE course and where the students can reach the assigned reading/listening materials before class. After that, the assessment criteria of the course are presented and discussed (i.e., 2. Assessment and Expectations from Students). It is explained to the students that the R-ELTE aims to prepare them as much as possible for their ‘real’ work environments. Therefore, following the findings of previous studies (Hatipoğlu, 2015, 2016), the two main foci of the course are to teach students to write exams and give feedback to/evaluate the quality of materials prepared by others (i.e., peer assessment). In addition, to learn to monitor their knowledge and skill development progress, students are taught self-assessment. When different sections of their exams are discussed in class, students are given a chance to talk about the writing process, what they had difficulties with, what they enjoyed the most, and what they would do differently now that they know what they know.

117

 Flipped Spiral Foreign Language Assessment

Figure 4. Week 1 Handout

118

 Flipped Spiral Foreign Language Assessment

The next topics discussed in Week 1 are the content and format of the exams that the students are expected to prepare and the type of feedback they should write for each of the sub-sections of the tests. Following the practices implemented in foreign language classrooms in the country students are expected to learn the rules and techniques for testing grammar and vocabulary knowledge, and reading, writing, listening, and speaking skills (i.e., students are mainly taught to test individual skills to be better prepared for what is done in the schools and on high-stakes exams in Turkey). To ensure the ‘spiral’ learning and development of the students in foreign language testing and evaluation (i.e., to make certain that there will be an iterative revisiting of subjects, themes, topics, and practices throughout the course), they are instructed to follow the steps listed below:

That is, students are expected to do their essential reading out-of-class, but present and discuss their questions and work as a group to solve different problems in class. After completing the ten steps discussed above, students are asked to submit the ‘First Draft of their Exams’ in Week 6. The writing of the ‘First Draft of the Exam’ will make up 10% of their overall grade (see Appendix A). Every week after that, the groups will receive feedback from their lecturers and classmates and will have time to read the relevant sources once more and revise and re-submit each of their exam sections. Each of the revised submissions will be allocated a maximum of 10 points as well. Since understanding the requirements for writing the expected feedback necessitates some prior knowledge in testing and evaluation only general information is given to the students in Week 1 (see Example 1):

119

 Flipped Spiral Foreign Language Assessment

Example 1 Each group will write three critical assessments of the different sub-sections of the exams prepared by the other groups in class. Feedback 1 will cover the Exam specifications and the Grammar and Vocabulary sections of the exam. Feedback 2 will focus on the Reading and Writing sections and Feedback 3 on the Listening and Speaking parts. Every time you will receive feedback from two different groups so that all of you get acquainted with all of the secondary school MONE books and all of the exams prepared in this course. The order in which feedback will be provided is given in Table 5. Group 1, for example, will receive Feedback 1 from Groups 2 and 3 and this feedback will be related to the Exam specifications, Grammar, and Vocabulary sections of the exam. Groups 1 and 3 will evaluate the Reading and Writing sections of Group 2. This will be the second Feedback they write. Details related to the content and format of the expected feedback will be given in Week 6 when many of the theoretical topics related to testing and evaluation are already covered and the first drafts of the exam are written and submitted. Table 5. Feedback groups

Student progress can be measured “externally” by teachers and tests and “internally” by asking students to self-assess (Oscarson, 1989, p. 1; also see Li & Zhang, 2021). The first method involves gathering data from test results or teacher observations and evaluations to ascertain whether a predetermined objective/level has been reached. In the latter approach, students are given the responsibility to evaluate what they can or cannot handle and how much progress they have achieved (Bercher, 2012). The use of self-assessment as an alternative approach to evaluation has been shown to lead to learner autonomy, higher commitment to learning and progress (since students themselves determine what their strengths and weaknesses are), and lower levels of anxiety, fear, and frustration (Guskey, 2016) which might be observed before and after official tests. In the ELTE course there is a self-assessment component as well. It is seen as an essential component of the ‘flipped spiral development’ of students. For this assessment component of the course, students are

120

 Flipped Spiral Foreign Language Assessment

asked to prepare presentations (which make up 25% of their overall grade) to evaluate their own progress and critically compare the first and last draft of their exams (i.e., Bloom’s taxonomy level ‘Evaluate’). After being given information about the workings, content, and assessment procedures of the ELTE course, students are asked to talk among themselves and form groups. The students are reminded once again that they are expected to work successfully in groups for the whole semester (i.e., usually 14 weeks) as is typically the case in the schools in Turkey. Studies conducted in Turkish schools (e.g., Çelebi et al., 2016; Özkan & Aslantaş, 2013) show that according to school administrators, the most important attribute of an effective teacher is that they can successfully cooperate on education related issues with their subject/brunch group members. Therefore, within the Ministry of National Education Regulation on the Secondary Education Institutions published on September 07, 2013 (MEB, 2013: Article 108), there are articles requiring the schools to form various boards (e.g., Teachers’ Board, Class or Branch Teachers’ Board, Subject Teachers’ Board) in all public schools in the country and teachers are expected to cooperate with their colleagues in these groups. The working groups in ELTE are not formed by the course instructor. Students are given the responsibility to create them since one of the main aims of the course is to encourage pre-service language teachers to take more responsibility for their own development and, as a result, become more autonomous learners. While talking to their classmates and trying to form groups, students are expected to ask and answer the following questions: 1. 1. Which of my classmates do I know well? With whom do I want to be in the same working group? 2. 2. Do I believe that I will be able to work with those group members successfully throughout the semester? 3. Which group of students (i.e., Year 5, 6, 7, or 8) am I interested in working with? Are there other people who are planning to work with this specific group of students? After forming their groups, ELTE students are asked to fill in the ‘Working Group Form’ (see Figure 5) where they put their names, the grade level for which they would like to write the exams, and their email addresses so that they can communicate quickly and efficiently.

121

 Flipped Spiral Foreign Language Assessment

Table 6. Working Group Form

Weeks 2-5 After presenting the general information related to the workings of the ELTE course in Week 1 to the students, Weeks 2 to 5 are devoted to building and strengthening the “Knowledge” and beginning the construction of the “Skills” section of their LAL as much as possible (see Table 4). Every week, students are assigned texts aiming to acquaint students with theories and concepts in applied linguistics and assessment and evaluation as out-of-class readings. They are also expected to get to know their language assessment contexts better by examining the curricula and books prepared by MONE for Secondary Schools in Turkey. Almost parallel to the accumulation of the information in the “Knowledge” dimension of LAL, students are expected and encouraged to come together as groups and start thinking and applying that information to the writing of the first draft of their exams. That is, they are expected to work on their “Skills” section of the LAL, as well. During the lessons, on the other hand, students are expected to ask questions about the topics covered in the assigned readings, discuss issues that they do not agree with or contradict the information given in other resources, does not fit with what they are trying to do in their exams. They are also encouraged to share their experiences in writing the first drafts of their exams, ask for help concerning the points they are struggling with, or talk about their difficulties in putting theory into practice.

122

 Flipped Spiral Foreign Language Assessment

Weeks 6-12 The foci of the lessons in Weeks 6-12 are the “Skills” and “Principles” dimensions of LAL (see Figure 3, Tables 4 and 5, Appendix A). During these weeks, students are expected to give feedback on the exam sections written by other groups and to correct their exams in line with the feedback given to them. In order to do this, students must move to a higher level of the “Knowledge” dimension of the spiral LAL (Level 3). It is hoped that assessing the exams prepared by other groups and reading the feedback given to their groups will help students identify the gaps in their knowledge and skills and, as a result of this, they will read more texts on testing and evaluation and will analyze those more carefully keeping in mind their specific needs. Students are asked to follow the guidelines given in Figure 5 while completing their feedback. The aim is to help them write critical but also fair, balanced and scientifically backed assessments of the products prepared by their classmates. Figure 5. Feedback instructions

123

 Flipped Spiral Foreign Language Assessment

Weeks 13-14 In Weeks 13-14 students are expected to present to the class the PPP where they compare and contrast the first draft and the final version of their exams. The presentations, as explained in the Course Schedule (see Appendix A), are expected to include specific comments related to the format, content, and appropriateness of all the sections of their exams as well as the reasons for changing, adding, deleting, etc. some parts of the test. The reasons given by the students should be scientifically backed up (i.e., they should refer to specific sources they use – e.g., Fulcher, 2012 – while revising their tests). In their evaluation presentations, students are also asked to list three of the most positive features of the course and three things that they think should be changed to make the course more useful for future cohorts. The specific instructions given to the students before they begin preparing their PPP are listed in Figure 6. Figure 6. Questions to keep in mind while preparing your PPP

So, the aim in Weeks 13 and 14 is to ‘push’ students to combine the “Knowledge”, “Skills” and “Principles” dimensions of LAL, and to help them see/discover which kinds of knowledge and types of skills they have acquired during the term. Another goal here is to help them see which objectives they have achieved and in which areas there are still gaps and they need to work on.

CONCLUSION The results of the study show that pre-service language teachers’ LAL changes positively as a result of taking the ELTE course where the FLISLALM is implemented. The most noticeable improvements are usually observed in relation to the “knowledge” dimension of LAL in Giraldo’s (2018) model. It is seen that the FLISLALM allows for more than half of the descriptors in the knowledge dimension to be covered and, as a result, students’ awareness of their own language assessment context (e.g., Explains own beliefs, attitudes, context, and needs for assessment; Criticizes the kind of washback assessments usually have on his/her teaching context) and of the theories and concepts related to the field of testing (e.g., interprets reliability and validity and their implications; interprets major qualities for language assessment practices and their implications for language assessment, Giraldo, 2018, pp. 188-189) advance

124

 Flipped Spiral Foreign Language Assessment

considerably. It is also found the students’ capacity and willingness to accept, understand, and implement the new knowledge and skills presented to them in class improves significantly as the course progresses. However, it is seen that within a single course, it is not possible to introduce all of the required theoretical foundations and practices in the field of testing and carry out training related to all the skills and sub-skills needed for all-around efficient classroom testing. Therefore, we suggest that more ELTE courses are added to the curriculum as well as more needs analysis-based training be given to the preservice English language teachers in Turkey.

REFERENCES American Federation of Teachers, National Council on Measurement in Education, & National Education Association. (1990). Standards for teacher competence in educational assessment of students. Educational Measurement: Issues and Practice, 9(4), 30-32. https://buros.org/standards-teacher-competence-educational-a ssessment-students Anderson, L. W., & Krathwohl, D. (Eds.). (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom’s taxonomy of educational objectives. Addison Wesley Longman, Inc. Applefield, J. M., Huber, R., & Moallem, M. (2000). Constructivism in theory and practice: Toward a better understanding. High School Journal, 84(2), 35–53. Arias, C., Maturana, L., & Restrepo, M. (2012). Evaluación de los aprendizajes en lenguas extranjeras: Hacia practices justas y democráticas [Evaluation in foreign language learning: Towards fair and democratic practices]. Lenguaje, 40(1), 99–126. doi:10.25100/lenguaje.v40i1.4945 Artal-Sevil, J. S., Castel, A. F. G., & Gracia, M. S. V. (2020, June). Flipped teaching and interactive tools. A multidisciplinary innovation experience in higher education. In 6th International Conference on Higher Education Advances (HEAd’20). https://web.archive.org/web/20210427174137id_/https://zaguan .unizar.es/record/95592/files/texto_completo.pdf Bercher, D. A. (2012). Self-monitor ing tools and student academi c s u c c e s s : W h e n p e r c e p t i o n m a t c h e s r e a l i t y. J o u r n a l o f C o l l e ge S c i ence Teaching, 41(5), 26–32. https://web.s.ebscohost.com/ehost/pdfviewer/pdfviewer?vid=0& sid=cfcdc8de-13bd-4621-bc38-1dea6df7f849%40redis Berry, V., Sheehan, S., & Munro, S. (2019). What does language assessment literacy mean to teachers? ELT Journal, 73(2), 113–123. doi:10.1093/elt/ccy055 Bishop, J., & Verleger, M. A. (2013, June). The Flipped Classroom: A Survey of the Research. Paper presented at the 2013 ASEE Annual Conference & Exposition, Atlanta, GA. 10.18260/1-2--22585 Bloom, B. S. (Ed.). (1956). Taxonomy of educational objectives, Handbook 1: Cognitive domain (2nd ed.). Longman Publishing. Brindley, G. (2001). Language assessment and professional development. In C. Elder, A. Brown, E. Grove, K. Hill, N. Iwashita, T. Lumley, C. MacNamara, & K. O’Loughlin (Eds.), Experimenting with uncertainty: Essays in honour of Alan Davies (pp. 126–136). Cambridge University Press.

125

 Flipped Spiral Foreign Language Assessment

Can, H. (2020). A micro-analytic investigation into EFL teachers’ language test item reviewing interactions [PhD Thesis]. Middle East Technical University. Çelebi, N., Vuranok, T. T., & Turgut, İ. H. (2016). Zümre öğretmenlerinin işbirliği düzeyini belirleme ölçeğinin geçerlik ve güvenirlik çalışması [The validity and reliability study of the scale of determining the level of cooperation of branch teachers]. Kastamonu Eğitim Dergisi, 24(2), 803-820. https://dergipark. org.tr/en/download/article-file/209704 Coombe, C., Troudi, S., & Al-Hamly, M. (2012). Foreign and second language teacher assessment literacy: Issues, challenges, and recommendations. In C. Coombe, P. Davidson, B. O’Sullivan, & S. Stoynoff (Eds.), The Cambridge guide to second language assessment (pp. 20-29). Cambridge University Press. Davies, A. (2008). Textbook trends in teaching language testing. Language Testing, 25(3), 327–347. doi:10.1177/0265532208090156 Debbağ, M. (2018). Öğretim İlke ve Yöntemleri dersi öğretim programı için hazırlanan ters-yüz edilmiş sınıf modelinin etkililiği. Yayınlamamış Doktora tezi, Bolu Abant İzzet Baysal Üniversitesi. Fuchs, K. (2021). Innovative teaching: A qualitative review of flipped classrooms. International Journal of Learning. Teaching and Educational Research, 20(3), 18–32. doi:10.26803/ijlter.20.3.2 Fulcher, G. (2012). Assessment literacy for the language classroom. Language Assessment Quarterly, 9(2), 113–132. doi:10.1080/15434303.2011.642041 Giraldo, F. (2018). Language assessment literacy: Implications for language teachers. Profile: Issues in Teachers’. Professional Development, 20(1), 179–195. doi:10.15446/profile.v20n1.62089 Giraldo, F. (2021). Language assessment literacy and teachers’ professional development: A review of the literature. Profile: Issues in Teachers’. Professional Development, 23(2), 265–279. doi:10.15446/ profile.v23n2.90533 Guskey, T. (2016). How classroom assessments improve learning. In M. Scherer (Ed.), On formative assessment: Readings from educational leadership (pp. 3–13). ASCD. Hatipoğlu, Ç. (2010). Summative evolution of an undergraduate ‘English Language Testing and Evaluation’ course by future English language teachers. English Language Teacher Education and Development (ELTED), 13, 40-51. http://www.elted.net/uploads/7/3/1/6/7316005/v13_5hatipoglu. pdf Hatipoğlu, Ç. (2015). English language testing and evaluation (ELTE) training in Turkey: Expectations and needs of pre-service English language teachers. ELT Research Journal, 4(2), 111-128. https://dergipark.org.tr/en/pub/eltrj/issue/28780/308006 Hatipoğlu, Ç. (2016). The impact of the university entrance exam on EFL education in Turkey: Preservice English language teachers’ perspective. Procedia: Social and Behavioral Sciences, 232, 136–144. doi:10.1016/j.sbspro.2016.10.038 Hatipoğlu, Ç. (2017a). History of Language Teacher Training and English Language Testing and Evaluation (ELTE) Education in Turkey. In Y. Bayyurt & N. Sifakis (Eds.), English Language Education Policies and Practices in the Mediterranean Countries and Beyond (pp. 227–257). Peter Lang.

126

 Flipped Spiral Foreign Language Assessment

Hatipoğlu, Ç. (2017b). Assessing speaking skills. In E. Solak (Ed.), Assessment in language teaching (pp. 118–148). Pelikan. Hatipoğlu, Ç. (2021a). Testing and assessment of speaking skills, test task types, and sample test items. In S. Çelik, H. Çelik, & C. Coombe (Eds.), Language assessment and test preparation in English as a foreign language (EFL) education (pp. 119–173). Vizetek. Hatipoğlu, Ç. (2021b). Assessment of language skills: Productive skills. In S. Inal & O. Tunaboylu (Eds.), Language Assessment Theory with Practice (pp. 167–211). Nobel. Hatipoğlu, Ç. (2022). Foreign Language Teacher Selection and Foreign Language Teaching and Assessment in the Reform Period of the Ottoman Empire (1700-1922). In G. Sonmez (Ed.), Prof. Dr. Ayşe S. AKYEL’e Öğrencilerinden Armağan Kitap: Türkiye’de Yabancı Dil Öğretmeni Eğitimi Üzerine Araştırmalar (pp. 23–38). Eğiten Kitap Yayıncılık. Hatipoğlu, Ç., & Erçetin, G. (2016). Türkiye’de yabancı dilde ölçme ve değerlendirme eğitiminin dünü ve bugünü [The past and present of Foreign Language Testing and Evaluation Education in Turkey]. In S. Akcan, & Y. Bayyurt (Eds.), 3. Ulusal Yabancı Dil Eğitimi Kurultayı Bildiri Kitabı (pp. 72-89). Istanbul: Boğaziçi Üniversitesi Press. Inbar-Lourie, O. (2012). Language assessment literacy. In C. Chapelle (Ed.), The encyclopedia of applied linguistics (pp. 1–9). John Wiley & Sons. doi:10.1002/9781405198431.wbeal0605 Inbar-Lourie, O. (2013). Guest Editorial to the special issue on language assessment literacy. Language Testing, 30(3), 301–307. doi:10.1177/0265532213480126 John, P. (2018). Constructivism: Its implications for language teaching and second-language acquisition. Papers in Education and Development, 33-34. https://journals.udsm.ac.tz/index.php/ped/article/view/1483 Kalkan, E. (2017). Eğitim, kültür, öğretmen: Öğretmen odaklı mı? Öğrenci odaklı mı? Istanbul Journal of Innovation in Education, 3(2), 51–63. Kremmel, B., & Harding, L. (2020). Towards a comprehensive, empirical model of language assessment literacy across stakeholder groups: Developing the Language Assessment Literacy Survey. Language Assessment Quarterly, 17(1), 100–120. doi:10.1080/15434303.2019.1674855 Lage, M. J., Platt, G. J., & Treglia, M. (2000). Inverting the classroom: A gateway to creating an inclusive learning environment. The Journal of Economic Education, 31(1), 30–43. doi:10.1080/00220480009596759 Larochelle, M., Bednarz, N., & Garrison, J. (Eds.). (1998). Constructivism and education. Cambridge University Press. doi:10.1017/CBO9780511752865 Levi, T., & Inbar-Lourie, O. (2020). Assessment literacy or language assessment literacy: Learning from the teachers. Language Assessment Quarterly, 17(2), 168–182. doi:10.1080/15434303.2019.1692347 Li, M., & Zhang, X. (2021). A meta-analysis of self-assessment and language performance in language testing and assessment. Language Testing, 38(2), 189–218. doi:10.1177/0265532220932481 Malone, M. (2013). The essentials of assessment literacy: Contrasts between testers and users. Language Testing, 30(3), 329–344. doi:10.1177/0265532213480129

127

 Flipped Spiral Foreign Language Assessment

McNamara, T., & Hill, K. (2011). Developing a comprehensive, empirically based research framework for classroom-based assessment. Language Testing, 29(3), 395–420. MEB. (2013). Millî Eğitim Bakanlığı ortaöğretim kurumları yönetmeliği (7 Eylül 2013) [Ministry of National Education Regulation on the Secondary Education Institutions (September 7, 2013)]. Resmi Gazete, 28758. Mede, E., & Atay, D. (2017). English language teachers’ assessment literacy: The Turkish context. Dil Dergisi, 168(1), 43–60. Newfields, T. (2006). Teacher development and assessment literacy. In T. Newfields, I. Gledall, M. Kawate-Mierzejewska, Y. Ishida, M. Chapman, & P. Ross (Eds.), Authentic Communication: Proceedings of the 5th Annual JALT Pan-SIG Conference (pp. 48-73). Tokai University College of Marine Science. O’Connor, K. (2022). Constructivism, curriculum and the knowledge question: Tensions and challenges for higher education. Studies in Higher Education, 47(2), 412–422. doi:10.1080/03075079.2020.1750585 O’Loughlin, K. (2006). Learning about second language assessment: Insights from a postgraduate student on-line subject forum. University of Sydney Papers in TESOL, 1, 71–85. Oscarson, M. (1989). Self-assessment of language proficiency: Rationale and applications. Language Testing, 6(1), 1–13. doi:10.1177/026553228900600103 Özkan, M., & Arslantaş, H. İ. (2013). Etkili öğretmen özellikleri üzerine sıralama yöntemiyle bir ölçekleme çalışması [A scaling study on effective teacher characteristics within the rank model]. Trakya Üniversitesi Sosyal Bilimler Dergisi, 15(1), 311-330. https://dergipark.org.tr/en/download/article-file/321465 Rea-Dickins, P. (2001). Mirror, mirror on the wall: Identifying processes of classroom assessment. Language Testing, 18(4), 429–462. doi:10.1177/026553220101800407 Reich, J. (2012). Rethinking teaching and time with the flipped classroom. EdTech Researcher Education Week. https://www.edweek.org/education/opinion-rethinking-teaching -and-time-with-the-flipped-classroom/2012/06 Şahin, S. (2019). An analysis of English Language Testing and Evaluation course in English Language Teacher Education Programs in Turkey: Developing language assessment literacy of pre-service EFL teachers [PhD Thesis]. Middle East Technical University. Scarino, A. (2013). Language assessment literacy as self-awareness: Understanding the role of interpretation in assessment and in teacher learning. Language Testing, 30(3), 309–327. doi:10.1177/0265532213480128 Spolsky, B. (1985). What does it mean to know how to use a language: An essay on the theoretical basis of language testing. Language Testing, 2(2), 180–191. doi:10.1177/026553228500200206 Stiggins, R. J. (1999). Evaluating classroom assessment training in teacher education programs. Educational Measurement: Issues and Practice, 18(1), 23–27. doi:10.1111/j.1745-3992.1999.tb00004.x Stiggins, R. J. (2002). Assessment crisis: The absence of assessment for learning. Phi Delta Kappan, 83(10), 758–765. doi:10.1177/003172170208301010

128

 Flipped Spiral Foreign Language Assessment

Sultana, N. (2019). Language assessment literacy: An uncharted area for the English language teachers in Bangladesh. Language Testing in Asia, 9(1), 1–14. doi:10.118640468-019-0077-8 Taylor, L. (2009). Developing assessment literacy. Annual Review of Applied Linguistics, 29, 21–36. doi:10.1017/S0267190509090035 Tezci, E., Dilekli, Y., Yıldırım, S., Kervan, S., & Mehmeti, F. (2017). Öğretmen adaylarının sahip olduğu öğretim anlayışları üzerine bir analiz. Education Sciences, 12(4), 163–176. Vogt, K., & Tsagari, D. (2014). Assessment literacy of foreign language teachers: Findings of a European Study. Language Assessment Quarterly, 11(4), 374–402. doi:10.1080/15434303.2014.960046 Yaşar, Ş. (1998). Yapısalcı Kuram ve öğrenme-öğretme süreci. In Vll. Ulusal Eğitim Bilimleri Kongresi Basılmış Bildiriler Kitabı (pp. 695-701). Konya: Selçuk Üniversitesi. https://www.academia.e d u / 2 4 7 3 6 8 8 7 / YA P I SA LC I _ K U R A M _ V E _ % C 3 % 9 6 % C4%9ERENME-_%C3%96%C4%9ERETME_S%C3%9CREC%C4%B0 Zownorega, S. J. (2013). Effectiveness of flipping the classroom in A honours level, mechanics-based physics class. MA Theses 1155. Eastern Illinois University. https://thekeep.eiu.edu/theses/1155

ADDITIONAL READING Baydar, M. B. (2022). Reference to testing principles as an interactional resource in L2 testing and evaluation classroom interaction in teacher education context [MA Thesis]. Middle East Technical University. https://hdl.handle.net/11511/99450 Brindley, G. (2001). Language assessment and professional development. In C. Elder, A. Brown, E. Grove, K. Hill, N. Iwashita, T. Lumley, C. MacNamara, & K. O’Loughlin (Eds.), Experimenting with uncertainty: Essays in honour of Alan Davies (pp. 126–136). Cambridge University Press. Fuchs, K. (2021). Innovative teaching: A qualitative review of flipped classrooms. International Journal of Learning. Teaching and Educational Research, 20(3), 18–32. doi:10.26803/ijlter.20.3.2 Giraldo, F. (2018). Language assessment literacy: Implications for language teachers. Profile: Issues in Teachers’. Professional Development, 20(1), 179–195. doi:10.15446/profile.v20n1.62089 Giraldo, F. (2021). Language assessment literacy and teachers’ professional development: A review of the literature. Profile: Issues in Teachers’ Professional Development, 23(2), 265–279. doi:10.15446/ profile.v23n2.90533 Hatipoğlu, Ç. (2017). History of language teacher training and English Language Testing and Evaluation (ELTE) education in Turkey. In Y. Bayyurt & N. Sifakis (Eds.), English Language Education Policies and Practices in the Mediterranean Countries and Beyond (pp. 227–257). Peter Lang. Sevgi-Sole, E., & Ünaldı, A. (2020). Rater negotiation scheme: How writing raters resolve score discrepancies. Assessing Writing, 45, 100466. doi:10.1016/j.asw.2020.100466

129

Flipped Spiral Foreign Language Assessment

KEY TERMS AND DEFINITIONS Content Validity: It is the degree to which a test or assessment instrument evaluates all aspects of the topic, construct, or behaviour that it is designed to measure. Low content validity shows that the exam does not cover all necessary aspect of the topic while high values indicate that the test covers most or all of the topic for the target audience. English Language Testing and Evaluation Course (ELET): The course taught in the ELT programs in Turkey. Face Validity: Pertains to whether or not the test “looks valid” to the examinees who take it (i.e., does the test look like a test and whether it looks as if it measures what it is supposed to measure). Flipped Classroom Model (FCM): In the FCM, traditional classroom activities (e.g., topic presentation) are taken out of the classroom and done at home, while out-of-class activities such as homework, practice, and problem-solving are done in class. Language Assessment Literacy (LAL): The knowledge, skills, and principles that teachers need to know about language assessment. Peer Assessment: Classmates critique and provide feedback to each other on their work. Progress Achievement Tests: Short-term progress tests that are based on the material covered in class. As the name suggests, they check how well pupils have learned or understood the material taught to them in specific weeks/units. They help teachers determine if remedial or consolidation work is needed. Self-Assessment: Judgements that students make about their own work, knowledge, action abilities and/or progress.

130

Flipped Spiral Foreign Language Assessment

APPENDIX A FLE 413.03: ENGLISH LANGUAGE TESTING AND EVALUATION Welcome to FLE 413: English Language Testing and Evaluation This course aims to teach students to write, implement and evaluate a variety of testing instruments for a specific group of language learners. To achieve these objectives, we will scrutinise topics such as testing and assessment, summative vs formative assessment, backwash, varieties of tests, test validity and reliability, stages of test construction, testing overall language proficiency, testing individual skills (e.g., writing, reading, listening, speaking) and evaluation of test results/items. Each session will be composed of one or two hours of lecturing and one or two hours of tutorials. Lectures will provide the overall framework, while tutorials will provide the forum for discussing the issues touched upon in the lectures. Students will be actively involved in discussions, presentations and practical analysis in tutorials. They will be asked to prepare work in advance of the tutorial (e.g., feedback, revision of exam sections), thus, giving them the opportunity to sharpen their critical awareness and the metalanguage required to express their ideas. Box 1. ­ Lecturer:

Course Day:

Monday

Office:

Times

08:40-11:30

Tel:

Place:

E-mail:

The students will be able to access the course outline, handouts, lecture notes, the exams written by them and their classmates, exam evaluations, announcements and grades through LMS (odtuclass.metu. edu.tr). That is why they are expected to check the website on a regular basis. If you have a problem or query, do not hesitate to contact me at the lecture or tutorial, in my office or via e-mail. Good luck! Çiler Hatipoğlu

131

Flipped Spiral Foreign Language Assessment

Box 4. ­ 4. TENTATIVE COURSE SCHEDULE WEEK

TOPIC

Week 1

General Introduction to the course

Week 2

Teaching and testing

SOURCE

TASKS & DATES *Form working groups **Choose a grade for which an exam will be prepared.

Heaton (1990), Ch. 1 Hughes (2003), Chs. 1 & 2

*Read the assigned sections before class **Have a group meeting, familiarise yourselves with the MONE Books

Heaton (1990), Chs. 2 & 3, Part 10.8 of Ch. 10 Hughes (2003), Ch. 3

*Start work on the exam: choose a level, distribute responsibilities, read the relevant sections in the course materials **Finish reading and analysing the MONE Books of your grade ***Start reading: Heaton (1990), Chs. 4, 5, 6, 7, 8, 9 Hughes (2003), Ch. 9, 10, 11, 12, 13 Alderson (2000), Buck (2001), Hatipoğlu (2017, 2021a, 2021b), Luoma (2004), Read (2000), Weigle (2002)

Stages of test development

Hughes (2003), Ch. 7

*Start writing the Exam Specifications **Work on the first version of your exam sections ***Continue reading: Heaton (1990), Chs. 4, 5, 6, 7, 8, 9 Hughes (2003), Ch. 9, 10, 11, 12, 13 Alderson (2000), Buck (2001), Hatipoğlu (2017, 2021a, 2021b), Luoma (2004), Read (2000), Weigle (2002)

Week 5

Writing Multiple Choice Questions

Haladyna (2004) Hatipoglu (2009) Heaton (1990), Ch. 3 Hughes (2003), Ch. 8, pp. 75-78

*Finish writing the Exam Specifications **Revise the sections in the first version of your exams ***Finish reading: Heaton (1990), Chs. 4, 5, 6, 7, 8, 9 Hughes (2003), Ch. 9, 10, 11, 12, 13 Alderson (2000), Buck (2001), Hatipoğlu (2017, 2021a, 2021b), Luoma (2004), Read (2000), Weigle (2002)

Week 6

Validity 1. Content, 2. Criterionrelated, 3. Construct, 4. Face

Heaton (1990), Ch. 10 Hughes (2003), Ch. 4 Genese & Upshur (1996), Ch. 4, pp. 62-68 Osterlind (2002), Ch. 3, pp. 59-106

*Finalise and submit the first version of your Exams Submission date: (Wednesday, 23:50) *Submission of Feedback 1: (Saturday, 23:50) Feedback 1 Content: Test Specifications, Grammar and Vocabulary sections

Week 7

1. Reliability_1 2. Testing GRAMMAR & VOCABULARY_1

Bachman (2004) Heaton (1989), Ch. 10 Hughes (2003), Ch. 5 Genese & Upshur (1996), Ch. 4, pp. 57-62

Wednesday, 23:50: Submission of the revised versions of the Grammar and Vocabulary sections for the groups that receive feedback

Week 8

1. Reliability_1 2. Testing GRAMMAR & VOCABULARY_2

Bachman (2004) Heaton (1989), Ch. 10 Hughes (2003), Ch. 5 Genese & Upshur (1996), Ch. 4, pp. 57-62

Wednesday, 23:50: Submission of the revised versions of the Grammar and Vocabulary sections for the groups that receive feedback *Submission of Feedback 2: Saturday, 23:50 Feedback 2 Content: Reading and Writing sections

Week 9

Testing READING & WRITING_1

Alderson (2000), Buck (2001) Heaton (1989), Chs. 8, 9 Hughes (2003), Chs. 9, 11

Wednesday, 23:50: Submission of the revised versions of the Reading and Writing sections for the groups that receive feedback

Week 10

Testing READING & WRITING_2

Alderson (2000), Buck (2001) Heaton (1989), Chs. 8, 9 Hughes (2003), Chs. 9, 11

Wednesday, 23:50: Submission of the revised versions of the Reading and Writing sections for the groups that receive feedback *Submission of Feedback 3: Saturday, 23:50 Feedback 3 Content: Listening and Speaking sections

Week 11

Testing LISTENING & SPEAKING_1

Buck (2001) Heaton (1989), Ch. 6 Hughes (2003), Ch. 12

Wednesday, 23:50: Submission of the revised versions of the Listening and Speaking sections for the groups that receive feedback

Week 12

Testing LISTENING & SPEAKING_2

Heaton (1989), Ch. 7 Hughes (2003), Ch. 10 Luoma (2004)

Wednesday, 23:50: Submission of the revised versions of the Listening and Speaking sections for the groups that receive feedback

Week 13

Self-evaluation session

Hughes (2003), Appendices 1-3 Heaton (1989), Ch. 11

Week 14

REVIEW

Week 3

Week 4

132

Kinds of tests and testing

Friday, 23:50: Submission of the final version of the PPPs comparing the first and last versions of the exams

Flipped Spiral Foreign Language Assessment

Box 2. ­ 1 LEARNING OUTCOMES On successful completion of this course, students will be able to *list the differences between testing and assessment *use basic terms and concepts related to language testing appropriately where/when necessary *utilize summative and formative assessment techniques where necessary *carry out various processes and practices related to assessment of language proficiency successfully *perform statistical analysis of testing data *design, implement and evaluate a variety of testing instruments for specific groups of language learners *evaluate tests and test results/items 2 MATERIALS 2.1 Main Coursebooks: Alderson, Charles. (2000). Assessing Reading. Cambridge: Cambridge University Press. Bachman, Lyle. (2004). Statistical Analyses for Language Assessment. Cambridge: Cambridge University Press. Buck, Gary. (2001). Assessing Listening. Cambridge: Cambridge University Press. Genese, Fred & Upshur, John A. (1996). Classroom-based Evaluation in Second Language Education. Cambridge University Press. Hatipoğlu, Çiler. (2009). Writing and Evaluating Multiple Choice Items. A training module prepared for the E-INSET Project (http://e-inset.mersin.edu.tr/). Hatipoğlu, Çiler. (2017). Assessing speaking skills. In Ekrem Solak (Ed.), Assessment in language teaching (pp. 117-148). Pelikan Yayin Evi. Hatipoğlu, Çiler. (2021a). Chapter 6: Testing and assessment of speaking skills, test task types, and sample test items. In Servet Çelik, Handan Çelik and Christine Coombe (Eds.), Language assessment and test preparation in English as a foreign language (EFL) education (pp. 119-173). Vizetek. Hatipoğlu, Çiler. (2021b). Chapter 9: Assessment of language skills: Productive skills. In Sevim Inal and Oya Tunaboylu (Eds.), Language Assessment Theory with Practice (pp. 167211). Ankara: Nobel (ISBN: 978-625-417-220-5) Heaton, J. B. (1990). Writing English Language Tests. London & New York: Longman. Haladyna, Thomas M. (2004). Developing and Validating Multiple-Choice Test Items (Third Edition). Lawrence Erlbaum Associates, Publishers. Hughes, Arthur. (2003). Testing for Language Teachers (Second edition). Cambridge: Cambridge University Press. Luoma, Sari. (2004). Assessing Speaking. Cambridge: Cambridge University Press. Osterlind, Steven. (2002). Constructing Test Items: Multiple Choice, Constructed Response, Performance and Other Formats (Second Edition). Kluwer Academic Publishers. Read, John. (2000). Assessing Vocabulary. Cambridge: Cambridge University Press. ------------------------------------------------------------2.2 Other readings: Bachman, Lyle F. & Palmer, Adrian S. (1996). Language Testing in Practice: Designing and Developing Useful Language Tests. OUP. Baker, David. (1989). Language Testing: A Critical Survey and Practical Guide. Edward Arnold. Cameron, Lynne. (2001). Teaching Languages to Young Learners. Cambridge University Press. Cumming, Alister & Berwick, Richard (Eds.) (1996). Validation in Language Testing. Multilingual Matters Ltd. Davies, Alan. (1990). Principles of Language Testing. Blackwell. Douglas, Dan & Chapelle, Carol (Eds.) (1993). A New Decade of Language Testing Research: Selected Papers from the 1990 Language Testing Research Colloquium. Alexandra, Virginia: Teachers of English to Speakers of Other Languages, Inc. Hamp-Lyons, Liz. (1991). Assessing Second Language Writing in Academic Context. Ablex Publishing Corporation. Harley, Birgit, Patrick Allen, Jim Cummins & Merrill Swain (Eds.) (1990). The Development of Second Language Proficiency. CUP. Harrison, Andrew. (1983). A Language Testing Handbook. Modern English Publications. Hatipoğlu, Çiler. (1999). Evaluation of the Boğaziçi University English Proficiency Test. MA Thesis, Boğaziçi University, Department of Foreign Language Education. Hatipoğlu, Çiler. (2010). Summative evolution of an undergraduate ‘English Language Testing and Evaluation’ course by future English Language Teachers. English Language Teacher Education and Development (ELTED), 13, 1-12. Henning, Grant. (1987). A Guide to Language Testing: Development, Evaluation, Research. Hill, Clifford & Parry, Kate (Eds.) (1994). From Teaching to Assessment: English as an International Language. London & New York: Longman. Kopriva, Rebecca J. (2008). Improving Testing for English Language Learners. New York & London: Routledge. Linn, Robert L. (Ed.) (1989). Educational Measurement (Third Edition). New York & London: American Council on Education & Macmillan Publishing Company. Low, Graham. (1985). Validity and the problem of direct language proficiency tests. In J. Charles Alderson (Ed.), Lancaster Practical Papers in English Language Education (Volume 6): EVALUATION (pp. 151-168). Lancaster: Pergamon Press. McKay, Penny. (2006). Assessing Young Language Learners. Cambridge University Press. Oller, John W., Jr. (1991). Language and Bilingualism More Tests of Tests. London & Toronto: Lewisburg, Bucknell University Press. Spolsky, Bernard. (1985). What does it mean to know how to use a language? An essay on the theoretical basis of language testing. Language Testing, 2 (2), 180-191. Valdes, Guadalupe & Figueroa, Richard A. (1994). Bilingualism and Testing: A Special Case of Bias. Ablex Publishing Corporation. Weigle, Sara Cushing. (2002). Assessing writing. Cambridge University Press. Woods, Anthony, Paul Fletcher & Arthur Hughes. (1986). Statistics in Language Studies. Cambridge: Cambridge University Press.

133

Flipped Spiral Foreign Language Assessment

Box 3. ­ 3. ASSESSMENT Exam writing:

40%

    In groups, students will write and re-write an exam (e.g., achievement, midterm, final) consisting of 6 parts (i.e., grammar, vocabulary, listening, reading, writing, speaking) and specifically designed for the needs of a particular group of students (e.g., 14-year-old students who have an intermediate level of proficiency in English and who will be attending school in an English speaking country next year).     (1 EXAM (10 pts)+ 3 REVISED VERSIONS OF THE EXAM SUB-SECTIONS (10 pts)= 40) Exam evaluation:

30%

In groups, students will write critical evaluations related to the format, content, and appropriateness of the exams prepared by their classmates. The students are expected to support their comments with quotations and examples from suitable sources. (3 EVALUATIONS X 10 POINTS = 30). Self-assessment presentations

25%

The groups will prepare PPPs consisting of 23-25 slides where they will compare and contrast the first and final versions of their Exams. The presentation should include specific comments about the format, content, appropriateness etc., of all the sections of their exams. A recording should accompany the PPP (i.e., what you need to prepare is a SLIDE SHOW PPP). These Slide Show PPPs will allow students to explain better the material and examples placed on their PPPs. Groups should send their PPPs to the lecturer for more thorough analysis and evaluation. Attendance/Class participation:

5%

Students are expected to attend classes regularly and participate actively in class discussions and group work activities. BONUS: DATA COLLECTION

10%

Data collection tools will be available on LMS or Google Docs. If the students are willing to undertake a data collection task, they will need to: (i) collect data from informants with characteristics identified in the Questionnaire*. (ii) fill in the assigned questionnaires on time themselves. *The questionnaires will be submitted to the course lecturer on the day of the Final exam, and the students will get 0.5 points for each authentic Questionnaire. If the instructor finds that the students fabricated the data presented in any of the data collection tools, the students will not get any points for these questionnaires.

Exam Writing Instructions 1. Form groups consisting of 4-5 students 2. Each group should choose a grade level (e.g., Grade 5, 6, 7, 8) and write a test specifically for that grade following the MONE books uploaded on the GOOGLE FOLDER. 3. The test should include SEVEN SECTIONS: (i) Exam Specifications, (ii) Grammar, (iii) Vocabulary, (iv) Reading, (v) Writing, (vi) Listening, (vii) Speaking 4. Each section should have TEN questions. 5. The type of questions in each section should be different (e.g., MCI in the Grammar sections, matching questions in the Reading section, filling in the blanks for the Listening section, etc.) 6. The exam written by each group will be uploaded to the GOOGLE FOLDER (so that all groups see them and can provide feedback). 7. The deadline for the submission of the first version of the Group Exams is Wednesday, 23:50 8. The various sections of each group’s exam will be given feedback by two other groups. 9. After the feedback sessions, the groups will revise, prepare a rubric and upload their new exams to a new folder in GOOGLE GROUP.

134

Section 3

Language Assessment in Education

136

Chapter 8

Teaching and Assessment in Young Learners’ Classrooms Belma Haznedar https://orcid.org/0000-0002-7025-0158 Bogazici University, Turkey

ABSTRACT Language learning in early childhood has been the subject of great interest both in first language (L1) and second language (L2) acquisition research. For the past 40 years, we have witnessed significant advances in the study of child language, with particular references to the cognitive, linguistic, psychological, pedagogical, and social aspects of child language. This chapter aims to shed light on some of the theoretical paradigms and their implications on language learning and assessment in young children whose exposure to another language begins early in life. In view of the diversity facing pedagogical practices worldwide, the authors aim to show the connection between classroom practices and assessment tools appropriate for young language learners, with special reference to formative and ongoing assessment.

INTRODUCTION It has been estimated that roughly two-thirds of the world population consists of individuals with access to more than one language (Rothman et al., 2019). With the world becoming more bi/multilingual and bi/multicultural, the pressure to improve language learning and teaching has rapidly increased in recent years. In many parts of the world, parents pay a fortune to ensure that their children receive a quality education, with developed language skills in more than one language from early years on. As Johnstone (2009) notes, the introduction of foreign/second languages at the primary level is perhaps the most striking language policy in education, with English being the most commonly taught language all over the world. Indeed, due to global trends and developments, many European (Germany, Italy, UK, Greece, Spain) and Asian countries (e.g., Korea, Japan, Thailand) have taken steps to introduce the teaching of English to children in primary schools (Enever & Moon, 2010; Garton & Copland, 2019; Kırkgöz 2009; Mourão & Lourenço, 2015). In most of these countries, English instruction begins around age 6 (Garton et al., 2011; Jenkins, 2015; Johnstone, 2009). DOI: 10.4018/978-1-6684-5660-6.ch008

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 Teaching and Assessment in Young Learners’ Classrooms

Apart from economic and social reasons, the primary reason behind this trend is the assumption that early childhood is the ideal time for learning an additional language. Given that first language (L1) acquisition takes place in early childhood without any effort or systematic training, it has long been assumed that children have the capacity to learn additional languages more successfully and faster than adults (Singleton, 2005; Johnson & Newport, 1989). The main idea here is that even though there is no single critical age, the ability to learn a second language weakens over time, and the possibility of reaching the native-like ultimate attainment decreases. Some researchers, however, have been critical of this view, known as ‘the younger the better’ position in the literature (Singleton, 2001). Today reaching a consensus among researchers and providing an explanation for the age of acquisition debate appears to be far from conclusive. On the one hand, given sufficient exposure and interaction, one can argue that normally developing children can learn another language at an early age. On the other hand, there seems to be no solid empirical evidence demonstrating that early L2 beginners certainly outperform adolescent beginners when the number of instructional hours is held constant (see García-Mayo & García Lecumberri 2003; Muñoz 2006), Indeed, based on data from primary and secondary school students, more recent work questions the role of age of onset in language learning (Muñoz 2008a,b; Pfenninger & Singleton 2018). On similar grounds, Copland (2020) criticizes ‘the younger the better’ position on pedagogical grounds, with special reference to the challenges that teachers of young learners are faced with. For Copland, unlike mother tongue development, ‘language learning experience in class settings is neither personalized nor intensive’ (p. 11). In many countries, children start learning English as part of the primary curriculum. However, the amount of exposure they receive is rather limited, only a couple of hours per week (Garton et al, 2011). Given that children need reinforcement of learning between the weekly lessons, there is a risk that learning the language may not be as effective as expected in the long run. In other words, the quantity of exposure in the class affects children’s learning (Lingren & Muñoz, 2013; Philips, 2020). To this end, researchers now recognize that there is much more to language learning than just the age of acquisition. Educational, social, psychological and contextual factors also play significant roles in young children’s second language development (Copland et al., 2014; Moyer 2013). Our aim in this chapter is to focus on this group of learners whose exposure to an additional language begins in early childhood in class settings. We pay special attention to language learning and teaching process as well as assessment elements in young learners. The organization of the chapter is as follows: We first describe certain milestones of early childhood second language learning. Then, we discuss elements of classroom practices for young children. Next, we provide samples of assessment tools for young language learners.

THEORETICAL BACKGROUND Who Are Young Language Learners? On standard assumptions, young learners, also known as child second language (L2) learners or successive bilinguals, refer to children whose first exposure to another language occurs after their first language has been established for at least 3-5 years (Haznedar, 2013; Meisel, 2008). Despite a lack of agreement among researchers and practitioners, the upper-bound differentiating successive bilinguals/child L2 learners from adult L2 learners is considered to be 7-10 years old, which is, in fact, an issue relating to the critical period debate discussed previously above. Given the scope and the objectives of this paper, our 137

 Teaching and Assessment in Young Learners’ Classrooms

major focus here concerns children aged 5-12 in the pre-primary or primary school years in formal L2 settings, as this is the most widespread population all over the world. As this paper examines language learning and language assessment in young learners of pre-primary and primary school children, it will be useful to review some defining characteristics of children aged 3-5 and 6-12. Despite variations observed in different countries, children generally start preschool around age 3 and primary education around age 6-7 in most countries. In Piaget’s terms, these children are assumed to be at the pre-operational stage during which they have yet to develop the theory of mind, not being able to appreciate others’ views, but being ego-centric, seeing the world from their own perspectives, attending to one aspect of a task only (Pinter, 2011). At such a young age, the child’s capacity for categorization varies. Markman (1994), for instance, notes that young children are found to be more interested in thematic (casual, temporal and spatial) relations among objects rather than taxonomic relations. The child at age 3, for instance, can view the relationship between objects and labels in a different way. When shown three pictures of various labels such as milk, cow, and dog, three-year-olds appear to form thematic relations among the labels, associating cow with milk. At age 6, on the other hand, the very same child categorizes objects in terms of taxonomic relations. When shown pictures of various objects such as dog, cat, pig, cow, milk, cheese, a child of this age tends to pay attention to taxonomic relations and forms groups such as dog, cat, pig, cow vs. milk and cheese. Despite such remarkable changes in the child’s cognitive and linguistic development in the first several years of life, elements of the concrete operational stage take time to emerge in children around 7-11 years of age. It is around this time period that children start to think in abstract terms, use an analogy, and appreciate cause-effect relations. This stage is also characterized by the emergence of the ability to handle different aspects of a task and a gradual decrease in egocentric behaviour. Despite the fact that young children in pre-operational and concrete operational stages have still a long way to go in terms of cognitive and social development, they enjoy listening to the same stories repeatedly as the plot and the characters are familiar and easy to follow. They enjoy playing games. At younger ages, for example, they like playing pretend games (pretending to make breakfast for the soft toys, pretending to act like animals- hopping like a rabbit, flying like a bird etc.). Many different types of games, indoor-outdoor games contribute to the child’s cognitive, linguistic, social and emotional development through which they learn how to react, how to share, how to act meaningfully and appropriately in a given context (Nicholas & Lightbown, 2008). Knowledge of these cognitive and social milestones in child development is of great importance to teachers who work with young learners in class settings.

Classroom Practices in Young Learners’ Classrooms In general, content in English language teaching (ELT) focuses on both formal features of the language, such as grammar and vocabulary and language skills such as speaking, listening, reading and writing. Indeed, in many countries national curricula, course books and classroom practices are organized according to these divisions (Gaynor 2014; Garton & Copland, 2019). In young learners’ classrooms teachers focus mainly on listening and speaking skills rather than the grammatical aspects of English. This is in line with a long-standing debate about the role of grammar in language teaching, which has a long history in the field. Grammar teaching is usually not given much credit in young learners’ classrooms. For many researchers, children at a young age are not ready to analyze grammatical patterns or treat language as an abstract system (Philips, 1993). To this end, grammar teaching in explicit terms was not given priority in 138

 Teaching and Assessment in Young Learners’ Classrooms

teaching language to young children (Cameron, 2001; Putcha, 2019). Despite counter perspectives as in Cameron’s work (2001) where grammar still has a place in language learning (see also, Saraceni, 2008), the main idea here is that for children meaning is more important than individual words or sentences in a context, in particular at a time they are not cognitively and linguistically mature. This brings into question the appropriate and effective methodology in young learners’ classrooms. According to Inbar-Lourie and Shohamy (2009), early programs range from awareness-raising to language-focus programs and from content-based curricula to immersion. In early studies, games, songs and stories are recommended tools for young learners, as they are assumed to provide not only contextualized input but also simple and repetitive speaking activities for children, which will eventually add up to their communication skills (Philips, 1993, p.7). On similar grounds, in a study examining the role of grammar in language teaching to children, Putcha (2019) also highlights the importance of presenting grammar and vocabulary-based language functions in stories which provide opportunities for meaningful language use on the part of children. This view is not far from the model, known as Content-based instruction or Content and Language Integrated Learning (CLIL), according to which children learn better when immersed in natural contexts where there are opportunities for the authentic use of the language (Coyle et al., 2010). In comparison to the traditional view of language teaching, which involves teacher-centered explicit grammar instruction in the form of isolated sentences and linguistic units (Hinkel & Fotos, 2002; Long, 2000; Lyster, 2004), CLIL-based instruction refers to situations where the school curriculum is taught through another language with the aim of both learning the content and the language simultaneously (Marsh, 1994). As a widely used approach in language teaching both in the US and Europe, CLIL differs from traditional language teaching methodology (Graddol, 2006) in that in CLIL classes, the teacher focuses on and assesses the subject content rather than the learners’ mastery of grammatical patterns in the form of the simple past or present, for instance (Cenoz et al., 2014). The implementation of this clearly requires a new perspective and methodology in language teaching. CLIL-based instruction involves learning about various issues through the language used in the class. For example, in a typical science class, the content can be organized in such a way that global warming, environmental concerns, eco-system, healthy food, etc. are examined within the scope of the course. Following a similar line of thought, classroom practices appear to show important changes in the last decade in line with the principles of CLIL. Having briefly outlined recent approaches to classroom practices in the field of young learners, we now explore types of assessment and effective assessment tools used in young learners’ classrooms, which is the next issue to be addressed in the rest of the paper.

ASSESSMENT IN YOUNG LEARNERS’ CLASSROOMS Assessment has always been a crucial component of learning and teaching in all educational systems to promote learning and improve students’ achievement. Schools and teachers assess students in different ways for different purposes. Sometimes, students take summative tests at the end of each term or year for a final evaluation. Sometimes, they take standardized tests, which provide normative data for assessment and evaluation. Sometimes, they go through a formative assessment which is an ongoing process both on the part of the teacher and the students. Overall, assessment in a school context is a process of obtaining information about student progress or a particular program implemented in the school. The

139

 Teaching and Assessment in Young Learners’ Classrooms

primary aim is to use this information to plan future steps and educational policies over time for both the school system and the students. As this paper aims to address the assessment of language skills in young learners, we will not go into the details of standardized assessment, which is not assumed to be reliable and valid for young learners (Shepard et al., 1998). For some researchers, standardized assessment should not even be given priority until grade 3, perhaps not until grade 4 (Shepard et al., 1998). Therefore, our focus in this paper is placed on informal and ongoing evaluation in language teaching, namely formative assessment tools in young learners’ classrooms. In the next two sections, we hope to show a range of widely used assessment tools for young learners who can develop these language skills in particular when teachers design and implement a clear, consistent, and planned approach to language learning.

Formative Assessment in Young Learners’ Classrooms For over 30 years, many researchers have argued that formative assessment is highly effective in increasing student learning (Black & Wiliam, 1998; Gipps, 1994; Torrance & Pryor, 2002) due to its very nature of documenting, analyzing and reflecting on the information collected from children (Hurst & Lally, 1992). This can be done via various ongoing assessment tools such as conducting regular observations as well as collecting children’s work for portfolios, which are among the widely used devices in early childhood programs (Trumbull & Lash, 2013). In essence, formative assessment is an ongoing process and involves observations of children in interesting, meaningful and challenging experiences (Bowman et al., 2001; Torrance, 2001). Heritage (2007) identifies three types of formative assessment, all contributing to the child’s development: (i) assessment that happens during a lesson; (ii) assessment which is preplanned before the instruction takes place; (iii) assessment specified in the curriculum which leads to data collection at significant points during the learning process (p. 141). Similarly, in a detailed analysis of the features of formative assessment, Cizek (2010) emphasizes that in addition to analyzing the child’s current knowledge and taking the necessary steps for reaching the expected outcomes, formative assessment also encourages children to self-monitor their own progress in terms of achieving learning goals, which in turn promotes reflection by students on their own learning processes. This is an important issue we will examine in the following sections. Compatible with these principles, recent years have witnessed various works of the European Council with special reference to early language learning assessment (Council of Europe, 2001). Much work has attempted to define principles of teaching and assessing young learners, with special reference to what young children are expected to be able to do in certain stages of their L2 development (Jang, 2014; McKay, 2006; Nikolov, 2016). These studies also led to the development of tests for young language learners, which apply standards compatible with the levels in CEFR. As noted by some researchers, however, CERF was not specifically designed for young learners (Hasselgren, 2005; Papp & Walczak, 2016). Therefore, researchers and test designers have relied mainly on the elements of CLIL. As noted previously, CLIL is an approach whose focus is on language content young learners are expected to know and the basic proficiency levels, such as A1 and A2 in the CERF, rather than standards-based testing. Despite some overlapping elements in aural/oral literacy skills specified in CERF, however, formative assessment tools appear to be more suitable for young language learners whose exposure to an additional language is restricted to class contexts. As can be deduced from our discussion up to this point, these children come from diverse learning contexts and mostly have limited exposure to the foreign language, ranging from 2-5 hours per week. To this end, our discussion in the next section will focus on classroom140

 Teaching and Assessment in Young Learners’ Classrooms

based ongoing assessment of these children, covering various types of teaching and assessment tools such as observations, self-assessment, portfolios, stories and songs in young learners’ classrooms.

Assessment Tools in Young Learners’ Classrooms This section exemplifies process-oriented formative assessment tools. Special attention is paid to observations, portfolios and self-assessment tools as well as to the use of stories, songs and games as part of the everyday assessment in the class.

Observations Given the key features of formative assessment presented above, one influential assessment tool to use with young learners is conducting regular observations. Frequent observations of the child in the class form the backbone of process-oriented classroom assessment. This is perhaps among the most effective type of assessment as it is also closely associated with the child’s learning process (Janisch et al., 2007; Shepard, 2000). During observations, the teacher takes regular notes of every child in terms of participation in class activities, performance in pair work and group work, as well as completion of activities in the class. Because these notes also provide information about the child’s personal development, they can also be used in designing the basis of the following activities to be prepared for the children. At this point, we would like to show how observational notes can support the design of a lesson and its content as well as the level of understanding on the part of the child. In more specific terms, we would like to show how a sample song can foster assessment of the child with the help of observations and observational notes taken in the class. Let us think of a scenario like this: Among others, one aim of the lesson is to raise awareness of first-grade children for health issues during the pandemic. The name of the song is Covid safety song www.youtube.com/watch?v=iM2GswfYri0. Let us proceed from this song and show step by step how the types of class activities based on observations and observation notes help the teacher to assess student performance.

Pre-Listening Activities First, the teacher provides information for the students about the class activity beforehand and shows colourful pictures about the theme of the song. In this particular example, the theme is the COVID-19 coronavirus pandemic, which has resulted in widespread school closures, especially during the early days of the pandemic for the past two years.

During Listening Activities Activity 1: 1. The teacher asks students to pick the words in the order they hear while listening to the song. 2. The teacher gives students a couple of words that are not in the song but about healthy life and hygiene, which is the topic of the song used. Then, s/he asks which words are in the song and which ones are not. This way, the teacher can direct the students’ attention to the lyrics of the song. In

141

 Teaching and Assessment in Young Learners’ Classrooms

order for the students to know which words are in the song and which ones are not, they need to listen to the song carefully. Activity 2: The teacher might give a true/false activity using the lyrics of the song. Activity 3: The teacher might give a fill-in-the-blanks activity while listening to the song second time. Wash your ------------- (hands) Don’t touch your ------------- (face) Keep your distance Watch your ------------- (space) After the fill-in-the-blanks activity above (Activity 3), the teacher gives feedback to each student and makes sure that every child has completed the activity properly. Providing feedback can be done in pairs, depending on the purpose and the design of the lesson plan. This way the students will have a chance to collaborate and interact with each other.

Post-Listening Activities 1. The teacher distributes the lyrics of the song to the students in a mixed order and asks them to put the sentences into the correct order while listening again. 2. In case some students find it difficult to do the activity, the song is played once again. (https:// www.youtube.com/watch?v=iM2GswfYri0) 3. In another activity the teacher asks the students to list the words repeated in the song or the ones that rhyme. The rhymed words are given in bold in the following example. Wash your hands Don’t touch your face Keep your distance Watch your space If you think you’re going to sneeze Sneeze into your elbow please……… In the sample song above, phrases such as ‘your face, your space’ show the rhymed words. The use of such songs in teaching raises awareness of students in terms of phonological processes in English.

142

 Teaching and Assessment in Young Learners’ Classrooms

Additionally, if the pronunciations of some sounds pose difficulties for the students, they will have a chance to practice problematic sounds. 4. The teacher and the students might sing the song once again before the end of the session as a whole group. Singing as a whole group might encourage shy and silent students to join the activities. Efforts must be made to get shy students involved in the activities so that they do not feel isolated in the class. Those students who appear to face difficulties in participating in the singing due to the number of unknown words or new grammatical patterns may be asked to say the repeated patterns in the song. 5. If this lesson is conducted with very young children learning English as a second language in preprimary or primary school years, they may be asked to draw a picture about the theme of the song. For instance, a little child might draw a picture of a doctor talking with a girl/boy. A single drawing may help the child to develop concepts relating to knowledge and understanding of the world, in this example, issues relating to health, hygiene, pandemic, safety etc. To this end, children’s drawings can give ideas about their feelings and experiences of the world (Anning & Ring, 2004) and the analyses of them can convey messages about the way they learn and progress. However, given individual differences among children, it is extremely important that a drawing activity does not turn into an unnecessary competition among children, which might have a detrimental impact on their motivation. 6. Another related activity with this song can be designed as follows: The teacher writes down some of the words of the song on a piece of paper and cuts out pictures related to the theme of the song and puts them all into an envelope. Each student/pair or group can pick an envelope and talk about the pictures. It can turn into a cause/effect type activity with grammatical patterns to be practiced. If your test is positive, stay home, if you are feeling bad, go and see a doctor, stay home and keep distance. From a closer perspective, all the activity types related to the song ‘Stay safe’ give valuable observational data to the teacher about student performance and yield sound information about student learning. They also provide the platform for the teacher to clarify areas in the lesson that students are uncertain about and find solutions to any problems encountered during the conduct of the lesson. The following questions have the potential to capture the nature of observational notes that the teacher can make use of: 1. While listening to the song, how did the students respond to the video and the colourful pictures used? 2. How did the students interact with each other during the pair or group work? Did they help each other during the conduct of the activities? 3. Who was able to complete the activities in the ‘during listening to the song’ phase? Who was not able to complete and why? 4. Who was able to complete the activities in the ‘post-listening to the song’ phase? Who was not able to complete and why? The teacher can compile observational notes with the questions above and make an analysis of each student’s performance. By answering these questions for each child, the teacher can document children’s learning in detailed terms. This in fact forms the backbone of assessment anchored in the class (Carr & 143

 Teaching and Assessment in Young Learners’ Classrooms

Claxton, 2002). Documenting observations does not only help for planning the follow-up lessons but also conveys a great deal of children’s understanding of many aspects of the lesson. Put it differently, the analysis of which part of the lesson went well, which part failed, which part embraced almost all children in the class, which part was found easy, and which part posed challenges to the majority can be used to assess cognitive, linguistic and social aspects of their development. This approach offers the teacher the unique opportunity to reflect on the learning process in order to identify how to move from the current state to the next step. What we have discussed so far was mostly associated with the teacher’s documentation of observational notes for each and every single child in the class. However, the teacher can also use and revise these observational notes to set further instructional goals, which will eventually lead to professional development in the form of self-reflective inquiry (Butler, 2002). As a matter of fact, using such formative assessment tools and their incorporation into class practice add up to the teacher’s professional development. This way, the teacher can learn not only how to assess the specific needs of the students but also how to ground the instructional choices he/she makes in the flowing steps. In sum, as we have attempted to show in the context of a sample song in this section, regular observations and taking systematic notes during observations and analyzing them for improving classroom teaching have far-reaching outcomes on students’ language learning processes. It could be that children might show better performance in certain tasks and activities, but not necessarily in others. Keeping a regular track of student performance in this sense provides the teacher with the knowledge of how to design and modify the following sessions in line with their developmental patterns. To this end, conducted individually or in pairs or groups, observation notes are a great asset for continuous assessment, showing stages of development children go through during the language learning process. Moreover, as noted above, observation notes also contribute to the teacher’s professional development and selfassessment, letting them know about the weaknesses and challenges as well as the strengths of the methodology adopted in language teaching. The notion of self-assessment applies to learners, too. Indeed, consistent with advances in the field of language teaching, recent years have seen a rise in research on learner autonomy and self-assessment. The next section, therefore, briefly reviews crucial elements of self-assessment in young learners’ classrooms.

Self-Assessment As has been discussed previously, an important feature of formative assessment concerns the notion of self-assessment, which is also given much attention in CERF. The main idea here is that young language learners need to become aware of their learning processes. In order to ensure this, the teacher creates such an atmosphere that children develop skills to assess and evaluate their own learning during the course of time. For instance, when they feel ready, they can fill in charts or speech bubbles about the tasks they have successfully completed and the level of language skills they have achieved. The following chart is a simple example, showing how young language learners can express what they do in relation to the theme of ‘family and friends’.

144

 Teaching and Assessment in Young Learners’ Classrooms

Figure 1. ­

Overall, the ultimate aim of self-assessment is to make sure that children can make use of the knowledge gained in this process and organize their own learning in the following years. One way to help them organize their own learning and self-reflection is to support children in creating portfolios of themselves, which is the next issue we examine as part of the ongoing assessment.

Portfolios Portfolios are collections of work samples children produce during their learning process (Puckett & Black, 2000). If used effectively, they provide a flexible platform for children to review their own learning while selecting some samples of their own choice. Language teachers and educators have been creating language portfolios since the 1990s, in line with the work produced by the Council of Europe (Little et al., 2011; Schärer, 2000). Various country-based language portfolios have been modeled and tested in many European countries. The CERF-based language portfolio has three major components: (i) Language passport refers to some demographic and factual information about the learner, involving any certificates or qualifications; (ii) Language biography gives the learner’s historical background of the learning process, along with some notes or checklists as well as self-assessment materials the learner keeps over time; (iii) Language dossier is a collection of sample works the learner selects and compiles over the course time. These might involve samples of any kind of product, ranging from a simple mask to various online or offline products. As noted by Puckett and Black (2000) these sample works also provide evidence of the child’s progress in relation to the learning goals of the curriculum. Given recent advances in digital technologies and their integration into classroom practices, portfolios have the poten-

145

 Teaching and Assessment in Young Learners’ Classrooms

tial to present a lot of information about the child’s learning processes (Boardman, 2007). Moreover, as has been the case in school systems over the last several decades, language portfolios can also be used as a tool to provide information about the child’s progress to parents during their school visits, where children themselves present their own work to others. Sharing portfolio work in such a way contributes to children’s ability to think, analyze and discuss their work with others. Overall, it should be noted that the crucial point is that portfolios are not only connected to children’s direct learning experiences (Frey et al., 2012), but perhaps more importantly they are student-developed and can be used as learning goals (Litchfield & Dempsey, 2015). As the type of assessment, we adopt within the scope of this paper is formative assessment which is assumed to be embedded into the teaching practices in the class, the next section focuses on the use of storybooks with young children and shows how they can be used both as effective tools for language teaching and assessment in the class.

Stories Another influential teaching and assessment tool to use with young learners is to make use of a storybased methodology in teaching. The use of stories has long been argued to be one of the most effective teaching and learning tools in terms of providing young learners with meaningful, motivating, and fun learning contexts in language classrooms (Brewster, Ellis & Girard, 2002; Davies, 2007; White, 1995, 1997). Starting even before formal education, active involvement with stories is important, because if they are used effectively, they have the potential to contribute to young learners’ not only language and literacy skills but also cognitive abilities. As has been noted by many researchers for years, stories allow children to make links between real life and their imaginative worlds (Brewster et al., 2002) and develop continuity in children’s learning in particular with respect to building connections between school subjects across the curriculum (Ellis & Brewster, 1991; Davies, 2007). Since young learners mainly focus on meaning rather than specific grammar patterns, their oral skills, and literacy as well as vocabulary knowledge improve at the same time. Much work in the field of English language teaching has long emphasized the integration of literature into language teaching (Collie & Slater, 1990; Fojkar et al., 2013; Ghosn, 2002; Miller & Pennycuff, 2008). For many researchers, a teaching program based on authentic and colorful story books motivates young children to read and supports their academic, emotional, and intercultural development. Story books also contribute to children’s oral and literacy skills (Krashen, 2004; Laufer & Rozovski-Roitblat, 2011; Liu & Zhang, 2018). In a detailed early study on book-based programs, also known as book floods, Elley (1989) observes that book floods have positive effects on young learners’ incidental vocabulary development. While reading books, the students learn new vocabulary items without getting explicit instruction about them. On similar grounds, Liu and Zhang’s (2018) recent meta-analysis shows the impact of graded readers in vocabulary learning, in particular when accompanied by supplementary materials such as questions and vocabulary worksheets. Stories also integrate oral and written language through extensive meaningful input. As pointed out by Dlugozs (2000), using stories in young learners’ classrooms from the beginning will lead to improvement in reading as well as understanding and speaking the other language. From a similar perspective, Kim and Hall (2002) discuss the effects of an interactive book reading program and suggest that interactive book reading programs contribute to learners’ pragmatic competence. An immediate question here concerns

146

 Teaching and Assessment in Young Learners’ Classrooms

how stories can be integrated into classroom practice and used as part of assessment. The next section, therefore, focuses on classroom practices with special reference to story use in language teaching.

How to Use Stories With Young Learners? Initially, the teacher might start with well-known universal stories or fairy tales as they are generative and children can identify themselves with universal themes like hope, love, happiness, fear or anger, no matter which country they live in (Ghosn, 2002; McQuillan & Tse, 1998; Rixon, 1995). This way, young learners would also probably know the story in their mother tongue and would be able to follow it better at a time they have not developed the necessary language skills in the new language. In addition to their rich content and themes, from a linguistic perspective, stories offer many advantages for young children. As can be seen in the famous Snow White and the Seven Dwarfs story, for instance, it is possible to find recurring structural patterns and phrases such as Who’s been sitting on my stool?, Who’s been eating off my plate?, Who’s been picking my bread?, Who’s been meddling with my spoon?, Who’s been handling my fork? etc. In line with the number of the dwarfs in the story, the very same structural pattern with ‘who’s been’, not taught either in the primary or in the secondary schools up until later years in the Turkish ELT curriculum, for instance, is repeated at least seven times, providing a meaningful context for either the introduction or the revision of the structural patterns under discussion. Such recurring patterns pave the way for children in various ways. Those who have not paid attention to such forms will have a chance to notice them, and those who have already noticed will have a chance to interact with peers and use them in meaningful ways. Given that an important aspect of this paper is on classroom-based ongoing assessment, let’s now examine how these resources can be used in young learners’ classrooms.

Methodology in Using Stories Similar to the use of songs, as exemplified in the previous sections and other types of task-based activities, a story-based methodology and assessment in young learners’ classrooms can involve various phases such as pre-storytelling, while storytelling, and post-storytelling.

Pre-Storytelling/Reading Pre-storytelling requires careful planning on the part of the teacher. Before using the story, the teacher should first decide on the objectives of the lesson and predict the potential outcomes of the particular story chosen. The objectives specified for a particular lesson can range from cultural and linguistic objectives to cross-curricular objectives to be built with the other courses. Both of these initial efforts are closely related to the assessment procedures to be used during the learning process. The teacher should also decide how to make the content of the story accessible to young learners. This can be done through visual materials, introducing characters of the story before reading and relating the story to children’s own experiences. As is known, young children usually identify themselves with the characters and situations in the story. Therefore, the teacher can prepare activities that allow them to associate learning with their real-life experiences. Let’s suppose that the theme of the lesson is wild animals and their habitats. Before introducing the ‘wild animals’ in the story, the teacher might ask children about their experiences at a zoo, the kinds of animals they have seen before, the activities or 147

 Teaching and Assessment in Young Learners’ Classrooms

fun they have had at the zoo. The crucial point here is that at least some portions of learning are based on learners’ real-life experiences, one of the basic mental processes young learners employ during their daily routines. A learning environment like this not only creates and encourages children’s participation but also provides the teacher with tools in terms of the way children understand the content presented to them.

During Storytelling/Reading One crucial point while reading/telling the story is to make sure that young learners are engaged in doing something related to the story (Wright, 1997). For example, while reading/telling the story, the teacher may ask students to put the pictures of the characters in the story into their order of appearance. Alternatively, depending on their level whether literate or illiterate in the second language, children can put the sentences from the story into order. These activities will certainly allow the teacher to gather information about the way children process and understand the content.

Post-Story Reading/Telling For the post-reading stage, the teacher can decide in advance which activities to use to consolidate the language taught in the story. Games compatible with the objectives of the lesson could be ideal to be used at this stage. A bingo game related to the theme covered in the lesson, for instance, might consolidate the learners’ speaking and listening skills. Post story reading/telling activities can also involve other types of games played in pairs/groups such as guessing games or class surveys or producing a picture dictionary based on the theme of the lesson. As shown in detailed terms in the previous sections, songs can also be effective tools for not only consolidating what has been presented to young learners in a story but also assessing how well they have understood the story and the accompanying activities, including comprehension questions as well as creative questions such as What do you see in this picture? Who do you think is the main character in this story?, What do you think these characters are doing? If you were a character in this story, which character would you want to be and why? Asking such questions might motivate students to produce in the target language and helps the teacher understand the way children would like to communicate and express themselves, sometimes through using the language taught, sometimes via gestures or other ways of interaction (MacNaughton & Williams, 2004). Even if students sometimes fail to produce some patterns or vocabulary and resort to their mother tongue, it would create a natural environment for the teacher to supply his/her students with the required tools in the target language. With careful planning and regular observations, the teacher can use all of this information for assessment purposes and prepare more tasks and activities in line with the themes specified in the program. Moreover, adapting Butler’s (2002) perspectives on self-reflection, we think that the following questions have the potential to inform classroom instruction as well as establish accountability on the part of the teacher: 1. 2. 3. 4. 148

Which parts of the story and the activities went well? In which parts did the students have problems in understanding and completing the tasks? Which parts of the lesson plan and the activity types need to be revised? What exactly did the students learn in this particular story?

 Teaching and Assessment in Young Learners’ Classrooms

5. Were there any activities I as a teacher planned but somehow could not conduct during the lesson? 6. What was the problem? Were the difficulties related to timing or the materials I prepared, or were the activities inappropriate for the level and the interest of my students? 7. Which methods, materials and techniques did I use to ease my students’ learning? Which ones did I not use? Why? 8. Were there any students whom I could not get connected to during the story session? If, yes, how can I compensate this next time? As can be seen in the questions above, during a story session the teacher can observe and assess not only child performance in the class but also his/her own teaching skills. Together with the children’s and teachers’ self-assessment skills we have addressed up to this point, all these classroom-based evaluations have far-reaching outcomes in the future and require collaboration among teachers, school administration and even parents. However, sometimes all these efforts might get disrupted due to unexpected circumstances, as in the case of the global Covid-19 pandemic. Since March 2020 almost all of us have found ourselves having to adapt quickly to dramatically different teaching contexts in response to the pandemic, which has resulted in online teaching as well as widespread school, college and university closures. This includes the young age range we have examined within the scope of this paper. Therefore, before closing our discussion on formative assessment, a few words are spared on story-based online assessment in young learners’ classrooms. Online learning platforms may allow us to explore various methods of formative assessment via selfassessment and collaborative interaction. Based on some activity sheets the teacher prepares from the storybook used in the lesson, students may initially complete the assigned activities offline. When they come to the class, they can check the answers depending on the nature of the task designed, sometimes focusing on speaking skills, sometimes on reading and writing skills. This way, children will have an opportunity to go over the story once again and consolidate what they have grasped. They can also complete offline activities previously in pairs or groups while collaborating with others, writing their own stories, for instance. Back in class, the teacher leads them to check their understanding, by asking questions such as tell us what this picture is about, add another sentence to your friend’s story, what is different in your friend’s story. As discussed previously, the teacher can assess what has really worked during the lesson and what has not, what shows that the children have learned something, and what further teaching skills and materials can be used to promote children’s learning. Overall, whether conducted online or offline, teachers need to check children’s progress systematically and gather information on areas they might need extra support. By using this information, teachers will have a chance to plan curriculum content and methodology for the upcoming lessons.

CONCLUSION In this paper we have examined a number of issues surrounding assessment with particular reference to young learners’ classrooms. We have first dealt with the introduction of foreign/second languages in early years education, then moved on to the characteristics of young learners and classroom practices used with this group of learners. In the second section of the paper, we have presented various types of assessment tools in young learners’ classrooms, with special reference to formative assessment tools such as observations, portfolios, self-assessment and stories. While most of the ideas presented within 149

 Teaching and Assessment in Young Learners’ Classrooms

the scope of this chapter apply to learners from various age ranges, our focus has been placed on young second language learners who learn through hands-on experiences, games, songs, stories rather than paper and pencil activities. Therefore, we have particularly focused on formative assessment tools, with a focus on ongoing assessment in the class, and exemplified how these tools provide meaningful and useful information for teachers. We have highlighted the idea that formative assessment in young learners’ classrooms is closely associated with actual classroom practices, including all types of observations of naturally occurring responses of students during class activities, stories, songs and games. Overall, we have endorsed the view that assessment needs to be embedded into the learning and teaching processes, through teachers’ day-to-day notes, observations or stories, which have the potential to provide rich contexts for learning. As we have also suggested, however, there is more to do given the challenges we have faced during the pandemic. Given the widespread and sustained nature of the virus and its variants perhaps in the following years, further work needs to explore how to plan accessible online and offline lessons and assessments together and how best to engage young learners and build relationships with them effectively.

REFERENCES Anning, A., & Ring, K. (2004). Making sense of children’s drawings. Open University Press. Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policy & Practice, 5(1), 7–74. doi:10.1080/0969595980050102 Boardman, M. (2007). “I know how much this child has learned. I have proof!” Employing digital technologies for documentation processes in kindergarten. Australian Journal of Early Childhood, 32(3), 59–66. doi:10.1177/183693910703200309 Bowman, B., Donovan, S., & Burns, S. (2001). Eager to learn: Educating our pre-schoolers. Report of Committee on Early Childhood Pedagogy. Commission on Behavioural and Social Sciences and Education National Research Council. Washington, DC: National Academy Press. Brewster, J., Ellis, G., & Girard, D. (2002). The primary English teacher’s guide. Penguin. Butler, D. L. (2002). Qualitative approaches to investigating self-regulated learning: Contributions and challenges. Educational Psychologist, 37(1), 59–63. doi:10.1207/S15326985EP3701_7 Cameron, L. (2001). Teaching languages to young learners. Cambridge University Press. doi:10.1017/ CBO9780511733109 Carr, M., & Claxton, G. (2002). Tracking the development of learning dispositions. Assessment in Education: Principles, Policy & Practice, 9(1), 9–37. doi:10.1080/09695940220119148 Cenoz, J., Genesee, F., & Gorter, D. (2014). Critical analysis of CLIL: Taking stock and looking forward. Applied Linguistics, 35(3), 243–262. doi:10.1093/applin/amt011 Cizek, G. (2010). An introduction to formative assessment: History, characteristics, and challenges. In H. Andrade & G. J. Cizek (Eds.), Handbook of formative assessment (pp. 3–17). Routledge.

150

 Teaching and Assessment in Young Learners’ Classrooms

Collie, J., & Slater, S. (1990). Literature in the language classroom: A resource book of ideas and activities. Cambridge University Press. Copland, F. (2020). To teach or not to teach: English in primary schools. In H. H. Uysal (Ed.), Political, pedagogical and research insights into early language education (pp. 10–18). Cambridge Scholars Publishing. Copland, F., Garton, S., & Burns, A. (2014). Challenges in teaching English to young learners: Global perspectives and local realities. TESOL Quarterly, 48(4), 738–762. doi:10.1002/tesq.148 Council of Europe. (2001). Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Council of Europe, Modern Languages Division. Coyle, D., Hood, P., & Marsh, D. (2010). CLIL: Content and language integrated learning. Cambridge University Press. doi:10.1017/9781009024549 Davies, A. (2007). Storytelling in the classroom: Enhancing traditional oral skills for teachers and pupils. Questions Publishing. Dlugozs, D. W. (2000). Rethinking the role of reading in teaching a foreign language to young learners. ELT Journal, 543(3), 284–290. doi:10.1093/elt/54.3.284 Elley, W. B. (1989). Vocabulary acquisition from listening to stories. Reading Research Quarterly, 24(2), 174–187. doi:10.2307/747863 Ellis, G., & Brewster, J. (1991). The storytelling handbook for primary teachers. Penguin. Enever, J., & Moon, J. (2010). A global revolution: Teaching English at primary school. Metropolitan University. Fojkar, M. D., Skela, J., & Kovač, P. (2013). A study of the use of narratives in teaching English as a foreign language to young learners. English Language Teaching, 6(6), 21–28. Frey, B. B., Schmitt, V. L., & Allen, J. P. (2012). Defining authentic classroom assessment. Practical Assessment, Research & Evaluation, 17(2), 1–18. Garcia Mayo, M. P., & Garcia Lecumberri, M. L. (Eds.). (2003). Age and the acquisition of English as a foreign language: Theoretical issues and fieldwork. Multilingual Matters. doi:10.21832/9781853596407 Garton, S., & Copland, F. (Eds.). (2019). The Routledge handbook of teaching English to young learners. London: Routledge. Garton, S., Copland, F., & Burns, A. (2011). Investigating global practices in teaching English to young learners. ELT Research Papers, 11(1), 1–24. Gaynor, B. (2014). From language policy to pedagogic practice: Elementary school in Japan. In S. Rich (Ed.), International perspectives on teaching English to young learners (pp. 66-86). Houndsmill: Palgrave Macmillan. Ghosn, I. (2002). Four good reasons to use literature in primary schools. ELT Journal, 56(2), 172–179. doi:10.1093/elt/56.2.172

151

 Teaching and Assessment in Young Learners’ Classrooms

Gipps, C. V. (1994). Beyond testing: towards a theory of educational assessment. Routledge. Graddol, D. (2006). English next. British Council Publications. Hasselgreen, A. (2005). Assessing the language of young learners. Language Testing, 22(3), 337–354. doi:10.1191/0265532205lt312oa Haznedar, B. (2013). Child second language acquisition from a generative perspective. Linguistic Approaches to Bilingualism, 3(1), 26–47. doi:10.1075/lab.3.1.02haz Heritage, M. (2007). Formative assessment: What do teachers need to know and do? Phi Delta Kappan, 89(2), 140–145. doi:10.1177/003172170708900210 Hinkel, E., & Fotos, S. (2002). New perspectives on grammar teaching in second language classrooms. Routledge. Hurst, V., & Lally, M. (1992). Assessment and the nursery curriculum. In G. Blenkin & A. V. Kelly (Eds.), Assessment in early childhood education (pp. 46–68). Paul Chapman. Inbar-Lourie, O., & Shohamy, E. (2009). Assessing young language learners: What is the construct? In M. Nikolov (Ed.), The age factor and early language learning (pp. 83–96). Mouton de Gruyter. Jang, E. E. (2014). Focus on assessment. Oxford University Press. Janisch, C., Liu, X., & Akrofi, A. (2007). Implementing alternative assessment: Opportunities and obstacles. The Educational Forum, 71(3), 221–230. doi:10.1080/00131720709335007 Jenkins, J. (2015). Global Englishes. Routledge. Johnson, J., & Newport, E. (1989). Critical period effects in second language learning: The influence of maturational state on the acquisition of English as a second language. Cognitive Psychology, 21(1), 60–99. doi:10.1016/0010-0285(89)90003-0 PMID:2920538 Johnstone, R. (2009). An early start: What are the key conditions for generalized success? In J. Enever, J. Moon, & U. Raman (Eds.), Young learner English language policy and implementation: International perspectives (pp. 31–41). Garnet Publishing. Kim, D., & Hall, J. H. (2002). The role of interactive book reading program in the development of second language pragmatic competence. Modern Language Journal, 86, 332–348. Kırkgöz, Y. (2009). English language teaching in Turkish primary education. In J. Enever, J. Moon, & U. Raman (Eds.), Young learner English language policy and implementation: International perspectives (pp. 189–195). Garnet Education. Krashen, S. (2004). The power of reading: Insights from the research. Libraries Unlimited. Laufer, B., & Rozovski-Roitblat, B. (2011). Incidental vocabulary acquisition: The effects of task type, word occurrence and their combination. Language Teaching Research, 15, 391–411. Lingren, E., & Muñoz, C. (2013). The influence of exposure, parents and linguistic distance on young European learners’ foreign language comprehension. International Journal of Multilingualism, 10(1), 105–129.

152

 Teaching and Assessment in Young Learners’ Classrooms

Litchfield, B., & Dempsey, J. (2015). Authentic assessment of knowledge, skills, and attitudes. New Directions for Teaching and Learning, 142, 65–80. Little, D., Goullier, F., & Hughes, G. (2011). The European language portfolio: The story so far (19912011). Council of Europe. Liu, J., & Zhang, J. (2018). The effects of extensive reading on English vocabulary learning: A metaanalysis. English Language Teaching, 11(6), 1–10. Long, M. H. (2000). Focus on form in task-based language teaching. In R. D. Lambert & E. Shohamy (Eds.), Language policy and pedagogy: Essays in honor of A Ronald Walton. John Benjamins. Lyster, R. (2004). Differential effects of prompts and recasts in form-focused instruction. Studies in Second Language Acquisition, 26, 399–432. MacNaughton, G., & Williams, G. (2004). Teaching young children: Choices in theory and practice. Open University Press. Markman, E. M. (1994). Constraints on word meaning in early language acquisition. Lingua, 92, 199–227. Marsh, D. (1994). Bilingual education & content and language integrated learning. International Association for Cross-cultural Communication, Language Teaching in the Member States of the European Union (Lingua), Paris, University of Sorbonne. McKay, P. (2006). Assessing young language learners. Cambridge University Press. McQuillan, J., & Tse, L. (1998). What’s is the story? Using the narrative approach in beginning language classrooms. TESOL Journal, 7, 18–23. Meisel, J. M. (2008). Child second language acquisition or successive first language acquisition? In B. Haznedar & E. Gavruseva (Eds.), Current trends in child second language acquisition (pp. 55-82). Amsterdam: John Benjamins. Miller, S., & Pennycuff, L. (2008). The power of story: Using storytelling to improve literacy learning. Journal of Cross-Disciplinary Perspectives in Education, 1(1), 36–43. Mourão, S., & Lourenço, M. (2015). Early years second language education: International perspectives on theory and practice. Routledge. Moyer, A. (2013). Foreign accent: The phenomenon of non-native speech. Cambridge University Press. Muñoz, C. (2006). Age and rate of foreign language learning. Multilingual Matters. Muñoz, C. (2008a). Symmetries and asymmetries of age effects in naturalistic and instructed L2 learning. Applied Linguistics, 29, 578–596. Muñoz, C. (2008b). Age-related differences in foreign language learning: Revisiting the empirical evidence. International Journal of Applied Linguistics, 46, 197–220. Newport, E. L. (1991). Contrasting conceptions of the critical period for language. In S. Carey & R. Gelman (Eds.), The epigenesis of mind (pp. 111–130). Erlbaum.

153

 Teaching and Assessment in Young Learners’ Classrooms

Nicholas, H., & Lightbown, P. (2008). Defining child second language acquisition, defining roles for L2 instruction. In J. Philp, R. Oliver, & A. Mackey (Eds.), Second language acquisition and the younger learner: Child’s play? (pp. 27–51). John Benjamins. Nikolov, M. (2016). Trends, issues, and challenges in assessing young language learners. In M. Nikolov (Ed.), Assessing young learners of English: Global and local perspectives (pp. 1–18). Springer. Papp, S., & Walczak, A. (2016). The development and validation of a computer-based test of English for young learners: Cambridge English young learners. In M. Nikolov (Ed.), Assessing young learners of English: Global and local perspectives. Springer. Pfenninger, S. E., & Singleton, D. (2019). Starting age overshadowed: The primacy of differential environmental and family support effects on L2 attainment in an instructional context. Language Learning, 69(1), 207–234. Philips, M. (2020). Multimodal representations for inclusion and success. In H. H. Uysal (Ed.), Political, pedagogical and research insights into early language education (pp. 70–81). Cambridge Scholars Publishing. Philips, S. (1993). Young learners. Oxford University Press. Pinter, A. (2011). Children learning second languages. Palgrave Macmillan. Puchta, H. (2019). Teaching grammar to young learners. In S. Garton & F. Copland (Eds.), The Routledge handbook of teaching English to young learners (pp. 203–219). Routledge. Puckett, M., & Black, J. (2000). Authentic assessment of the young child. Prentice Hall. Rixon, S. (1995). The role of fun and games activities in teaching young learners. In C. Brumfit, J. Moon, & R. Tongue (Eds.), Teaching English to children: From practice to principle (pp. 33–48). Longman. Rothman, J., González-Alonso, J., & Puig-Mayenco. (2019). Third language acquisition and linguistic transfer. Cambridge University Press. Saraceni, M. (2008). Meaningful form: Transitivity and intentionality. ELT Journal, 62(2), 164–172. Schärer, R. (2000). European language portfolio pilot project phase (1998-2000): Final Report. Council of Europe. https://rm.coe.int/16804586bb Shepard, L. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7), 4–14. Shepard, L., Kagan, S., & Wurtz, E. (Eds.). (1998). Principles and recommendations for early childhood assessments. National Education Goals Panel. Singleton, D. (2001). Age and second language acquisition. Annual Review of Applied Linguistics, 21, 77–89. Singleton, D. (2005). The Critical Period Hypothesis: A coat of many colors. International Review of Applied Linguistics in Language Teaching, 43, 269–285. Torrance, H. (2001). Assessment for learning: Developing formative assessment in the classroom. International Journal of Primary. Elementary and Early Years Education, 29(3), 26–32.

154

 Teaching and Assessment in Young Learners’ Classrooms

Torrance, H., & Pryor, J. (2002). Investigating formative assessment, teaching and learning in the classroom. Open University Press, McGraw Hill. Trumbull, E., & Lash, A. (2013). Understanding formative assessment: Insights from learning theory and measurement theory. WestEd. Wright, A. (1995). Storytelling with children. Oxford University Press. Wright, A. (1997). Creating stories with children. Oxford University Press.

ADDITIONAL READING Babaee, M., & Tikoduadua, M. (2013). E‐Portfolios: A new trend in formative writing assessment. International Journal of Modern Education Forum, 2(2), 49–56. British Council Publications. (n.d.). Assessing Young Learners: A toolkit for teacher development. Retrieved from https://www.teachingenglish.org.uk/sites/teacheng/files/Asse ssing_young_learners_TEv10.pdf C a u d we l l , G . ( 2 0 2 0 ) . A s s e s s i n g y o u n g l e a r n e rs . B r i t i s h C o u n c i l P u b l i c a tions. Retr ieved from https://www.br itishcouncil.org/exam/aptis/research/projects/ assessment-literacy/assessing-young-learners Guddemi, M., & Case, B. J. (2004). Assessing young children. Pearson Assessment Report. Pearson Education. Heritage, M. (2010). Formative assessment: Making it happen in the classroom. Corwin. doi:10.4135/9781452219493 Nikolov, M. (2016). Trends, issues and challenges in assessing young language learners. In M. Nikolov (Ed.), Assessing young learners of English: Global and local perspectives. Springer. Nikolov, M., & Timpe-Laughlin, V. (2021). Assessing young learners’ foreign language abilities. Language Teaching, 54(1), 1–37. doi:10.1017/S0261444820000294

KEY TERMS AND DEFINITIONS Formative Assessment: Monitoring student learning and providing ongoing feedback over time. Language Portfolio: The collection of children’s productions during language learning. Self-Assessment: Making self-evaluations and judgments about an individual’s learning process. Young Learners: Children aged 5-12 in the pre-primary or primary school years in formal L2 settings.

155

156

Chapter 9

The Long-Term Washback Effect of University Entrance Exams:

An EFL Learner and Teacher’s Critical Autoethnography of Socialization Ufuk Keleş https://orcid.org/0000-0001-9716-640X Faculty of Educational Sciences, Bahçeşehir University, Turkey

ABSTRACT In this chapter, the author explores the long-term washback effects of taking the nationwide university entrance exam (UEE) on his L2 socialization. He scrutinizes how he used his agency to break away from such effects in his later life. His theoretical framework incorporates L2 socialization theory and the concept of “desire” in TESOL. Methodologically, he employs critical autoethnography of socialization. The findings reveal that his L2 socialization was shaped by studying for and being taught to the test (aka the UEE), which greatly helped him earn a place at a top-notch university yet created many obstacles in his undergraduate studies and professional life. The findings further showed that the UEE’s format was, to some extent, egalitarian in that it provided high schoolers from low socio-economic status families with the opportunity to study in prestigious universities.

INTRODUCTION My history of learning, using, and teaching English language goes back to 1994 when I started learning English at high school at the age of fourteen. Upon receiving a rather high score from the UEE in 1998, I earned a place at Boğaziçi University where I studied English language and literature for five years. Upon graduation, I worked as an English instructor first at a language school from 2003 to 2006, and next at Yıldız Technical University for almost 13 years. I received my MA degree in TEFL from Bilkent

DOI: 10.4018/978-1-6684-5660-6.ch009

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 The Long-Term Washback Effect of University Entrance Exams

University in the 2012-13 educational year. In 2016, I quit my teaching career to pursue my doctoral studies in the Curriculum and Instruction Department at the University of Alabama, which I completed in 2020. Currently, I am working as an assistant professor at Bahçeşehir University’s ELT department. I started off the chapter with the brief personal information in italics above for I believe it is crucial for autoethnographers to provide their readers with some autobiographical data as early and as thoroughly as possible (Keleş, 2022a). This way, I believe my readers - you - will soon start making connections between my personal narrative(s) and your own experiences. Doing so, you may approach my thoughts, beliefs, and emotions with a comparative and dialogic lens as well. That said, I expect you to engage actively with this autoethnographic narrative of mine in which I reflect on the long-term washback effects of the university entrance exam (UEE)1 on my academic and professional life. The UEE constitutes multiple-choice questions only and measures test takers’ grammar and vocabulary knowledge along with their reading skills. As a result, these test takers strategically ignore developing their communicative skills (i.e., listening, writing, and speaking) while preparing for the exam in high school (Sayın & Aslan, 2016) – and I am one of them. This is the reason why I am interested in the UEE. Just like in the life of thousands of high schoolers taking it, the UEE has played a substantial role in my language learning, using, and teaching history. At this point, I must note that I am not aiming for a personal narrative that is generalizable to a larger population. On the contrary, I ask you to read it as a particular story - a mundane one that I can only hope that will resonate with yours. The informed, academic, and critical voice in me tells me that the UEE is one of the core elements of the education system nationwide, especially in the field of ELT (Hatipoğlu, 2016). Across the country, our language classrooms cannot go further than teaching reading, grammar, and vocabulary in most cases (Akın, 2016; Sayın & Aslan, 2016; Yıldırım, 2010). Although the standardized compulsory education system requires language learners to take minimum 76 lesson hours of English from the fourth grade onwards (Kırkgöz, 2008), Turkey lags other countries (Kasap, 1019; Yaman, 2018) since we are unable to achieve the desired, high quality English education (Erdoğan & Savaş, 2022). Surely, there are many contextual reasons that feed into this undesired, ineffective, and vicious education system (Hatipoğlu, 2016). One main reason is the UEE that language department students take in the last year of the high school or after finishing it. In preparation to the UEE, these students cram for grammar rules and all possible exceptions to them, develop their test-taking strategies, and learn as much vocabulary as possible (Hatipoğlu, 2016). To support these students, their English teachers also focus on grammar and vocabulary instruction as they feel compelled to teach to the test. According to Shohamy et al. (1996), negative washback occurs when teachers feel pressured and anxious when their teaching performance is assessed by their students’ test taking performance. Likewise, in order to be regarded a ‘good’ teacher, many English teachers in Turkey oftentimes are compelled to neglect working on developing the communicative skills of their students who will soon become pre-service English teachers (Hatipoğlu, 2016). Once they complete their formal ELT education, as novice English teachers, they revisit their own language learning experiences, and teach English in the way they were taught, which Lortie (1974, p. 2) calls “apprenticeship of observation.” As a result, teaching grammar and vocabulary becomes the norm; and with it, there comes the vicious circle of I-understand-but-can’t-speak-English. Not everyone in Turkey was, has been, or will be the victim of this vicious circle. However, learning English for almost three decades and teaching it for about twenty years, I ascertain that my experience says it is a long-held nationwide problem –at least for many of my friends, students, and colleagues,

157

 The Long-Term Washback Effect of University Entrance Exams

who have taken the UEE to pursue an English language-related degree at university, and later become English teachers. As noted by Alderson and Wall (1993, p. 1), “tests are held to be powerful determiners of what happens in the classroom”. The main purpose of this autoethnographic study is to explore what happened to me first as a learner and later as a teacher in the classrooms before and after taking the UEE. Agreeing with Ellis (2009), who asks “Who knows better the right questions to ask than a social scientist that has lived through the experience?” (p. 102), I seek answers to the research questions below: 1. How has taking the nationwide university entrance exam influenced my L2 socialization both as an EFL learner and teacher in different communities of practice (CoPs) over 25 years? 2. How have I used my agency to emancipate myself from the long-term washback effects of the UEE? With your companionship, I look for answers to these research questions throughout the chapter hoping that my answers will help you introspect into your own beliefs/thoughts/emotions about your own language learning and test taking experiences.

BACKGROUND LITERATURE The education system in Turkey relies heavily on tests (Hatipoğlu, 2016). Starting from their primary education, students take many high-stakes tests to finally study in prestigious universities (DoğançayAktuna & Kızıltepe, 2005). In a rather competitive environment, students, teachers and even schools’ performances are evaluated by looking at how well students have performed on previous exams (Hatipoğlu & Erçetin, 2016). As one and perhaps the most critical high-stakes tests in Turkey, the UEE has serious consequences since its results determine who will be able to study at what degree program at which university. However, there is a common belief among practitioners, academics, students, and parents that the central standardized UEE has a negative washback effect on local ELT practices in Turkey (Hatipoğlu, 2016).

Definition of Washback Effect Washback broadly refers to any effect that a test has on teaching and learning, be it positive or negative, intended, or unintended (Alderson & Wall, 1993; Bachman & Palmer, 1996; Cheng, 2005; Cheng et al., 2004; Hung, 2012). In this study, I mainly use Watanabe’s (2004) five-dimensional model (see Table 1) as it defines washback effect more thoroughly, and as it helps me understand the UEE’s particular washback effects more comprehensively.

158

 The Long-Term Washback Effect of University Entrance Exams

Table 1. Watanabe’s (2004) five-dimensional model of washback Dimension

Range

Explanation

Specificity

general specific

The effect may be produced by any test. The effect is attributed to a particular test.

Intensity

strong weak

High-Stakes tests have stronger effects. Low-stakes tests have weaker effects.

Length

long-term short-term

The effect endures even after taking the test. The effect diminishes right after taking the test.

Intentionality

intended unintended

The effect was targeted by the test designers. The effect was targeted by the test designers.

Value

positive negative

The effect yields useful results. The effect leads to harmful results.

Given the dimensions in Watanabe’s model in Table 1, I acknowledge that the UEE is definitely a high-stakes test which has strong and long-term effects. Many scholars converge on the opinion that the more a test bears important educational consequences, the more strongly its washback effect is felt by the students and their teachers (e.g., Alderson & Wall, 1993; Gates, 1995; Green, 2007; Watanabe, 2004). Similarly, Vallette (1994) suggests that washback is strong especially when the students’ performance on a test determines their future. As such, since the score they receive from the UEE defines how far the students in Turkey come close to their future aspirations and makes them readjust their future trajectories, the UEE certainly has strong long-term washback effects. When it comes to intentionality, it is safe to say that the UEE yields intended results. Student Selection and Placement Center (OSYM), the governmental institute that prepares and administers large-scale exams in Turkey, deliberately prepares the UEE in multiple-choice format so that it produces objective, credible, and efficient results. Agreeing with Roediger and Marsh (2005) who remark that the results of multiple-choice exams are far from being subjective or invalid, it is safe to assume that the structure of the UEE, in and of itself, produces the intended washback effect regarding its feasibility, efficiency, and reliability. Determining whether a test results in beneficial or detrimental washback is highly context specific (Cheng, 2005) depending on the needs, qualities, and expectations of students, their teachers and families, and the broader society. However, most washback studies conducted in Turkey found that the UEE had much more negative washback effects than positive (Toksöz & Kılıçkaya, 2017), frequently because teachers teach to the test, which means that they narrow down their focus to the required skills and knowledge ignoring the skills and knowledge base excluded in the UEE. Specifically, since the UEE assesses the test-takers’ reading skill, grammar knowledge, and lexical base, most pre-service EFL teachers start their undergraduate studies without sufficiently developing their speaking (especially pronunciation), listening and writing skills although they receive satisfactory and above scores from the UEE.

Washback Studies in the Context of Turkey Investigating into washback in language testing and education in Turkey is a recent research area for educational linguists. Most research conducted on washback in Turkey is about high-stakes exams (Toksöz & Kılıçkaya, 2017), particularly the UEE. Among them are Hatipoğlu’s (2016) study in which she

159

 The Long-Term Washback Effect of University Entrance Exams

examined the long-term washback effect of the UEE on language teachers; Sayın and Aslan’s (2016) study which focuses on the negative effects of English language tests on pre-service ELT teachers; and others (Karabulut, 2007; Yıldırım, 2010). These studies mostly focus on the value of the UEE as they scrutinize whether it had beneficial or detrimental effect on the test-takers. The findings of recent studies on the UEE’s washback effect in Turkey show that the UEE, on the whole, produces negative washback effect. For instance, in her study where she analyzed the UEE’s longterm washback effects, Hatipoğlu (2016) found that nearly all of the participants held that the UEE had an overwhelmingly negative impact on all levels of English language education in Turkey that included planning, defining, and designing the system. Also, most high school teachers provided their students with strategic help to master the UEE’s format instead of teaching English in communicative ways. Similarly, the findings of Sayın and Aslan’s (2016) study revealed that since the English section of the UEE focused on grammar, vocabulary, and reading only, the students strategically ignored developing their communicative skills (i.e., listening, writing, and speaking) while preparing for the exam in high school. This ignorance however surfaced when they started studying ELT at university. Karabulut’s (2007) and Yıldırım’s (2010) studies, although conducted in separate settings in different times, showed similar findings regarding the UEE’s negative washback in that the high schoolers who prepared for the UEE studied only reading, grammar, and vocabulary. In harmony with their students’ study habits, the teachers in these studies preferred to teach to the test and prepare their students for the UEE almost completely ignoring the productive language skills. Overall, the alignment of teachers’ desires with their students’ results in stronger desires on both sides (Motha & Lin, 2014). In the same vein, English teachers’ adherence to teaching to the test thickens in line with their students’ desire to be taught to the test. However, in the long run, such mutual benefit may yield ‘unintended’ yet ‘inevitable’ detrimental results.

CONCEPTUAL FRAMEWORK In this chapter, I explore the long-term washback effects of the UEE that I took in 1998 on my own thoughts/emotions/experiences first as an English learner and later as an English teacher. In doing so, my conceptual framework is guided by the amalgamation of Duff’s (2007, 2012) theorization of L2 socialization in different communities of practice (CoPs) (Lave & Wenger, 1991; Wenger, 1998, 2004, 2011), and Motha and Lin’s (2014) concept of desire in language learning. The long-term washback effects of my personal UEE-taking experience are manifested in my personal L2 socialization. Duff (2012, p. 564) defines L2 socialization as a process by which learners develop their language skills in the linguistic communities by interacting with more experienced learners/users/ members. While doing so, novices observe the existing language learning norms and performances of the old(er)-timers and practice their knowledge through interaction with other members in that CoP. Wenger (2011, p. 1) defines CoPs as “groups of people who share a concern or a passion for something they do and learn how to do it better as they interact regularly.” CoPs are everywhere: at home, at school, on campus, at work, and so on (Lave & Wenger, 1991). To serve for the purposes of this study, the CoPs I refer to in my L2 socialization are my classrooms at high school where I learned English before taking the UEE and the campuses and the workplaces I have thus far continued my academic and professional life ever since I took the UEE.

160

 The Long-Term Washback Effect of University Entrance Exams

Looking back, having been taught to the test by my teachers in my first CoP, I now realize, led me to think/feel/believe that it was how language teaching would “naturally” occur. After all, I was novice, and my teachers were old(er)-timers. Who was I to question their teaching style? They surely did know how to teach a language class, which was the only CoP for me to learn English. In the latter CoPs, however, I gradually understood that the dynamics of each CoP were contextual and different from the previous CoP. Complementing L2 Socialization theory, Motha and Lin (2014)’s multi-faceted theorization of desire illuminates how my interpersonal relationships have been gradually (re)produced and transformed within various CoPs. These scholars extend the meaning of desire from individual “wants” to social “goals.” For them, communities, institutions, and nations also have desires. Arguing that “at the center of every English language learning moment lies desire” (p. 331), these authors propose a framework for theorizing desire in five interconnected levels that include the desires of language learners, communities, teachers, institutions, and states (or governments). This multi-level conceptualization of desire allowed me to explore the emotional aspect of my L2 socialization by comparing mine to others’ desires at different levels. To illustrate, when my desire to earn a good place at university overlapped with my teachers’ desire to teach to the test and prepare me for the UEE corresponded, my L2 socialization was a smoother process. Dissimilarly, when my desire to speak English accurately as a fresh university student clashed with my professors’ desire to receive instantaneous, thoughtful, and fluent answers to their discussion questions, I forced myself into remaining silent to avoid making mistakes while speaking English.

METHODOLOGY I utilize autoethnography to guide the methodological framework of this study. Being an established qualitative method of inquiry in educational research (Denzin & Lincoln, 2011), autoethnography has gradually gained popularity among applied linguists as a “a newly introduced method of research” (Yazan, 2018, p. 6). Today, many scholars converge on the idea that autoethnography is a form of qualitative research method that allows researchers to analyze their personal experience within their sociocultural contexts (e.g., Chang, 2008; Ellis, 2004; Yazan & Keleş, 2023; Starr, 2010; Wall, 2008). To frame their autoethnographic work, autoethnographers frequently refer to the morphological constituents of the term: auto (self), ethno (culture), and graphy (narration) (e.g., Ellis, 2004; Canagarajah, 2012; Holman Jones, 2005; Keleş, 2022a; Lionnet, 1990). In other words, autoethnography is “writing about the personal and its relationship to culture” (Ellis, 2004, p. 37). To make connections between the methodology and the topic of interest, I refer to washback effect in the sense Hughes (2003) suggests; to explore the UEE’s effects not only on my own teaching and learning (auto), but also on the educational system and society as a whole (ethno) and craft this manuscript (graphy) accordingly. As opposed to doing research ‘on’ the researched to understand a phenomenon, autoethnographers focus on ‘self’ to make meaning out of their own lived experiences in the societies they live for a certain period (Keleş, 2022b; Sardabi et al., 2020). Assuming the double roles (as a researcher and researched) simultaneously, autoethnographers destabilize the borders between the researchers and the researched, hence, invert “binaries between individual/social, body/mind, emotion/reason, and lived experience/ theory in academic work” (Gannon, 2006, pp. 475-476). In this study, I (as the researcher) use autoethnography to explore the long-term washback effects of taking the UEE on my later language user and teacher practices (as the participant).

161

 The Long-Term Washback Effect of University Entrance Exams

Mainly, there are two approaches to writing autoethnography: evocative and analytic (Anderson, 2006). Those who write their autoethnographies in the ‘evocative’ genre prefer telling their stories through richly descriptive personal narratives (Ellis, 2009). They aim to find new ways of expressing emotionally charged experiences. According to Ellis (2004), good autoethnographic works consist of “thinking like an ethnographer, writing like a novelist” (p. 330). While doing so, evocative autoethnographers reject reductive academic analysis or theorization and defy the traditional ‘reporting’ language of academia (Ellis & Bochner, 2006). On the other hand, analytic autoethnography emphasizes the link between theory and practice through insider knowledge. Unlike evocative autoethnography, analytic autoethnography focuses on connecting personal experiences with existing research and theories moving away from sole emotional response to reach a scholarly analysis (Cook, 2014). Although they come from different perspectives, evocative and analytic autoethnographers converge on the idea that the researcher’s personal experiences play a central role in understanding the cultural practices that shape those experiences. In this paper, I incorporate an analytical lens with an evocative writing style in order to address my research questions in the best way possible. As for my writing style, I marry the basic principles of evocative and analytic autoethnography in crafting this manuscript, which is about my intense emotions as much as it is about my deep critical thoughts and experiential beliefs. I follow Bochner and Ellis’ (2016) definition of evocative autoethnography as well as their unorthodox writing style while I connect my narrative with existing theoretical concepts as suggested by Anderson (2016), who is the forerunner of analytic autoethnography. Incorporating evocative and analytic autoethnographies, I aim to provide an analysis of the long-term washback effects of the UEE on my academic and professional life without being confined to traditional academic prose writing. Traditional scholars follow the conventions of third person academic voice in their written academic discourse to enjoy author authority and turn their readers into passive recipients of the information (Adams et al., 2015). Unlike them, I deliberately use first person voice to subvert “radically from the analytic, third-person spectator voice of traditional social science prose” (Bochner & Ellis, 2016, p. 82). Refraining from assuming a ‘God’s eye’ view, I approach this study as “an enlarged conversation” (Goodall, 2000, p. 11) with my readers – you. Doing so contributes to the conversational style I would like to achieve between you and me. Content-wise, I define this manuscript as a critical autoethnography of socialization. While the “socialization” component denotes my theoretical stance, the “critical” aspect refers to my epistemological position. To me, language learning is both a product and a process of socialization, which needs a critical analysis in order to fully grasp the power dynamics operating underneath the education system. I frame this study as a critical autoethnography since it is the study and critique of cultural norms through my own lens as a researcher/participant (Holman Jones, 2018). Critical autoethnography calls for an examination of systems, institutions, and discourses to expose social problems (Boylorn & Orbe, 2014). The UEE, itself, is a critical topic of interest among educational scholars in Turkey, and hence, begs for a careful examination from multiple angles by multiple researchers in multiple contexts. From my own point of view, I approach the UEE as a social policy and practice that has had undeniable effects on my L2 socialization both as a language learner/user and a teacher. Overall, on one hand, the UEE has greatly granted me the opportunity to study at a prestigious university although I came from a lower SES working-class family. On the other hand, it created many obstacles in my undergraduate studies and later my graduate studies as well as my professional life in ELT.

162

 The Long-Term Washback Effect of University Entrance Exams

Memory Work as Primary Data Personal memory is “the bricks and mortar” (Giorgio, 2013, p. 408) of most autoethnographic studies, especially the ones in which researchers focus on their life stories that encompass multiple years of experience. For this autoethnographic study, I use my memory as the primary data source to explore how taking the UEE has influenced my L2 Socialization over 30 years as an English language learner, user, and teacher. Many scholars approach memory data with caution for they believe ‘memory work’ is highly subjective. For them, people experiencing the same event may recall what happened differently, thus are likely to create different ‘versions’ of a story (Tullis-Owen et al., 2009). I agree that my memory work is partial and subject to my self-positioning along with my instantaneous positioning of the audience (Bochner, 2007). However, I do not aim to produce ‘accurate’, ‘static’, ‘objective,’ and ‘consistent’ stories in this study because I am aware that no story is exempted from the interpretation of the narrator. Nevertheless, I believe “the power of narrative memory comes not from precision or accuracy but from how we [in the present] relate to our constructions and re-constructions of the past” (Hayler, 2010, p. 5). Therefore, I assert that my own memories constitute data, which are equally valuable to field notes, recorded interviews, or otherwise collected information by a researcher (Winkler, 2018). That is the reason why I opted for autoethnography in the first place. Although memories play a central role in my data collection, to some extent I agree with Chang (2008), who warns against heavy reliance upon personal memory and recalling as the only data source. Therefore, ever since I decided to write this autoethnographic paper, I have kept a reflective journal to note down any memories that I remembered regarding my language learning, teaching, and using history. In addition, I revisited my previous Facebook posts along with the essays I wrote as part of course assignments during my master’s (2012-2013) and doctoral studies (2016-2020). However, I did not treat these data sources as quantifiable products of a ‘data mining process,’ but viewed them as ‘memory triggering items.’ That is, I revisited them with a purpose to re-remember more memories. Once such memories resurfaced, I added them on my journal.

Data Analysis Bearing in mind that “an aspiring new autoethnographic scholar can miss the trees for the forest” (Anderson & Glass-Coffin, 2013, p. 64), I utilized Chang’s “chronicling the past” strategy (2008, p. 72) in this paper. According to Chang, chronicling the past helps autoethnographers revisit memorable events from their lived experiences in a sequential order. Cautioning against the precariousness of memory work, Chang advises autoethnographers to take a systematic and purposeful approach while collecting personal memory data that extends over many years. Following this strategy, I first produced a chronological autobiographical timeline as reflected in the findings and discussion section in the following. Second, on my reflective journal, I wrote down memorable experiences in accordance with the relevant CoPs in my timeline (i.e., Mehmet Çelikel High School, Boğaziçi University, TÖMER Language School, Yıldız Technical University, and Bilkent University). To craft the findings and discussion section, I selected the critical moments which were either directly about my experiences about the UEE before and after taking it or indirectly about my experiences which were affected by taking the UEE. These critical moments were mostly emotionally charged experiences. Ellis (1999) calls this process, “emotional recall,” in which autoethnographers can 163

 The Long-Term Washback Effect of University Entrance Exams

“imagine being back in the scene emotionally and physically” (p. 211). While collecting data, I reli(e) ved my past experiences, yet filtered them through the mindset of an educational researcher as I am now. As a final remark, I must note that I did not approach my reflective journal as a data source for producing codes or themes in the sense of traditional data analysis that seeks recurrent themes to explain a given phenomenon. Instead, I used it as a self-generated document to guide me through the selection process of my personal memories in an organized way while writing this manuscript. This selection process was rather ‘subjective’ in that it did not follow any predetermined criteria. I chose particular items from the journal while exploring any connections between the UEE and my L2 learner/user/teacher socialization. The guiding principle in this process was to understand how I had made meaning out of my experiences at the onset of happening, and how I approached these memories at the moment of writing up this manuscript. I focused on the changes in my interpretation of a memory rather than approaching each memory piece as a ‘factual’ and isolated datum.

FINDINGS My L2 socialization has evolved in multiple phases. However, for the purposes of this study, I focus on only four of them: 1) my high school years at Mehmet Çelikel High school where I acquired advanced grammar, vocabulary, and reading skills through formal English education, which was guided by the teaching to the test practices of my English teachers who mainly aimed to help us achieve in the UEE ; 2) five years of undergraduate education at Boğaziçi University where I experienced multiple linguistic as well as other challenges while studying English language and literature; 3) my professional life in different workplaces where I taught English mainly through “apprenticeship of observation” (Lortie, 1974); and 4) my continuous professional development settings where I received my pedagogical formation certificate and later my MA degree in English language teaching.

Phase 1 - Mehmet Çelikel High School: English as a Means to Achieve in UEE I first started to learn English truly when I was 14 years old … Whenever I learned a new grammar rule, I tended to learn every detail including the so-called exceptions. I was quite good at it. I also had a significant memory, which allowed me to learn new vocabulary items without making too much effort. (From an assignment for Language Testing Course - MA TEFL, Spring, 2013) The excerpt above speaks to my high school years where I learned English intensively for two semesters, and later prepared for central university entrance exams. I chose this excerpt as it summarizes my language learning experiences in high school. Learning English was not something I had planned for my future; yet, in my first CoP - the language classroom, I performed well. In a few months, I realized that I was good at learning English in the way it was ideologically defined in formal schooling. I understood the grammar structures without having to study for hours, retained new words without making much effort, and enjoyed constant appraisals from my teachers for my top performance in achievement tests. My English teachers treated me as a ‘distinguished’ student because I always participated actively in the classroom. I frequently raised my hand, answered questions ‘correctly’ almost all the time. That’s right, Mr. Keleş! Well done! I completed the prep class with the highest achievement scores. Also, I was a well-behaved student, who never dis164

 The Long-Term Washback Effect of University Entrance Exams

appointed his teachers and school administration with discipline issues. In time, my good manner and active participation resulted in having good relations with my teachers, which in return increased my desire to learn English even more. In the last two years of my high school life, our English teachers prepared us for the central university entrance exam through coursebooks, practice/mock tests, and grammar- and vocabulary-based worksheets, which were all designed in accordance with the UEE’s format. Back then, I never questioned the ways our teachers taught us English. We spent most class hours answering multiple-choice questions and discussing our answers. Our teachers’ instruction relied mostly on giving us mock tests. While we were answering multiple-choice questions, they would proctor the class and check the stopwatch so that they could recreate the physical and mental conditions of the actual UEE. Once the time was over, they would give us the ‘correct’ answer, and provide grammar instruction where necessary. Other times, we would read paragraph-long texts, and look up the unknown words in the dictionary. We did not have any pronunciation practice, classroom conversations, and presentations. We seldom worked on communicative skills. Although I was not aware that lack of communication was a problem back then, now I see that this problem was not based on my English teachers’ styles only but also on the contextual constraints; institution- and nation-level second language policies and practices; and the structure and administration of the nationwide UEE. For my English teachers, I was one of the brightest students; and their satisfaction with my progress made me very happy, which increased my desire to keep actively participating in the class. By active participation, I mean being willing to do mechanical exercises such as ‘fill in the blanks’, ‘choose the correct options (for multi-choice questions), and ‘match the word with its meaning’. Unlike many of my friends, I was good at doing these exercises. Solving grammar puzzles and memorizing new vocabulary items were never difficult for me. At that time, however, it was impossible for me to realize that I viewed English as a school subject. Being considered an ‘achiever’ somehow clouded my judgment and prevented me from seeing that I was unable to speak English. Looking back in time, I realize that my language learning desire and style in high school was in harmony with our teachers’ desire. They were teaching to the test while we wanted to receive top scores in the test. To that end, they preferred to provide us with detailed grammar instruction using meta-language. As a result, despite being unable to speak or write in English, I knew very well what grammatical rules applied for tenses, active/passive voice, comparatives, and superlatives, reported speech, articles, countable/ uncountable words, and so on. I trusted my grammar knowledge so much so that I could write a grammar book! No matter how detailed, I needed such knowledge to answer the multiple-choice questions in the UEE. At the end of my high school, I was lucky enough to achieve a high score in centralized university entrance test and earned a place at one of the top-ranking universities in Turkey, Boğaziçi University, where I would pursue my undergraduate degree in English Language and Literature.

Phase 2 - Boğaziçi University: Redefining What Learning English Means for Me I was so happy to earn a place at Boğaziçi University. It was a huge leap for me although it was a small step for humankind. […] Nevertheless, as a teenager who came from a poor family, it was difficult for me to compete with rich kids who were also very smart. […] The university was (and still is) mostly for students who came from upper or upper-middle class and who went to private schools where they acquired English as a means of communication. For them, English was not a school subject as it was

165

 The Long-Term Washback Effect of University Entrance Exams

for me. […] Earning a place there was hard but continuing to study was much harder. However, I was somehow able to graduate although I had one of the lowest GPAs at the department. (From an assignment for Multicultural Education course - PhD, Spring, 2018) The excerpt above speaks to how I felt about my enrollment at Boğaziçi University. Earning a place there has been one of my life achievements. On one hand, I was proud because my achievement was affirmed by the score I received from the UEE. I had answered all the multiple-choice English questions correctly and became one of the top one-percent students. On the other hand, I was frightened because I soon realized that my living expenses were much above my budget. Spending almost all my savings on the initial school fees on the first day of registration, my excitement turned into a panic. I was worried that I would go through economic hardship while studying there. Since my father’s death eight years before, my mother had been struggling to provide for me and my siblings. Obviously, she would not be able to support me financially. I soon started working as a private English tutor to afford my university education. In time, I realized that there was a huge gap in the department between students like me, who came from public high schools, and others, who came from private high schools. While those from public schools could not speak English fluently although they had solid grammar and vocabulary knowledge, those from private schools had developed ‘perfect’ speaking skills. The more I learned about the student profile in the department, the more marginalized I felt among them. Even now, I do not feel like I have ever fully belonged to Boğaziçi University community, either as a student or alumnus. The first day of my undergraduate studies, the realities struck me so hard. I had never realized sitting in a university classroom would be very different from sitting at a high school desk. As soon as our Poetry professor came in the classroom, she started reciting a poem “Shall I compare thee to a summer’s day?” which I would later learn that it was Shakespeare’s 18th sonnet. Later, I learned that the professor had this “dramatic entrance” every year to welcome new students. She continued, “Thou art more lovely and more temperate-,” but suddenly stopped and asked: “Who would like to continue?” I was puzzled at the question since nobody had bought their textbooks yet, and since I could never imagine someone in the class would have heard the poem before, let alone recite it. To my surprise, Seda (hereafter, all names are pseudonyms), raised her hand and continued the poem. I was even more shocked. How come? Soon after, I learned that Seda was from Robert College. There, like in most prestigious private high schools, Seda had decided to excel in English rather than social or natural sciences. She had taken extra literature and humanities classes and read literary works in English. Unlike her, the only books I had read by then were EFL coursebooks, a few graded readers, and test booklets like many other students from public high schools. Now I believe that the professor’s artistic entrance might have been deliberate – to identify those who came from Robert College, where she herself had graduated from years ago. There was no way for me and my peers from public schools to have such profound knowledge of English poetry at that time. Feeling like a total stranger to my department’s faculty members, I felt disempowered, and chose to let it go rather than struggle to achieve. From the very first class onwards, I knew that I was an ordinary public-school student, who could not utter a single sentence in English. Another memorable incident which made me think that my English language skills were inadequate was when our Psychology professor asked us to read the first two chapters from the textbook after the first week of my first semester. Until then, I had never read any authentic books in English which were not written for English learners. Soon after I started reading the first chapter (Biological Basis of Behavior), I realized that I knew only half of the content-specific vocabulary. Despite the help of 166

 The Long-Term Washback Effect of University Entrance Exams

a English-Turkish dictionary, I was still unable to comprehend the information in that book. It took almost an evening to finish reading five pages to fully comprehend. The problem was that I needed to read 65 more pages. At that time, completing one of the twelve tasks of Hercules seemed to be much easier than finishing reading the assignment. To my surprise, one of my roommates, Cenk, a business administration student, who was also taking the same course as I, finished reading both chapters in two hours without using a dictionary. He had graduated from Tarsus American High school, another private high school established by US missionaries in a southern Turkey more than a century ago. Although he had majored in science, his English vocabulary knowledge was far more advanced than mine. He could also speak English very fluently. These two encounters with Seda and Cenk led me to believe that my previous English education was insufficient and superficial to pursue my higher education thinking that my previous English education was insufficient and superficial – after all, it had not prepared me for the challenges behind the campus gate. Although my grammar knowledge was solid – well not as solid as Cenk’s – and I had trusted my vocabulary in high school, I soon understood that I was not able to communicate in English neither textually nor verbally. Although I chose to study English literature, I had never read anything in English other than inauthentic EFL course materials before let alone any English literary works. Although I was competent in answering each and every one of the multiple-choice questions correctly in the university entrance exam, I had never read any texts originally written for L1 speakers of English as well (Keleş, 2020). In time, I became a quiet student sitting (and sometimes sleeping) in the back of the class although I had been a distinguished student in high school. I remember one of my professors categorize students in her class in three groups. First group was made up of ‘active participants’ who listened to her attentively, participated in classroom discussions, and always took notes. Students in the second group were usually quiet, but regularly took notes. She called them the ‘silent majority’. And lastly, those in the back rows were the ones who would neither speak nor take notes. She called these students ‘the cool marginals.’ I was somewhat content with being labeled as a cool marginal. After all, it fitted into my physical looks with my long hair and ‘hippie-looking’ outfit. Although I lived close to the campus, I spent most of my day in the quad socializing with my friends. I went to the classroom only when it was an obligation to sign the attendance sheets only. During the lectures, I usually had my earphones on hidden under my long straight hair listening to ‘rock’ music. I kept avoiding participating in classroom discussions partly because of my ‘foreign language speaking anxiety’ and partly because of my dislike for our professors’ favoritism. I do not remember a single moment in which a professor called out a student’s name to ask about their opinions - I do not think they knew my name, at all! They preferred asking questions to the whole class, and one of the ‘active participants’ in the front rows would answer. Had my name been called out, perhaps I would have felt more compelled to read the course assignments to prevent the feeling of shame. One day, upon receiving no response to her question, a professor said she did not want to ask the question directly to a particular student as she found it anti-democratic among grown-ups. This tendency, however, encouraged me further to sit at the back, and do nothing. Also, the professors never asked us to work in groups to discuss a topic, and then present our responses as a group. Neither did they assign us any presentation for the next class. Consequently, I was never forced to ‘communicate’ with my peers or my professors in English. In that atmosphere, mostly my reading skills developed as I had started reading literary works in English. To some extent, my listening skills enhanced as well. Yet, I was still unable to speak in English. 167

 The Long-Term Washback Effect of University Entrance Exams

I remember my first utterance in English was in one of the Mythology classes in my fifth semester. Before then, I had never participated in classroom discussions due to the high levels of anxiety I had that derived from my fear of making mistakes. I recall saying: “No, Poseidon was Zeus’ brother, not his son!” I remember this sentence crystal clearly, as it marked one of the milestones of my language use in the classroom. From then on, when possible, I would raise my hand and make short but precise comments on any given topic during lectures. Until then, I had identified as a silent (or silenced) underachiever, who pretended to not care about the ‘apolitical’ curriculum of my department. Looking back in time, I now realize that my grammar-focused L2 education in high school, the public/private school divide fostering class-based injustice in my department among my cohort and professors, lack of guidance and support from professors and university administration all made it difficult for me to socialize into the campus life in Boğaziçi University. Instead of looking for solutions to issues, which I thought were too overwhelming along with my serious financial problems, I preferred to accept the ‘cool marginal’ identity in the first two years of my undergraduate studies. However, I was able to socialize into this CoP through making friends with like-minded fellow students and bringing my personal literary interests and academic studies together, which in return helped me graduate a little later than the normal period.

Phase 3 - Workplace: Teaching English as an Apprentice of Observation Back then, I thought my job was to teach the rules as a means of knowledge transmission in a friendly atmosphere so that my students could come to my classes willingly and learn ‘accurate’ English. In time, however, I earned more experience and learned a lot more about teaching through pedagogical books, my colleagues, and in-service trainings. I started to believe that learning a language meant more than learning the grammar rules. It had a purpose; to communicate. (From an assignment for Methodology Course - MA TEFL, Spring, 2013) When I graduated, I could secure a teaching job at a state university’s language school for (young) adult learners. Although teaching English had not been on my career agenda, I soon liked being a teacher because I noticed that it aligned with my desire to share my knowledge in language with learners who were willing to receive it for their future prospects. Also, I was good at doing the job. For the whole three years I worked there, I received no complaints from my students. Mainly, I was teaching as I was taught. Yes, grammar instruction. Since I had not received any formal English language teacher education, I relied only and heavily on my apprenticeship of observation (Lortie, 1974), which was basically imitating how my English teachers at high school taught me. Even when a challenge arose, I would think what my best English teacher, Fehmi Bey, would do regardless of the contextual differences. I was good at giving students what they were seeking – grammar structure and vocabulary. Most of the language learners in my classes were around my age. They were university graduates for whom English proficiency meant faster employment or finding a better job than they already had. They knew that they had to learn English for their career, but they were somehow unaware that they would have to write e-mails, talk to English speakers, prepare and present business reports, or so on. However, none of these skills were focused on during our courses, and only few students demanded us to teach these skills. The way they were exposed to language instruction was so embedded in their teacher-centered educational life that they never questioned my language teaching practices.

168

 The Long-Term Washback Effect of University Entrance Exams

What is worse, I never thought that I needed to focus on developing my students’ communicative skills. I rigidly followed the coursebook and the monthly course plans. I thought if they succeeded in completing all 12 levels (four elementary, four intermediate, and four advanced), they would automatically be able to write e-mails, talk over the phone, read “authentic” texts, and communicate in English on business trips – well, I was wrong. One day, an ex-student of mine, Harun, visited me since we had developed a close teacher/student relationship over a year. We had lunch together conversing about what we had been doing since he graduated. Harun was two years younger than me and had recently graduated from university with a BS in mechanical engineering when he started learning English in my classroom in my first year of teaching. Before he started looking for a job, he had decided to complete his English education. He successfully finished all of the course levels in one year. He was always very active in the classroom and did his assignments regularly. In time, Harun and I had a close teacher/student bond which was slightly more formal than an elder/younger brother relationship. After he received his certificate of completion, we lost contact for almost a year until he paid me a visit that day. Harun had found a job at an international car factory in another city. Before employment, the company had given him a written English test, which he successfully passed. In his third month, his manager took him on a business trip to Indonesia for two weeks. He was mainly expected to help his manager with translation and communication with the local employees in their factory there. However, Harun was so stressed about ‘having to speak English’ that he fell ill (diarrhea and vomiting) for the first two days. Until then, he had never thought of going abroad and speaking in English. On the following days, he was less stressed and forced himself to speak English as well as he could. I remember him saying: “I was never ready for that. You never told us about that - not prepared us!” Although he jokingly told his story and we laughed about it, this incident made me start questioning my instructional practices. In a way, I was benefiting from the national L2 policies and institutional practices as a teacher, although I had realized that such practices did not work for me when I was a university student. Just as I was not ready for EMI instruction at university because of the grammar based L2 education I received at high school; Harun was not prepared to speak English abroad. Something was wrong! How could he feel so scared of speaking the language he had ‘learned’ so successfully? Harun’s short visit compelled me to question my teaching style, which resembled that of my high school teachers. Just as my teachers’ focus on grammar and vocabulary instruction did not prepare me for my university education, my teaching style did not help Harun with the business correspondence skills he needed at work. The story that I wrote as a Facebook post in Figure 1 criticizes this teaching style.

169

 The Long-Term Washback Effect of University Entrance Exams

Figure 1. A Facebook post on my personal wall that criticizes EFL education in Turkey

In Figure 1, I mock the general practices of language learning and teaching in Turkey including mine in a rather sarcastic Facebook post, dated on December 26, 2012. It humorously depicts the discrepancy between the knowledge of grammar structures and actual language use. The first comment below the post shows that this issue was acknowledged by other people in Turkey. Harun’s visit made me realize that I had become a cogwheel in the very system that I suffered from as a university student. In a way, I was trapped by “apprenticeship of observation” (Lortie, 1974), which means that I was teaching in the way I was taught. Looking back, I see that, just as my teachers’ desire was shaped by institutional desire of my high school administrators, who wanted to increase their percentage of graduates going to university, my teaching desire was shaped by the language school’s desire to have more students move on to the next levels by taking grammar-based achievement tests. My students resembled the high school me. They never complained about their progress as long as they moved on to the next level. On average, I had the lowest rates of repeating students. The administration was satisfied with my performance, and so were my students and I - until the day Harun paid me a visit. From then on, I started looking for new ways to improve my students’ speaking skills.

170

 The Long-Term Washback Effect of University Entrance Exams

The peer learning group that I set up in my third and last year at TÖMER language school is illustrative of my efforts to change my teaching style. Since I was the old(est) timer due to the high teacher turnover, I had become the head teacher of the English department where the majority of the ten teachers were newly minted graduates. I set up groups of three English teachers. Funda, Sema and I would observe classes, and provide feedback to new recruits. Unlike me, Funda and Sema were ELT graduates who started working at the language school a year after me. I asked Funda to observe my classes. After observing my morning class for two weeks, she told me that I was acting like a stand-up comedian rather than a teacher. I remember feeling proud of my role, but she then said, “You do not let them do the job. You do the job for them. Since you are funny, they listen to you. But they only listen to you.” Although she sounded a little harsh, she had a point. Agreeing with her critique, I decided to change my instruction from teacher-centered to learner-centered by asking my students to work in pairs or groups instead of telling them to individually complete the exercises in their handouts. At the same time, I reduced my teacher talk. Motha and Lin (2014) note that an introspection of our desires may be useful to evaluate critically “whose interests are being privileged in the context of our educational practice of socializing certain desires and prohibiting others” (p. 337). Ever since I started working as an English teacher, I questioned my desires, skills, and beliefs in the professional communities in which I took part. Harun’s visit is illustrative for my self-critique. His example was an eye-opening experience for me. That he could not speak English on a business trip to Indonesia led me to question my teaching style. From then on, I tried my best to develop myself professionally and become a more helpful teacher. Towards the end of my early career period, I may now say that I felt the urge to change my teaching style instead of maintaining the status-quo. Although a great majority of my students were content with my teaching performance, I knew that they could have done better under different conditions. Therefore, I decided to find innovative ways to encourage my students to use their communicative skills even if they did not have such a conscious desire. To that end, I enrolled in a one-year long pedagogical formation certificate at Yildiz Technical University in 2006 to learn about ELT in a more formal fashion. However, it did not meet my expectations since the courses were mostly related to learning theories rather than second language acquisition. Their lectures mostly covered educational theories rather than classroom practices. The only ELT course the program offered was Grammar Teaching, which was not exactly what I needed. Nevertheless, I completed the certificate program. Although I was officially ‘certified’ to be an English teacher, I did not feel that I became an expert teacher since the certificate program did not equip me with the necessary knowledge of and experience in ELT. In 2012, I applied to Bilkent University’s MA TEFL program designed for in-service English teachers from universities in Turkey and supported by Fulbright. Looking back, I realize that my motivation to enroll in this program was to seek empowerment in my professional life. My participation to one-year intensive MA TEFL program turned out to be one of the milestones in my life as an English language learner, teacher, and user.

Phase 4 – Bilkent University – Breaking Away from Traditional EFL Instruction Being the actor of a one-person show might be tempting but comes with the price of passive audience whose job is merely to applaud you at the end of the show. Learning a language, however, requires more than reception. Therefore, as a teacher, being an extra in an interactive play contributes much more to the future performance of my students, especially when they are required to perform in their own shows

171

 The Long-Term Washback Effect of University Entrance Exams

in the real world’s theatre halls. (From an assignment for Curriculum Development course - MA TEFL, Spring, 2013) The excerpt above explains why I joined the MA TEFL program at Bilkent University in 2012 - to improve my teaching skills. Looking back, I realize my studies there developed not only my teaching skills but also my English proficiency to a great extent. I was in a class with nine English instructors, who came from different universities across Turkey. Since admission to this program was merit-based, they were all good at their jobs. Only three of them were non-ELT graduates like me. Two of them were born in Australia, and they spoke English more fluently than the rest of the cohort. In time, we developed a close relationship as a group. We had two professors. Dr Kelly was a US citizen living in Turkey for years. Dr Kaya, who was the same age as me, had recently received her PhD degree in the US. One of the objectives of the 100% EMI program was to improve our English language skills. To that end, we were required to speak English with our professors in and out of the classroom. Unlike in my bachelor’s studies, our professors paid extra attention to make us work in pairs or in groups with different classmates for our classroom activities, course assignments, and online discussions. Since almost all of us lived in student dorms, we developed close friendships over time. We shared our thoughts, beliefs, and aspirations regarding our private, academic, and professional lives both in and out of the classroom space. In this CoP made up of new members who were willing to learn from the more experienced members, we received great assistance from our professors and each other as much as we could so that we could support each other both emotionally and academically. Throughout the program, we provided feedback to each other’s written work. That the program was designed to encourage peer work overlapped with our desire to make friends with like-minded colleagues. To date, we all have maintained our relationships, often asking for opinions, information, and advice from each other. I learned so much from my peers and my professors about learning and teaching English during the classes, while at the same time, I developed my English proficiency. The program enhanced my writing skills via written assignments, reflective essays, blog posts, and thesis writing. Also, the ‘English-only’ policy enhanced my English language speaking skills to a great extent through in-class discussions, peer and group work, and presentations. Until then, I had always felt anxious to speak English. There, I overcame this fear - unless I have to give a presentation at a conference, which still remains to be a challenge for me. Looking back in time, Bilkent University has had an important role in my L2 socialization. Before, I used to avoid speaking English with my colleagues during formal meetings. I would even refrain from asking questions to teacher educators during in-service trainings, workshops, and seminars. However, when I returned to YTU, I participated in such events more actively. Also, I had learned the meaning of ELT-related terminology, which allowed me to express my beliefs, opinions, and concerns more efficiently. While describing a CoP in detail, Wenger (2004) theorizes domain (i.e., the area of knowledge that brings the community together), community (i.e. the group of people for whom the domain is relevant), and practice (i.e., the body of knowledge, methods, tools, stories, documents, which members mutually develop) as its important components. The success of this CoP pertains to the quality of the relationships among members be they old(er)timers or newcomers. In a short time, my cohort and I all socialized into our CoP at Bilkent University since we were all interested in developing our knowledge (practice) in the field of ELT (domain) with each other (community). Our professors enhanced our practice through pair and group work, which I had never experienced before. We provided feedback to each other’s theses, did our assignments in collaboration, and contributed to classroom discussions respectfully. As a result, 172

 The Long-Term Washback Effect of University Entrance Exams

I learned so much about my profession, built a network of close friends/colleagues, and produced a master’s thesis in nine months. Since almost all our interactions were in English, we also developed our English language communication skills, as well.

DISCUSSION In this autoethnographic study, I turned the spotlight on myself and explored how my UEE experiences have affected me and my teaching style, and what actions I have taken to alleviate its negative longterm washback effects. Overall, the findings corroborated the existing washback studies in the context of Turkey, which found that taking the UEE had negative washback effect on the high school students’ communicative and productive skills (Hatipoğlu, 2016; Karabulut, 2007; Sayın & Aslan, 2016; Yıldırım, 2010). Being a student at Boğaziçi University, where my cohort was made up of students coming from mostly private high schools, I was one of the few who were unable to speak English to communicate with their peers and the professors, and who chose auto-silencing themselves to avoid, delay, and even overcome the academic challenges. Given that my professors did not acknowledge this washback effect, which mostly influenced the students who came from public high schools, which were especially in small cities of Anatolia, the problem grew bigger and bigger to a point that which led some students (including me) to fall behind peers, and even dropping out of school. I was fortunate enough (or had the power to struggle) to be able to graduate despite all the social, cultural, linguistic, and economic hardships that I had to go through. On the other hand, owing to the fact that UEE assessed only the grammar and vocabulary knowledge along with the reading skill, I developed substantial English language knowledge while preparing for the UEE – like the participants in Külekçi’s (2016) study. Acquiring such profound grammar and vocabulary knowledge is the only positive washback effect previously discussed in literature. However, the UEE has another positive washback effect, which is understudied in previous studies although it has deeper, longer lasting effect on test takers, which I call social justice effect. High-stakes exams are used to place students into higher education institutes, evaluate the quality of education, and control nepotism in the allocation of scarce opportunities (Cheng, 2005; Al-Jamal & Ghadi, 2008; Ghorbani, 2012). Looking back in time, I realize that it was the scope and the format of the UEE that allowed me to achieve such a high score, which earned me the right to study among one of the most prestigious universities. If the UEE had not existed, and if the universities had required applicants to submit a score card of an international language test such as TOEFL or IELTS, or if they had administered their own proficiency test as some of them do after students’ enrolment, I strongly believe that I would not have been as successful in such tests as I actually was in the UEE. Perhaps, my teachers would have had different teaching styles and focused more on our proficiency rather than accuracy. However, I do not believe that it would have sufficed to be accepted by Boğaziçi University given that there would have been so many applicants from the private schools in İstanbul. The students in these schools, since they were exposed to English longer than me, would have been more fluent than me just as most of my actual cohort in my department. In brief, I believe the UEE has a positive washback effect on students with limited social, linguistic, and financial resources than those who are more affluent. Therefore, I believe the UEE serves as a tool for obtaining equality of opportunities, and its abolishment may detriment social justice in Turkey, which is already fragmented and damaged by the public vs. private school divide. 173

 The Long-Term Washback Effect of University Entrance Exams

Another positive washback of the UEE is that I was regarded as a good teacher in my professional life since I frequently benefited from the remarkable amount of grammar and vocabulary knowledge I had acquired while preparing for the UEE. This knowledge of mine also served as a tool to conceal my weakness and lack of knowledge of SLA theories, EFL Teaching methodologies, and classroom management. Although it worked for me, I realized after a while that it actually did not work for my students – Harun being one of them – who needed to develop their communicative skills even though they were not aware of that. Instead of maintaining the status-quo, I decided to leave my comfort zone, and change my teaching style so that I could turn my classroom into settings where I would encourage my students to have more communicative practice than grammar knowledge. Doing so, I would transform my teaching style from “filling in the blanks” to “feelin’ the blanks” (Keleş & Yazan, 2022). In closing, I must say that I opened up myself to you by giving you so many details about my L2 Socialization. However, it felt, overall, like a therapy for me – talking to like-minded people. It also allowed me to systematically review my past memories, whose joy is pretty similar to tidying up my house. At this point, I hope that my story has been a vantage point for future researchers - you - to revisit your memories of taking the UEE (or any other centralized nationwide exam) and revise its role in your current practice.

ACKNOWLEDGMENT I am grateful for having been a student of Mr. Bahri Yıldız, who has been more than an English teacher for me and who recognized my potential much earlier than I did. He has been one of the most influential persons in my life who changed the course of my future significantly. This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

REFERENCES Adams, T. E., Holman Jones, S., & Ellis, C. (2015). Autoethnography. Oxford University Press. Akın, G. (2016). Evaluation of national foreign language test in Turkey. Asian Journal of Educational Research, 4(3), 11–21. Al-Jamal, D., & Ghadi, N. (2008). English language general secondary certificate examination washback in Jordan. Asian EFL Journal, 10(3), 158–186. Alderson, J. C., & Wall, D. (1993). Does washback exist? Applied Linguistics, 14(2), 115–129. doi:10.1093/ applin/14.2.115 Anderson, L. (2006). Analytic autoethnography. Journal of Contemporary Ethnography, 35(4), 373–395. doi:10.1177/0891241605280449 Anderson, L., & Glass-Coffin, B. (2013). I learn by going: Autoethnographic modes of inquiry. In S. Holman Jones, T. E. Adams, & C. Ellis (Eds.), Handbook of autoethnography (pp. 57–83). Left Coast Press. Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice: Designing and developing useful language tests (Vol. 1). Oxford University Press. doi:10.2307/328718

174

 The Long-Term Washback Effect of University Entrance Exams

Bochner, A. P. (2007). Notes toward an ethics of memory in autoethnographic inquiry. In N. K. Denzin & M. D. Giardina (Eds.), Ethical futures in qualitative research: Decolonizing the politics of knowledge (pp. 197–208). Left Coast Press. Bochner, A. P., & Ellis, C. (2016). Evocative autoethnography: Writing stories and telling lives. Routledge. doi:10.4324/9781315545417 Boylorn, R. M., & Orbe, M. P. (Eds.). (2014). Critical autoethnography: Intersecting cultural identities in everyday life. Routledge. Canagarajah, A. S. (2012). Teacher development in a global profession: An autoethnography. TESOL Quarterly, 46(2), 258–279. doi:10.1002/tesq.18 Chang, E. (2008). Autoethnography as method. West Coast Press. Cheng, L. (2005). Changing language teaching through language testing: A washback study (Vol. 21). Cambridge University Press. Cheng, L., & Curtis, A. (2004). Washback or backwash: A review of the impact of testing on teaching and learning. In L. Cheng, Y. Watanabe, & A. Curtis (Eds.), Washback in language testing: Research contexts and methods (pp. 3–17). Routledge. doi:10.4324/9781410609731-9 Cook, P. (2014). To actually be sociological: Autoethnography as an assessment and learning tool. Journal of Sociology, 50(3), 269–282. doi:10.1177/1440783312451780 Denzin, N. K., & Lincoln, Y. S. (Eds.). (2011). The SAGE handbook of qualitative research. SAGE. Doğançay‐Aktuna, S., & Kızıltepe, Z. (2005). English in Turkey. World Englishes, 24(2), 253–265. doi:10.1111/j.1467-971X.2005.00408.x Douglas, K., & Carless, D. (2013). A history of autoethnography. In S. Holman Jones, T. E. Adams, & C. Ellis (Eds.), Handbook of autoethnography (pp. 84–106). Left Coast Press. Duff, P. (2007). Second language socialization as sociocultural theory: Insights and issues. Language Teaching, 40(4), 309–319. doi:10.1017/S0261444807004508 Duff, P. (2012). Second language socialization. In A. Duranti, E. Ochs, & B. Schieffelin (Eds.), Handbook of language socialization (pp. 564–586). Wiley-Blackwell. Ellis, C. (1999). Heartful autoethnography. Qualitative Health Research, 9(5), 669–683. doi:10.1177/104973299129122153 Ellis, C. (2004). The Ethnographic I: A methodological novel about autoethnography. AltaMira Press. Ellis, C. (2009). Fighting back or moving on: An autoethnographic response to critics. International Review of Qualitative Research, 3(2), 371–378. doi:10.1525/irqr.2009.2.3.371 Ellis, C., & Bochner, A. P. (2006). Analyzing analytic autoethnography: An autopsy. Journal of Contemporary Ethnography, 35(4), 429–449. doi:10.1177/0891241606286979 Erdoğan, P., & Savaş, P. (2022). Investigating the selection process for initial English teacher education: Turkey. Teaching and Teacher Education, 110, 1–18. doi:10.1016/j.tate.2021.103581

175

 The Long-Term Washback Effect of University Entrance Exams

Gannon, S. (2006). The (im)possibilities of writing the self-writing: French poststructural theory and autoethnography. Cultural Studies ↔ Critical Methodologies, 6(4), 474-495. doi:10.1177/1532708605285734 Gates, S. (1995). Exploiting washback from standardized tests. In J. D. Brown & S. O. Yamashita (Eds.), Language testing in Japan (pp. 107–112). Japan Association for Language Teaching. Ghorbani, M. (2012). The washback effect of the university entrance examination on Iranian English teachers’ curricular planning and instruction. Iranian EFL Journal, 2, 60–87. Giorgio, G. (2013). Reflections on writing through memory in autoethnography. In S. Holman Jones, T. E. Adams, & C. Ellis (Eds.), Handbook of autoethnography (pp. 406–424). Left Coast Press. Goodall, H. L. (2000). Writing the new ethnography. AltaMira Press. Green, A. (2007). IELTS washback in context: Preparation for academic writing in higher education (Vol. 25). Cambridge University Press. Hatipoğlu, Ç. (2016). The impact of the university entrance exam on EFL education in Turkey: Preservice English language teachers’ perspective. Procedia: Social and Behavioral Sciences, 232, 136–144. doi:10.1016/j.sbspro.2016.10.038 Hatipoğlu, Ç., & Erçetin, G. (2016). Türkiye’de yabancı dilde ölçme ve değerlendirme eğitiminin dünü ve bugünü ve yarını [The past, present, and future of foreign language testing and evaluation education in Turkey]. In Proceedings of the third national conference on language education (pp. 72-89). Academic Press. Hayler, M. (2010). Autoethnography: Making memory methodology. Research in Education, 3(1), 5–9. Holman Jones, S. (2005). Autoethnography: Making the personal political. In N. K. Denzin & Y. S. Lincoln (Eds.), Handbook of qualitative research (3rd ed., pp. 763–791). SAGE. Holman Jones, S. (2018). Creative selves / creative cultures: Critical autoethnography, performance, and pedagogy (creativity, education and the arts). Palgrave Macmillan. doi:10.1007/978-3-319-47527-1 Hughes, A. (2003). Testing for language teachers (2nd ed.). Cambridge University Press. Hung, S. T. A. (2012). A washback study on e-portfolio assessment in an English as a foreign language teacher preparation program. Computer Assisted Language Learning, 25(1), 21–36. doi:10.1080/0958 8221.2010.551756 Karabulut, A. (2007). Micro level impacts of foreign language test (university entrance examination) in Turkey: A washback study [MA thesis]. Available from ProQuest Dissertations & Theses Global. (304856856) Kasap, S. (2019). Akademisyenlerin gozünden Türkiye’deki İngilizce eğitimi. Yüzüncü Yıl Üniversitesi Eğitim Fakültesi Dergisi, 16(1), 1032–1053. doi:10.23891/efdyyu.2019.152 Keleş, U. (2020). My language learning, using, and researching stories: Critical autoethnography of socialization (Publication No. 28154180) [Doctoral dissertation, The University of Alabama, Tuscaloosa]. ProQuest Dissertations and Theses Global.

176

 The Long-Term Washback Effect of University Entrance Exams

Keleş, U. (2022). In an effort to write a “good” autoethnography in qualitative educational research: A modest proposal. Qualitative Report, 27(9), 2026–2046. doi:10.46743/2160-3715/2022.5662 Keleş, U. (2022b). Autoethnography as a recent methodology in applied linguistics: A methodological review. The Qualitative Report, 2(27), 448–474. doi:10.46743/2160-3715/2022.5131 Keleş, U. (2023). Exploring my in-betweenness as a growing transnational scholar through poetic autoethnography. In L. J. Pentón Herrera, E. Trinh, & B. Yazan (Eds.), Doctoral Students’ identities and emotional wellbeing in applied linguistics: Autoethnographic accounts. Routledge. doi:10.4324/9781003305934-7 Keleş, U., & Yazan, B. (2022). “Fill in the blanks” vs. “feelin’ the blanks:” Communicative language teaching and ELT coursebooks. In H. Celik & S. Celik (Eds.), Coursebook evaluation in English language teaching (ELT) (pp. 161-186). Vizetek. Kırkgöz, Y. (2008). A case study of teachers’ implementation of curriculum innovation in English language teaching in Turkish primary education. Teaching and Teacher Education, 24(7), 1859–1875. doi:10.1016/j.tate.2008.02.007 Külekçi, E. (2016). A concise analysis of the foreign language examination (YDS) in Turkey and its possible washback effects. International Online Journal of Education & Teaching, 3(4), 303–315. Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Cambridge University Press. doi:10.1017/CBO9780511815355 Lionnet, F. (1990). Auto-ethnography: The an-archic style of dust tracks on a road. In H. L. Gates (Ed.), Reading black, reading feminist (pp. 382–413). Meridian. Lortie, D. (1975). Schoolteacher: A sociological study. University of Chicago Press. Motha, S., & Lin, A. (2014). “Non-coercive rearrangements”: Theorizing desire in TESOL. TESOL Quarterly, 48(2), 331–359. doi:10.1002/tesq.126 Roediger, H. L. III, & Marsh, E. J. (2005). The Positive and Negative Consequences of Multiple-Choice Testing. Journal of Experimental Psychology. Learning, Memory, and Cognition, 31(5), 1155–1159. doi:10.1037/0278-7393.31.5.1155 PMID:16248758 Sardabi, N., Mansouri, B., & Behzadpoor, F. (2020). Autoethnography in TESOL. In J. Liontas (Ed.), the TESOL encyclopedia of English language teaching (pp. 1–6). Wiley & Sons, Inc. Sayın, B. A., & Aslan, M. M. (2016). The negative effects of undergraduate placement examination of English (LYS-5) on ELT students in Turkey. Participatory Educational Research, 3(1), 30–39. doi:10.17275/per.16.02.3.1 Sardabi, N., Mansouri, B., & Behzadpoor, F. (2020). Autoethnography in TESOL. In J. Liontas (Ed.), the TESOL encyclopedia of English language teaching (pp. 1–6). Wiley & Sons, Inc. Shohamy, E., Donista-Schmidt, S., & Ferman, I. (1996). Test Impact revisited, washback effect over time. Language Testing, 13(3), 298–317. doi:10.1177/026553229601300305 Starr, L. J. (2010). The use of autoethnography in educational research: Locating who we are in what we do. Canadian Journal for New Scholars in Education, 1(3), 1–9.

177

 The Long-Term Washback Effect of University Entrance Exams

Toksöz, I., & Kılıçkaya, F. (2018). Review of journal articles on washback in language testing in Turkey (2010-2017). Lublin Studies in Modern Languages and Literature, 41(2), 184. doi:10.17951/ lsmll.2017.41.2.184 Tullis-Owen, J. A., McRae, C., Adams, T. E., & Vitale, A. (2009). Truth troubles. Qualitative Inquiry, 15(1), 178–200. doi:10.1177/1077800408318316 Valette, R. M. (1994). Teaching, testing, and assessment: Conceptualizing the relationship. National Textbook Company. Wall, S. (2008). Easier said than done: Writing an autoethnography. International Journal of Qualitative Methods, 7(1), 38–53. doi:10.1177/160940690800700103 Watanabe, Y. (2004). Methodology in washback studies. In L. Cheng, Y. Watanabe, & A. Curtis (Eds.), Washback in language testing: Research contexts and methods (pp. 19–36). Lawrence Erlbaum Associates. Wenger, E. (1998). Communities of practice: Learning meaning and identity. Cambridge University Press. doi:10.1017/CBO9780511803932 Wenger, E. (2004). Knowledge management as a doughnut: Shaping your knowledge strategy through communities of practice. Ivey Business Journal, 68(3). We n g e r, E . ( 2 0 1 1 ) . C o m m u n i t i e s o f p ra c t i c e : A b r i e f i n t ro d u c t i o n . R e tr ieved from https://scholarsbank.uoregon.edu/xmlui/bitstream/handle/1794 /11736/A%20brief%20intoduction%20to%20CoP.pdf Winkler, I. (2018). Doing autoethnography: Facing challenges, taking choices, accepting responsibilities. Qualitative Inquiry, 24(4), 236–247. doi:10.1177/1077800417728956 Yaman, İ. (2018). Türkiye’de İngilizce Öğrenmek: Zorluklar ve fırsatlar. RumeliDE Dil ve Edebiyat Araştırmaları Dergisi, 11, 161–175. doi:10.29000/rumelide.417491 Yazan, B. (2018). TESL teacher educators’ professional self-development, identity, and agency. TESL Canada Journal, 35(2), 140–155. doi:10.18806/tesl.v35i2.1294 Yazan, B., & Keleş, U. (2022). A snippet of an ongoing narrative: A non-linear, fragmented, and unorthodox autoethnographic conversation. Applied Linguistics Inquiry, 1(1). https://doi.org/10.22077/ ALI.2022.5561.1003 Yazan, B., & Keleş, U. (2022). A snippet of an ongoing narrative: A non-linear, fragmented, and unorthodox autoethnographic conversation. Applied Linguistics Inquiry, 1(1). Advance online publication. doi:10.22077/ALI.2022.5561.1003 Yıldırım, Ö. (2010). Washback effects of a high-stakes university entrance exam: Effects of the English section of the university entrance exam on future language teachers in Turkey. The Asian EFL Journal Quarterly, 12(2), 92–116.

178

 The Long-Term Washback Effect of University Entrance Exams

ADDITIONAL READING Boylorn, R. M., & Orbe, M. P. (Eds.). (2020). Critical autoethnography: Intersecting cultural identities in everyday life. Routledge. doi:10.4324/9780429330544 Cheng, L. E., Watanabe, Y. E., & Curtis, A. E. (2004). Washback in language testing: Research contexts and methods. Lawrence Erlbaum Associates Publishers. doi:10.4324/9781410609731 Green, A. (2013). Washback in language assessment. International Journal of English Studies, 13(2), 39–51. doi:10.6018/ijes.13.2.185891 Pan, Y. C. (2008). A critical review of five language washback studies from 1995-2007: Methodological considerations. JALT Testing & Evaluation SIG Newsletter, 12(2), 2–16. Xu, Q., & Liu, J. (2018). A study on the washback effects of the test for English majors (TEM). Springer. doi:10.1007/978-981-13-1963-1

KEY TERMS AND DEFINITIONS Accuracy vs. Fluency: Accuracy refers to the ability to use the necessary vocabulary, grammar, and punctuation correctly in a given language. Fluency, on the other hand, denotes the flow and efficiency with which people express their ideas, particularly when speaking. While the focus of language accuracy is on avoiding making all sorts of grammar, spelling, pronunciation, and/or vocabulary mistakes, fluency is centers on using communicative skills effectively. Analytic vs. Evocative Autoethnography: While evocative autoethnography has a free form writing style that relies on emotions to connect with the audience, analytic autoethnography takes a more traditional ethnographic stance to avoid obscuring the compatibility of autoethnographic works within traditional ethnographic practices. Apprenticeship of Observation: A concept that describes what students learn about teaching through observing their teachers for thousands of hours they spend in classrooms as much before entering teacher education programs. Autoethnography: Simply put, a qualitative research methodology that draws on and analyzes or interprets the researcher/participant’s lived experience and connects their beliefs/thoughts/ emotions to sociocultural items, structures, and resources. Communities of Practice (CoPs): A group of people who share a common concern, a set of problems, or an interest in a topic and who come together to fulfill both individual and group goals sharing best practices and creating new knowledge to advance a domain of professional practice through social interaction and collaboration between newcomers learn from old(er) timers. Critical Autoethnography: Critical autoethnography is a form of autoethnography which seeks to describe and systematically analyze personal experience in an effort to make invisible power dynamics visible, denaturalize them, and transform the society into a better version of itself. High-Stakes Test: A test with important consequences for the test taker such as progressing to a next level, stage, or grade; receiving a certificate, diploma, a scholarship; entering into professional or academic life; or becoming licensed to practice a profession.

179

 The Long-Term Washback Effect of University Entrance Exams

L2 Socialization: A concept defined as a process by which second (or foreign) language learners of a language gradually acquire competence in that language first through seeking peripheral and then maintaining legitimate membership in a community where that language is spoken. Teaching to the Test: A colloquial term for any method of education whose curriculum is heavily focused on preparing students for a standardized test. Washback Effect: A term, also known as backwash, that refers to the impact of testing on curriculum design, teaching practices, and learning behaviors.

ENDNOTE 1



180

My reference to UEE is limited only to the English language subtest that foreign language students are responsible for answering on the test day.

181

Chapter 10

Dynamic Assessment in an Inclusive Pre-K FLEX Program Within Universal Design for Learning (UDL) Framework Hilal Peker https://orcid.org/0000-0002-2642-3015 University of Central Florida, USA

ABSTRACT This chapter discusses how dynamic assessment (DA) is utilized for both instruction and assessment by using universal design for learning (UDL) framework to support inclusive education of young learners with special needs in a program offering French as a foreign language (FFL). The author focuses on incorporating DA in order to better understand student learning in this inclusive, prekindergarten FFL program. There have been some studies conducted with foreign language programs at the elementary level and higher or with typical young learners in an English as a second or foreign language setting; however, there are not enough studies focusing on foreign language programs with special needs students (SNSs) because these programs are not often available to many SNSs due to the practice of exemption. Thus, it is crucial to use DA as a tool for both instruction and assessment to be able to understand SNSs’ needs and learning gains. In this chapter, DA is examined and implications for inclusive education are provided.

INTRODUCTION In this chapter, the author discusses how Dynamic Assessment (DA) is utilized for both instruction and assessment by using Universal Design for Learning (UDL) framework to support inclusive education of young learners with special needs. The author will refer to a previously published pilot study that was conducted with the same participants in this study in order to discuss the quantitative results, as there are no other studies that examined the application of DA in inclusive classrooms into which both typically DOI: 10.4018/978-1-6684-5660-6.ch010

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

developing children and children with special needs are integrated (Regalla & Peker, 2017). However, the purpose of this chapter is to emphasize the usage of DA and the learning of pre-kindergarten students who participated in a program offering French as a foreign language (FFL). This FFL program is offered to all pre-kindergarten students enrolled in a unique charter school. In this school, approximately 50% of the population consists of students with special needs (SNSs) who are integrated into classrooms with typically developing students (TDSs). An earlier pilot study (Regalla & Peker, 2017) conducted on this program indicated that SNSs could participate successfully in foreign language classes when appropriate methods, such as multimodal instruction within UDL, are used. However, the findings of this study also indicated that traditional assessments were not appropriate for measuring the learning of all students (i.e., both typical and students with disabilities) (Regalla & Peker, 2017). Thus, in this chapter, the author focuses on how to incorporate DA into the inclusive prekindergarten FFL program in order to better understand student learning. Little research exists in this area due to the fact that much of the previous work in DA has been conducted with foreign language programs at the elementary level and higher or with young learners in an English as a second or foreign language setting. Furthermore, very few early childhood foreign language programs exist (Regalla & Peker, 2015), and approximately only about 25% of these U.S. schools offer foreign language at the elementary level (Pufahl & Rhodes, 2011). In addition, foreign language programs are not often available to many SSNs because of the practice of exempting them from foreign language requirements (Aindriu, 2022; Selvachandran et al., 2020; Wight, 2015). Considering “there is no ‘one size fits all’ approach to teaching students with special needs” (Regalla et al., 2017), each student should be assessed at individual basis and one of the best assessment types to achieve this is DA.

BACKGROUND Assessment and evaluation play a crucial role in foreign language classrooms as teachers are constantly evaluating progress in their students’ language skills both formally and informally. Foreign language teachers make use of both formative and summative assessments. Formative assessments are part of the daily instructional routine. Teachers use formative assessments during instruction to informally evaluate students’ language development, to plan subsequent instruction, and to provide useful feedback to students on their performance (Brown, 2010). On the other hand, summative assessment is defined as “a method of measuring a person’s ability, knowledge, or performance in a given domain” (Brown, 2010, p. 3). Summative assessment often takes place at the end of a unit of instruction at a time specifically designated for testing. Summative assessments are frequently formal assessments with feedback normally consisting of a letter grade or score that can determine a student’s progress to the next level. Because formative and summative assessments have very different goals, they are usually conducted independent of one another. DA is a type of assessment in which both teaching and learning take place as a whole activity in harmony with mediation between a learner and more capable peers or a teacher (Poehner, 2008). DA is based upon the framework of sociocultural theory and Vygotsky’s Zone of Proximal Development (ZPD), which refers to the difference between what the learner can achieve alone and what the learner can accomplish in collaboration with the assistance of more capable peers (Vygotsky, 1978). Vygotsky calls this assisted performance or activity mediation; however, in some resources this term is used interchangeably with “scaffolding” in explaining mediation (Donato, 2000; Gibbons, 2003; Hogan and 182

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

Pressley, 1997; Lantolf, 2000; Mantero, 2002; Takahashi, 1998; Wood, Bruner and Ross, 1976). According to ZPD, scaffolding is used by the more capable peer to move learners beyond their actual level of performance, one that the learner is able to achieve alone, to the next level. Regarding foreign language teaching and learning, the use of scaffolding and DA has been studied from a sociocultural perspective as it contributes to learner development (Aljaafreh & Lantolf, 1994; Anton, 2003; Brooks & Donato, 1994; Davin, 2013; Davin & Donato, 2013; Hasson, 2018; Shrestha, 2020; Swain & Lapkin, 2000). A unique feature of DA that differentiates it from other types of traditional, summative assessments is that the assessor can actually provide assistance to the learner while conducting the assessment. The assistance that occurs during assessment not only gives the assessor a clear idea of what the learner can do independently, but it also provides information about the students’ need for assistance during instruction (Poehner, 2008). However, too much assistance does not benefit learner development. The assistance provided by foreign language teachers as they scaffold students should remain within the ZPD of the learner and move the learner just beyond what he/she can do independently (Vygotsky, 1978; Wertsch, 1979). Based on Poehner’s (2008) explanation, scaffolding should be initiated as implicit assistance and then slowly turned into more explicit assistance. To exemplify, an assessor or a teacher may opt for pausing and looking student(s) to indicate an error in student response so that students could identify the error by themselves. Then, the assessor may repeat the questions and provide more implicit assistance if student(s) do not provide an answer. They may even provide the explicit assistance in the form of additional explanation in the students’ first language (Aljaafreh & Lantolf, 1994). According to sociocultural theory, knowledge is constructed through social interaction (Vygotsky, 1978; Wertsch, 1985) using language as the primary tool for mediation of social interactions (Vygotsky, 1986). A “more capable peer” or a teacher/assessor can use this primary tool to provide the learner with scaffolding so that the learner could move beyond the current or actual level of performance (Vygotsky, 1986). This way, teachers could adjust their speech/responses based on students’ current level as a means of assistance to the learner in order to guide classroom interactions within their students’ ZPD (Davin, 2013). In DA, the emphasis on scaffolding is just as much on learner development as it is to assess the learner’s level. One type of DA is called interactionist, and according to interactionist DA, scaffolding occurs between the assessor and the learner as a continuous evolving interaction (Poehner, 2008). For instance, Anton (2003) used interactionist DA to assess oral proficiency at the university level and the results indicated that DA indicated more accurate understanding of students’ proficiency levels. Furthermore, scaffolding in DA could be scripted or flexible. When it is scripted, teachers have already planned prompts that are usually designed towards learner needs or that are based on common learner errors (Lantolf & Poehner, 2004). In addition, Davin and Donato (2013) mentioned that scripted prompts could be given a value to standardize them and measure student development. On the other hand, flexible mediation used in DA may include prompts specifically targeted to the individual learner rather than following a series of pre-scripted prompts. In a study of elementary Spanish program, for instance, Davin (2013) used both flexible and scripted DA as she investigated the integration of DA and instructional conversations (IC). IC is a discussion-based lesson format that helps learners to expand upon their ideas through language (Cazden, 2001), thus providing teachers with more opportunities for scaffolding within a leaner’s ZPD (Tharp & Gallimore, 1991). In Davin’s (2013) study, the teacher used a combination of pre-scripted DA and IC as the flexible mediation to assess learners’ less predictable errors. She found that the use of both the scripted DA and IC aided in the meeting of overall instructional goals and learning gains for individual students.

183

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

In another vein, students who use a language other than English have been sometimes misidentified, and these students are labeled as “language learning disabled” (Rosa-Lugo, Mihai, &Nutta, 2012; Roseberry-McKibben, 2008). Because of the over-identification of these students’ language impairments or due to the mislabeling a language difficulty as a disability (Rosa-Lugo et al., 2012), DA may provide an alternative assessment framework to traditional standardized testing (Peña, 2000). Considering such population, DA could be considered as a fair assessment, and indeed, Gutierrez-Clellan and Peña (2001) emphasized that DA was found to be more fair and valid in assessing learners’ language skills mainly thanks to the opportunities that DA expose students to (e.g., prompting and probing). In addition, Peña et al. (2006) indicated that mediated interventions helped all students with learning gains; however, typically developing students showed greater gains compared to students with language impairments. Moreover, Hasson et al. (2013) created Dynamic Assessment of Preschoolers’ Proficiency in Learning English (DAPPLE) and assessed student success regarding phonology, vocabulary, and syntax. The tasks they used were interactive with multiple prompts and cues. The results indicated that DAPPLE could differentiate typically developing students from bilingual students with language impairments (Camilleri et al., 2014). In addition, according to previous literature (Laing & Kamhi, 2003; Pena et al., 1992; Rosa-Lugo et al., 2012), DA was found to be an effective tool in order to measure the skills and needs of SSNs through test-teach-retest continuum. Moreover, DA is linked to Response to Intervention (RTI), often used for students with special needs (Grigorenko, 2009; Lidz & Peña, 2009). RTI is defined by Grigorkenko as “attempts to find the best way to educate children by taking into account patterns of response and adjusting pedagogical strategies depending upon these responses” (2009, p. 114). Similar to DA, in RTI, teachers move from less intense interventions to higher levels, and assessment and intervention are in the same continuum (Berkeley et al., 2009; Fuchs et al., 2003; Gresham, 2004; Mellard et al., 2010). As in RTI, teachers use student output to inform themselves about the next stage to understand student needs and provide support (Grigorenko, 2009; Lidz & Peña, 2009; Wixson & Valencia, 2011). A study conducted by Spector (1992) showed that children who showed benefits from prompts and cues provided during a DA showed increased word recognition abilities throughout the course of the study.

UNIVERSAL DESIGN FOR LEARNING (UDL) AND SPECIAL NEEDS STUDENTS Being the most prominent inclusive education policy in the U.S., the Individuals with Disabilities Education Act (IDEA, 2004), promotes RTI which enables all learners regardless of their disabilities to obtain high-quality instruction as well as more intensive and structured intervention for their academic and behavioral success. Besides this policy, in 2015, the U.S. passed the Every Student Succeeds Act (ESSA), and it “promotes the use of UDL in instruction and assessment (P.L. 114–95) to help meet the high academic standards set by the policy- makers” (Scott et al., 2022, p. 334). Similar to RTI, UDL provides a comprehensive system based on best practices and evidence-based strategies that lead to the improvement of students’ problem-solving skills. However, compared to RTI, UDL utilizes modern technology in achieving the common purpose (i.e., high-quality and structured intervention). UDL, inspired by the Universal Design (UD) principles (Rose & Meyer, 2002), is an instructional approach in which teachers consider diverse learners’ needs in designing instruction instead of providing only adjustments or modifications for individual SNSs (Pisha & Coyne, 2001; Kurth, 2013). Understanding and taking into account the diversity of students is the key in applying UDL. According to Kurth 184

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

(2013), UDL provides students with multiple opportunities “to represent knowledge (how content and directions are presented to students), express knowledge (how students demonstrate their knowledge), and engage in the classroom (how students stay motivated and involved in learning)” (p. 35). Indeed, UDL draws its principles from these opportunities through different modalities of representation, action and expression, and engagement (see Figure 1) Figure 1. Universal Design for Learning Guidelines

From Center for Applied Special Technology (2022). Universal design for learning guidelines (Version 2.2). Retrieved from https://udlguidelines.cast.org/more/downloads Reprinted with permission.

The concept of representation refers to providing multiple representations of a subject, and when instruction is provided in multiple modalities or representations, the subject being taught becomes more accessible to larger groups of all-ability students. Action and expression refers to multimodal means of communication. For instance, SNSs’ using iPads to indicate which French word refers to what object. Last, engagement refers to stimulating student motivation and interest through hands-on and creative instruction (Center for Applied Special Technology, 2022; Courey et al., 2012). An example would be SNSs’ using realia to talk about a picnic that includes real plates, food etc. In classrooms designed based on UDL, providing support through vocabulary, emphasis on key concepts, using visuals and audio aids, and giving enough time to students to process input is crucial (Best et al., 2015). In addition, modeling and providing constructive feedback are as important as providing support through multimodal techniques (Thoma et al., 2009). However, based on the previous studies conducted on UDL, aforementioned UDL

185

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

principles have been being incorporated into the curriculum, but courses have not been designed for applying the principles, specifically for students with disabilities (Scott et al., 2017; Scott et al., 2022). Thus, it is important to incorporate UDL into curriculum for SNSs also considering the assessment component of the teaching process. In each stage of UDL, DA could be utilized as an assessment and instruction component because DA is used as a tool for both instruction and assessment (see Figure 2). In other words, DA enables us to understand learner needs or their L2 proficiency level while assessing the outcomes. For instance, in this FLEX program, after the introduction of French culture (e.g., some food names or flag), DA could be used to understand how much of the multi-modal instruction helped students to understand WHY of the learning process (see Figure 1). Then, based on what teachers observe during the DA, new and a different modalities of representation could be provided or teachers could proceed to the next stage (i.e., action and expression). Since learning and assessment is a continuum, DA acts as catalyzer between each stage of UDL. Figure 2. Universal design for learning and dynamic assessment continuum

As discussed earlier in this literature review, DA has been studied in foreign language classrooms and has been utilized in order to differentiate preschool bilingual special needs students from typically developing bilingual students. Also, DA has been found to be effective in assessing special needs students. However, there is a lack of studies focusing on DA in a pre-kindergarten foreign language program that is inclusive of students with special needs. Thus, the purpose of this study is to investigate the use of DA in an inclusive, foreign language, pre-kindergarten setting by referring to the first published pilot study (Regalla & Peker, 20017) in an inclusive context, and looking at the current study from an inclusive

186

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

perspective with a focus on DA dialogues between teacher and students. The following research questions and sub-questions were asked to address the purpose of this chapter: 1. What can DA tell us about foreign language learning for students of all ability levels within UDL framework? How DA could be used for students with special needs and typically developing students regarding their assessment scores? 2. How does a pre-kindergarten French teacher use prompting strategies to elicit second language (L2) responses from her students during instruction?

METHOD Setting The current study was conducted at a charter school serving children from pre-kindergarten through grade five in a fully inclusive setting with approximately 50% of the students having various special needs. The researchers launched the French program in 2014 as a result of a partnership between the school principal and a professor at a local university. At the time of this study, French was offered to all pre-kindergarten students in three classrooms with a total of 48 students. This program was designed as a typical foreign language exploratory (FLEX) program intended to introduce the French language and culture (Lipton, 1992). The main goal of this FLEX program was to teach basic interpersonal communication skills in French and stimulate interest in learning another language and culture. FLEX programs are also called “exploratory programs” because in these programs one or more languages are taught and students get a chance to explore the language and culture of target language. Students benefit not only from that particular language taught but also from the experience and process of learning a language for the most part.

Participants In the current study, there were 31 students in total from three pre-kindergarten classrooms. The parents of these students were contacted after the university’s Institutional Review Board (IRB) approved the procedures. More than half of these students (N = 17) had Individual Education Plans (IEPs) for various special learning needs, including cognitive developmental delays, language impairments, hearing impairments, and children who had been identified to be on the autism spectrum. The two groups of students in this study are categorized based on their IEP status. The students with IEPs are SSNs (N = 14) and the ones who did not have IEPs are TDSs. French classes for all three pre-kindergarten classrooms were taught by the same French teacher. The teacher was provided with professional development by a language professor. This professor holds a French teaching certification and had been working with the French teacher to design materials and lessons and to provide feedback regarding French teaching.

Program Design, Instructional Materials, and Theoretical Framework Each class received a 30-min. French lesson two days a week for a total of 36 lessons for 18 weeks. French lessons focused on interpersonal and interpretive communication at the Novice Low and Novice 187

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

Mid levels as described in the World Readiness Standards (The National Standards Collaborative Board, 2015) and the American Council on the Teaching of Foreign Languages (ACTFL) Proficiency Guidelines. Themes covered during the lessons included greetings, numbers, body parts, colors, and clothing. French classes were taught within the framework of UDL which was inspired by the Universal Design (UD) principles (Rose & Meyer, 2002). According to these principles, a teacher focuses on removing the barriers and limitations of a learning environment rather than focusing on learners’ limitations (Rose et al., 2006). Al-azawei (2016) emphasized, “designing ‘accessible’ content and delivering it in an ‘accessible’ learning environment can improve learning experience regardless of individual learning abilities” (p. 40). Considering the principles of UDL, French classrooms were designed towards both groups of students’ needs, and French materials were made accessible for all ability students (e.g., multisensory realia, toys, audio-visuals). Since UDL promotes multimodality, the thematic instruction was also supported by the video series, Little Pim (2022). The series was designed to teach foreign language to young children from birth through age six and was chosen because of its engaging format and presentation of language in context without the use of translation.

Data Collection and Analyses Video-Recordings. For each of the 18 weeks of French instruction, video was recorded in all three prekindergarten classrooms once per week to assess students’ abilities to understand and respond to their teacher in French. Short segments of video were recorded in each class for a total of 54 video segments. These video segments were focused on the French teacher’s questioning and use of prompting strategies. Thirty video segments were chosen for analysis in this study. The videos were chosen for their clarity of audio and representation of data. The data from the 30 video segments represent all instructional themes taught in each of the three prekindergarten classes at various points during the French instruction. In this article, transcripts from five video segments will be shared to show data from each of the three pre-kindergarten classrooms, all instructional themes, and a variety of prompting strategies used by the teacher. In order to analyze the video data, the videos were viewed to determine prompting categories and to count the number of times the teacher used a specific prompting strategy. In the 30 videos analyzed for this study, the prompting strategies used by the teacher were counted and tallied for total numbers of each prompting strategy. Five prompting categories were found including repetition, suggestion (of incorrect response in French), initial sound prompt, gestures, and use of English. The categories were determined by comparing teacher responses on a questionnaire to the lessons recorded on video. To ensure interrater reliability, the videos were viewed separately by two researchers, and the inter-rater reliability was 95%. Counts from each reviewer were cross-checked and differences in counts and categorization were resolved by discussion until agreement was reached. Questionnaires. The French teacher completed a short questionnaire towards the end of the French instruction. The questionnaire had two parts. The first part of the questionnaire asked the teacher to explain how she prompted students in class when they could not answer one of her questions in French. Second, the French teacher was asked to predict how much prompting each student would need during assessments. The numeric coding for the prediction section was as follows: 1. significant prompting needed, 2. some prompting needed, and 3. little to no prompting needed. For the analysis of the data, Statistical Package for the Social Science (SPSS) was used. The numeric coding described above was

188

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

entered into SPSS. Then, an independent t-test was conducted to analyze the group differences among students (i.e., SSNs and TDSs) based on the French teacher’s predictions on the students’ prompting needs. Dynamic Assessments. After sixteen weeks of instruction, students were assessed individually for their interpretive and interpersonal communication skills in French. A dynamic assessment pre-scripted rubric was created by the researchers that addressed the topics covered in the French classes (see Figure 3). The researcher pulled students individually for assessments during their last two weeks of French classes. The researcher conducted all of the assessments and scored each student on their own rubric marked with only a code number assigned to each student to protect privacy. Another rater/researcher observed the assessments and scored the students independently from the other researcher on a separate rubric for inter-rater reliability. At times when students gave an unexpected answer that did not fit with the rubric, each rater marked it with a special note on the rubric and the two raters later discussed the student’s response until an agreement was made on an appropriate score. Students were asked a total of eight questions during the assessment. Students were asked to say their name, age, to identify two body parts, two types of clothing, and two colors (see Figure 3).

189

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

Figure 3. Dynamic Assessment Rubric.

From Regalla, M., & Peker, H. (2017). Prompting all students to learn: Examining dynamic assessment in a pre-kindergarten, inclusive French program. Foreign Language Annals, 50(2), 323–338. Reprinted with permission.

This rubric was designed as a 5-point rubric; however, since this evaluation instrument is specific to this unique program, its reliability may be questionable despite its high content and construct validity. The DA procedures were as follow. First, as an example of this procedure, students are shown a blue

190

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

card and asked what color the card is. If students provide a correct answer, they get 5 points. However, if students provide the correct answer after teacher provides prompts or clues, they get 4 points. If student does not provide the correct answer when the first question is asked, teacher is supposed to provide some clues to help the group either understand the question or find the correct answer with help. For instance, if a student cannot answer the first question when the teacher shows only one blue card to ask a question about it, teacher can pull up one more card and ask “Which one is the blue card?” In this case, providing another option helps the student to remember. Thus, the assessment is structured in the form of a conversation that helps the student to understand questions and find answers with clues and extra help, as in ZPD. Scaffolding is built through each question and mediation of meaning occurs during question and answer between students and teacher. To determine the normality of the data, skewness and kurtosis values were examined and found to be within the range (i.e., +/- 1.96). Next, an independent t-test was run on SPSS to analyze the group differences among the SSNs and TDSs based on the dynamic assessment scores. Then, teacher’s prediction on students’ prompting needs was measured through the second t-test, as mentioned earlier. Lastly, depending on the results of these two t-tests, correlational statistics were conducted to investigate if there was parallelism between the teacher’s predictions of students’ prompting needs during their performance in class (i.e., questionnaire data) and how they are assessed through dynamic assessment.

RESULTS What Can DA Tell Us About Foreign Language Learning for Students of All Ability Levels Within UDL Framework? In answering the aforementioned question, I will refer to Regalla and Peker (2017) for quantitative results, but their interpretation will be synthesized with the qualitative results in the current study. To begin with, when the difference between the two groups regarding their rubric scores was examined, the independent t-test results indicated a statistically significance (p = 0.001). On average, special needs students had a mean score of 20 from the assessment (SD = 10) while TDSs had a mean score of 33 (SD = 7). When French teacher’s prediction on the two groups of students’ prompting needs was examined, the difference between special needs students and typical students regarding the teacher’s perception of students’ prompting needs was statistically significant (p = 0.039), and SNSs’ mean score was 2.2 (SD = .8) while the typical students had a mean score of 2.7 (SD = .6). Furthermore, examining the relationship between DA scores and teacher’s perception on students’ prompting needs was necessary because the two groups statistically significantly differed from each other regarding their DA scores and prompting needs. The correlation was positive and significant (r = .75, N = 31, p < .001), indicating that the students who needed less prompting obtained higher scores. Last, student mean scores were examined to find out if there was a statistically significant difference between SNSs’ means scores and TDSs’ means for the questions representing each topic assessed. Regalla and Peker (2017) found that there was no statistically significant difference between SSNs’ mean scores and TDSs’ mean scores for the question Comment t’appelles-tu? [What is your name?], but a significant difference was shown between the two groups for all other questions (i.e., Quel âge as-tu? [How old are you?], Qu’est-ce que c’est? (la tête) [What is this? Head], Qu’est-ce que c’est? (la main) [What is this? Hand], Quelle couleur est-ce? (vert) [What color is this? Green], Quelle couleur est-ce? 191

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

(rouge) [What color is this? Red], Qu’est-ce que c’est? (le chapeau) [What is this? Hat], and Qu’est-ce que c’est? (le pantalon) [What is this? Pants]).

How Does a Pre-Kindergarten French Teacher Use Prompting Strategies to Elicit L2 Responses from Her Students During Instruction? According to the survey/questionnaire data, French teacher used multiple prompting strategies. These could be listed as: providing the initial sound of the correct response (N = 44), suggesting incorrect responses with a questioning tone of voice (N = 78), repeating the question (N = 113), and using gestures (N = 34). However, even though French teacher did not name “using English” as a prompting strategy, 30-segment video recordings indicated that she used English (L1) as a prompting strategy 55 times. The prompting strategy that French teacher used most often was repetition followed by suggestion, use of English, initial sound prompts, and gestures, respectively. In many cases, repetition was the teacher’s first choice of a prompting strategy. If repetition was insufficient prompting for a student, the teacher followed with another appropriate prompting strategy such as gesturing to a body part or providing the initial sound of a number. If other prompting strategies did not produce a student response, the teacher chose to use English in order to arrive at a student response in French. The following transcripts are some samples from French teacher’s prompting strategies. The transcript below shows the teacher’s use of gestures, suggestion of incorrect responses, and English prompts. The teacher initially points to her head to ask the question, Qu’est-ce que c’est? [What is this?] When the students respond by repeating the question, the teacher uses English to clarify the question. After the clarification, the teacher uses suggestions of incorrect responses and suggestion of a song to prompt students who do not respond. T: “Classe, qu’est-ce que c’est?” [Class, what is this?] pointing to head. SS: “Qu’est-ce que c’est?” [What is this?] T: “Ok class, remember, every time I ask ‘Qu’est-ce que c’est?’ [What is this?] and point here,” points to head, “I am asking you ‘what’s this?’ You are going to say ‘la tête.’ [the head]. Ok class, ‘Qu’est-ce que c’est?’ [what’s this], ‘Qu’est-ce que c’est?’” Points to head. SS: “La tête.” [The head]. T: “Arianna, qu’est-ce que c’est?” [Arianna, what is this?] teacher is pointing to her head. S: Silence. T: “Que’est-ce que c’est?” [What is this?] S: Silence. T: “Remember the song, ‘tête, épaules, genoux, pieds’ [head, shoulders, knees, feet]?” Teacher singing body parts to the tune. “Qu’est-ce que c’est?” [What is this?] pointing to head. “Le nez? La main? [The nose? The hand?]” S: “La tête!” [The head!] T: “Excellent! [Excellent!] Brittany, qu’est-ce que c’est? [What is this?]” S: “La tête.” [The head.] T: “Excellent! [Excellent!] Andrea, qu’est-ce que c’est? [What is this?]” S: “La tête.” [The head.] T: “Excellent!” [Excellent!]

192

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

The following transcript shows an example of the teacher using gestures to prompt students and again using English to clarify the difference between shorts and pants. T: “Qu’est-ce que c’est?” [What is this?] Teacher is pointing to her pants. SS: “Le pantalon.” [The pants] T: “Excellent! [Excellent!] Qu’est-ce que c’est?” [What is this?] Teacher is pointing to a student volunteer’s shirt. SS: “Le t-shirt.” [The t-shirt.] T: “Bien, qu’est-ce que c’est?” [Good, what is this?] Teacher is pointing to volunteer’s shorts. SS: “Le pantalon.” [The pants.] T: “No, this is le pantalon” [the pants]. Pointing to her pants. “This is le short [the shorts].” Pointing to volunteer’s shorts. “It is shorter than the pants.” Pointing back and forth between the volunteer’s shorts and her own pants. “Le short, le pantalon. [The shorts, the pants]. Let’s try again. Qu’est-ce que c’est?” [What is this?] Teacher pointing to her pants. SS: “Le pantalon.” [The pants.] T: “Excellent! Qu’est-ce que c’est? [Excellent! What is this?]” Teacher is pointing to volunteer’s shorts. SS: “Le short.” [the shorts] In the transcript below, the teacher uses a variety of prompts to encourage a special needs student to produce a correct French response. The teacher asks Danny (a typical student) the question Comment tu t’appelles? [What is your name?] and Quel âge as-tu? [How old are you?] Next, she asks Eric (one of the SNSs) to answer the same questions. Eric responds quickly to Comment tu t’appelles? [What is your name?] but struggles with Quel âge as-tu? [How old are you?] The teacher uses gestures, English prompts, and initial sound prompts to encourage a French response from Eric. T: “Bonjour, comment tu t’appelles?” [Hello, what is your name?] S: “Je m’appelle Danny.” [My name is Danny.] T: “Quel âge as-tu, Danny?” [How old are you, Danny?] S: “Cinq ans.” [Five years old]. T: “Excellent, Danny!” Teacher shakes student’s hand, moves to next student. “Bonjour, comment tu t’appelles?” [Hello, what is your name?] S: “Je m’appelle Eric.” [My name is Eric.] T: “Excellent! Quel âge as-tu?” [Excellent! How old are you?] S: Silence. T: Shows fingers. “Quel âge as-tu?” [How old are you?] S: “Five”. T: “How old are you, en français [in French]?” S: Silence. Teacher shows fingers, provides initial sound prompt “s.” S: “Cinq.” [Five.] T: “Excellent!”

193

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

DISCUSSION The findings indicate that DA can provide some valuable insights about the difference between the two groups of students (i.e., SNSs who had IEPs and the TDSs). When scores between the two groups are compared, the TDSs outperform the special needs students. The overall mean scores for the TDSs were higher than the mean scores for the SNSs. There was also a difference between the mean scores on each of the eight questions posed in the DA, with the TDSs scoring a higher mean for each. These findings align with the literature because prior research on DA has shown TDSs benefit more from the prompting in DA than SNSs (Peña et al., 2006). However, this does not mean that SNSs do not benefit from DA. They benefit more from DA compared to traditional assessments because traditional assessments do not consider learner variability which is the primary focus of UDL. The most notable finding is the comparison of both group’s scores on the question Comment t’appellestu? [What is your name?]. For this question, the difference in mean scores was not statistically significant between the TDSs and The SNSs in the current study. One reason that can account for the close proximity in scores for this particular question is the amount of time that students had to practice this exchange of information in the L2. In every lesson observed and recorded for this study, the French teacher started her lesson with a warm-up or introduction questions and also the distribution of French class name tags. The teacher varied some of her warm-up questions in her introduction activity to review prior learning, such as Quel âge as-tu? [How old are you?] and Qu’est-ce que c’est? [What is this?]. Despite the variance of questions, Comment t’appelles-tu? [What is your name?] remained as part of the introduction activity in each French lesson observed and recorded. During this introductory exchange of information, each student had the opportunity to hear the question and answer the teacher in French as he/she retrieved a name tag. This finding is significant because it shows the benefits of a repeated L2 exchange between the teacher and each individual student. Previous studies focusing on foreign language repetition also showed that repetition within a dialogue or discourse was a good predictor of learning rate in formal settings (Dufva & Voeten, 1999; French, 2006; Jackson & Ruf, 2018; Service, 1992; Service & Kohonen, 1995). The repetition of this exchange over the eighteen weeks of French classes benefitted all students, which means repetition did not only improve TDSs’ learning but also SNSs’ learning. Furthermore, the most remarkable result was SNSs’ earning nearly equivalent scores to their typically developing peers on the assessment of Comment t’appelles-tu? [What is your name?]. Although the differences in mean scores were statistically significant between the SNSs and the TDSs for the other seven questions, closer examination of the mean scores can provide additional insights regarding the learning of special needs students. For instance, all students, including the SNSs, were able to produce at least one L2 output when enough support was provided during DA. Similarly, regarding the previous studies in UDL, when multimodal instruction is provided to optimize SNSs’ learning and access to an L2, more students show learning gains (Lim et al., 2019; Regalla & Peker, 2015; Thibault, 2000). In the current study, some SNSs needed more prompting than the others. However, even in cases where SNSs did not display evidence of interpersonal communication during the assessment, they were able to display evidence of their interpretive skills in French. For example, a student could have earned two points for a display of interpretive skills by pointing to a red colored card after hearing “Can you show me rouge [red]?” Furthermore, the findings suggest that the amount of practice time spent on a particular topic influenced the mean scores of the SNSs. As stated earlier, the teacher frequently reviewed prior learning in her introduction activity. Therefore, the topics introduced by the teacher earlier in the 18 weeks of 194

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

French classes were reviewed more frequently. There was a gradual decrease in the mean scores for the SNSs, as they corresponded to the questions related to topics presented towards the end of the 18 weeks of French classes. This indicates that SNSs may need more time, but it does not mean that they did not learn. When enough multimodal instruction is provided as in UDL and enough time is allowed for processing learning, these SNSs may reach the same level with their typically developing peers. For example, the lowest mean score of 1.64 was earned for le pantalon [pants] as students were presented with the question Qu’est-ce que c’est? [What is this?]. Articles of clothing were the last topic introduced by the French teacher leaving the fewest practice opportunities for students. When the mean score for the last topic presented (clothing) is compared to the mean score for the first topic presented (giving names), the findings show the effect of practice time and repetition on the scores of special needs students. When given the opportunity for more practice time and more repetition, the SNSs in this study showed less need for prompting and were able to earn higher assessment scores. These findings align with previous studies indicating that SNSs may not learn at a similar pace to TDSs’; however, when adequate time and support through multimodal instruction are provided, SNSs may experience success (Camilleri et al., 2014; Simon- Cereijido & Gutierrez-Clellen, 2014; Regalla & Peker, 2016, 2017; Regalla et al., 2017). The findings of this study also contribute to our knowledge of the use of DA. For example, the French teacher’s predictions on student prompting needs positively correlated with the DA results. This indicates that the French teacher had a clear understanding of her students’ prompting needs after working with them for approximately 16 weeks by using DA within UDL. With the teacher’s understanding of student prompting needs and the connection between prompting needs and assessment scores, teacherconducted assessments with the option of flexible DA could have shown greater student learning on this assessment. In this study, the researcher conducted the assessments due to the scheduling constraints of the French teacher. Students had seen the researcher many times while video recording and they were familiar with the researcher. However, if time had permitted the teacher to conduct the assessments and she had the option of flexible DA, the teacher could have elicited more L2 responses and more efficiently than an outsider using a pre-scripted series of prompts. In parallel with this idea, flexible mediation was found the most effective scaffolding technique (Hasson, 2018) to apply within pre-kindergarten students’ ZPD because this type of mediation helps students scaffold the input in an appropriate pace, and it also helps teachers to adjust their prompts and support to specific student needs or to individual student needs (Davin, 2013). In some cases, the use of English in the prompt may be the most appropriate prompt in a student’s ZPD. For example, when the teacher prompted Jason for the color “pink” during a classroom activity, she asked about his favorite color in English. As a result, Jason was able to say the correct color in French after being prompted to use the L2. In this study, the French teacher often used an English prompt to elicit an L2 response during class when she was aware that other prompting strategies were not effective for the whole class or a particular student. However, in the pre-scripted DA, English could not be used before other non-English prompts. Because each of the pre-scripted prompts used in this study were assigned an order and a value, the assessor was obligated to use prompts that may not have been the most effective for each student. Prior research shows the use of English has been effective in eliciting L2 production during small group tasks (Brooks & Donato, 1994; Davin & Donato, 2013; Hasson, 2018; Swain & Lapkin, 2000). In this study, English was used to provide support to a prompt that may not otherwise be understood by novice level learners if the teacher were to use only French in her prompting strategies.

195

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

CONCLUSION AND RECOMMENDATIONS As mentioned earlier, the sample in this study is very unique because, to the best of my knowledge, there is no other school in the U.S. that has integrated SNSs into regular classrooms with TDSs except the one I worked with in this study so far. Since the sample size of this study is small due to the unique nature of these classrooms, the reliability issue could be questionable. However, instruments were valid in assessing what was taught because they could be called customized or tailored specifically for this unique population considering students’ IEPs. Therefore, the findings still suggest that DA can be an effective method of assessing the foreign language learning of all students, including those with special needs, especially within the UDL framework. For example, although SNSs required more prompting than TDSs regarding most of the DA questions, all SNSs in the study were able to provide L2 responses showing evidence of either interpersonal or interpretive communication skills. Furthermore, the findings show that the amount of time spent practicing an L2 exchange influences the scores of special needs students. SNSs’ scores were nearly equivalent to the scores of their typically developing peers for the question where the teacher had spent the most practice time, especially at the action and expression stage of UDL. As mentioned by Regalla et al. (2017), SNSs may still experience success when adequate time is provided even if their pace may be slower than their typically developing peers. Considering previous findings in the literature (Camilleri et al., 2014; Simon- Cereijido & Gutierrez-Clellen, 2014; Regalla & Peker, 2016, 2017; Regalla et al., 2017) as well as the current study findings, SNSs should be provided with adequate time, which could be achieved as the extension of some activities or including students in IEP system. Most schools may have IEPs as explained at the beginning of this chapter; however, these IEPs should also be examined in order to understand if the components of UDL is intended and achieved in reality. Thus, one of the suggestions to create a more diverse curriculum that would meet diverse learner needs could be integrating DA into UDL curriculum as part of both assessment and instruction. DA can be even more effective if the teacher who knows the students’ prompting needs is able to conduct the assessment. While conducting such an assessment, it is better to use flexible mediation because it allows the teacher to reach each student’s ZPD more effectively without the obligation to follow a list of pre-scripted prompts. As Regalla and Peker (2017) emphasized, it is more effective in understanding “what all children can do when flexible, rather than pre- scripted, mediation is used, particularly by a teacher who is familiar with students’ specific prompting needs and is thus able to tailor the assessment to each student’s ZPD” (p. 335). As mentioned earlier (see the UDL and SNSs section), teachers could provide support through vocabulary, emphasize key concepts through audio-visuals and give enough time to students to process input in order to reach students’ ZPD and communicate with them. Overall, lessons designed within UDL when accompanied by DA could address learner needs better, and teachers could have a chance to better evaluate students’ current ZPD through the mediation they achieve during DA.

ACKNOWLEDGMENT The author would like to express her tremendous gratitude to the research school, United Cerebral Palsy (UCP) of Central Florida, for making this study possible. The author of this study would also like to thank her research partner, Dr. Michele Regalla, for her tireless work for the FLEX Program and her dedication to promote foreign language accessibility for all students. 196

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

REFERENCES Aindriú, S. (2022). The reasons why parents choose to transfer students with special educational needs from Irish immersion education. Language and Education, 36(1), 59–73. doi:10.1080/09500782.2021 .1918707 Al-Azawei, A., Serenelli, F., & Lundqvist, K. (2016). Universal design for learning (UDL): A content analysis of peer-reviewed journal papers from 2012 to 2015. The Journal of Scholarship of Teaching and Learning, 16(3), 39–56. doi:10.14434/josotl.v16i3.19295 Aljaafreh, A., & Lantolf, J. P. (1994). Negative feedback as regulation and second language learning in the zone of proximal development. Modern Language Journal, 78(4), 465–483. doi:10.1111/j.1540-4781.1994. tb02064.x Antón, M. (2003). Dynamic assessment of advanced foreign language learners. Paper presented at the American Association of Applied Linguistics, Washington, DC. Berkeley, S., Bender, W. N., Peaster, L. G., & Saunders, L. (2009). Implementation of response to intervention: A snapshot of progress. Journal of Learning Disabilities, 42(1), 85–95. doi:10.1177/0022219408326214 PMID:19103800 Best, K., Scott, L. A., & Thoma, C. A. (2015). Starting with the end in mind: Inclusive education designed to prepare students for adult life. In R. G. Craven, A. J. S. Moren, D. Tracey, P. D. Parker, & H. F. Zhong (Eds.), Inclusive education for students with intellectual disabilities (pp. 45–72). Information Age Press. Brooks, F. B., & Donato, R. (1994). Vygotskian approaches to understanding foreign language learner discourse during communicative tasks. Hispania, 77(2), 262–274. doi:10.2307/344508 Brown, H. D., & Priyanvada, A. (2010). Language assessment: Principles and classroom practices. Pearson. Camilleri, B., Hasson, N., & Dodd, B. (2014). Dynamic assessment of bilingual children’s language at the point of referral. Educational and Child Psychology, 31(2), 57–72. doi:10.53841/bpsecp.2014.31.2.57 Cazden, C. (2001). Classroom discourse: The language of teaching and learning. Heinemann Center for Applied Special Technology. Courey, S. J., Tappe, P., Siker, J., & LePage, P. (2012). Improved lesson planning with universal design for learning (UDL). Teacher Education and Special Education, 36(1), 7–27. doi:10.1177/0888406412446178 Davin, K. J. (2013). Integration of dynamic assessment and instructional conversations to promote development and improve assessment in the language classroom. Language Teaching Research, 17(3), 303–322. doi:10.1177/1362168813482934 Davin, K. J., & Donato, R. (2013). Student collaboration and teacher-directed classroom dynamic assessment: A complementary pairing. Foreign Language Annals, 46(1), 5–22. doi:10.1111/flan.12012 Donato, R. (2000). Sociocultural contributions to understanding the foreign and second language classroom. In J. P. Lantolf (Ed.), Sociocultural theory and second language learning. Academic Press.

197

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

Dufva, M., & Voeten, M. M. (1999). Native language literacy and phonological memory as prerequisites for learning English as a foreign language. Applied Psycholinguistics, 20(3), 329–348. doi:10.1017/ S014271649900301X Every Student Succeeds Act. 2015. PL 114-95, 114 U.S.C. French, L. M. (2006). Phonological working memory and L2 acquisition: A developmental study of Quebec Francophone children learning English. Edwin Mellen Press. Fuchs, D., Mock, D., Morgan, P. L., & Young, C. L. (2003). Responsiveness to intervention: Definition, evidence, and implications for the learning disabilities construct. Learning Disabilities Research & Practice, 18(3), 157–171. doi:10.1111/1540-5826.00072 Gibbons, P. (2003). Mediating language learning: Teacher interactions with ESL students in a contentbased classroom. TESOL Quarterly, 37(2), 247–272. doi:10.2307/3588504 Gresham, F. M. (2004). Current status and future directions of school-based behavioral interventions. School Psychology Review, 33(3), 326–343. doi:10.1080/02796015.2004.12086252 Grigorenko, E. L. (2009). Dynamic assessment and response to intervention: Two sides of one coin. Journal of Learning Disabilities, 42(2), 111–132. doi:10.1177/0022219408326207 PMID:19073895 Gutierrez-Clellen, V. F., & Pena, E. (2001). Dynamic assessment of diverse children: A tutorial. Language, Speech, and Hearing Services in Schools, 32(4), 212–224. doi:10.1044/0161-1461(2001/019) PMID:27764448 Hasson, N. (2018). The dynamic assessment of language learning. Routledge. Hasson, N., Camilleri, B., Jones, C., Smith, J., & Dodd, B. (2013). Discriminating disorder from difference using dynamic assessment with bilingual children. Child Language Teaching and Therapy, 29(1), 57–75. doi:10.1177/0265659012459526 Hogan, K., & Pressley, M. (1997). Scaffolding student learning: Instructional approaches and issues. Brookline Books. Individuals with Disabilities Education Act (as amended), 20 U.S.C. Sec. 1401 etseq. Jackson, C. N., & Ruf, H. T. (2018). The importance of prime repetition among intermediate-level second language learners. Studies in Second Language Acquisition, 40(3), 677–692. doi:10.1017/ S0272263117000365 Kurth, J. A. (2013). A unit-based approach to adaptations in inclusive classrooms. Teaching Exceptional Children, 46(2), 34–43. doi:10.1177/004005991304600204 Laing, S. P., & Kamhi, A. (2003). Alternative assessment of language and literacy in culturally and linguistically diverse populations. Language, Speech, and Hearing Services in Schools, 34(1), 44–55. doi:10.1044/0161-1461(2003/005) PMID:27764486 Lantolf, J. (2000). Sociocultural Theory and Second Language Learning. Oxford University Press.

198

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

Lantolf, J. P., & Poehner, M. E. (2004). Dynamic assessment: Bringing the past into the future. Journal of Applied Linguistics, 1(1), 49–74. doi:10.1558/japl.1.1.49.55872 Lidz, C. S., & Peña, E. D. (1996). Dynamic assessment: The model, its relevance as a nonbiased approach, and its application to Latino American preschool children. Language, Speech, and Hearing Services in Schools, 27(4), 367–372. doi:10.1044/0161-1461.2704.367 Lidz, C. S., & Peña, E. D. (2009). Response to intervention and dynamic assessment: Do we just appear to be speaking the same language? Seminars in Speech and Language, 30(02), 121–133. doi:10.10550029-1215719 PMID:19399697 Lim, N., O’Reilly, M. F., Sigafoos, J., Ledbetter-Cho, K., & Lancioni, G. E. (2019). Should heritage languages be incorporated into interventions for bilingual individuals with neurodevelopmental disorders? A systematic review. Journal of Autism and Developmental Disorders, 49(3), 887–912. doi:10.100710803018-3790-8 PMID:30368629 Lipton, G. (1992). Practical handbook to elementary foreign language programs, including FLES, FLEX, and immersion programs (2nd ed.). National Textbook. Mantero, M. (2002). Scaffolding revisited: Sociocultural pedagogy within the foreign language classroom. Retrieved October 2016 from www.eric.ed.gov Mellard, D., McKnight, M., & Jordan, J. (2010). RTI tier structures and instructional intensity. Learning Disabilities Research & Practice, 25(4), 217–225. doi:10.1111/j.1540-5826.2010.00319.x Pena, E., Iglesias, A., & Lidz, C. S. (2001). Reducing test bias through assessment of children’s word learning ability. American Journal of Speech-Language Pathology, 10(2), 138–151. doi:10.1044/10580360(2001/014) Peña, E. D. (2000). Measurement of modifiability in children from culturally and linguistically diverse backgrounds. Communication Disorders Quarterly, 21(2), 87–97. doi:10.1177/152574010002100203 Peña, E. D., Gillam, R. B., Malek, M., Felter, R., Resendiz, M., & Fiestas, C. (2006). Dynamic assessment of children from culturally diverse backgrounds: Application to narrative assessment. Journal of Speech, Language, and Hearing Research, 49, 1037–1057. PMID:17077213 Peña, E. D., Quinn, R., & Iglesias, A. (1992). The application of dynamic methods to language assessment: A non-biased procedure. The Journal of Special Education, 26(3), 269–280. doi:10.1177/002246699202600304 Pimsleur-Levine, J., & Benaisch, A. (2022). Little Pim. Retrieved June 9, 2022, from https://www.littlepim.com Pisha, B., & Coyne, P. (2001). Smart from the start: The promise of universal design for learning. Remedial and Special Education, 22(4), 197–203. doi:10.1177/074193250102200402 Poehner, M. E. (2008). Dynamic assessment: A Vygotskian approach to understanding and promoting L2 development. Springer. doi:10.1007/978-0-387-75775-9

199

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

Pufahl, I., & Rhodes, N. (2011). Foreign language instruction in U.S. schools: Results of a national survey of elementary and secondary schools. Foreign Language Annals, 44(2), 258–288. doi:10.1111/j.19449720.2011.01130.x Regalla, M., & Peker, H. (2015). Early language learning for all: Examination of a prekindergarten French program in an inclusion setting. Foreign Language Annals, 48(4), 618–634. doi:10.1111/flan.12156 Regalla, M., & Peker, H. (2016). Multimodal instruction in pre-kindergarten: An introduction to an inclusive early language program. The National Network for Early Language Learning (NNELL) Learning Languages Journal, 21(2), 11-14. Retrieved from https://files.eric.ed.gov/fulltext/EJ1124522.pdf Regalla, M., & Peker, H. (2017). Prompting all students to learn: Examining dynamic assessment in a pre-kindergarten, inclusive French program. Foreign Language Annals, 50(2), 323–338. doi:10.1111/ flan.12261 Regalla, M., Peker, H., Llyod, R., & O’Connor-Morin, A. (2017). To exempt or not to exempt: An examination of an inclusive pre-kindergarten French program. International Journal of TESOL and Learning, 6(3&4), 83–100. http://untestedideas.net/journal_article.php?jid=ijt201712&v ol=6&issue=4 Rosa-Lugo, L. I., Mihai, F., & Nutta, J. W. (2012). Language and literacy development: an interdisciplinary focus on English learners with communication disorders. Plural Pub. Rose, D., Harbour, W., Johnston, C. S., Daley, S., & Abarbanell, L. (2006). Universal design for learning in postsecondary education. Journal of Postsecondary Education and Disability, 19. Rose, D. H., & Meyer, A. (2002). Teaching every student in the digital age: Universal design for learning. Academic Press. Roseberry-McKibben, C. (2008). Multicultural students with special language needs (3rd ed.). Academic Communication Associates. Scott, L. A., Bruno, L., Gokita, T., & Thoma, C. A. (2022). Teacher candidates’ abilities to develop universal design for learning and universal design for transition lesson plans. International Journal of Inclusive Education, 26(4), 333–347. doi:10.1080/13603116.2019.1651910 Scott, L. A., Thoma, C. A., Puglia, L., Temple, P., & D’Aguilar, A. (2017). Implementing a UDL framework: A study of current personnel preparation practices. Intellectual and Developmental Disabilities, 55(1), 25–36. doi:10.1352/1934-9556-55.1.25 PMID:28181884 Selvachandran, J., Kay-Raining Bird, E., DeSousa, J., & Chen, X. (2020). Special education needs in French Immersion: A parental perspective of supports and challenges. International Journal of Bilingual Education and Bilingualism, 25(3), 1120–1136. doi:10.1080/13670050.2020.1742650 Service, E. (1992). Phonology, working memory, and foreign language learning. Quarterly Journal of Experimental Psychology, 45A(1), 21–50. doi:10.1080/14640749208401314 PMID:1636010 Service, E., & Kohonen, V. (1995). Is the relation between phonological memory and foreign language learning accounted for by vocabulary acquisition? Applied Psycholinguistics, 16(2), 155–172. doi:10.1017/ S0142716400007062

200

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

Shrestha, P. N. (2020). Dynamic assessment of students’ academic writing: Vygotskian and systemic functional linguistic perspectives. Springer. doi:10.1007/978-3-030-55845-1 Simon-Cereijido, G., & Gutierrez-Clellen, V. (2014). Bilingual education for all: Latino dual language learners with language disabilities. International Journal of Bilingual Education and Bilingualism, 17(2), 235-254. doi:10.1080/13670050.2013.866630 Spector, J. E. (1992). Predicting progress in beginning reading: Dynamic assessment of phonemic awareness. Journal of Educational Psychology, 84(3), 353–363. Swain, M., & Lapkin, S. (2000). Task-based second language learning: The uses of the first language. Language Teaching Research, 4, 251–274. Takahashi, E. (1998). Language development in social interaction: A longitudinal study of a Japanese FLES program from a Vygotskyan approach. Foreign Language Annals, 31(3), 392–406. Tharp, R. G., & Gallimore, R. (1991). The instructional conversation: Teaching and learning in social activity. Research Reports: 2. Paper rr02. Santa Cruz, CA: National Center for Research and Cultural Diversity and Second Language Learning. Retrieved from: http://escholarship. org/uc/item/5th0939d The National Standards Collaborative Board. (2015). World-readiness standards for learning languages (4th ed.). Author. Thibault, P. J. (2000). The multimodal transcription of a television advertisement: Theory and practice. In A. Baldry (Ed.), Multimodality and multimediality in the distance learning age (pp. 311–385). Palladino. Thoma, C. A., Bartholomew, C. C., & Scott, L. A. (2009). Universal design for transition: A roadmap for planning and instruction. Paul H Brookes Publishing. Universal design for learning guidelines (Version 2.2). (2022). Retrieved from https://udlguidelines. cast.org/more/downloads Vygotsky, L. (1978). Mind in Society. Harvard University Press. Vygotsky, L. (1986). Thought and language. MIT Press. Wertsch, J. V. (1979). The regulation of human action and the given‐new organization of private speech. In G. Zivin (Ed.), The development of self‐regulation through private speech (pp. 78–98). John Wiley & Sons. Wertsch, J. V. (1985). Vygotsky and the social formation of mind. Harvard University Press. Wight, M. S. (2015). Negotiating language learner identities: Students with disabilities in the foreign language learning environment. Dissertation Abstracts International Section A, 76. Wixson, K. K., & Valencia, S. W. (2011). Assessment in RTI: What teachers and specialists need to know. The Reading Teacher, 64, 466–469. doi:10.1598/RT.64.6.13 Wood, D., Bruner, J., & Ross, G. (1976). The role of tutoring in problem solving. Journal of Child Psychology and Psychiatry, and Allied Disciplines, 17.

201

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

ADDITIONAL READING Alsaadi, H. M. A. (2021). Dynamic assessment in language learning: An overview and the impact of using social media. English Language Teaching (Toronto), 14(8), 73–82. doi:10.5539/elt.v14n8p73 Baek, S. G., & Kim, K. J. (2003). The effect of dynamic assessment based instruction on children’s learning. Asia Pacific Education Review, 4(2), 189–198. doi:10.1007/BF03025361 Galkiene, A., & Monkeviciene, O. (2021). Improving inclusive education through universal design for learning. Springer International Publishing AG. doi:10.1007/978-3-030-80658-3 Ismailov, M., & Chiu, T. K. F. (2022). Catering to inclusion and diversity with universal design for learning in asynchronous online education: A self-determination theory perspective. Frontiers in Psychology, 13, 819884–819884. doi:10.3389/fpsyg.2022.819884 PMID:35265016 Nelson, L. L. (2021). Design and deliver: planning and teaching using universal design for learning (2nd ed.). Paul H. Brookes Publishing Co. Peña, E. D., Gillam, R. B., & Bedore, L. M. (2014). Dynamic assessment of narrative ability in English accurately identifies language impairment in English language learners. Journal of Speech, Language, and Hearing Research, 57(6), 2208–2220. doi:10.1044/2014_JSLHR-L-13-0151 PMID:25075793 Poehner, M. E., Swain, M., & Lantolf, J. P. (2018). The Routledge handbook of sociocultural theory and second language development. Routledge. doi:10.4324/9781315624747 Rossi, M. (2022). Universal design for learning and inclusive teaching: Future perspectives. Elementa, 1(1-2), 103–113. doi:10.7358/elem-2021-0102-ross

KEY TERMS AND DEFINITIONS Assessment: Making inferences based on students’ learning and development in order to design new learning opportunities for students. Dynamic Assessment: A type of assessment in which both teaching and learning take place as a whole activity in harmony with mediation between a learner and more capable peers or a teacher. Flexible Dynamic Assessment: The type of dynamic assessment that includes prompts specifically targeted to the individual learner rather than following a series of pre-scripted prompts. Foreign Language Exploratory (FLEX) Program: A foreign language program intended to introduce another language and culture. FLEX programs are also called “exploratory programs” because in these programs one or more languages are taught and students get a chance to explore the language and culture of target language. Formative Assessment: Evaluating students’ language development to plan subsequent instruction and to provide useful feedback to students on their performance. Scripted Dynamic Assessment: The type of dynamic assessment where the prompts are designed towards learner needs in advance or are based on common learner errors. Summative Assessment: Evaluating students’ language development to provide overall feedback after certain amount of learning takes place.

202

 Dynamic Assessment in an Inclusive Pre-K FLEX Program

Universal Design for Learning (UDL): An instructional approach in which teachers consider diverse learners’ needs in designing instruction instead of providing only adjustments or modifications for individual students with special needs. Zone of Proximal Development (ZPD): The dynamic zone between what the learner can achieve alone and what the learner can accomplish in collaboration with the assistance of more capable peers.

203

204

Chapter 11

An Analysis of the General Certificate Examination Ordinary Level English Language Paper and Students’ Performance Achu Charles Tante University of Buea, Cameroon Lovelyn Chu Abang University of Buea, Cameroon

ABSTRACT This chapter sets out to analyse the Ordinary Level English Language Paper at the General Certificate of Examination from 2012– 2015 within the English-speaking sub-system in Cameroon. Five specific research objectives were formulated to guide the study that used the survey research design. The population of the study comprised of 45 English language teachers/examiners and 260 forms four and five students (approximately 14-15 years). Qualitative and quantitative data were collected. Two sets of questionnaires were developed for both teachers and students, and an interview guide for Head of Departments and examiners. Documentation was also employed such as past GCE questions from 2012–2015, end of marking subject reports, and O/L English language syllabus. Data analysed using the Pearson Product Moment Correlation showed that there was a correlation between assessment objectives, test content, test item development, assessment rubrics, and students’ performance in English language. Based on findings, certain recommendations were suggested.

DOI: 10.4018/978-1-6684-5660-6.ch011

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

INTRODUCTION Like most nations in Sub-Sahara Africa, Cameroon is a multilingual and multicultural country with more than 240 Home Languages (HLs), plus official and other foreign languages. No HL is used in schooling, as in other African countries (Tante, 2010a, Tante, 2010b). Historical evolution shows Cameroon was ruled first under the Leagues of Nations following the defeat of Germany in the First World War and continued by Britain and France as a United Nations’ Mandated Territory. East Cameroon (French speaking) gained independence from France in 1960, while Southern Cameroon (English-speaking) had hers in 1961 by joining East Cameroon. Due to different experiences in education, governance and law both countries formed a federation with each state being autonomous but with a federal government. This chapter is concerned with the English-speaking sub-system which was enshrined into the 1998 Law on Education that each education system would be run autonomously according to their inherited second languages., creating two sub-systems in Cameroon (In the English-speaking sub-system in Cameroon six years is spent in the primary school and seven in the secondary school. The first five years of secondary schooling ends up with a certificate examination called the General Certificate of Education (GCE O Level). Success in the GCE O Level enables a student to proceed to High School where success after two years of study gives a student a General Certificate of Education Advanced Level (GCE A Level). Success at the GCE A Level is lee way to Higher Education. These seven years are graded as ‘Forms’, that is Form 1 to Form 5 and Lower Sixth and Upper Sixth. To make it clearer I will describe in the ‘Introduction’ the structuring of the educational systemEnglish language has a huge implication for Anglophones since it is a subject on the school curriculum from nursery to Higher Education (English as a second language); it is used across the whole curriculum it is the language for formal and informal communication; it is the language for law, the media and business (Ayafor, 2005). The implication is telling in academic progress, transformative initiative in education, work mobility and generally to the Sustainable Development Goals (SDGs) targets. The government is making great efforts to improve learners’ competency and communication fluency, and accuracy because the syllabus aims at building users who would be all-rounded in English. A pivotal consideration then is making judgments regarding the level of a learner of the English language end-of-course examination. Pertinent questions arise, such as the aims and objectives of the exam, the content of the exam, tasks, and activities. So, there are many reasons for developing a concise understanding of the principles and practices of language testing. More and more young people in Cameroon, where English is used as the main language of instruction, find it challenging to access tertiary education because of poor performance in English language at the end of their post-primary schooling. In addition, there is a general frustration expressed by employers about the language inadequacy of employees. However, it has been observed that communication in English language in Cameroon often seems to suffer a lot from language usage problems (Nana, 2013). Statistics of English language scores at the GCE O Level show continuous below-average scores for years (see Table 1 below). One wonders if there may be a problem with the exam content, objectives, rubrics, organisation or testing techniques. This study then will attempt analysising the sub-tests that make up the English Paper at the GCE O Level Examination in Cameroon from 2012 to 2015 to find out if any trend could be drawn that may be useful not only in Cameroon but to other English as Medium of Instruction contexts.

205

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

Table 1. Evolution of performance at the GCE Ordinary Level Examination from 2012 - 2015 Subject

English Language

2012

2013

2014

2015

Registration

81365

Sat

80488

Passed

29417

% Passed

36.55

Registration

89898

Sat

88789

Passed

33781

% Passed

38.05

Registration

91639

Sat

89821

Passed

11910

% Passed

13.26

Registration

105328

Sat

103978

Passed

27276

% Passed

26.23

The continual poor English language achievement over the years leaves a lot to be desires. Regarding grade score per candidate, the trend is the same as in Table 2 beloow, where success is from grades ‘A’ to ‘C’. Table 2. Evolution of Performance at the GCE Ordinary level Examination (2012 – 2015) by Grade Year

Grade A

Grade B

Grade C

  Grace D

Grade E

Grade U

2012

312

4443

24733

11039

17576

22385

2013

97

2676

31016

16530

22426

16044

2014

04

372

11537

7940

25216

44749

2015

67

2084

25125

11267

28289

38796

Source: Cameroon GCE Board (2016)

A cautionary look at the number of failures and the low grades scored by most examinees makes one question the reason because English language is one of compulsory subjects with the highest coefficient and appears on the daily teaching timetable.

206

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

DESCRIPTION OF ORDINARY LEVEL ENGLISH LANGUAGE PAPER The examination content is made up of paper 1 and 2 with six subsets. Paper 1 comprises four subtests namely: listening comprehension, reading comprehension, grammar, and vocabulary whereas paper 2 has two subtests and the third subtest on directed writing, composition writing, and school-based and spoken English (but the last sub-test is not yet implemented). Each of the subtests has its general aims and objectives as stated in the examination syllabus. The various subtests also have their specific aims/ objectives and content material as prescribed. It tends to measure the four basic language skills. They are listening, speaking, reading, and writing using either integrated tasks or discrete tests. These skills are examined, even though they are disproportionately weighted (see Table 3). The test of speaking ability, which has been neglected in the English Language testing following the syllabus review held in 2010, saw the necessity and relevance of oral testing of English Language. Table 3 is summary of English Language Ordinary paper. Table 3. Distribution of the Sub tests in the Ordinary Level English Paper

1

2

Section

Name

Type of Question

Duration in Minutes

No of Questions Set

No of Questions to Be Answered

Marks

A

LC

25

9

9

B

RC

35

17

17

C

Gram

10

12

D

Vocab

10

12

A

DW

1 hour

1

1

30

B

Comp

Essay

1 hour

8

1

40

SB &Sp. Eng

Cont. Assess./ oral

F4-5 work 5-10 minutes

Open

Open

40

C*

MCQ

Problem solving

All

12

Weighting

30% of the total marks

12 60% of the total marks 10% of the total marks

Key: LC = Listening Comprehension, RC = Reading Comprehension, Gram= Grammar, Vocab = Vocabulary, DW = Directed Writing; SB = School-Based, SP. Eng. = Spoken English, MCQ = Multiple Choice Questions, Cont. Assess. = Continuous Assessment, Comp = Composition Source: GCE O/L English Language syllabus (2016).

AIMS/OBJECTIVES AND CONTENT OF ORDINARY LEVEL ENGLISH LANGUAGE PAPER The general aims and objectives of the O/L English Language paper are stated below. Aims: The general aims of the O/L English Language examination syllabus are: • • • •

To encourage the teaching and learning of the four language skills in an integrated manner; To encourage communication in speech and writing; To promote the use of English as a national and international language; To encourage extensive reading and listening, and responding in various ways.

207

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

Objectives: Candidates will be expected to demonstrate their ability to: • • • • • • •

Understand and convey information; Understand, select, order and present facts, ideas and opinions; Evaluate information in reading material and in other media and select what is relevant to specific purpose Recognize implicit meaning and attitude; Show a sense of audience and awareness of style in a variety of situations; Listen to, speak and write English that is acceptable in national and international circles; Exercise control of appropriate structures and conventions, including punctuation and spelling.

Assessment Objectives • •

Listening and reading comprehension, composition and Directed writing will assess knowledge, comprehension, application and analysis. Grammar and vocabulary will assess knowledge, comprehension and application of Bloom’s cognitive domain (GCE Regulations and Syllabuses, 2011).

AIMS / OBJECTIVES AND CONTENT OF THE DIFFERENT SUBTESTS Paper 1: The Syllabus Content Listening Comprehension Listening comprehension aims to: encourage and develop the listening skills, listening with understanding to varied materials, increase learner’s vocabulary and responding in various ways to textual materials on different topics. Objectives • • • • •

Listen to and respond in different ways to various textual materials. Infer meaning of words in context. Discriminate between sounds Discriminate between stress patterns Identify syllable components

Sub-test Content It comprises a listening comprehension text and questions set on it. The text will either be a story, a dialogue, an interview, a talk, a speech or a radio, recording of about 400 to 500 words number of questions 9 Multiple Choice Questions (MCQs) and will carry 9 marks.

208

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

Reading Comprehension Reading comprehension aims to encourage the teaching and learning of reading skills, reading with understanding of textual material on varied topics, increasing learner’s vocabulary and fostering a love for extensive reading. Objectives • • •

Test the ability to read various kinds of materials. Test the ability to respond in different ways to various kinds of materials. Test the ability to infer the meaning and use of words in context.

Sub-test Content The examination will be set on one or more passages totaling about 750 to 1000 words in length. The passages may not have the same subject matter. There will be 17 questions in this section; they will be of MCQ type and carry 17 marks.

Grammar Grammar aims at encouraging the learning of grammar which is the basis of the English language, encourage the application of rules of grammar, and improve communication since there is little effective communication without an adequate language base. Objectives • •

Encourage teachers and learners to give grammar the emphasis it deserves. Enable learners to understand and be understood.

Sub-test Content This section consists of 12 MCQs on grammar items and will carry 12 marks. It shall be weighed as 30% of the total subject marks.

Vocabulary Vocabulary aims at encouraging learners to increase their vocabulary use and also to pay more attention to words Objective • • • •

Enable learners to understand the meaning of words in context. Enable the learner to understand the use of words in context. Enable the learner to understand a wide range of idioms. Enable the learner to understand wide range of figures of speech.

209

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

• •

Enable the learner to use a wide range of idioms. Enable the learner to use a wide range of figures of speech.

Sub-test Content This section will consist of 12 MCQs on vocabulary items and will carry 12 marks.

Paper 2 Directed Writing This section aims at encouraging learners to extract relevant information from a text, encourage learners to use the information to write clearly and coherently a specified number of words and paragraphs, and also for learners to write with a purpose and for a particular audience. Objective • • • • •

Encourage learners to respond to stimulus material. Encourage learners to select and present facts. Encourage learners to write different text types. Encourage the slanting and formatting of information appropriately Encourage respect for words and paragraph limits.

Composition This section aims to encourage learners to write English accurately, with correct spelling, punctuation, grammar and good handwriting, and to express themselves in acceptable English. Objectives • • • • •

Enable learners to acquaint themselves with a variety of written modes. Enable learners to show relevance to the topic chosen. Encourage creativity in learners. Encourage logical and coherent arrangement of ideas in learners. Enable learners to express an appropriate point of view and show a sense of audience.

It looks like the GCE Examination is designed to capture phases in fluency and accuracy. The candidates are expected to recognise the language system and use the four language skills, which include listening, reading, speaking and writing. According to the Ordinary Level Subject Reports (2015) most candidates could not write consistently in one tense and many lacked appropriate vocabulary to express their thoughts effectively. Also, most of the candidates thought in their mother tongues or Pidgin English, and translated these thoughts verbatim into English, thereby imposing syntax of these languages on the English Language thereby leading to poor expression, poor organisation and accuracy and consequently poor overall performance. According to the Ordinary Level Subject Reports for GCE General Education

210

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

subjects (2015), most candidates could not write consistently in one tense and many lacked the appropriate vocabulary to express their thoughts effectively. Also, most of the candidates thought in their mother tongues or pidgin English and translated these thoughts verbatim into English, thereby imposing syntax of these languages on the English Language thereby leading to poor expression, poor organisation and accuracy and consequently poor overall performance. Therefore, it is important to carry out an analysis of the GCE Ordinary Level in English language to find out if the variables determine student performance.

RESEARCH OBJECTIVES The purpose of this study is to analyse the O/L English Language Paper at the GCE examination in relation to students’ performance. Specifically, the study has as objective: 1. 2. 3. 4. 5.

To find out if the test objectives of the subtests are in line with the test items of the different subtests. To investigate the test content of O/L English Language at the GCE examination. To investigate how the English Language test is developed. To find out if assessment rubrics in English Language affect students’ performance. To find out the differences in performance in the subtests.

Research Questions 1. Are the objectives of the subtests of O/L English Language in line with the test items of the different subtests? 2. What is the test content of the O/L English Language paper at the GCE examination? 3. How is the O/L English language test developed? 4. To what extent do assessment rubrics affect students’ performance in O/L English at the GCE examination? 5. To what extent do differences in performance in the subtests affect students’ performance?

LITERATURE REVIEW Conceptual Framework Variables are important to this study and need to be conceptualised include test objectives, test content, test development, test rubrics and student performance. The first terms to briefly consider are assessment and testing, two concepts sometimes used as synonyms but that may not necessarily be the case. . The term assessment in BANA (Britain, Australasia and North America) countries (Holliday, 1994, p. 12) is used in a variety of ways. Lynch (20011, p. 358) offers the following starting point: “[Assessment] has tended to be used as either a synonym for testing, a synonym for evaluation, or has signalled a broader collection of measurement techniques. As I use the term here, assessment will refer

211

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

to the systematic gathering of information for the purposes of making decisions or judgments about individuals.” The above is in agreement with Clarke (1998), Ioannou-Georgiou and Pavlou (2003), and Cheng et al (2004). Clapham (2000), on the other hand explains that “testing” is frequently used to apply to the construction and administration of formal or standardised tests and “assessment” to refer to more informal methods such as those referred to as alternative assessment. Berwick (1994) similarly identifies two categories: assessment concerned with pupils’ educational development and the outcomes of the educational process.The same is true of Lambert and Lines’ (2000, p. 5) dichotomy of traditional and alternative assessment. They divide their book into two main parts to represent what to them are the two “cultures” of assessment or what they describe as “assessment of education” and “assessment for education”. Many descriptors have emerged in conceiving assessment more for learning rather than of learning such as alternative assessment (Smith, 1995), assessment episode (Mavrommatis, 1997), assessment event/assessment incident (Torrance and Pryor, 1998), enabling assessment (Singh, 1999), empowerment evaluation (Fetterman et al., 1996 cited in Shohamy, 2001), critical language testing, democratic assessment, curriculum-based assessment (CBA) and dynamic assessment (Shohamy 2001). On the other hand, language testing according to Allen (2009) is the practice and study of evaluating an individual’s proficiency in using a particular language effectively. The activity of developing and using language tests as psychrometric activity, language testing traditionally was more concerned with the production, development and analysis of tests. Davies (2009)prejudges how recent critical and ethical approaches to language testing have placed more emphasis on the uses of language tests. In the English language GCE O Level examination an attempt is made to combine language performance and language cognition. In the present study, therefore, assessment is taken as the macro system and testing in the micro system underlying the purpose of the language judgment. Both words are used interchangeable in this study with the underlying conception of summative assessment. The purpose should align with decision whether language judgment may be summative or formative. For example, Rea-Dickens and Gardner (2000) classified assessment into three categories based on the uses of the assessment into formative, summative and evaluative assessment. Building on the classification Rea-Dickens (2001) conceives of three assessment functions. Firstly, bureaucratic dimension to assessment may exist within a school for internal monitoring system established, even though outcomes from the process are not disseminated beyond the immediate school context. So the bureaucratic identity deals with internal and external assessments that are not formative. Secondly, the pedagogic identity of assessment is driven by teaching needs. According to Rea-Dickins (2001) it is characterised by the needs of insiders, such as mainstream class teachers or a language support team, for assessment data. This is associated with pedagogic functions within schools where formative assessment contributes to knowledge about groups or individual learners and influences instructional decisions on this basis. The teacher devises assessment to meet his/her needs for information about individual learners’ language achievement, so that it is easier to plan more appropriately for individual learners Lastly, the learning identity of assessment is related to dimensions of assessment that reflect a primary concern with learning and learner needs (Rea-Dickins 2001). This assessment is internal to the school context and shares some characteristics with the teacher agenda one. However, it is distinct in that it focuses on learning through assessment and the role of the learner in this process. This kind of assessment is embedded within instruction and can be viewed as contributing to learning, not measuring 212

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

learning. It is concerned with developing learner awareness, understanding and knowledge. The learners become motivated as assessment is for learning. They become engaged in the interaction through which they are enabled to develop skills of reflection such as for self- and peer-monitoring, which helps them to reflect meta-cognitively on their own learning. The teacher moves through stages where the assessment activities are for promoting language, that is the informal end of the continuum. This is formative assessment and learning which could lead to scaffolding of learning. Turning to testing Davies (2009) statement indicates that the purpose of language test is to determine proficiency or aptitude in a language and that it equally discriminates person’s ability from others. Allen (2009) states that language testing is the practice and study of evaluating the proficiency of an individual in using a particular language effectively. The activity of developing and using language tests as a psychometric activity, language testing traditionally was more concerned with the production, development and analysis of tests. Allen (2009) argues that: A language test aims to determine a person’s knowledge and / or ability in the language and to discriminate that person’s ability from that of others. Such ability may be of different kinds, achievement, proficiency or aptitude. Tests unlike scale consist of specified tasks through which language abilities are elicited. The term language assessment is used in free variation with language testing although it is also used somewhat more widely to include for example classroom testing for learning and institutional examination. From the fore-going literature the GCE O Level English language examination has as purpose the end of course test of achievement.

The Concept of Objective Amin (2011) describes objectives as goals / aims / purposes to be attained in terms of what is to be achieved. He says objectives are normally broken down into specifics. Wolf (1984) defines objective as a statement of a desired change in the behaviour and attitude of a learner. He specifies that there are two categories of objectives that is, the general and specific objectives according to him, a general objective is a unitary statement and these general objective scan refer to internal states in the learner / candidate as indicated by use of terms such as “knows”, “understand,” and “comprehends” – provided these terms are classified either in an accompanying set of specific objectives. In formulating objectives, he postulated a number of guiding principles: • • • • • • •

Objectives should be stated in terms of the learner and not in terms of the teacher purposes or learning experiences. Statement of objective should have a verb, denoting the desired learner behaviour or state. Objective should be stated in terms that have uniform meanings. Objectives should be unitary statement Objectives should be stated at an appropriate level of generality Objectives should be related to the content and learning experiences provided. Objectives should be realistic in terms of the time available for instruction, the characteristics of the learners and the present state of knowledge.

Wolf (1984) further explains that an attempt to define exhaustively a general objective may give the erroneous impression that a general objective is nothing but the sum of secrete proficiencies, rather than 213

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

a generalized proficiency. Too detailed an analysis of a general objective may undermine the wholeness and consequently, the integrity of the objectives. One example of such over analysis is in the area of reading comprehension where attempts have been made to break this competency into a number of different abilities. Some of the specific skills that have been noted are ability to follow directions; identify main ideas and supporting details in textual materials, select essential information from a paragraph, answer questions related to the grammar, vocabulary and literary aspects of the passage, identify the writer’s purpose, etc. the above analysis may be helpful in suggesting instructional emphasis and specific test questions. Still it does convey the impression that reading comprehension is made up of many separate skills, which is not true. Reading comprehension is a fairly generalized ability that manifests itself fairly uniformly through a variety of specific abilities. For the current examination the content is outlined under ‘Aims / Objectives and Content’ of the different subtests

Test Content From a practical point of view, test design begins with decisions about test content, what will go into the test. These decisions imply a view of the test construct, the way language and language use in test performance are seen, together with the relationship of test performance to real-world contexts of use (McNanara, 2000). In major test projects, articulating and defining the test construct may be the first stage of test development, resulting in an elaborated statement of the theoretical framework of the test. According to McNamara (2000) establishing test content involves careful sampling from domain of the test, that is, the set of tasks or the kinds of behaviours in the criterion setting, as informed by an understanding of the test construct. The test content for the GCE O Level English language can be seen under ‘Aims / Objectives and Content’ of the different subtests.

Test Rubrics Huba and Freed (2000) state that, a rubric for assessment, usually in the form of matrix or grid is a tool used to interpret and grade students’ work against criteria and standards. An assessment rubric therefore is a matrix, grid or cross-tabulation employed with the intention of making expert judgments of students work both more systematic and more transparent to students. The instructions, or what Bachman and Palmer (1996) call “rubric” is an essential part of the task work plan. This is because they specify what the participants need to do to reach the outcome. According to Gagne et al. (2004), teachers make judgments about students everyday based on informal and formal appraisals of classroom work, homework, assignments and performance in quizzes and tests. Assessment rubrics for students’ achievement assist in evaluation by providing objective guidelines to measure and evaluate learning. Several common features of assessment rubrics can be distinguished, as stated by Stevens and Antonia (2013) cited in Gagne et al (2004). According to Andradem (2003) cited in Gagne (2004) the format of assessment rubrics are sometimes called ‘criteria sheets”, ‘grading schemes”, or “scoring guides” and they are excellent tools to use when assessing students’ work for several reasons. An examiner might consider developing and using a rubric if: he/she finds him/herself re-writing the same comments on several different students’ scripts or task. Again the examiner’s marking load might be high, and writing out comments take up much of his/her time. Also, students repeatedly question the examiner about the assignment requirements, even after handing back the marked assigned. A rubric is an assessment tool that clearly indicates marking criteria. 214

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

It can be used for marking assignments, class participation, or overall grades. Luft (1999) categories two types of rubrics: holistic and analytical. Holistic rubrics group several different assessment criteria and classify them together under grade headings. Analytic rubrics, on the other hand separate different assessment criteria and address them comprehensively. The top axis includes values that can be expressed with numerically or by letter grade. The side axis includes the assessment criteria.

Test Development Process It is generally said that test development is comprised of test making and test use, which are further broken down into three steps: basic planning, operationalisation and administration of these steps, basic planning is the most important when dealing with test development (McNamara, 2000). The Cameroon GCE Board as an examination body, takes into consideration the recent theoretical ideas of test development, which states that the major items (test objectives, test purpose, kinds of test, test format, test content, timing factor of test, administration of test) used in writing a test, so that terms such as test task, placement test, diagnostic test and achievement test will become more familiar to test takers. Of these steps basic planning is the most important when dealing with test development. The Cameroon GCE Board, conscious of its high-stakes nature of assessment has built into its assessment system measures to ascertain quality in assessment. Question moderation, typesetting of questions and proof reading f the type-set questions are done by seasoned experts following specific guidelines. Question setting as a whole takes into consideration the prescription of the subject syllabus, syllabus coverage, the strength of the questions as per the various ability groups and Bloom’s Taxonomy. For example, at the Ordinary Level, 530- English language demands that 40% of the question should be on knowledge, 40% on comprehension and 20% on application and analysis. This prescription is respected during questions moderation and proof reading to the letter (Ngole, 2010). Ngole (2010) reiterates the high premium put on syllabus coverage by the question set. Question printing and packaging are also carefully done under strict confidential cover. It should be noted that for each of this exercise, there are guidelines put in place to ensure quality in the face of quantity. A set of rules and regulations also governs the organization and conduct of the examination to ensure uniformity in application and tight invigilation, making sure that the subject master for each subject under scrutiny is excluded. Agbor (2006) ascertained that quality assessment of a program must not fall short of any measure that will “water” and bring down standards of program from the planning phase, execution phase (examination to the marking phase. He further emphasized that all what is requires to enhance standards must be minimized, whether it be human material or financial resources. With regards to human resources to guarantee quality assessment at the Cameroon GCE Board, not every teacher is eligible to mark the examination. Markers are selected by a panel of Board officials from a pool of applicants based on laid down criteria. During the marking, a panel of subject officials, (a trained group of subject experts called Assessors, Chief Examiners and Assistant Chief Examiners) preside over the marking of each subject. Their work is to follow up what the examiners are doing, check and moderate the marking of the scripts. However, before the actual marking commences, the examiners led by their subject officials, discuss the proposed marking guides for their subjects, up-dating them where necessary. This is followed by trial marking in conference. At the Advanced Level, one subject official supervises the work of 10 examiners while at the Ordinary level, subject official supervises the work of 15 examiners. Nevertheless, during marking, any examiner found wanting is immediately turned away and their scripts remarked. The control mechanisms are put in place so as to safeguard standard 215

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

and build confidence which leads to the credibility of the certificates obtained from the GCE Board (Cameroon GCE Board Regulation Guide, 2000 – 2010). Another crucial stage to consider in test development according to McNamara (2000) is test method. In other words, the way in which candidates will be required to interact with the test materials, particularly the response format, that is, the way in which candidates will be required to respond to the materials. Also, test methods will cover aspects of design together with issues of how candidates’ responses will be scored or rated. The fourth stage according to McNamara (2000) is the trial or trying out the test materials and procedures prior to their use under operational conditions. To him, this stage involves a careful design of data collection to see how well the test is working. In other words, a trial population will have to be found, that is a group of subjects who resemble in all relevant respects (age, learning background, general proficiency level, etc) the target test population. For instance, in discrete point test items, a trial population of at least 100, and frequently far more than this, is required. A careful statistical analysis is carried out of responses to items to investigate their quality and the concepts. In addition, test – taker feedback should be gathered from trial subjects, often by a sample questionnaire. This will include questions on perceptions of the level of difficulty of particular questions, the clarity of the rubrics, and general attitude to the material tasks. This is because according to him, subjects can easily spot things that are problematic about material which test developers struggle to see. Finally, materials and procedures will be revised in the light of the trails, in preparation for the operational use of the test. Data from the actual test performances needs to be systematically gathered and analyzed, to investigate the validity and usefulness of the test under operational conditions.

Performance Test Snowman and Biehler, (2000) as cited in Tambo (2012) distinguished the written tests from performance tests in that written tests assess how much students know while performance tests assess what students can do with what they know. Tambo (2012) goes further to classify performance tests in to four main classes as currently used in educational and examination institutions. These include: direct writing, observation of performance on the job, simulation and portfolios. Direct writing test type which according to Tambo (2012) often ask students to write an essay under conditions that are similar to those of real life. For example, candidates may be asked to write on the effects of constant electricity failure on Cameroon’s economy. They are given a duration to do the work (days or weeks). The resources they may choose to use, the resource persons, the procedure they adopt etc. However, they are aware of the criteria by which the essay will be assessed, such as length of the essay, will be assessed, such as length of the essay, the content etc. (Tambo, 2012). Observation of performance on the job is another type Tambo (2012) explains that this type of assessment requires students to perform in real life situation, what they have been taught in the classroom. Equally, in communicative language testing, McNamara (2000) states that, communicative language testing are intended to be a measure of how the test takers are able to use language in real life situation. Example of performance tests, student teachers are expected to teach lessons to real students in specific classroom. The major assessment instruments here include a check list or a rating scale. A check list consists of a list of things, criteria or steps that are expected of a student / candidate when the student carries out a performance.

216

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

Simulations are used when it is not possible or desirable to observe a student’s performance in real life situation. The act of simulating something first requires that a model be developed. This model represents the very characteristics or functions of the selected system when it is not possible to observe a student’s performance in real life. Simulation is used in many contexts such as in language testing, technology for performance optimization, safety engineering, training, education etc (Tambo, 2012). A portfolio as described by Tombari and Borich (1999) cited in Tambo (2012) is a planned collection of pieces or samples that document what a student has achieved in a learning task as well as the steps the student has taken to get there. Arter and Spandel (as cited in Tambo, 2012) also define it as a purposeful collection of a student’s work that tells the story of the student’s effort, progress or achievement with respect to define learning objectives. For example, a portfolio comprising two formal letters, three pieces of fiction writing, two poems, two rough drafts of a composition write up and a final draft of the composition may be used to assess a student’s writing skill at the upper secondary level. Here, the teacher and student are expected to work together in deciding the samples to be selected and the criteria for assessing those samples. The portfolio also functions as a means of some expression, self-reflection and self-evaluation for an individual student (Herbert, 1998).

METHOD The design was the survey research design that made use of the mixed method for triangulation. The target population consisted of all English language teachers of Forms 4 and 5, GCE O/L English Language examiners and Form 4 and 5 students (aged about 14-16) in both public and denominational secondary grammar schools in three administrative divisions in the North West Region of Cameroon. The sample size was made up of 45 teachers and 260 students drawn purposively and proportionately, while out of the three divisions, the accessible population was drawn from nine public secondary schools and six denominational secondary schools. Qualitative and quantitative data were collected for this study for instrumentation and data collection. Quantitative data was collected using two sets of questionnaires constructed for teachers/examiners and students. The questionnaires were constructed using the Likert scale response format of Strongly Agree (SA) = 4, Agree (A) = 3, Disagree (D) = 2, and Strongly Disagree (SD) = 1 for positively worded questions and vice versa for negatively worded questions. All the questionnaire items were closed-ended. The questionnaire for teachers/examiners consisted of 6 sections: Section A was based on items of demographic information, Section B item on specific research question 1, Section C was based on items on specific research questions 2, Section D on specific research question 3 and section E was based on research question 4. The students’ questionnaire consisted of two sections: Section A was made up of items of demographic profile of students and Section B on research question 5. While research question 6, was answered in the interview guide. After validating, the instruments were administered to the various schools. The technique that was used to administer the questionnaires and interview schedule were the direct delivery technique because the researcher wanted to have contact with the respondents in order to provide clarifications when necessary

217

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

FINDINGS Table 4. Teachers’ highest qualification Qualifications

Teachers N

%

DIPES I (Lower Teacher Diploma)

5

11.11

B.A

16

35.56

DIPES II (Advance Teacher Diploma)

22

48.89

M.A

2

4.44

TOTAL

45

100

Table 4 present the highest qualification of the respondents. The respondents’ qualification were of four categories beginning with the lowest qualification which is DIPES I with 5 respondents giving a percentage of 11.11%, next B.A with 16 respondents giving 35.56%, DIPES II is the second highest qualification with the highest number of respondents (22) giving a percentage of 48.89%. And finally M.A (Masters Degree) with the least number of respondents (2) and a percentage of 4.44%, giving a total of 100%. Table 5. Teachers’ teaching experience Teaching Experience

Teachers N

%

1 – 5yrs

4

8.89

6 – 10yrs

11

24.44

11 – 15yrs

17

37.78

16 – 20 yrs and above

13

28.89

TOTAL

45

100

Table 5 presents teaching experience of respondents. The teaching experience of respondents was divided into four periods. The first period was those who have taught from 1 – 5 years, were 4 in number giving a percentage of 8.89%, those of 6 – 10 years were 11 providing a percentage of 24.44%, those from 11 – 15 years were 17 giving a percentage of 37.79% and those of 16 and above were 13 of them giving 28.89% making a grand total of 100%.

218

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

Table 6. Teachers’ marking experience Teachers

Marking Experience

N

%

1 – 5 yrs

7

29.17

6 – 10 yrs

9

37.50

11 – 15 yrs

4

16.67

16 – 20 yrs and above

4

16.67

TOTAL

24

100

Figure 1. Teachers’ marking experience

Table 6 and Figure 1 show the marking experience of the examiners. Making experience of examiners was categorised into four main categories. 7 examiners have marked from 1 – 5 years (29.17%), 9 examiners have marked from 6 – 10 years (37.5%), 4 examiners have marked from 11 – 15 years (16.67%) and 4 examiners have marked from 15 and above (16.67%). Table 7. Gender of students Students

Gender

N

%

Female

143

55

Male

117

45

TOTAL

260

100

Table 7 presents data on the gender type of students. A total of 260 respondents, of which 143 were females (55%) and 117 were males (45%).

219

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

Table 8. Research question 1 N

Scores M

SD

The objectives of O/L English Language are directly drawn from candidates’ content and learning experiences.

45

3.11

.32

Objectives of O/L English Language are stated taking into consideration the cognitive, affective, and psychomotor domains.

45

3.22

.52

Objectives are reflected in the test items of the different subtests.

45

3.40

.50

The objectives are realistic in terms of the time available for instruction and the characteristics of the learners before the examination.

45

2.96

.47

TOTAL

45

3.17

.45

Table 8 answers research question 1. Item 1 indicates that out of the 45 respondents 5 of them strongly agreed that the objectives of O/L English Language are drawn from the content and learning experiences of the candidates. 40 respondents agreed to the same statement. Following the mean score and the standard deviation, the decision reached at indicated that all the respondents accepted that the objectives of O/L English Language are drawn from the content and learning experiences of the learners. Looking at the frequency of responses for item 2, there is a clear indication that objectives of O/L English Language are a clear indication that objectives of O/L English Language are stated taking into consideration the three domains of stating objectives, that is, cognitive, affective and psychomotor domains. Item 3 also indicates that all the respondents were of the opinion that the objectives are reflected in the test items of the different subtests as shown by the results gotten from the mean score and standard deviation. Item 4 indicates that 39 respondents agreed that the objectives are realistic in terms of time available for instruction and the characteristics of learners before the examination while 6 respondents totally disagreed. Therefore, the four items that treat research question 1 following the total result gotten from the mean score and standard deviation of the responses, it is very evident that the objectives of O/L English Language are in line with the test content and test items of the subtests. Table 9. Research question 2 N

Scores M

SD

The six subtests in O/L English Language are appropriate for the level of examination.

45

3.31

.56

O/L English Language test content materials are structured in such a way that they sufficiently measures the four language skills

45

3.38

.49

The test content materials for O/L English Language are complex for the level of the examination.

45

2.11

.65

Reading comprehension is well developed such that it covers all the objectives of reading.

45

3.33

.56

Topics on essay writing are familiar to candidates because they are gotten from societal issues and real-life situations.

45

3.73

.45

Teachers should be given equal opportunity in decision making with regards to test content selection.

45

3.33

.67

TOTAL

45

3.20

0.56

220

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

Table 9 treats research question 2. Research question 2, however deals with the test content of O/L English langue. Following the results of x and s, item 5 indicates that the six subtests in O/L English Language are appropriate for the level of the examination. Looking at item 6 on the table, it could be seen that all the respondents were of the opinion that O/L English Language test content materials are structured in such a way that they sufficiently measure the four language skills. Drawing from item 7 above, 39 (majority) respondents were against the fact that the test content materials for O/L English Language are complex for the level of the examination, while 6 (minority) were for the fact that the test content materials are complex. Following the respondents’ responses, it is clear that the test content materials are not complex for the level. From item 8 above, just 2 respondents out of 45 rejected the fact that reading comprehension is well developed such that it covers all the objectives of reading. However, the x and s shows that the objectives of reading are covered in the reading comprehension test. Again, item 9 shows that the candidate are familiar with the essay topics since they are gotten from societal issues and this item experiences of candidates. The results of this item attest to the fact that topics are gotten from societal issues thus candidates are familiar with the topics. All the respondents accepted the item. Finally, item 10 states that teachers should be given equal opportunity in decision making with regards to test content selection. Most of them (40 respondents) against 5 respondents were for the fact that teachers should be given equal opportunity in decision making concerning test content of O/L English Language. Table 10. Research question 3 N

Scores M

SD

Only experienced examiners, chief examiners and assessors are involved in the construction and development of test items.

45

3.09

.76

Teachers should take part in decision making on matters regarding the construction and development of test items.

45

3.60

.50

When developing testitems, the test developers take into consideration the age, and cognitive abilities of the candidates andrecourses available.

45

3.04

.56

The test items go through a pilot testing soas to test validity and reliability of the test.

45

2.38

.89

The test items are in line with the set objectives.

45

3.18

.39

TOTAL

45

3.06

0.62

Table 10 deals with research question 3, which shows the development of O/L English Language test. Interpreting the table, the frequency of responses for item 11 shows that 34 respondents were for the opinion that only experience examiners, assistant chief examiners, chief examiners and assessors are involved in the construction and development process of the test items. On the contrary, 11 respondents were against this view. Following the results as presented by the x and s of item 11, the decision shows that it was accepted that only experienced examiners, chief / assistant chief examiners and assessors are involved in the test development process. Analysing the frequency of responses of item 12, all the

221

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

respondents accepted the fact that teachers should take part in decision making on matters regarding the construction and development of test items. Drawing from the responses of item 13, the result shows that 39 respondents accepted that when developing the test items, the test makers take into consideration that age, cognitive abilities of candidates as well as resources available for the examination. On the other hand, 6 respondents were in total disagreement about this fact. This could be that these 6 respondents are not aware or are ignorant of the test development process. Again, item 14 indicates that there is controversy in the responses of the respondents. Out of the 45 respondents, 23 of them were of the opinion that test items go through a pilot testing so as to test the validity and reliability of the items. While 22 of them rejected the fact that test items go through a pilot phase. Following the result of this item, x is below 2.5 and s is .88649, the decision arrived at indicates that this item was not accepted by the respondents. From item 15, result indicates that all 45 respondents were in total agreement that test items are in line with set objectives. Therefore, looking at the 5items that treat research question 3, and the total x and s, it is very glaring that the test development process is not strange to the respondents. At least if not all the respondents were aware, majority of the respondents knew how the English Language test is developed. Table 11. Research question 5 N

Scores M

SD

Assessment rubrics have negative effect on performance.

45

2.67

.67

Candidates are aware of the assessment rubrics, yet they do not respect the rubrics.

45

3.18

.61

Assessment rubrics for O/L English Language are an effective tool for scoring the outcomes of students’ performance in English Language.

45

2.80

.50

The assessment rubrics for English language are complex such that candidates fail to follow and this has a negative effect on their performance.

45

2.24

.68

TOTAL

45

2.72

.62

Table 11 deals directly with research question 4. From the table, item 16 shows that assessment rubrics have negative effect on candidates’ performance. Majority of the respondents agreed that assessment rubrics have negative effect on candidates’ performance. Since the x is above 2.5 and the s .67420, the decision is agreed upon. From item 17, it is evident that candidates are aware of assessment rubrics, yet they ignore them. It was accepted by 40 respondents as oppose to 5 respondents who denied the fact that candidates are aware of the rubrics, yet they don’t respect the rubrics. Following the x and s, it is true that students are aware of these rubrics and yet ignore them. Moreover, item 18 which set to find out if assessment rubrics are an effective tool for scoring students’ performance. From the result of this findings, it is clear that the mean score (x) is above 2.5 while the standard deviation (s) is .50452. Thus, assessment rubrics are an effective tool for scoring students’ performance in O/L English Language.

222

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

Analysing item 19 indicates that the frequency of response to this particular item is 34 as against 11. 34 respondents disagreed that assessment rubrics for O/L English Language are complex such that candidates fail to follow and this has a negative effect on candidates’ performance while 11 accepted that assessment rubrics are complex. Therefore, analysing the items in table 12, the x and s result shows that to a larger extent assessment rubrics affect students’ performance in O/L English language. Table 12. Correlations of variables Variable

1

2

3

     1. English Language

-

     2. Language Content

.205

-

     3. Language Test

.398

.897

-

     4. Assessment Rubrics

.318

.897

.017*

4

-

*. Correlation is significant at the 0.05 level (2-tailed).

Table 12 presents a correlation of the objective of the study. These correlations of variables could be interpreted either horizontally or vertically which will arrive at a decision.

Interview Schedule Responses Looking at the interview schedule responses, the researcher used descriptive analysis so as to describe the exact words of the respondents. A total of seven respondents were randomly selected from the 45 respondents for the interview. The interview responses showed that all the respondents were of the opinion that the test items are in line with the set objectives. This is quoted from the exact words of the respondents “the objectives are in line with the test items of the subtests”. One of the examiners was emphatic that “the objective [tests] align with the subtest items”This is inconformity with the questionnaire responses that treat research question. Though some have some loopholes and may need some evolutions. The writing skill in particular was indicated by most of them to be adjusted. Following the interview schedule, item 2 was based on research question 2, which deals with test content of O/L English Language. According to the interview responses, item 2 tells us that O/L English Language test content lives up to the standard of the expectation when looking at it comparatively with other countries like Ghana, Nigeria, and Zambia.. They equally said the content is appropriate because it has laid down objectives and subtests that adequately test all four language skills. Again, item 3 on the interview schedule based on research question 3 treats the test development process. Interview responses explained how the test items are gotten and developed. It was explained that only examiners are asked to submit question whereby the chief examiners and assessors examine and moderate questions. They also say it is mandatory for all examiners to set questions and that those whose questions are selected are motivated by the GCE Board. Again, the GCE Board has a question bank where the questions are kept. These questions usually go through a trial testing before the candidates are tested. In addition, the teachers’ resource centre also at times provide questions to the Board; that is why they run regional mocks in which their main mission is to develop resources for students.

223

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

Items 4 and 5 which specifically answer research question 6. Research question 6 treats differences in performance in the different subtests. According to the interview responses, it was said that paper one which is made up of Listening Comprehension, reading Comprehension, Grammar and Vocabulary is scored electronically, though it may not be very clear to say if candidates perform well in paper one or paper two. But from the overall result, it was reported that candidates may perform better in paper one not because they are intelligent, but most often because they do guessing and gambling and at time having it right. In Paper Two where candidates are expected to write, ‘Directed Writing’ is the least scored according to 5 respondents. This is because Directed Writing is guided writing where candidates are directed to select relevant materials which pose problems to so many candidates first they fail to respect rubrics, paraphrasing and paragraphing become another issue, mismanagement of words which bring down their performance. It was also reported that most candidates find difficulty in formulating and quite a good number of them do uplifting of materials which indicates that their sense of initiation is still below expectation. Though some students manage to come up with creative writings which are good the problem with directed writing is instructions. As for composition writing, performance is usually above average this is because it was reported that it offers a wide range of variety and it squares with the average learner’s experiences since most of the topics for essay writing are familiar to the candidates in one way or the other. Finally, it was reported that the score for paper two is a little lower than paper one because here, they have to measure their score in terms of content, expression, accuracy, slanting and orderliness in presentation. The students do not have a mastery of the different aspects of grammar. Moreover, item 6 on the interview schedule treats students’ perception of O/L English Language. Following the interview responses, it was reported that the performance has always been embarrassing at the national level due to poor mastery of grammar, vocabulary, etc. the students assume that they know English, also the nonchalant attitude of teachers in delivering the content. Another respondent reported that he doesn’t see interest or zeal in his students in learning the English language for the purpose of GCE examination and those who put interest at least pass above average. Again, someone have to push the students to do one or two things. The students limit themselves only to GCE content. To some of the respondents, the students are interested in knowing the language but the issue is that it is a language where stereotype notes are not given to be reproduced in the examination so the students don’t have time because they assume that they know. He went further by saying that even when you sample opinions from candidates to list the subjects they will pass, they usually begin with English Language. This is very evident with the responses recorded by item 6 on table 14 where by 219 out of 260 respondents agreed that they are sure to pass O/L English Language at the GCE examination.

DISCUSSION Research Question 1: Are the objectives of O/L English Language in linewith the test content and test items of the different subtests? Four questionnaire items were developed to answer this research question. Following the results of the mean score and standard deviation, the decision indicated that it was agreed the objectives of the subtests are in line with the test content and items of the various subtests. However, the objectives of O/L 224

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

English Language generally state that Reading and Listening Comprehension, Composition and Direct writing will assess knowledge, comprehension, application and analysis. Grammar and vocabulary will assess knowledge, comprehension and application following Bloom’s cognitive domain of objectives. However, the affective and psychomotor domains are not stated directly on the objectives but they are indirectly tested via writing and rubrics. Candidates are asked to write on topics that may portray the affective aspects such as attitude interest and feelings, their motor skills are assess as mentioned in the rubrics as they are instructed to write out clearly and orderly. This is in line with Mkpa (1987) who categorises objectives in to three main categories. In addition, the objectives of O/L English language encourage listening, reading, writing and speaking skills and also that candidates should be able to speak and be understood. Inculcate excellent use of language in learners candidates are therefore to be trained to understand, select relevant information, discriminate sounds, use appropriate structures and conventions to achieve these objectives. The question here is; are the objectives achieved? To an extent the objectives are not achieved this is seen from the consistent poor performance in the subject over the years. This finding seems to justify Amin’s (2011) description of an objective which he states that objectives as goals, aims and purposes are to be attained in terms of what is to be achieved. Research Question 2: What is the test content of O/L English Language paper at the GCE Examination? Looking at table 10, 10 items were developed to treat research question 2. From findings, it was revealed that the respondents agreed that the six subtests are appropriate for the examination. Also, the content materials are not complex and well structured that they sufficiently measure the four language skills. It was also agreed that the topics for essay writing are familiar and are gotten from societal issues. The respondents were also for the opinion that teachers be given equal opportunity in decision making with regards to content selection. However, this is not what is obtained when it comes to content selection The test content of O/L English Language is made up of Paper 1 and Paper 2. Aspects of content include grammar, vocabulary, listening comprehension, reading comprehension, directed writing and composition writing. Paper one is made up of listening comprehension, reading comprehension, grammar and vocabulary. It comprised 4 sections with between 50 – 60 MCQ items. Listing comprehension passage comprised of 6000 – 700 words and the passage is divided can be set on a talk, a story, a speech, a lecture or an interview and the subject matter usually reflects the Cameroonian reality. Reading comprehension consisted of a passage of about 1500 – 2000 words. Between 201 – 2014, reading comprehension passage was a single passage while in 2015 it was made up of two passages. Paper 2 comprised of directed writing and composition writing. For directed writing, passages of about 600 – 700 words on either a talk, a report, an article interview, a letter (formal or informal); to write for a particular audience or the public and this piece of work is not supposed to be more than 150 words. According to the interview responses findings revealed that English Language test content is appropriate though some of the respondents are for the fact that the spoken test should go operational so as to assess if students’ performance in English language will improve. Moreover, candidates may be inspired by the fact that they have to speak directly to an audience this may spur them to learn more so as not to be embarrassed. Research Question 3: How is O/L English Language test developed?Discussing from table 11, 5 items were constructed to answer this question.

225

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

However, findings revealed that O/L English language test is developed by seasoned experts following specific guidelines. Literature also revealed that there is a test development cycle which test developers follow. The test development process begins with ground work and ends with evaluation and scoring. According to MaNamara (2000) there are four main stages involved in the test development process. Findings also indicated that item 13 as agreed by respondents, that test developers take into consideration the age cognitive abilities of the candidates. This falls under the very first stage of McNamara’s test cycle. Though most of the respondents denied that test items don’t go through pilot testing, the researcher disagreed with them. The researcher may want to believe that responses to this particular item 14 was due to their ignorance and nonchalant attitude towards the subjects. All in all, O/L English Language test goes through four main developmental stages as out lined by McNamara (2000). Research Question 4: To what extent do assessment rubrics affect students’ performance in O/L English Language at the GCE Examination? Findings from table 18 showed that there is a significant relationship between assessment rubrics and students performance in English Language. From the above findings, it was clear that assessment rubrics to an extent has an effect on students’ performance. Drawing from the mean score and standard deviation, the decision arrived at revealed that assessment rubrics thus have an effect on performance. Uddin (2014) in his study found out a significant impact of rubrics on students’ performance along with a strong positive attitude of both students and teachers towards the use of rubrics. Similarly, Diab and Baloa (2011) also researched on rubrics development for assessing critiques in English. They found out that the use of rubrics helped students effectively. All in all, it could be said that rubrics make the assessment process more valid and reliable. Literature on rubrics highlighted the necessity of developing and using rubrics. According to Uddin, rubrics help students focus on their works, produce work of high quality, earn better grade and feel less anxious about their task. It also helps them to know what the examiner expects from them. Rubrics also provide candidates necessary direction to check their own work, evaluate their performance. Besides, rubrics successfully reduce the uncertainty about candidate’s work. Also, it reduces significantly the possibility of adding unnecessary materials to their answers. Most of all, rubrics help students get immediate feedback identifying strengths and weaknesses immediately after immense implication on students’ performance. Research Question 5: To what extent do differences in performance in the subtests affect students’ performance? This particular research question was treated in the interview guide. Findings revealed that no detail could be analyse regarding performance in paper one subtests and paper two subtests. Reasons being that paper one is scored electronically so examiners don’t have direct access to test scores of paper one. However from their guessing, combining performance in paper one is higher than paper two. Again, it was reported that paper one is full of guessing and candidates may end up scoring all the MCQ items. Following the end of marking subject report, it was reported that directed writing has always been the least in terms of performance. This poor performance in directed writing over the years was due to the following reasons: • 226

Non-respect of instructions (rubrics) – words and paragraph limits.

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

• • • • • • • •

Inability to select all relevant materials. Lifting-indiscriminate copy of stimulus material. Poor mastery of language and slanting techniques such as transitional words like firstly, in addition, furthermore, nevertheless, however Poor mastery of the role they had to assume in the task. Poor mastery of format of talk, especially the contents of each part: salutation, opening and closing. Defining the rubrics marks distribution as sub headings for their answers (Content, Expression, Accuracy and Slanting). Uncancelled plan. Poor handwriting etc.

Again, subject report findings prove that for composition variety of topics are set; made up of eight topics ranging from discursive, argumentative, expository, descriptive, narrative, vague, to picture composition. These topics are considered relevant and familiar to candidates which enables most of them to choose easily. However, it was found out that candidates most often usually score above average for content. That notwithstanding, general performance is usually below average due to poor mastery of language especially writhing skills. It was also found out that poor performance in composition writing was due to the fact that most of the candidates could not write consistently in one tense and many lack appropriate vocabulary to express their thoughts effectively. Most often candidates think in their mother tongue or Pidgin English and translate these thoughts verbatim into English thereby imposing the syntax of these languages on the English language, for instance, “you are talking your own”, everything was doing me”, “I was shaming”, “they did not have my time”, your name is who”, “my own has come”, “you are looking for my trouble”, and “I ate until”,again, instead of writing continuously, candidates hand to list ideas. Also, many words were misused. Some examples are: Thought for taught, no for know, new for knew, leaving for living, classmaid for classmate, the for there, cut for caught, earn for end, live for leave. However, generally, though paper one is electronically scored, statistics have shown that paper one has always been average for paper two, the overall performance has consistently be below average.

CONCLUSION AND RECOMMENDATIONS As mentioned above, this study’s results imply a correlation between test objectives, test content, test development, assessment rubrics and students’ performance. However, the above findings revealed that English language performance is common. It has equally identified the areas of high performance and low performance. Again, the study revealed that the test content is appropriate and how effective assessment rubrics are on students’ performance. Suggestions for the ways of improving performance in English Language are thus necessary. To address the issue of poor performance in English Language, it is highly recommended that consistent use of well-developed rubrics can enhance outcome for English language candidates. Assessment rubrics will significantly help examiners and tests developers to be above the common suspicious of being bases in terms of giving unjustified grades to candidates once the rubrics are well developed. On the part of students, a well-developed rubric can help to the attainment of better grades. Candidates should be constantly reminded of the necessity of assessment rubrics to their performance.

227

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

Again, grammar and vocabulary do not need to be tested as separate skills at the GCE since they are tested in composition and directed writing. In these areas, vocabulary is needed, tenses, verbs, clauses, figurative expressions, etc, which are all aspects of grammar. The simple fact that it is MCQ, again defeats the purpose because it again leads to guessing as by interview findings. It seems a waste of time repeatedly assessing a particular aspect over and over. In addition, to test content, the proposed spoken English should only go operational when all necessary ground work and resources must have been put in place such as language laboratories, trained raters (to avoid biases and sentiments), machineries and a defined content to be tested. The fact that evaluation is not going to affect the candidates’ situation in written parts shows that there will be laxity and a nonchalant attitude towards it. For this reason, spoken English should be considered as part of the main examination. Furthermore, to the English sub-system of education, a pass in O/L English Language should be made a compulsory requirement for writing GCE A/L whether in arts or sciences. With these in place, candidates will double their effort and a change in attitude toward English language. Moreover, students need to improve on their handwriting as good facts presented in such a way that cannot be read does not communicate the candidate thoughts since the examiner will not find it very easy to read.The use of Pidgin English should be avoided in and around the school environment as it has a negative influence on expression. Parents are advised to encourage their children to speak English at home and should create conducive home climate.

ACKNOWLEDGMENT This research received no specific grant from any funding agency in the public, commercial, or not-forprofit sectors.

REFERENCES Allen, P. (2009). Definition. A paper presented at a competition at the University of Washington, University of Washington Press. Amin, M. E. (2011). Concepts of Testing and Principles of Test Construction. Paper Presented at the GCE Training Workshop on Multiple Choice Question, Bamenda. Ayafor, I. M. (2005). Official Bilingualism in Cameroon: An Empirical Evaluation of the Status of English in Official Domains [PhD Thesis]. Albert-Ludwigs-Universität. Berwick, G. (1994). Factors which Affect Pupil Achievement: The Development of a Whole School Assessment Programme and Accounting for Personal Constructs of Achievement [Unpub. PhD Thesis]. University of East Anglia. Bloom, B., Hastings, T., & Madaus, G. (1981). Evaluation to ImproveLeaning. McGraw-Hill. Brown, D. J. (2005). Testing in Language Program. McGraw-Hill. Brown, H. D. (2003). Language Assessment: Principles and Classroom. Pearson.

228

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

Chapelle, C. A., & Brindley, G. (2010). Assessment: An Introduction to Applied Linguistics. Hodder & Stoughton Ltd. Cheng, L., & Curtis, A. (2004). Washback or Backwash: A Review of the Impact of Testing on Teaching and Learning. doi:10.4324/9781410609731-9 Fetterman, D. M., Kaftarian, S., & Wandersman, A. (1996). Empowerment Evaluation. Academic Press. Flutcher, G., & Davidson, F. (2007). Language Testing and Assessment: An Advance Resource Book. Routledge and Francis Group. Gagne, R. M., Wager, W. W., Golas, K. C., & Keller, J. M. (2004). Principles of Instructional Design (5th ed.). Thomson Wadsworth. GCE Board Regulations and Syllabuses. (2011). Cameroon General Certificate of Education Board. GCE O/L English Language Syllabus. (2016). Cameroon General Certificate of Education Board. Huba, M. E., & Freed, J. E. (2000). Using Rubrics to Provide feedback to students. Learner-centred Assessment on College Campuses. Ioannou-Georgiou, S., & Pavlou, P. (2003). Assessing Young Learners. Resource Books for Teachers. Oxford University Press. Lambert, D., & Lines, D. (2000). Understanding Assessment: Purposes, Perceptions, Practice. Key Issues in Teaching and Learning Series. Routledge. Luft, J. A. (1999). Assessing Science Teachers as They Implement Inquiry Lessons: The Extended Inquiry Observational Rubric. Science Educator, 8(1), 9–18. Lynch, B. K. (2001). Rethinking Assessment from a Critical Perspective. Language Testing, 18(4), 351-372. Mavrommatis, Y. (1997). Understanding Assessment in the Classroom: Phases of the Assessment Process – the assessment episode. Assessment in Education: Principles, Policy & Practice, 4(3), 381–399. McNamara, T. (2001). Language Assessment as Social Practice: Challenges for Research. Language Testing, 18(4), 333–349. doi:10.1177/026553220101800402 Mkpa, M. A. (1987). Continuous Assessment Instruments and Techniques used by Secondary School Teachers. Total Publishers. Nana, G. (2011). Official bilingualism and field narratives: Does school practice echo policy discourse? International Journal of Bilingual Education and Bilingualism. Ngole, M. J. (2010). Increasing Enrolment at the Cameroon General Certificate of Education Examination and Challenges of Maintaining Quality Assessment. Journal of Educational Assessment in Africa, 13(2), 157–162. Ordinary Level Subject Reports for the GCE General Education. (2012-2015). GCE Board Buea Practices. Whiteplains, NY: Longman.

229

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

Rea-Dickins, P. (2001). Mirror, Mirror on the Wall: Identifying Processes of Classroom Assessment. Language Testing, 18(4), 429–462. doi:10.1177/026553220101800407 Rea-Dickins, P., & Gardner, S. (2000). Snares and Silver Bullets: Disentangling the Construct of Formative Assessment. Language Testing, 17(2), 215–243. doi:10.1177/026553220001700206 Shohamy, E. (2001). Democratic Assessment as an Alternative. Language Testing, 18(4), 373–391. doi:10.1177/026553220101800404 Singh, B. (1999). Formative Assessment: which way now? Paper Presented at the British Educational Research Association Annual Conference, University of Sussex at Brighton. Smith, K. (1995). Assessing and Testing Young Learners: Can we? Should we? IATEFL SIG Mini Symposium. Smith, R. M. (1991). IPARM: Item and Pearson Analysis with Rasch Model. Mesa Press. Stevens, D. D., & Levi, A. J. (2013). Introduction to Rubrics: An Assessment Tool to Save Grading Time, Convey Effective Feedback, and Promote Student Learning (2nd ed.). Stylus Publishing, LLC. Tambo, L. I. (2012). Principles and Methods of Teaching (2nd ed.). ANUCAM Limbe. Tante, A. C. (2010a). Young learners’ classroom assessment and their performance in English language in English-speaking Cameroon primary school. Journal of Educational Assessment in Africa, 4, 175–189. Tante, A. C. (2010b). The Purpose of English Language Teacher Assessment in the English-speaking Primary School in Cameroon. English Language Teacher Education and Development Journal, 13, 27–39. Wallace, C., & Davies, M. (2009). Sharing Assessment in Health and Social Care: A Practical Handbook for Inter-professional Working. Sage Publications. doi:10.4135/9781446215999 Wolf, M. R. (1984). Evaluation in Education; Foundations of competency Assessment and Program Review (2nd ed.). Praeger Publishers.

ADDITIONAL READING Bachman, L., & Palmer, A. (2022). Language Assessment in Practice: Developing Language Assessments and Justifying their Use in the Real World. Oxford University Press. Bachman, L. F., & Palmer, A. S. (1996). Language Testing in Practice: Designing and Developing useful Language Tests. Oxford University Press. Green, A. (2013). Exploring Language Assessment and Testing: Language in Action. Routledge. doi:10.4324/9781315889627 Hughes, A. (2003). Testing for Language Teachers (2nd ed.). Cambrige University Press. Peregoy, S. F. (2017). Reading, Writing and Learning in ESL: A Resource Book for Teaching K-12 English. Learners (7th ed.). Pearson.

230

 An Analysis of the General Certificate Examination Ordinary Level English Language Paper

Tante, A. C. (2013). Teachers’ approaches to language classroom assessment in Cameroon primary schools. Exchange: The Warwick Research Journal, 1(1), 1-10. http://exchanges.warwick.ac.uk Tante, A. C. (2015). Content of English Language Certificate Examination in Primary School in Cameroon: An Analysis. CARL Linguistics Journal, 6, 1–22. Tsagari, D., & Banerjee, J. (Eds.). (2016). Handbook of Second Language Assessment. De Gruyter Mouton. doi:10.1515/9781614513827 Xerri, D., & Briffa, P. V. (Eds.). (2018). Teacher Involvement in High-Stakes Language Testing. Springer International Publishing AG. doi:10.1007/978-3-319-77177-9

KEY TERMS AND DEFINITIONS Assessment: The broad concept involving making judgmental decisions on learners. This attempts to provide valid and reliable answers to what, why, who, and how questions. Assessment Rubrics: These are instructions that help in guiding the learner on expectations of a task, activity, or question. Rubrics instruct learners how their answers should be aligned to the questions. Generally, the literature stresses certain categories like formative, summative, ongoing, and alternative assessments. English Language: One of the official languages in Cameroon. It is used across the curriculum and for communication as English as a second language (ESL). Examination: A formal end-of-course judgment that determines a learner’s achievement, performance, or outcome. GCE Ordinary Level: The General Secondary Examination Ordinary Level marks the end of secondary schooling in the Cameroon English-speaking sub-system. It has a duration of five years and precedes entry into high school. Test Development: Language test development process is the stages which the development of a test goes through. From purpose, description of knowledge and skill, test specification, test items, piloting to evaluating test items. Test Objectives: This informs the reader about the aims or goals or the purpose of an assessment. Test Performance: A category of testing whereby skill is expected to be displayed more than knowledge. Other types of testing such as achievement, diagnostic and proficiency focus on different techniques.

231

232

Chapter 12

Evaluation of ESL Teaching Materials in Accordance With CLT Principles through Content Analysis Approach Muhammad Ahmad Government College University, Pakistan Aleem Shakir Government College University, Pakistan Ali Raza Siddique Government College University, Pakistan

ABSTRACT Owing to the rising needs of English language for communication at a global level, experts have stressed the significance of teaching English supported by materials based on communicative language teaching (CLT) principles to facilitate the development of communicative competence. This study, therefore, aims to evaluate ESL teaching materials to check their suitability to develop learners’ communicative competence. The study, for this purpose, employs content analysis approach for the analysis of text of English designed for class two in the light of a checklist devised on CLT principles. The results reveal that the content of the said textbook does not conform to the CLT principles. Therefore, it is not suitable to facilitate the development of communicative competence in the learners. The study suggests either to improve/revise the textbook or to replace it by another suitable one.

DOI: 10.4018/978-1-6684-5660-6.ch012

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 Evaluation of ESL Teaching Materials

INTRODUCTION English, together with Urdu, is treated as the official language in Pakistan where it enjoys the status of the language of power and is recognized as a language with more cultural capital than any other language spoken in Pakistan (Rahman, 2007). Business contracts, government documents, shop and street signs and other activities are maintained in English. Not only this, English is also the language of the court in Pakistan (Hasan, 2009). In addition, English is taught at all levels of education in Pakistan (Kausar, Mushtaq & Badshah, 2016; Panhwar, Baloch & Khan, 2017; Warsi, 2004). Many schools use local languages as well. However, main focus is on English as a second language in Pakistan. According to Punjab Education and English Language Initiative (PEELI, 2013), all public schools in Punjab, Pakistan will use English as a medium of instruction. According to Mansoor, the demand of English in higher education is very high (2005), therefore, English is used as a medium of instruction in higher education institutes in all subjects excluding language subjects (Mashori, 2010; Rahman, 2004). However, the focus of this chapter is primary school education level. With the worldwide spread of English, the knowledge of English language has got an extraordinary significance causing an increase in the teaching of English as a foreign or second language in many countries of the world (Ander, 2015; Crystal, 2012; Graddol, 2006) that further resulted in the availability and use of different teaching materials such as computer programs, electronic resources, movies, multimedia, paper based resources, pictures, songs and textbooks. The aim of all of these resources has been to create interactivity between teaching and learning of these resources. However, the role of textbook has always been more significant from the students’ as well as the teachers’ perspectives; i.e., from teachers’ perspective it has served as a reference whereas from students’ perspective the textbook has set the context for instruction (Ur, 2007). The same view has been shared by Richards who says that the textbooks help the teachers supplement their instructions whereas the textbooks help the students maintain their contact with the language (2001). In fact, the textbooks are pre constructed and fully specified contents which serve accountability interests by creating a certain amount of uniformity of what happens to the students as well as teachers in different classrooms (Prabhu, 1987) which, in the view of Chambliss and Calfee (1998), offer “a rich array of new and potentially interesting facts, and open the door to a world of fantastic experience” (p. 7). In EFL/ESL contexts, the textbooks serve as a universal component (Davison, 1976; Hutchinson & Torres, 1994). It not only “represents the visible heart of any ELT program”, but also offers many advantages to the learners as well as teachers (Sheldon, 1988, p. 237). In fact, textbooks serve different roles in an ELT curriculum i.e. they provide an effective source for material presentation, self-directed learning, activities as well as ideas, reference for learners and support for less experience teachers (Cunningsworth, 1995). Moreover, textbooks help the teachers save and spend time in worthwhile activities and decrease occupational over-burden by yielding a respectable return on investment (for, the textbooks are less expensive and involve low lesson preparation time as compared to teacher made materials) (O’Neill, 1982; Sheldon, 1988). Additionally, textbook saves the students from the danger of inexperience teachers (Kitao & Kitao, 1997; O’Neill, 1982; Williams, 1983). Hutchinson and Torres are of the view that the textbooks foster innovation by supporting the teachers against threatening and disturbing change processes, by introducing new methodologies as well as gradual changes and fostering scaffolding which helps the teachers create new methods on their own (1994). In addition, majority of the learners learn the language with the help of textbooks which according to Tomlinson (2010) serve as a guide for them to prepare for exams. 233

 Evaluation of ESL Teaching Materials

Textbooks have also been criticized in a number of studies. Researchers (Clarke & Clarke, 1990; Florent & Walter, 1989; Porreca, 1984; Renner, 1997; Sunderland, 1992) have criticized the EFL/ESL textbooks for depicting cultural as well as social bias. Some of the studies (Brusokaite, 2013; Clarke & Clarke, 1990; Durrani, 2008; Florent & Walter, 1989; Leo & Cartagena, 1999; Macleod & Norrby, 2002; Renner, 1997; Siren, 2018; Ullah & Skelton, 2013) have criticized the textbooks for promoting gender bias, sexism and stereotyping. The projection of cultural and social biases (e.g. gender bias, sexism and stereotyping) through EFL/ESL books, in the view of Brusokaite (2103), Renner (1997) and Sunderland (1992), may result in unequal sharing of power relations and female marginalization in target language cultures. Alptekin (1993) adds that target language culture works as a vehicle for language teaching through the textbooks therefore, it is essential to embed the language in its cultural base which exposes the learners to a completely unknown culture which causes alienation, stereotyping and resistance to the learning process. Similarly, Phillipson (1992) criticizes the language textbooks on the ground that the said textbooks promote Western (particularly British) enterprises with economic and ideological agendas. Gray (cited in Litz, 2005) however seems to defend the depiction of cultural as well as social elements in the language textbooks. He is of the opinion that the English language textbooks are the ambassadors of culturalartifacts. Therefore, the students should see the English language textbooks more than mere linguistic component and engage themselves more critically in their textbooks. In this way, Gray, adds that the learners will be able to improve their language skills for two-way information flow and cultural debates and discussions. The language textbooks have also been observed to be inappropriate in the view of many researchers (e.g. Block, 1991; Thornbury & Meddings, 1999). Block observes that the textbooks use conventional activities, and inappropriate as well as outmoded language (1991). In the view of Thornbury and Meddings (1999), textbooks paralyse learners’ ability to convey meaning since they encourage the reproduction of suggested language by the learners instead of enabling them to use their own imagination to use the words “as vehicles for the communication of their own meanings” (p. 12). Tickoo (2003) goes a step further saying: “textbook often acts as a constraint; it goes against my attempt to respond fully to the pupils’ needs. Its use also goes against learner creativity… … textbooks are invaluable supports” (p. 256). However, many researchers (e.g. Grant, 1987; O’Neill, 1982; Ur, 2007) seem to guard against the charges leveled by the textbook critics claiming the textbook as a valuable support to the learners as well teachers. Litz (2005) is of different view in this regard. He adds that, at this particular time, there is no consensus on this issue and “this would seem to warrant some degree of caution” (p. 7) in the use of English language textbooks in certain learning as well as teaching contexts. There has been a considerable influence of CLT on the teaching of English language for the last two decades. Therefore, English language materials have been devised based on the CLT principles (Ander, 2015), that have been successful to nurture communication and develop skills. CLT approach emerged in the 1970s as a reaction to grammar-based language teaching approaches, methodologies and syllabi (Aftab, 2012; Hymes, 1971; Savignon, 1972) which, by recognizing grammatical competence as an essential component of communication (Larsen-Freeman, 2001), developed a new understanding of grammar learning emphasising mainly on communicative skills and discovery-based learning (Thornbury, 2006) and providing the learners with a meaningful input of the target language vocabulary and forms (Hinkel & Fotos, 2001). In fact, CLT approach utilizes different approaches to teach a language with the help of fluency activities (Richards, 2001) and in this way, “grammar teaching in context means the emphasis is on communicative skills” (Ander, 2015, p. 44). 234

 Evaluation of ESL Teaching Materials

The CLT concept is based on the notions of competence (knowledge of language or language in mind) and performance (actual use of language by producing meaningful sounds or words) (Chomsky, 1965). The terms, competence and performance by Chomsky (1965), were later merged and explained as communicative competence by Champel and Wale (cited in Ander, 2015) and Hymes (1964, 1966, 1972) which referred to the grammatical knowledge of the users about morphology, phonology, and syntax of a language supplemented by the social knowledge about when and how to use the language appropriately. Hymes (1966) took Chomsky’s notion of competence as an abstract entity. For this reason, Hymes relied on the ethnographic exploration of the communicative competence which involved “communicative form and function in integral relation to each other” (p. 12). Later, Hymes (1971) added that the linguistic theory of communicative competence should be seen as the part of a more general theory involving culture as well as communication. According to Hymes (1972), communicative competence refers to the knowledge of a language and the learners’ ability to use it in terms of its appropriateness, context, feasibility, formality, and the performance of a language act. Therefore, the communicative competence, that is also known as ethnography of communication (Cameron, 2001; Hymes, 1964), is considered these days as one of the most significant theories which underlie the communicative approach to the teaching of a foreign or second language (Leung, 2005). CLT refers to an approach to a second or foreign language teaching with an aim to develop the communicative competence (Richards, Platt & Platt, 1992). In the view of Nunan, CLT encourages the learners to learn the target language by focusing mainly on the language learning experiences and incorporating personal experiences into the language learning environment. In this process, the teachers teach such topics as are out of the sphere of traditional grammar and the learners talk about their personal experiences with their class fellows resulting in the development of language skills about all types of situations (1991). According to Brown, in a CLT classroom, the teacher does not lead the class. Rather he simply facilitates as well monitors the activities. CLT lessons are theme and topic oriented and the main aim of the CLT has been to develop communicative competence (Hinkel & Fotos, 2001) that, in simple words, means “competence to communicate” (Bagarić & Djigunović, 2007), enables the learners to communicate in target language (Savignon, 1997). In this regard, three models have been presented. The first model has been presented by Canale and Swain (1980) which has further been modified by Canale (1983). In the view of Canale and Swain, communicative competence refers to the skill required for communication purpose as well as to the synthesis of the underlying system of knowledge. By skill, Canale and Swain (1980) mean an individual’s capacity to use his knowledge for communicative purpose. They explain knowledge, both conscious and unconscious, by dividing it into three types: (1) grammatical knowledge; (2) knowledge of how to use a language in different social contexts to perform communicative functions; and (3) the knowledge of how to combine communicative functions with utterances relative to discourse principles. Canale (1983) adds that the skill requires further to be differentiated between the underlying ability and its manifestation in the communication. The second model, presented by CelceMurcia, Dornyei and Thurrell (1995), interprets communicative competence in term of sociocultural content involving actional competence, discourse competence, linguistic competence, sociocultural competence and strategic competence. Similarly, the third model has been introduced by Bachman and Palmer (1995) and stresses on the effective use of language utilising: (1) organisational knowledge (i.e. grammatical and textual); and (2) pragmatic knowledge (i.e. functional and sociolinguistic knowledge).

235

 Evaluation of ESL Teaching Materials

This study aims to evaluate an English language textbook taught to the students of grade-2 at some private and all public schools in Punjab, Pakistani to see whether the said book is based on communicative language teaching principles or not and thereby determine the suitability of the textbook to the development of communicative competence in the learners. The principles of communicative language teaching have been summarized from Brown (2001) and Richards and Rogers (2007) i.e. (1) communicative language teaching classroom focuses on all the components of communicative competence e.g. discursive, functional, grammatical, strategic and sociolinguistic. Therefore, the goals of a CLT classroom should interlink the organizational features of a language with pragmatic aspects; (2) Such type of language techniques should be devised as may involve the learners in authentic, functional and pragmatic use of language: (3) Fluency should be given more importance than accuracy to involve the learners in a meaningful use of language; (4) Such type of tasks should be introduced as may develop such skills in the learners as may engage them receptively as well as productively in un-rehearsed contexts outside of the classroom; (5) The learners should be provided with such opportunities as might facilitate their own learning process by developing an understanding of their learning styles and developing suitable strategies for automated learning; and (6) the teacher should behave like a facilitator and encourage the learners to construct meaning through interaction. Textbook evaluation refers to a straightforward analytical process of matching i.e. matching of the learners’ needs to the available resources (Hutchinson, 1987). Tomlinson (2010) considers textbook evaluation as an applied linguistic activity which helps the administrators, material developers, supervisors and teachers to “make judgments about the effect of the materials on the people using them” (p. 15). Textbook evaluation is essential to provide the quality education (Allwright, 1981; Cunningswoth, 1995; Panezai & Channa, 2017), since; it helps identify the strengths and shortcomings of the texts, tasks and exercises included in the textbooks (Sheldon, 1988). In the view of Cunningsworth (1995), textbook evaluation ensures “that careful selection is made, and that the materials selected closely reflect [the needs of the learners and] the aims, methods, and values of the teaching program” (p. 7). In addition, textbook evaluation helps: the teachers acquire accurate, contextual, systematic and useful insight into the materials used in the textbooks (Cunningsworth, 1995; Ellis, 1997); improve the usefulness of the textbooks (Graves, 2001); develop and administer language-learning programs (McGrath, 2002) and facilitates in the selection process of the textbook (Tomlinson, 2010). All of these studies establish the rationale for the evaluation of the textbook in this study. In this regard, many studies have been conducted in the world which evaluated the textbooks from communicative language teaching perspectives. One of such studies was conducted by Tok (2010) who evaluated ‘Spot On’, an English language textbook taught in Turkish schools through a survey technique to highlight the shortcomings as well as the strengths of the said textbook. Majority of the respondents of the study gave negative views about the activities used in the textbook by declaring them as being meaningless practices which lead the researcher conclude that the activities, in the said book, did not improve communicative competence. In the similar context, Ander (2015) analyzed ‘New Bridge to Success’ to check its suitability in the light of CLT principles. For this purpose, the study utilized content analysis technique to identify the sub skills focused in the textbook and evaluate the language tasks included in different sections of the textbook. The results revealed that the textbook focused more on productive than receptive or grammar skills. So far as the tasks were concerned, the results showed that the textbook involved controlled, free and guided communicative tasks. On the basis of these results, the study concluded that the textbook did not represent the balanced distribution of different skills. 236

 Evaluation of ESL Teaching Materials

Aftab (2012) conducted a multidimensional research to evaluate Pakistani English language curriculum as well as textbooks. The results revealed that overall educational system was filled with shortcomings which were declared to be indirectly responsible for poor English language teaching-learning in Pakistan. Moreover, policies regarding curriculum as well as textbooks were also observed to be improper. In addition, activities included in the English language textbooks were found to be artificial as well as controlled. The study suggested to improve the training programs for teachers as well as textbook writers; enhance the process of curriculum development; and prescribe such textbooks as may facilitate English language acquisition. Shah, Hassan and Iqbal (2015) evaluated English language textbooks for Dar.e.Arqam school students of grades 6 and 7. The results revealed that the said books focused more on grammar which was less required whereas focused less on speaking skills which were mainly required by the learners. The study concluded that the textbooks did not meet the learners’ requirements; therefore, should either be improved or replaced by appropriate books. Kausar, Mushtaq and Badshah (2016) evaluated English Book-1 of short stories prescribed by Punjab Curriculum & Textbook Board (PCTB), Lahore, Pakistan for the students of grade-11 from the learners’ as well as teachers’ perspectives. They developed a questionnaire from the checklist devised by Litz (2005) to collect data for the study from 100 students and 10 teachers. The results, based on the respondents’ perceptions (about the textbook exercises, outline as well as planning, language type, language skills, theme as well topic and the overall view of the textbook), revealed that the said textbook did not meet the English language learners’ needs. Textbook’s content and exercises, outline and planning and organization were particularly found to be inappropriate. On the basis of its results, the study concluded that the textbook should be revised to make it suitable to the learners’ as well as teachers’ needs. Almost similar results had been reported by Naseem, Shah and Tabassum (2015) in their study on grade-9 English language textbook therefore. They also proposed to revise the English language textbooks. Arshad and Mahmood (2019) evaluated an 11th grade English language textbook taught in Punjab, Pakistan from CLIL Perspective, and found its content ignoring the development of listening and speaking skills. In fact, the elements of textbook and examination (in Pakistan) do not support CLT practices. Teachers are not trained to practice CLT methodology. Moreover, the textbook is patterned on GTM principles which emphasizes on the reading of given lessons, learning of grammar and decontextualized vocabulary and ignores listening, speaking and interacting reading and writing skills. To solve these problems, the textbooks should be based on CLT principles. The contents should facilitate communicative language teaching-learning approach which will directly affect the classroom proceedings (Khan, 2007). Akram and Mehmood (2011), in this concern, recommends that the textbook should be practical as well as functional. Zafar and Mehmood (2016) find that there is a less representation of international culture in Pakistani textbooks. Therefore, they propose the inclusion of international culture to make the learners aware of the international as well as national cultures. Some of the studies have also evaluated communicative language teaching in Pakistan; for example, according to Yaqoob, Ahmed and Aftab (2015), CLT faces many constraints in Pakistan. Such as mother tongue influence, large class size, shortage of time, non-supportive domestic environment, lack of motivation and oral exams. They have suggested teachers’ role and provision of facilities by the government to facilitate CLT environment in Pakistan. On the whole, the environment of English language teaching in Pakistan is not favourable (Panhwar, Baloch & Khan, 2017). Pakistani students, particularly from rural areas, are deficient in all the four language skills. They are unable to communicate in English. The reason is that Urdu is the mother (National) language of some of the people in Pakistan. They learn it as a first language. In this way, they have to learn English as a 237

 Evaluation of ESL Teaching Materials

second language. But there are many people in Pakistan whose first (mother) language is Punjabi, Sindhi, or Pushtu, and they learn English as a foreign language (while learning Urdu as a second language) (Warsi, 2004). Durrani (2016) adds that the students are more inclined to learn through GTM therefore, they show less favourable attitude towards CLT. Panhwar, Baloch and Khan (2017) enumerated different contextual problems (e.g. large class size and overuse of traditional teaching methods), as the constraints to the development of CLT environment in Pakistan. Keeping above theories as well as facts regarding EFL/ESL textbooks and CLT status particularly in Pakistan, this study aims to evaluate a textbook, titled ‘English-2’, (which is taught to the students of grade-2 in some private and all public schools in Punjab, Pakistan) through content analysis technique under five categories. For this purpose, it mainly concerns to answer the following question: 1. Is the content of ‘English-2’ suitable to facilitate the development communicative competence in the learners?

RESEARCH METHODOLOGY Research Design This is a qualitative type of research which utilizes content analysis approach to evaluate an English language textbook taught to the students of grade-2 in private and public schools in Punjab, Pakistan to see whether the content of the said textbook meets the requirements of CLT or not.

Content Analysis Content analysis, according to Berelson (1952), is “a research technique for the objective, systematic and quantitative description of the manifest content of communication” (p. 13) which is popularly used to analyse communication artifacts as well as documents in the form of audios, videos, pictures or texts (Bell, Bryman & Harley, 2018; Berelson, 1952; Krippendorff, 2018). It involves a systematic observation of communication artifacts as well as the reading of texts after assigning them codes or labels to highlight the meaningful aspects of the texts (Hodder, 2013). Content analysis facilitates to classify longer texts into a few categories (Ahuvia, 2001; Weber, 1990) which further help count the frequencies within each category (Ahuvia, 2001). This research studies the content under five categories (adopted from Kausar, Mushtaq & Badshah, 2016; Litz, 2005) i.e. (i) activities and tasks, (ii) skills (iii) language type, (iv) content and subject and (v) overall perception and limits itself to the qualitative content analysis technique for the evaluation of the said textbook to identify communicative language teaching features.

Material and Method Content of this study comprises of a textbook, i.e. ‘English 2’. This book has been prepared and published by Punjab Curriculum and Textbook Board (PCTB) under the supervision of the government of Punjab, Pakistan to be taught to the students of grade-2 in all public and some non-elite private sector schools.

238

 Evaluation of ESL Teaching Materials

This book has been written and reviewed by the experts in the field. Its content has been divided into 25 units. Realizing the importance of textbook evaluation, different researchers (see e.g. Allwright, 1982; Cunningsworth, 1995; Ellis, 1997; Graves, 2001; Hutchinson, 1987; Litz, 2005; McGrath, 2002; Panezai & Channa, 2017; Sheldon, 1988; Tomlinson, 2010) have stressed to evaluate the textbooks (see introduction section for details). The literature on textbook evaluation is very vast. Different researchers have introduced different procedures for this purpose (Hashemi & Borhani, 2015; Litz, 2005; Mohammadi & Abdi, 2014) and most of them (e.g. Aftab, 2012; Hutchinson, 1987; Litz, 2005; Sheldon, 1988; Ur, 2007; Williams, 1983) have presented checklists for this purpose. Therefore, this study utilizes a self-devised checklist (see Check list), based on communicative language teaching principle given by Brown (2001) and Richards and Rogers (2007), to find, identify, and analyze the content of the said textbook. Table 1. Checklist 1

Activities and Tasks 1.1

Does the textbook contain activities for information sharing, role play and problem solving?

1.2

Do the activities facilitate individual, pair and group work?

1.3

Do the activities introduce grammar points as well as vocabulary items in realistic contexts?

1.4

Do the communicative tasks facilitate grammar learning?

1.5

Do the communicative tasks facilitate independent and original responses?

1.6

Do the activities involve learners’ cultural practices?

2.1

Does the textbook facilitate the equal development of language skills for real communication purpose?

2.2

Does the textbook provide practices for natural pronunciation (e.g. stress intonation) required for communication?

2.3

Does the practice of individual skills facilitate in the integration of other skills?

2

Skills

3

Language Type 3.1

Is the language, used in the textbook, suitable to the real and life like use?

3.2

Does the textbook provide sufficient vocabulary items to be used in different situations for communication purpose?

3.3

Is the vocabulary, used in the book, related to the students’ culture and background?

3.4

Does the textbook facilitate functional use of language?

4.1

Does the textbook contain a variety of subjects and contents?

4.2

Do the contents, presented in the textbook, relate to the students’ life and interests?

4

Content and Subject

5

Overall Perception 5.1

Is the textbook suitable to provide opportunities for communication and interaction?

5.3

Does the textbook facilitate the use of language in as well as outside of the classroom?

5.4

Is the textbook suitable from communicative language teaching perspective?

No

Yes

No

Yes

No

Yes

No

Yes

No

Yes

Source: Author’s own compilation derived from Brown (2001), Kausar, Mushtaq and Badshah (2016), Litz (2005), Richards and Rogers (2007)

239

 Evaluation of ESL Teaching Materials

Checklist Preparation Some of the major theorists (see e.g. Brown, 1995; Cunningsworth, 1995; Litz, 2005; Sheldon, 1988; Williams, 1983) emphasize that the checklist should be devised on some established criteria involving: (1) physical features (i.e. layout, logistical and organizational characteristics); and (2) methodological features (involving aims and approaches to determine the organization of the material and its suitability to the learners’ needs); (3) culture as well as gender representation components; and (4) functional, grammatical, language skills, and linguistic features. Since the aim of this study is to analyse the textbook from communicative language teaching perspectives, therefore it mainly focuses on the criteria surrounding the last features i.e. functional, grammatical, and language skills. In this regard categories have been adopted from Kausar, Mushtaq and Badshah (2016) and Litz (2005).

Data Collection and Analysis Procedure The data for this study has been collected with the help of a checklist and analysed manually by the researcher by extracting examples of different categories from the textbook by simply reading it. 2.5.1 Level of Evaluation There are three levels of content evaluation; i.e., pre-use, in-use and post-use evaluation (Cunningsworth, 1995; Ellis, 1997). Pre-use evaluation predicts the potential performance of the contents for future use. Therefore, it is also known as predictive evaluation (Ellis, 1997; Litz, 2005; McGrath, 2002; Tomlinson, 2010). In-use evaluation examines the materials in current use and is also called retrospective evaluation. Similarly, post-use evaluation examines the effects of materials on the users. It is reflective in nature (Litz, 2005; McGrath, 2002; Tomlinson, 2010). This study aims to examine the content used in English-2 at ‘in-use’ level to check its effectiveness from communicative language teaching perspectives.

Limitation The content of this study is limited to one textbook only. Therefore, the results of this study are not generalizable.

RESULTS Activities and Tasks Activities are very significant for having cognitive value to promote learning through social interaction (Long, 1990; Vygotsky, 1978). Activities, which make the learning process pleasurable (Gak, 2011), are very beneficial from language learning perspective, for, they help: (i) increase the language use; (ii) enhance the quality of language use; (iii) provide with an opportunity to individualise instruction; (iv) provide with less threatening environment for language use; (5) and motivate the learners for language learning (Long, 1990). Therefore, such activities should be selected as may facilitate innovation as well as creativity among the learners to enhance their self-worth and competence focusing mainly on their needs (Gak, 2011).

240

 Evaluation of ESL Teaching Materials

The textbook does not contain role pay and problem-solving activities. However, some of the information sharing activities have been observed in the text; for example: 1. Maria and Hassan come home. Help them tell Baba the colour of fruits and vegetables. Most of the activities facilitate individual work. Such as: 1. Make words with are. 2. Read the rhyme. The textbook does not use such activities as can engage the learners in pair or group work. However, some of the activities have been found to introduce vocabulary items in a realistic way e.g.: 1. Maria and Hassan come home. Help them tell Baba the colour of fruits and vegetables. So far as the communicative tasks are concerned, the textbook does not contain any such tasks as may facilitate independent as well as original responses except the example given above this paragraph. Moreover, the activities do not involve learners’ cultural practices. All of the activities used in the textbook are artificial, controlled, and conventional and guided which need teacher’s help to be performed. All of the activities, most of which are practice activities, show that the teacher’s role is that of a director/guide whereas CLT principles determine teacher’s role as a facilitator. Moreover, the activities are meaningless and unsuitable to provide realistic contexts for the elicitation of real responses from the learners. Therefore, it can be said that the activities do not meet the required criteria to enable the learners to interact with the teacher or fellows to discuss their answers after working independently. In the view of Nunan (1991), this type of deficiencies is commonly found in the textbooks which can be overcome by task modification technique by the teacher.

Skills There are four main language skills; i.e., listening, speaking, reading and writing. During 1970s and 1980s, these skills were taught separately in a rigid order i.e. listening was taught before speaking. However, later it was recognized that people use more than one skill at a time which resulted in the integration of different skills in teaching-learning process (Holden & Rogers, 1997). The notion of integration of different skills was highly emphasized by the theoreticians as well as researchers (e.g. McDonough & Shaw, 2012; Swan, 1985), which resulted in the form of integrated and multi-skill materials for language teaching. Analysis of content reveals that the principle of equal development of all the four language skills i.e. listening, speaking, reading and writing has not been followed in the content of the textbook. The main focus is on reading, writing and speaking activities whereas, listening skill is completely ignored in the content. So far as the integration of difference skills is concerned, the content analysis shows that the textbook integrates different skills but in different proportion with main focus on the integration of reading and writing, reading, speaking and writing, and speaking and writing. However, integration of listening skill is ignored. Some of the examples, from the textbook, about language skills are given below: 1. Read the rhyme (Reading skill); 241

 Evaluation of ESL Teaching Materials

2. 3. 4. 5.

[read] Say the word. Write the middle letter in the blank (Reading, speaking and writing); Make [write] words with ‘are’. Say the words (writing and speaking); Look [read], say, write (reading, speaking, writing); and Look [read] and say (reading and writing).

CLT principles prefer fluency to accuracy. But the content of the textbook prefers accuracy to fluency i.e. it mainly emphasises on rule-based correction activities and tasks. For example: 1. 2. 3. 4.

Put the words in correct order. Remember that a sentence starts with a capital letter; Write the correct ‘oo’ word in each blank; Add a correct ending to write the plurals of the words below; and Put the words in correct order.

The content of this textbook involves multi-skills (mainly focusing on reading, writing and speaking) however; it ignores the listening skill completely. Therefore, the content of the textbook is not suitable to nurture language skills equally for the purpose of communication.

Language Type The language, used in the textbook, is not functional. There are no conversations or dialogues in the content. Main focus is on reading and writing skills which have been integrated with speaking. The textbook, however, introduces sufficient vocabulary items to be used in different communicative situations which are related to the learners’ background and culture. Some of examples, from the textbook, related with learners’ culture and background include: eid, masjid, pray, eidi, truck, rickshaw, bus, tractor, farm, tube well, village, doll, uniform, etc. Moreover, the vocabulary covers almost all walks of life such as; travel, sports, family, education, animals, fruits, colours, seasons, festivals, media, food etc. In addition, the progression of introducing the vocabulary items has also been positively observed. The textbook introduces alphabets first, then words, phrases and shorter sentences. However, the textbook fails to provide sufficient opportunities to use the vocabulary items in local as well as personal contexts through role play and problem-solving activities. In this respect, its material does not conform to the CLT principles.

Content and Subject The textbook covers a wide variety of contents and subjects such as: alphabets, consonants, diphthongs, diagraphs, verbs, prepositions, tenses, punctuation, nature, zoo, market, family, media, festivals, friendship, health, environment, seasons, etc. However, the content of the textbook is concerned with local as well as personal culture and ignores the depiction of target language culture. Owing to the inclusion of the local/personal contents, the textbook is appropriate whereas due to the exclusion of target language contents/subjects, the textbook is inappropriate for CLT classrooms.

Overall Perception The textbook is not in complete compliance with CLT principles. It does not show the presence of all of the points presented in the checklist. However, some of the principles have been noticed in the content, 242

 Evaluation of ESL Teaching Materials

yet they are not sufficient enough to provide the reason for the appropriateness of the textbook. Rather, most of the principles have been ignored. Furthermore, it does not provide sufficient opportunities for communication and interaction purpose. For these reasons, it is not suitable to facilitate the development of communicative competence in the learners. Different studies have given different reasons for these deficiencies i.e. deficient policies as well as curriculum (Aftab, 2012), and patterning of textbooks and teaching/learning based on GTM principles (Durrani, 2016; Khan, 2007).

DISCUSSION The aim of this has been to know whether the content of English-2 is suitable to facilitate the development of communicative competence in the grade-2 learners in Punjab Pakistan or not. In this regard, the content of the said book has been analyzed in five different categories i.e. (i) activities and tasks, (ii) language skills, (iii) language type, (iv) content and subject, and (v) overall perception. After a careful content analysis, it has been observed that the textbook includes some of the points which conform to the communicative language teaching principles i.e. (a) presence of information sharing activities, (b) focus on reading, writing and speaking activities, (c) progressive introduction of sufficient vocabulary items from different walks of life, (d) use of a wide variety of contents and subjects from local as well as personal contexts, and (e) facilitation of individual learning through activities. On the other hand, the analysis also reveals the said textbook fails to: (a) contain role play and problem solving activities (b provide sufficient opportunities to use the language in local as well as personal contexts through role play and problem solving activities; (c) use such activities as can engage the learners in pair or group work; (d) follow the principle of equal development of all the four language skills i.e. listening, speaking, reading and writing; (e) prefer fluency to accuracy; (f) focus on listening skill; (g) include target language culture; and (h) introduce functional language. Moreover, the activities, used in the textbook, are artificial, controlled, conventional and guided. These results show that the textbook fails to follow all of the communicative language teaching principles therefore; it is unsuitable to be taught to the learners. These results match with a number of international as well as national/local level studies. Such as the study by Tok (2010), on the textbook taught in Turkish schools, declares the activities used in the textbook as being meaningless practices which are unable to improve communicative competence. The study by Ander (2015) reports the imbalanced distribution of language skills in the textbook. The study also reports the textbook focusing mainly on productive skills (i.e. speaking and writing) and ignoring the receptive skills (i.e. listening and reading). Here, a slight difference is noted that the textbook, which is the subject of this study, focuses on reading, writing and speaking skills and ignores the listening skill only whereas, the study by Ander (2015) reports both reading as well as writing to be ignored. Similarly, the study by Aftab (2012), conducted on Pakistani English language textbooks, also reports the activities as being artificial and controlled. In addition, the study by Shah, Hassan and Iqbal (2015) finds that the textbooks, taught in a renowned private sector school in Punjab, Pakistan, focus more on grammar skills which are less requires and focus less on speaking skill which is most required. On the basis of these findings the study concludes that the textbooks do not meet the learners’ requirements. However, the results of this study seem to contradict a lit bit here i.e. the textbook of this study focuses on speaking skill along with grammar. The results of this study validate the results of another study by Kausar, Mushtaq and Badshah (2016) which reports the content and exercises of the textbook 243

 Evaluation of ESL Teaching Materials

to be inappropriate. The study also reports the content, outline, planning and organization of the textbook to be inappropriate. Outline, planning and organization cannot be compared here. The reason is that these categories have not been evaluated in this study. English, as a modern language, to which Graddol (2006) refers to as the first global lingua franca, has become the first language of the world (Northrup, 2013). And its use, particularly for communication purposes, at international (Northrup, 2013) and at global (Northrup, 2013) levels has greatly increased. For these reasons it is being most widely taught as a foreign/second language in the world (Crystal, 2012; Graddol, 2006), for which specific textbooks are being used as the main source to provide the suitable materials to the learners. The textbooks, which in the view of Prabhu are pre constructed and fully specified contents (1987), are supposed to help the students maintain their contact with the language (Richards, 2001) and provide an effective source for material presentation, self-directed learning, activities as well as ideas, reference for learners and support for less experience teachers (Cunningsworth, 1995). Being the member of a global community, Pakistan has also recognized the significance of English for communication with international partners. For this purpose, English is being widely used in Pakistan for different purposes (see e.g. Kausar, Mushtaq & Badshah, 2016; Mansoor, 2005; Mashori, 2010; Panhwar, Baloch & Khan, 2017; Rahman, 2004, 2007; Warsi, 2004). So, due to the extended use of English in different fields in Pakistan, has increased its significance in education at different levels. Majority of the public and private sector schools are using English as a medium of instruction in Pakistan (see Khan, 2018; Mansoor, 2005; Panhwar, Baloch & Khan, 2017; Rahman, 2004, 2007; Warsi, 2004). Despite these practices at a large scale, English language proficiency in Pakistan is not satisfactory (Shamim, 2008; 2011; Aftab, 2012; Warsi, 2004). Different studies report different reasons for this deficiency i.e. improper policies regarding curriculum as well as textbooks, poor language teaching learning and use of artificial materials in the textbooks (Aftab, 2012); preparation of textbooks on grammar translation principles (Khan, 2007); the textbooks do not meet the learners’ requirements (Arshad & Mahmood, 2019; Kausar, Mushtaq & Badshah, 2016; Naseem, Shah & Tabassum, 2015; Shah, Hassan & Iqbal, 2015); and learners’ inclination to learn through grammar translation method (Durrani, 2016; Panhwar, Baloch & Khan, 2017). Still another reason, which seems more relevant here, is that the elements of textbook and examination (in Pakistan) do not support communicative language teaching practices. Moreover, the teachers are not trained to practice communicative language teaching methodology. In fact, communicative language teaching encourages the learners to learn the target language by focusing mainly on the language learning experiences as well as incorporating personal experiences into the language learning environment (Nunan, 1991) and aims to develop communicative competence in a second or foreign language (Richards, Platt & Platt, 1992). Similarly, according to Brown, in a communicative language teaching classroom, the teacher does not lead the class. Rather he simply facilitates as well as monitors the activities. CLT lessons are theme and topic oriented and the main aim of the communicative language teaching is to develop communicative competence (Hinkel & Fotos, 2002) which, in simple words, means “competence to communicate” (Bagarić, & Djigunović, 2007) enables the learners to communicate in target language (Savignon, 1997). The situation seems to be averse in the content of the book of this study. It seems to ignore CLT principles. In fact, Pakistani education system, which Aftab (2012) refers to as being filled with shortcomings, has not succeeded so far to create environment conducive to communicative language teaching (Panhwar, Baloch and Khan, 2017). The reason is that CLT faces many constraints in Pakistan. Such as mother tongue influence, large class size, shortage of time, non-supportive domestic environment, lack 244

 Evaluation of ESL Teaching Materials

of motivation and oral exams (Yaqoob, Ahmed & Aftab, 2015) whereas, Panhwar, Baloch and Khan (2017) enumerate different contextual problems (e.g. large class size and overuse of traditional teaching methods), as the constraints to the development of CLT environment in Pakistan. This situation should ultimately be checked. In this concern, different concrete measures are required to be taken in general and related with the textbooks in particular i.e. improvement in the teachers’ training programs as well as textbook writers; enhancement of the process of curriculum development; and prescription of such textbooks as may facilitate English language acquisition (Aftab, 2012); improvement or replacement of the textbooks by appropriate ones (Arshad & Mahmood, 2019; Kausar, Mushtaq & Badshah, 2016; Naseem, Shah & Tabassum, 2015; Shah, Hassan & Iqbal, 2015); selection of such contents as may facilitate communicative language teaching-learning approach (Khan, 2007); selection of such textbooks as may facilitate functional as well as practical use of language and inclusion of target language culture in the textbooks (Zafar & Mehmood, 2016).

CONCLUSION The textbook does not follow the CLT principles since it does not: contain role play and problem solving activities; provide sufficient opportunities to use the language in local as well as personal contexts through role play and problem solving activities; use such activities as can engage the learners in pair or group work; follow the principle of equal development of all the four language skills i.e. listening, speaking, reading and writing; prefer fluency to accuracy; focus on listening skill; include target language culture; and introduce functional language. Moreover, the activities, used in the textbook, are artificial, controlled, conventional and guided. Due to these deficiencies, the textbook is not suitable to be taught from CLT perspective. This might pose a serious hurdle to the development of communicative competence in the learners. The study suggests to consider the matter seriously and take concrete measures to overcome the problem i.e. improvement in the teachers’ training programs as well as textbook writers; enhancement of the process of curriculum development; and prescription of such textbooks as may facilitate English language acquisition; improvement or replacement of the textbooks by appropriate ones; selection of such contents as may facilitate communicative language teaching-learning approach; selection of such textbooks as may facilitate functional as well as practical use of language and inclusion of target language culture in the textbooks. This research received no specific grant from any funding agency in the public, commercial, or notfor-profit sectors.

REFERENCES Aftab, A. (2012). English language textbooks evaluation in Pakistan [Doctoral Dissertation]. University of Birmingham.

245

 Evaluation of ESL Teaching Materials

Ahuvia, A. (2001). Traditional, interpretive, and reception based content analyses: Improving the ability of content analysis to address issues of pragmatic and theoretical concern. Social Indicators Research, 54(2), 139–172. doi:10.1023/A:1011087813505 Akram, M., & Mahmood, A. (2011). The need of communicative approach (in ELT) in teacher training programmes in Pakistan. Language in India, 11(5), 172–178. Allwright, R. L. (1981). What do we want teaching materials for? ELT Journal, 36(1), 5–18. doi:10.1093/ elt/36.1.5 Alptekin, C. (1993). Target-language culture in EFL materials. ELT Journal, 47(2), 136–143. doi:10.1093/ elt/47.2.136 Ander, T. (2015). Exploring communicative language teaching in a grade 9 nationwide textbook: New bridge to success [Master Thesis]. Bilkent University, Ankara, Turkey. Arshad, A., & Mahmood, M. A. (2019). Investigating content and language integration in an EFL textbook: A corpus-based study. Linguistic Forum, 1(1), 8-17. Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice: Designing and developing useful language tests (Vol. 1). Oxford University Press. Bagarić, V., & Djigunović, J. M. (2007). Defining communicative competence. Metodika, 8(1), 94–103. Bell, E., Bryman, A., & Harley, B. (2018). Business research methods. Oxford University Press. Berelson, B. (1952). Content analysis in communication research. Free Press. Block, D. (1991). Some thoughts on DIY materials design. ELT Journal, 45(3), 211–217. doi:10.1093/ elt/45.3.211 Brown, H. D. (2001). Teaching by principles: An interactive approach to language pedagogy. Pearson Education. Brown, J. D. (1995). The elements of language curriculum: A systematic approach to program development. Heinle & Heinle Publishers. Brusokaitė, E. (2013). Gender representation in EFL textbooks [Doctoral Dissertation]. Lithuanian University of Educational Sciences, Vilnius, Lithuania. Cameron, D. (2001). Working with spoken discourse. Sage (Atlanta, Ga.). Canale, M. (1983). From communicative competence to communicative language pedagogy. Language & Communication, 1(1), 1–47. Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics, 1(1), 1–47. doi:10.1093/applin/1.1.1 Celce-Murcia, M., Dörnyei, Z., & Thurrell, S. (1995). Communicative competence: A pedagogically motivated model with content specifications. Issues in Applied Linguistics, 6(2), 5–35. doi:10.5070/ L462005216

246

 Evaluation of ESL Teaching Materials

Chambliss, M. J., & Calfee, R. C. (1998). Textbooks for learning: Nurturing children’s minds. Blackwell Publishers. Chomsky, N. (1965). Aspects of the theory of syntax. MIT Press. Clarke, J., & Clarke, M. (1990). Stereotyping in TESOL materials. In B. Harrison (Ed.), Culture and the Language Classroom (pp. 31–44). Modern English Publications/British Council. Crystal, D. (2012). English as a global language. Cambridge University Press. doi:10.1017/ CBO9781139196970 Cunningsworth, A. (1995). Choosing your coursebook. Heinemann. Davison, W. F. (1976). Factors in evaluating and selecting texts for the foreign-language classroom. English Language Teaching Journal, 30(4), 310–314. doi:10.1093/elt/XXX.4.310 Durrani, H. (2016). Attitudes of undergraduates towards ggrammar translation method and communicative language teaching in EFL context: A case study of SBK women’s university Quetta, Pakistan. Advances in Language and Literary Studies, 7(4), 167–172. Durrani, N. (2008). Schooling the ‘other’: The representation of gender and national identities in Pakistani curriculum texts. Compare: A Journal of Comparative Education, 38(5), 595–610. doi:10.1080/03057920802351374 Ellis, R. (1997). The Study of second language acquisition. Oxford University Press. Florent, J., & Walter, C. (1989). A better role for women in TEFL. ELT Journal, 43(3), 180–184. doi:10.1093/elt/43.3.180 Gak, D. M. (2011). Textbook-an important element in the teaching process. Hatchaba Journal, 19(2), 78–82. Graddol, D. (2006). English next (Vol. 62). British Council. Grant, N. (1987). Making the most of your textbook: Vol. 11. No. 8. Longman. Graves, K. (2001). Teachers as course developers. Cambridge University Press. Hasan, S. A. (2009). English language teaching in Pakistan. Retrieved on June, 9 , 2 0 1 8 f r o m : h t t p : / / w w w. a r t i c l e s b a s e . c o m / l a n g u a g e s - a r t i c l e s / e n g l i s h - l a n g u age-teaching-in-pakistan- 1326181.html Hashemi, S. Z., & Borhani, A. (2015). Textbook evaluation: An investigation into “American English File” series. International Journal on Studies in English Language and Literature, 3(5), 47–55. Hinkel, E., & Fotos, S. (Eds.). (2001). New perspectives on grammar teaching in second language classrooms. Routledge. doi:10.4324/9781410605030 Hodder, I. (2013). The interpretation of documents and material culture. Sage (Atlanta, Ga.). Holden, S., & Rogers, M. (1997). English language teaching. Delti.

247

 Evaluation of ESL Teaching Materials

Hutchinson, T., & Torres, E. (1994). The textbook as an agent of change. English Language Teaching Journal, 48(4), 315–328. doi:10.1093/elt/48.4.315 Hutchinson, T. W. A. (1987). English for Specific Purposes: A learning-centred approach. Cambridge University Press. doi:10.1017/CBO9780511733031 Hymes, D. (1964). Introduction: Toward ethnographies of communication. American Anthropologist, 66(6/2), 1-34. Hymes, D. (1966). Two types of linguistic relativity. In W. Bright (Ed.), Sociolinguistics (pp. 114–158). Mouton. Hymes, D. (1971). Competence and performance in linguistic theory. In R. Huxley & E. Ingram (Eds.), Language Acquisition: Models and Methods (pp. 3–28). Academic Press. Hymes, D. (1972). On communicative competence. In J. B. Pride & J. Holmes (Eds.), Sociolinguistics: Selected Readings (pp. 269–293). Penguin. Kausar, G., Mushtaq, M., & Badshah, I. (2016). The evaluation of English language textbook taught at intermediate level. Gomal University Journal of Research, 4, 32–43. Khan, H. A. (2007). A needs analysis of Pakistani state boarding schools secondary level students for adoption of communicative language teaching [Master Thesis]. School of Arts & Education, Middlesex University, London, UK. Khan, R. M. B. (2018, May 20). English in Pakistan. The Nation. Retrieved on June 8, 2019 from https:// nation.com.pk/24-May-2018/english-in-pakistan Kitao, K., & Kitao, S. K. (1997). Selecting and developing teaching/learning materials. The Internet TESL Journal, 4(4), 20–45. Krippendorff, K. (2018). Content analysis: An introduction to its methodology. Sage Publications. Larsen-Freeman, D. (2001). Teaching grammar. In M. Celce-Murcia (Ed.), Teaching English as a Second or Foreign Language (3rd ed., pp. 251–266). Heinle & Heinle. Leo, R. J., & Cartagena, M. T. (1999). Gender bias in psychiatric texts. Academic Psychiatry, 23(2), 71–76. doi:10.1007/BF03354245 PMID:25416009 Leung, C. (2005). Convivial communication: Recontextualizing communicative competence. International Journal of Applied Linguistics, 15(2), 119–144. doi:10.1111/j.1473-4192.2005.00084.x Litz, D. R. (2005). Textbook evaluation and ELT management: A South Korean case study. Asian EFL Journal, 48, 1–53. Long, M. H. (1990). Task, group, and task-group interactions. In S. Anivan (Ed.), Language Teaching Methodology for the Nineties (pp. 31–50). SEAMEO Regional Language Centre. Macleod, M., & Norrby, C. (2002). Sexual stereotyping in Swedish language textbooks. Journal of the Australasian Universities Language and Literature Association, 97(1), 51–73. doi:10.1179/aulla.2002.97.1.005

248

 Evaluation of ESL Teaching Materials

Mansoor, S. (2005). Language planning in higher education: A case study of Pakistan. Oxford University Press. Mashori, G. M. (2010). Practicing process writing strategies in English: An experimental study of pre and post process teaching perceptions of undergraduate students at Shah Abdul Latif University Khairpur. English Language & Literary Forum, 12, 25–57. McDonough, J., & Shaw, C. (2012). Materials and methods in ELT. John Wiley & Sons. McGrath, I. (2002). Materials evaluation and design for language teaching Edinburgh textbooks in applied linguistics. Edinburgh University Press. Mohammadi, M., & Abdi, H. (2014). Textbook evaluation: A case study. Procedia: Social and Behavioral Sciences, 98, 1148–1155. doi:10.1016/j.sbspro.2014.03.528 Naseem, S., Shah, S. K., & Tabassum, S. (2015). Evaluation of English textbook in Pakistan: A case study of Punjab textbook for 9th class. European Journal of English Language and Literature Studies, 3(3), 24–42. Northrup, D. (2013). How English became the global language. Palgrave Macmillan. doi:10.1057/9781137303073 Nunan, D. (1991). Communicative tasks and the language curriculum. TESOL Quarterly, 25(2), 279–295. doi:10.2307/3587464 O’Neill, R. (1982). Why use textbooks? ELT Journal, 36(2), 104–111. doi:10.1093/elt/36.2.104 Panezai, S. G., & Channa, L. A. (2017). Pakistani government primary school teachers and the English textbooks of Grades 1–5: A mixed methods teachers’-led evaluation. Cogent Education, 4(1), 1–18. do i:10.1080/2331186X.2016.1269712 Panhwar, A. H., Baloch, S., & Khan, S. (2017). Making communicative language teaching work in Pakistan. International Journal of English Linguistics, 7(3), 226–234. doi:10.5539/ijel.v7n3p226 Phillipson, R. (1992). Linguistic imperialism. Oxford University Press. Porreca, K. L. (1984). Sexism in current ESL textbooks. TESOL Quarterly, 18(4), 705–724. doi:10.2307/3586584 Prabhu, N. S. (1987). Second language pedagogy (Vol. 20). Oxford University Press. Punjab Education and English Language Initiative. (2013). Can English medium education work in Pakistan? Retrieved June 6, 2019, from https://www.britishcouncil.org/peeli_report.pdf Rahman, T. (2004). Denizens of alien worlds: A study of education, inequality and polarization in Pakistan. Oxford University Press. Rahman, T. (2007). The role of English in Pakistan. In A. B. Tsui & J. W. Tollefson (Eds.), Language Policy, Culture, and Identity in Asian Contexts (pp. 219–239). Lawrence Erlbaum. Renner, C. E. (1997). Women are “Busy, Tall, and Beautiful”: Looking at sexism in EFL materials. Retrieved on June 7, 2019 from https://files.eric.ed.gov/fulltext/ED411670.pdf

249

 Evaluation of ESL Teaching Materials

Richards, J., Platt, J., & Platt, H. (1992). Dictionary of language teaching & applied linguistics. Longman. Richards, J. C. (2001). The role of textbooks in a language program. RELC Guidelines, 23(2), 12–16. Richards, J. C., & Rogers, T. S. (2007). Principles of communicative language teaching and task-based instruction. Retrieved on June 6, 2019 from https://www.pearsonhighered.com/assets/samplechapter/0/1/3/1 /0131579061.pdf Savignon, S. J. (1972). Communicative competence: An experiment in foreign-language teaching. Center for Curriculum Development. Savignon, S. J. (1997). Communicative competence: Theory and classroom practice: Texts and contexts in second language learning. McGraw-Hill. Shah, S. K., Hassan, S., & Iqbal, W. (2015). Evaluation of text-book as curriculum: English for 6 and 7 grades in Pakistan. International Journal of English Language Education, 3(2), 71–89. doi:10.5296/ ijele.v3i2.8042 Shamim, F. (2008). Trends, issues and challenges in English language education in Pakistan. Asia Pacific Journal of Education, 28(3), 235–249. doi:10.1080/02188790802267324 Shamim, F. (2011). English as the language for development in Pakistan: Issues, challenges and possible solutions. In H. Coleman (Ed.), Dreams and Realities: Developing Countries and the English Language (pp. 291–310). British Council. Sheldon, L. E. (1988). Evaluating ELT textbooks and materials. ELT Journal, 42(4), 237–246. doi:10.1093/ elt/42.4.237 Siren, T. (2018). Representations of men and women in English language textbooks: A critical discourse analysis of open road 1-7 [Master Thesis]. University of Oulu, Finland. Sunderland, J. (1992). Gender in the EFL classroom. ELT Journal, 46(1), 81–91. doi:10.1093/elt/46.1.81 Swan, M. (1985). A critical look at the communicative approach. ELT Journal, 39(1), 2–12. doi:10.1093/ elt/39.1.2 Thornbury, S. (2006). How to teach grammar. Longman. Thornbury, S., & Meddings, L. (1999). The roaring in the chimney. Retrieved on June 7, 2019 from http://www.hltmag.co.uk/sep01/Sartsep018.rtf Tickoo, M. L. (2003). Teaching and learning English: A sourcebook for teachers and teacher-trainers. Orient Longman. Tok, H. (2010). TEFL textbook evaluation: From teachers’ perspectives. Educational Research Review, 5(9), 508–517. Tomlinson, B. (2010). Principles of effective materials development. In N. Harwood (Ed.), English language teaching materials: Theory and practice (pp. 81–98). Cambridge University Press. Ullah, H., & Skelton, C. (2013). Gender representation in the public sector schools textbooks of Pakistan. Educational Studies, 39(2), 183–194. doi:10.1080/03055698.2012.702892

250

 Evaluation of ESL Teaching Materials

Ur, P. (2007). A course in language teaching: Practice and theory. Cambridge University Press. Vygotsky, L. S. (1978). Mind in society. Harvard University Press. Warsi, J. (2004). Conditions under which English is taught in Pakistan: An applied linguistic perspective. Sarid Journal, 1(1), 1–9. Williams, D. (1983). Developing criteria for textbook evaluation. ELT Journal, 37(3), 251–255. doi:10.1093/elt/37.3.251 Yaqoob, H. M. A., Ahmed, M., & Aftab, M. (2015). Constraints faced by teachers in conducting CLT based activities at secondary school sertificate (SSC) level in rural area of Pakistan. Education Research International, 4(2), 109–118. Zafar, S., & Mehmood, R. (2016). An evaluation of Pakistani intermediate English textbooks for cultural contents. Journal of Linguistics & Literature, 1(1), 124–136.

ADDITIONAL READING Brown, H. D. (2007). Teaching by principles. Addison Wesley Longman Inc. Larsen-Freeman, D., & Anderson, M. (2011). Techniques & principles in language teaching (3rd ed.). Oxford University Press. Prasad, B. B. N. (2013). Communicative language teaching in 21st century ESL classroom. English for Specific Purposes World, 14(40), 1–8. Rahman, M. M., & Pandian, A. (2018). A critical investigation of English language teaching in Bangladesh: Unfulfilled expectations after two decades of communicative language teaching. English Today, 34(3), 43–49. doi:10.1017/S026607841700061X Razmjoo, S. A. (2007). High schools or private institutes textbooks? Which fulfill communicative language teaching principles in the Iranian context. Asian EFL Journal, 9(4), 126–140. Richards, J. C. (2005). Communicative language teaching today. SEAMEO Regional Language Centre. Richards, J. C., & Rodgers, T. S. (2014). Approaches and methods in language teaching (3rd ed.). Cambridge University Press. doi:10.1017/9781009024532

KEY TERMS AND DEFINITIONS CLT (Communicative Language Teaching): It refers to an approach to a second or foreign language teaching with an aim to develop the communicative competence in the learners. EFL: English as a Foreign Language. ESL: English as a Second Language.

251

 Evaluation of ESL Teaching Materials

PEELI (Punjab Education and English Language Initiative): A roadmap for educational reforms in Punjab, Pakistan. Punjab: A province in Pakistan. Textbook: A written source of information for the students based on the syllabus of a particular subject intended to achieve learning outcomes. Textbook Evaluation: An applied linguistic activity which helps the administrators, material developers, supervisors, and teachers to make judgments about the effect of the materials on the people using them.

252

Section 4

Perspectives in Language Assessment

254

Chapter 13

Language Assessment:

What Do EFL Instructors Know? What Do EFL Instructors Do? Dilşah Kalay https://orcid.org/0000-0003-0681-3179 Kütahya Dumlupinar University, Turkey Esma Can Kütahya Dumlupinar University, Turkey

ABSTRACT Teachers/instructors have the critical role of bridging teaching and assessment, meaning the more knowledgeable the teachers/instructors are, the more effective the assessment becomes. This results in that language instructors are to integrate various assessment strategies into their teaching to make better decisions about the learners’ progress, which highlights the term “assessment literacy.” Besides language instructors’ being knowledgeable, what they do in classrooms deserves attention. Language assessment practices are strategies/methods instructors use in classrooms to reach to-the-point and objective evaluations of students’ language development. Within this scope, the purpose of the current study is two-fold: first, to investigate the language assessment knowledge of language instructors and, second, to identify their language assessment practices in classrooms. Based on the findings, it is critical to understand not only what language instructors know but also what they do in classes. As a result, the ultimate goal of standardization in language assessment could be attained.

INTRODUCTION Assessment has an indispensable role in both teaching and learning as it initiates the process like an engine (White, 2009). In other words, assessment gives teachers the chance to evaluate whether their classroom practices are effective or not. However, there is confusion about the terms “assessment” and “testing”. Tests are tools concerning administrative issues, and learners are aware that they will be evaluated; on the other hand, assessment is “an ongoing process that encompasses a much wider domain” DOI: 10.4018/978-1-6684-5660-6.ch013

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 Language Assessment

(Brown, 2003; p. 15). To clarify, even though assessment is usually accepted as tests conducted at the end of the courses in order to find out whether the course goals are accomplished, it also has a place at the very beginning of the curriculum development process since it is able to detect the problems learners might face. When it comes to foreign language learning/teaching, assessment becomes much more prominent. Language assessment is defined as “a broad term referring to a systematic procedure for eliciting test and non-test data for the purpose of making inferences or claims about certain language-related characteristics of an individual” (Purpura, 2016; p. 191). Language assessment is among the indispensable practices in language teaching/learning as it brings out the quality of instruction (Stiggins, 1999). Brown (2003) highlights how significant language assessment is in the following seven items: 1. Periodic assessments, both formal and informal, can increase motivation by serving as milestones of student progress. 2. Appropriate assessments aid in the reinforcement and retention of information. 3. Assessments can confirm areas of strength and pinpoint areas needing further work. 4. Assessments can provide a sense of periodic closure to modules within a curriculum. 5. Assessments can promote student autonomy by encouraging students’ self-evaluation of their progress. 6. Assessments can spur learners to set goals for themselves. 7. Assessments can aid in evaluating teaching effectiveness. As can be understood, language assessment practices are good informants of students’ language progress. However, what is the role of instructors in language assessment? Teaching and assessment are two concepts strictly related; they give information about one another and, as a result, develop each other (Malone, 2013). Teachers/instructors have a critical role in bridging teaching and assessment (Leung, 2014). That is to say, the more knowledgeable and skillful the teachers/ instructors are, the more effective and powerful the assessment becomes (Popham, 2009). This means that language instructors are expected to integrate various assessment strategies into their teaching in order to make better decisions about the learners’ progress, which highlights the term “assessment literacy”. Assessment literacy is defined as “teachers’ knowledge and abilities to apply assessment concepts and techniques to inform decision making and guiding practice” (Mertler & Campbell, 2005; p. 16). On the other hand, when it comes to language assessment literacy, Malone (2013) defines it as “language teachers’ familiarity with testing definitions and the application of this knowledge to classroom practices in general and specifically to issues related to assessing language” (p. 329). As can be understood, the main focus is on the teachers/instructors since they are responsible for developing assessment methods, administering those methods to the language teaching/learning process, interpreting the results, and making educational decisions (Stiggins, 1999). All in all, teachers/instructors are the leaders, facilitators, and directors in the assessment process, and hence it seems essential for them to have a certain degree of language assessment knowledge (Popham, 2006). In addition to what instructors know about the language assessment process, what they do in language classrooms deserves attention. Language assessment practices are strategies and methods utilized by teachers/instructors in language classrooms in order to reach to-the-point and objective evaluations of students’ language development. Tran (2012) puts forward that language assessment practices can be exemplified as tasks ranging from assessing learners’ overall and skill-based language ability to 255

 Language Assessment

evaluating a whole language teaching program. However, deciding on which practices to implement in language classrooms is an action that requires caution because of ethical concerns such as objectivity (Can, 2017). When objectivity and reliability of language assessment are concerned, “teacher individuality”, a term posing a threat to standardization in language assessment, comes forward (Shepard, 2000). In other words, the commonality of assessment practices should be built among language instructors so as to ensure standardization. The very first step to doing this is identifying the classroom assessment practices of language instructors. Keeping all these in mind, this chapter aims to investigate and discuss the knowledge and practices of foreign language instructors related to language assessment. Hence, the purpose of the current study is two-fold: The first aim is to investigate the language assessment knowledge of language instructors, and the second is to identify the language assessment practices of language instructors. In line with these objectives, the following research questions are posed: 1. What are the general and skill-based Language Assessment Knowledge (LAK) levels of the EFL instructors? a. Is there a relationship among their levels of skill-based LAKS? b. Is there any change in LAK level according to; i. years of experience, ii. educational background, iii. BA program graduated, iv. having a testing course in BA, v. training on testing and assessment, vi. being a testing office member? 2. What assessment purposes, methods and procedures do EFL instructors report using in their language assessment practices?

THEORETICAL BACKGROUND Principles of Language Assessment Language assessment can be considered as a challenging phenomenon for language instructors who are not specifically trained in this field. However, language teaching and language assessment are notions which cannot be considered without each other; thus, language teachers need to possess a sufficient amount of information in terms of language assessment (Ölmezer-Öztürk, 2018). According to Taylor (2006), assessment can be called “the art of the possible” (p. 58). Having said that, there are some principles which ought to be followed in order to establish healthy and successful assessment practices, and these principles can be counted as validity, reliability, practicality, washback, and authenticity (Brown, 2003). For Bachman and Palmer (1996), reliability, validity, authenticity, and practicality are among the factors which constitute the usefulness of an assessment method. Primarily, the importance of the validity of an assessment tool needs to be highlighted. Validity can be put forward as the rate an assessment tool is able to measure what it is supposed to measure (Henning, 1987). In other words, as long as an assessment tool focuses on what it is supposed to focus on and measures students’ success in that skill or topic correctly, it can be assumed to have validity. Content 256

 Language Assessment

validity deals with whether or not an assessment tool assesses the information and the skills taught in that particular course (Hughes, 1989). The next principle, which is also one of the main contributors to good assessment practice, is reliability. Bachman and Palmer (1996) define reliability as “consistency of measurement” and add that a reliable assessment tool will be consistent under different assessment situations (p. 19). In other words, reliability shows the ability of an assessment tool in terms of producing similar outcomes with the same students at different points in time. If one test produces similar scores in different situations when administered to the same group, it can be called reliable (Ölmezer-Öztürk, 2018). According to Taylor (2006), reliability means the consistency, accuracy, and dependability of assessment scores. As the name itself suggests, the following principle, which is practicality, refers to an assessment tool being practical. To be more precise, Brown (2003) claims that practicality refers to an assessment tool being of assistance in terms of cost, time, administration process, and scoring. Likewise, Taylor (2006) puts forward that the practicality of an assessment tool indicates how practical it is in terms of the resources it requires (p. 56). Another principle, washback, can be defined as the influence of assessment on the teaching methods and learning practices of the students (Hughes, 1989). Similarly, Taylor (2006) describes the washback effect as a phenomenon that refers to the influence of testing on classroom practice. According to Brown and Hudson (1998), this washback effect can be positive or negative depending on how much the objectives of the curriculum and assessment practices match each other. As the final principle, Bachman and Palmer (1996) explain authenticity in assessment as “the degree of correspondence of the characteristics of a given language test task to the features of a target language use task” (p. 23). Namely, authenticity refers to an assessment tool being authentic and genuine, and in order for that to happen, there should be a natural language, meaningful topics, meaningful tasks which come out within context, and particular themes in assessment tools (Brown, 2003). To sum up, these principles, validity, reliability, practicality, washback, and authenticity, are quite essential in terms of establishing healthy assessment practices in language classrooms. While each of these factors is valuable on its own terms, they form good assessment practices when they come together. For this reason, training language teachers and assessment practitioners related to these principles is a must (Brown & Hudson, 1998). Moreover, the reliability and validity of an assessment tool can be monitored and improved with the help of training raters, putting out clear criteria, triangulating critical phases of assessment making, and making use of multiple sources of information (Brown & Hudson, 1998).

ASSESSMENT OF LANGUAGE SKILLS Assessment of student’s language knowledge and abilities is a demanding task, which requires knowledge and the competency of many different skills. Kyriacou (2007) highlights “assessment of students’ development” as one of the many skills a foreign language teacher should possess, indicating that assessment of student work requires giving regular and consistent feedback to the students, making use of a wide range of tasks, directing students to carry out self-assessment, helping students carry out peerassessment, conducting an assessment of student strategies and pinpointing common difficulties that students experience in order to find solutions. Even though the items above can be considered necessary for language assessment in general and should be mastered, instruction and assessment of each language skill might differ in terms of the methods 257

 Language Assessment

and assessment tools used (Brown & Hudson, 1998). This chapter will attempt to discuss some assessment practices of four skills: reading, listening, speaking, and writing. Since reading is a receptive skill, it can be claimed that it is only possible for instructors to observe their students’ reading comprehension with the help of reading subskills (Hubley, 2012). Brown (2003) makes a categorization of reading assessment methods by stating four different task types such as perceptive, selective, interactive, and extensive. Perceptive tasks can be listed as read-aloud tasks, picturecued tasks, or some multiple-choice items, enabling learners’ reading skill development at low levels. For selective reading tasks, true-false exercises, multiple-choice questions, and matching exercises can be counted. These kinds of tasks tend to focus on the formal issues of the language, such as grammar and vocabulary. The next one, which is interactive reading tasks, requires the reader to comprehend the meaning of a text and be in interaction with it; hence, editing, ordering exercises, and information transfer exercises fall into this category. Finally, extensive tasks are made out of longer texts requiring students to perform reading strategies like skimming, scanning, note-taking, and outlining (Brown, 2003). Similarly, Cheng, Rogers & Hu (2004) state that short answer questions, matching exercises, true-false questions, multiple choice questions, cloze tests, sentence completion, summaries, student journals, peer and self-assessment can be listed as effective reading assessment methods (p. 370). Similar to reading, the assessment of listening is a challenging process. Anderson and Lynch (1988) point out that the difficulty of a listening task depends on many factors, such as the text itself, the student who does the listening, and the context of the listening (p. 81). Furthermore, Anderson and Lynch (1988) mention other factors which may have an effect on the assessment of the listening skill, such as topic familiarity, explicit information, and type of input (p. 86), and add that the purpose of the listening tasks, the response that is being required from the listener and support materials are essential contributions to the assessment of listening (p. 87-90). Hence, it can be concluded that carrying out the assessment of listening is an arduous task with many aspects to consider. Whether or not students are able to interpret what they listen correctly can be assessed with the help of multiple-choice questions, short answer questions, note-taking, dictation, and preparing summaries (Hughes, 1989; Cheng et al., 2004). With the help of these methods, a receptive skill can be transferred into assessable chunks. According to Brown and Hudson (1998), receptive skills are generally assessed with the help of the selected-response assessment type of tasks. True-false exercises, matching, and multiple-choice questions fall into this category, and even though these types of assessments are not easy to create for the test-makers, they are quite convenient to grade as they do not require much production of the language from the student. Because of the practicality of grading and the excessive focus on receptive skills in proficiency exams, selected-response assessment tasks are mostly preferred over other assessment tasks (Brown & Hudson, 1998). On the other hand, speaking, which is a productive skill, can be assessed in more direct ways compared to reading and listening. O’Sullivan (2012) listed some common speaking assessment approaches; individual interviews, student monologues, and group conversations. Cheng et al. (2004, p. 374) point out that verbal presentations, interviews, dialogues, discussions, peer assessments, and public speaking are among the assessment practices that can be utilized. Taylor (2006) claims that the types of tasks that require interactive communication are helpful for assessing students’ speaking skills; and adds that range, accuracy, grammar, vocabulary, coherence, relevance, being able to produce comprehensible speech, intonation, and strategic competence are among the factors which are observed in speaking testing (p. 53).

258

 Language Assessment

Even though writing assessment might seem less demanding compared to other skills, Weigle (2012) states that assessment of writing is more challenging than it is thought to be. Tribble (1996) points out that teaching writing is a process during which the teacher assumes distinct roles such as audience, assistant, evaluator, and examiner; hence these roles and the process itself have a considerable influence on the type of the assessment carried out and the feedback given (p. 133). Moreover, the assessment of a written product needs to be based on “multiple yardsticks,” in other words, the approach to the assessment should cover many different points (Carroll & West, 1989 as cited in Tribble, 1996). Looking at this information can be helpful in understanding the challenge of the writing assessment. Harmer (2004) points out that different instruction and assessment types can be utilized depending on the writing approach a teacher employs. In other words, the product-oriented writing approach and process-oriented writing approach are different methods which require different forms of assessment. While teacher feedback is preferred chiefly in product-oriented writing classes, peer assessment and self-assessment can be considered alternative assessment forms in process-oriented writing classrooms (Harmer, 2004). However, there are some common rules and methods to consider for the writing assessment. Taylor (2006) lists some components which need to be analyzed in writing assessment, which are accuracy, spelling, punctuation, content, organization, cohesion, range of structures, vocabulary, register, format, and the influence on the reader (p. 53). Cheng et al. (2004) mention essays, editing pieces of writing, matching items, student journals, peer feedback, self-assessment, and student portfolios as practical assessment tools for the writing skill (p. 372). Brown (2003) lists four different writing assessment tasks, which are imitative, intensive, responsive, and extensive. Imitative writing tasks aim to focus on mechanics and form, while intensive writing tasks concentrate on meaning and form, putting most of the emphasis on form. Next, responsive writing tasks intend to assess both meaning and form, whereas extensive writing tasks have a bigger spectrum and aim to evaluate all skills and competencies related to writing. Brown and Hudson (1998) state that while assessing productive skills like writing and speaking, constructed-response assessments and personal-response assessments are effective as they focus on the construction and the production of the language and add that fill in blank questions, short answer questions, and tasks that require student performance can be counted as constructed response assessment tasks while portfolios, self-assessment, and conferences can be listed as personal-response assessment. One important thing that should be mentioned about these kinds of evaluations is the fact that there is a risk that the grader might be subjective; hence, it needs a good framework in order to establish reliability and validity (Brown & Hudson, 1998). Although there are some common rules and criteria, the assessment of each skill differs from each other based on some factors that have been mentioned above. However, the main issue in the evaluation of language skills is to train teachers in a way that they can carry out reliable and valid assessment practices in their classrooms. (Taylor, 2006).

Language Assessment Literacy Assessment of student learning might be one of the most crucial parts of being a language teacher since it is a significant indicator of to what extent the objectives of a course have been achieved. For this reason, language teachers should be trained and get knowledgeable about language assessment (Malone, 2013).

259

 Language Assessment

Most of the time, classroom practitioners are the sole assessor of their students’ performances, and to reach a healthy verdict regarding the students’ performance, these practitioners should possess a sufficient amount of knowledge and skills related to assessment (Cheng, 2004; Popham, 2006; Ölmezer-Öztürk, 2018). In other words, they need to be “literate” in language assessment since assessment literacy has been associated with how much knowledge a practitioner holds regarding assessment (Berry, Sheekan & Munro, 2019). Originating from the notion of assessment literacy, language assessment literacy specifically focuses on instructors’ knowledge, practices, and skills related to the assessment of language (Malone, 2011; Malone, 2013; Sevimel-Şahin & Subaşı, 2019). Malone (2013) describes language assessment literacy as being both knowledgeable about terms and issues regarding language testing and utilizing this theoretical knowledge in practice. As a result of the rising popularity of formative assessment practices, language assessment literacy has also become much more prevalent (Sevimel-Şahin & Subaşı, 2019). Davies (2008) states that skills and knowledge, along with principles, are three important contributors to teachers’ language assessment literacy. In this theory of Davies’ (2008), educating teachers in required fields and methodologies enable them to improve their skills related to assessment, while knowledge refers to possessing related information about language assessment and instruction. Moreover, principles refer to notions which are essential, such as fairness, reliability, and impact (Davies, 2008). Moreover, Ölmezer-Öztürk (2018) points out that language assessment literacy can be located where knowledge and skills are met. Similarly, Lam (2019) indicates that knowledge of assessment is at the heart of language assessment literacy, and it is followed up with other vital factors such as designing materials, grading, and giving feedback. These definitions indicate that language assessment literacy is composed of many factors, and language assessment knowledge, skills, principles, and practice can be counted among those factors. As assessment is an indispensable part of language teaching, teachers need to be educated so that they can be “literate” in assessment (Malone, 2013; Sevimel-Şahin & Subaşı, 2019).

Language Assessment Practice Even though language assessment has been described in lots of different ways, when it comes to language assessment practice, every teacher, course, and context is unique, making the practice of assessment different and unique in each context. This uniqueness ought to make it troublesome to comprehend; however, there have not been many studies conducted which aim to delve into understanding the assessment practices of teachers (Zhang & Burry-Stock, 2003). There are some factors that contribute to the uniqueness of assessment practice. For instance, teachers’ beliefs and attitudes may influence their preferences for assessment types or the way they carry out an assessment activity in their classroom. (Acar-Erdol & Yıldızlı, 2018). Moreover, teachers’ assessment practices may be influenced because of the level, the courses they teach, and also the curriculum as well as the programs they are required to follow (Zhang & Burry-Stock, 2003). In spite of the uniqueness and individuality in assessment practice, there is some common ground in the form of some categorizations and rules. For instance, Brown and Hudson (1998) indicate that language assessment methods can be grouped into three different categories, which are selected-response assessment tools in which students choose their answers from a set of given choices such as true-false questions and multiple-choice questions; constructed-response assessment tools, which require the students to construct their own answers to the questions such as fill in the blanks questions and short 260

 Language Assessment

answer questions; and personal-response assessment tools, in which students answer with their own responses and opinions such as peer assessment or portfolio tasks (p. 658). This categorization of assessment tools actually indicates the context-depended nature of language assessment practice and hints at the uniqueness of assessment practices, too. Brown and Hudson (1998) also point out that there are a few contributing factors in how teachers decide on the assessment practices they are going to employ in their teaching practice and go on to add that the possible washback effect of the assessment practice, feedback giving process and the opportunity to make use of as many sources of information as possible during the process should be the primary concerns while deciding on assessment methods and tools. Also, Zhang and Burry-Stock (2003) list a range of notions that form the backbone of any assessment practice for any teacher, and their list consists of paper-pencil tests, performance tests, standardized tests, grading exams, reporting exam results to the students, and reflecting on the results of testing in order to make decisions related to teaching plans. What is more, computer-based assessments should be included in this list as a vital assessment method (Brown, 2003). Another categorization of assessment practice based on its objective can be put forward as formative assessment and summative assessment (Brown, 2003). The growing tendency to implement more constructive and student-oriented approaches in foreign language classes paves the way for the rise of formative assessment techniques in the classrooms, and these formative assessment methods lead to more individuality in teachers’ assessment practices (Özdemir-Yılmazer & Özkan, 2017). The shift toward more formative assessment practices brings along the need for teachers to become fluent in new methods that encompass the entire learning process, such as peer feedback, self-assessment, and group projects (Acar-Erdol & Yıldızlı, 2018). Another benefit of the formative assessment methods is the fact that they can also lead students to become more autonomous because they need to take control of the process of their own learning experiences. (Özdemir-Yılmazel & Özkan, 2017; Acar-Erdol & Yıldızlı, 2018). In addition, Taylor (2006) claims that the “deficit” model, which means measuring how far students are away from the desired outcomes, has left its place to a “can do legacy”, which can be explained as highlighting the good sides of student performance and focusing on points of improvement. It can be concluded that the popularity of formative assessment methods may be one of the reasons for this shift. Not only do different assessment practices give information related to student improvement, but they also enable language teachers to notice their own strengths and weaknesses along with the good sides and improvable sides of the curriculum. In other words, while some assessment practices help measure the student’s development and evaluate teaching outcomes, some practices might help the teachers reflect on their own teaching methodology and improve themselves (Cheng et al., 2004; Acar-Erdol & Yıldızlı, 2018). Cheng et al. (2004) put forward a categorization of assessment practices as student-centered and teacher-centered and indicate that gaining information related to students’ development, presenting feedback to students, finding out strengths and weaknesses of the students, giving grades, motivating students, documenting the development of the students, and preparing students for standardized tests can be listed as student-centered assessment practices (p. 367). As for the teacher-centered assessment practices, Cheng et al. (2004) point out that designing instruction, diagnosing strengths and weaknesses of their own teaching, and grouping students at correct levels of instruction can be counted (p. 367). To conclude, although assessment practices can be defined and put into many distinct categories, such as formative and summative; teacher-centered and student-centered; or selected-response, constructed-

261

 Language Assessment

response, and personal response, they are still context-bound and dependable on the teachers’ beliefs and attitudes, courses, syllabi, curriculum, and the students.

METHODOLOGY Research Design Adopting a descriptive research design, the current study elaborates on quantitative data, which is collected through two different questionnaires, namely “Language Assessment Knowledge Scale (LAKS)” developed by Ölmezer-Öztürk (2018) and “Teachers’ Assessment Practices Survey” created by Cheng et al. (2004). With the findings of these questionnaires, it is attempted to identify and present what language instructors know about language assessment and what they do in reality (in language classrooms).

Participants and Research Setting The present study is carried out in order to present a general picture of both language assessment knowledge and practices of EFL instructors working at the School of Foreign Languages of a state university in Turkey. The participants of the study are composed of 31 EFL instructors, selected according to the nonprobability sampling method, in which the researcher chooses the subjects as they are available and convenient in the research context (Creswell, 2012). The demographic information about the participants is presented in Table 1. Table 1. Participant profile Participant Profile

f

%

Gender

Male Female

8 23

25.8 74.2

Year of Experience

1-5 years 6-10 years 11-15 years 16-20 years More than 21 years

2 13 8 8 0

6.5 41.9 25.8 25.8 0

Bachelors’ Degree

English Language Teaching (ELT) Non- ELT

25 6

80.6 19.4

Educational Background

BA degree MA degree Ph.D. degree

13 14 4

41.9 15.2 12.9

Apart from the demographic features of the participants, their experiences with “language assessment” is also examined, as illustrated in Table 2.

262

 Language Assessment

Table 2. Experience with “language assessment” Experience

f

%

Being a testing office member

Yes No

19 12

61.3 38.7

Had a separate assessment course in pre-service

Yes No

24 7

77.4 22.6

Training on language testing/assessment

Yes No

16 15

51.5 48.4

Data Gathering Instruments The data for the present study is collected through two different scales. Firstly, in order to examine the language assessment knowledge levels of instructors, the Language Assessment Knowledge Scale (LAKS) developed by Ölmezer-Öztürk (2018) was utilized. This scale, which consisted of 60 items, was designed as a part of a Ph.D. thesis, and it was highly reliable, considering Ölmezer-Öztürk (2018) found the Cronbach alpha coefficient of this scale as .91. For the present research, the reliability analysis was also carried out, and it was revealed that Cronbach’s reliability coefficient was .82, which proved to be robust (Taber, 2018). The first part of the scale asked about the participants’ demographic features, such as age, work experience, and education, as well as their experiences with language assessment/ testing. On the other hand, the second part focused on the assessment knowledge based on four skills; namely listening, reading, writing, and speaking. With this scale, participants were expected to respond to each item by options “true”, “false”, and “don’t know”. Participants are graded based on their correct answers. They got no points when the answers were “false” and “don’t know”. On the other hand, for the second aim of the study, in order to identify EFL instructors’ language assessment practices, a survey questionnaire developed by Cheng et al. (2004) was applied. The questionnaire comprised four sections: 1) assessment purpose, 2) assessment methods, 3) procedures of assessment, and 4) personal/professional information (eliminated for the present study). With this questionnaire, participants were supposed to check the most appropriate option for themselves (as their classroom assessment practices) in each subsection. The assessment practices were listed based on four language skills, which enabled the skills-based analysis in the questionnaire. The reliability analysis for the language practices survey was also conducted, and the Cronbach alpha coefficient was found to be .87, supporting the reliability of the scale.

Data Collection, Preparation and Analysis The data was collected during the academic year 2021-2022 spring term. The printed copies of the scales were distributed to the language instructors, and the documents were returned after they completed the scales. The data collection process lasted about two weeks. Prior to the analysis, normality tests were applied to scrutinize whether the scores from the Language Assessment Knowledge Scale (LAKS) (skill-based as well as overall scores) were normally distributed in a bell-shaped curve (Creswell, 2012). According to Tabachnick and Fidell (2013), the data with +1.5 and -1.5 deviation from the normality of Skewness and Kurtosis can be accepted as normally distributed.

263

 Language Assessment

As a result of the normality analysis, the scores for each language skill and the overall scores taken from the LAKS in this study met the normality assumptions. Furthermore, addressing Research Question 1, EFL instructors’ language assessment knowledge was scored. Regarding LAK scores, for each item, the subjects were rated “1” if they had correct answers according to what assessment literature suggests and “0” if their answers were incorrect or they marked the “don’t know” option. Based on this scoring, the lowest score for the whole scale was 24, whereas the highest score was 60. After the scoring, first, the level of LAK of the language instructors was revealed via descriptive analyses (means, percentages, std dev., etc.) and one sample t-test with respect to their general and skill-based scores. Second, the Pearson Correlation was implemented in order to reveal whether there was a positive or negative relationship among skill-based LAK and overall knowledge. Finally, inferential statistics such as independent samples t-test and one-way ANOVA were administered to indicate the effects of different demographic features on the level of language assessment knowledge. Concerning Research Question 2, to present language assessment purposes, methods, and procedures participants report using in their assessment practices, descriptive analyses were implemented.

FINDINGS The findings of the quantitative analyses of the present research are presented in line with the research questions.

1. What Are the General and Skill-Based Language Assessment Knowledge (LAK) Levels of the EFL Instructors? The purpose of the first research question with subcategories was to illustrate the general and skillbased language assessment knowledge level of the EFL instructors. For this purpose, the responses of the subjects were investigated through descriptive statistics, and the results derived from the analyses are presented in Table 3 as follows. Table 3. Descriptive Statistics on general and skill-based LAK level of EFL instructors N

Min

Max

Mean

Std Dev

LAKS_Reading

31

7

15

11.29

2.053

LAKS_Listening

31

5

15

8.45

2.779

LAKS_Writing

31

2

15

8.10

2.650

LAKS_Speaking

31

4

15

10.00

2.543

LAKS_TOTAL

31

24

60

37.84

8.182

The findings of the descriptive analyses indicated that participants’ mean score on the language assessment knowledge scale was 37.84 over 60. That is to say, participants correctly answered 37.84 items on average. This means that if the score of 30, half of the total score, was accepted as the reference point as presented in Ölmezer-Öztürk (2018), participants’ LAK level (with 37.84 mean) in the present

264

 Language Assessment

study was proved to be above the average. To confirm this and investigate whether there is a statistically significant difference between this mean score and the reference score (half of the total score – 30), one sample t-test was carried out. The findings of the one sample t-test are illustrated in Table 4 below. Table 4. One sample t-test for the overall score

LAKS_TOTAL

Mean Diff.

df

t

p

7.84

30

5.334

.000*

*p< .001

As the values indicated above, the mean difference (7.84) between the mean scores of the participants (37.84) and half of the overall score (30) was revealed to be statistically significant, which suggests that the LAK level of the participants is significantly high. Table 5. One sample t-test for the skill-based scores Mean Diff.

df

t

p

LAKS_Reading

3.79

30

10.282

.000*

LAKS_Listening

.952

30

1.907

.066

LAKS_Writing

.597

30

1.254

.220

LAKS_Speaking

2.50

30

5.474

.000*

*p< .001

Moreover, in order to investigate whether the mean score for each language skill was significantly lower/higher than the reference score, one sample t-test was also administered for each subsection of the LAKS. The reference score was determined as 7.5 because the highest score that could be taken from each subsection was 15. The results are shown in Table 5. The mean scores for each skill were 11.29 for assessing reading, 8.45 for assessing listening, 8.10 for assessing writing, and 10 for assessing speaking. As the results indicated in the table, reading and speaking-based scores revealed statistically significant differences, whereas listening and writing-based scores did not prove significant differences from the reference score. All in all, when the skill-based LAK levels of the EFL instructors are concerned, it can be suggested that with the high mean scores and aforementioned statistically significant differences, participants proved to be more knowledgeable about assessing reading and speaking. Once for all, the range of frequencies on the basis of EFL instructors’ LAK scores is represented in the bar chart in Figure 1 as follows:

265

 Language Assessment

Figure 1. The range of frequencies based on total LAK scores

As Figure 1 illustrates, out of 60, the lowest score was 24, whilst the highest score was 60. When the scores were analyzed individually, it seemed clear that there were only two scores below half the total score (30), which were 24 and 29, and most of the scores were above, pulling the mean score to a higher level. Overall, it could be acknowledged that participants in the present study mostly answered the items in LAKS correctly and, as a result, can be regarded as high-performers.

a. Is There a Relationship Among Their Levels of Skill-based LAKS? One of the sub-questions of the first research question is supposed to examine whether a positive or negative correlation existed among EFL instructors’ levels of skill-based LAKS. In order to carry out the analysis, the Pearson Correlation was conducted, and the results are presented in the following table. The findings demonstrated in Table 6 revealed that all the values were significantly correlated except the values between assessing speaking and reading. Thus, all skill-based language assessment knowledge types (assessing all four skills) were highly and positively correlated with the overall knowledge of the participants, which means that if a skill-based language assessment training is provided to EFL instructors, their general language assessment knowledge is expected to get higher, as well. This may indicate that LAK might be accepted as a holistic matter of fact with intersectional branches. Table 6. The correlation among general and skill-based LAK

Reading Listening

Reading

Listening

Writing

Speaking

TOTAL_LAK

1

.561**

.540**

.338

.721**

1

.704**

.500**

.864**

1

.613**

.889**

1

.764**

Writing Speaking TOTAL_LAK **Correlation is significant at the 0.01 level (2-tailed); N=31

266

1

 Language Assessment

Apart from that, the table makes it evident that all types of skill-based LAK were moderately correlated among themselves, excluding the relation between reading and speaking. The highest significant correlation was indicated between listening and writing (.704), and the lowest was between listening and speaking (.500). On the other hand, when it comes to the only non-significant relationship on the table, it is clear that there was a weak but positive correlation between speaking and reading (.338). All in all, these positive relationships among skill-based LAK mean that the more knowledgeable EFL instructors get about assessing a language skill, the more likely they are to be more competent in assessing other skills, which highlights the issue of the interrelatedness of skill-based language assessment knowledge.

b. Is There Any Change in LAK Level? The other sub-question of the first research question is to investigate the change in LAK of language instructors concerning such variables as the year of experience, educational background, bachelor’s program graduated, training on testing and assessment, etc. The results on each variable are presented under the subsections below. i. Years of Experience Table 7. Changes in LAK in terms of years of experience Years of Experience

N

M

1-5 years

2

38.50

6-10 years

13

38.46

11-15 years

8

35.88

16-20 years

8

38.63

One of the variables was the year of experience, and whether the EFL instructors’ LAK changed based on the years they had spent in this profession was examined. The participant numbers and mean scores in each category of the years of experience are illustrated in Table 7. Table 8. One-way ANOVA results based on years of experience Sum of Squares

df

Mean Square

F

p

41.713

3

13.904

.191

.902

Within Groups

1966.481

27

72.833

TOTAL

2008.194

30

Between Groups

*p< .05

Additionally, one-way ANOVA was administered to investigate the effect of years of experience on LAK. The results in Table 8 indicated no significant difference among the four groups (p = .902),

267

 Language Assessment

revealing that the variable of teaching experience does not make any changes in the LAK level of language instructors. ii. Educational Background The second variable examined under the sub-question was educational background. As indicated in Table 9, out of 31 participants, 13 instructors had a BA degree, 14 had a MA degree, and 4 had a Ph.D. degree. Table 9. Changes in LAK in terms of educational background Educational Background

N

M

BA degree

13

36.54

MA degree

14

37.21

Ph.D. degree

4

44.25

According to the one-way ANOVA results in Table 10 for examining the differences among the three groups mentioned above, no statistically significant findings were revealed (p = .245), which indicates that the LAK level of language teachers does not change with respect to their educational background. Table 10. One-way ANOVA results based on educational background Sum of Squares

df

Mean Square

F

p

1.479

.245

Between Groups

191.856

2

95.928

Within Groups

1816.338

28

64.869

TOTAL

2008.194

30

*p< .05

iii. BA Program Graduated As another variable, whether the participants have been ELT graduates or not was examined. Table 11 presents the ELT and non-ELT numbers of participants. Table 11. Changes in LAK in terms of the BA program graduated BA Program Graduated

N

M

ELT

25

39.20

Non-ELT

6

32.17

Independent samples t-test analysis in Table 12 shows that there is not a significant difference between ELT and non-ELT groups (p = .057). That is to say, language instructors’ graduation from English Language Teaching Departments of universities does not make any change in terms of their LAK level.

268

 Language Assessment

Table 12. Independent samples t-test results based on the BA program graduated

LAKS_TOTAL

Mean diff.

df

t

p

7.033

29

1.981

.057

*p< .05

iv. Having a Testing Course in BA The impact of a testing course taken to get a bachelor’s degree was scrutinized as another variable. Table 13 indicates the number of participants having taken a testing course in BA; 24 language teachers took the testing course, whereas 7 did not. Table 13. Changes in LAK in terms of having a testing course in BA A Testing Course in BA

N

M

YES

24

39.29

NO

7

32.86

When the independent samples t-test findings are analyzed, as is evident in Table 14, there was not any statistically significant difference between the two groups (p = .066). In other words, it is obvious that testing and assessment courses presented during BA do not have any influence on the LAK level of language instructors. Table 14. Independent samples t-test results based on having a testing course in BA

LAKS_TOTAL

Mean Diff.

df

t

p

6.435

29

1.910

.066

*p< .05

v. Training on Testing and Assessment Training on testing and assessment was examined as the fifth variable under the sub-question. Participants were asked whether they had any training on testing and assessment apart from the courses taken in BA. As illustrated in Table 15, 16 participants reported that they had training on assessment, while 15 presented no training. Table 15. Changes in LAK in terms of training on testing and assessment Training on Testing and Assessment

N

M

YES

16

39.38

NO

15

36.20

269

 Language Assessment

The independent samples t-test was applied in order to understand the difference between these two groups, and the findings in Table 16 revealed no statistically significant difference (p = .288), which underlines the fact that any training on language testing and assessment does not prove to have a significant effect on LAK level of instructors. Table 16. Independent samples t-test results based on training on testing and assessment

LAKS_TOTAL

Mean Diff.

df

t

p

3.175

29

1.083

.288

*p< .05

vi. Being a Testing Office Member Table 17. Changes in LAK in terms of being a testing office member Testing Office Member

N

M

YES

19

39.68

NO

12

34.92

The final variable focused was being a testing office member, and whether preparing language exams for students has any impact on language instructors’ LAK level was scrutinized. Table 17 demonstrates that out of 31 subjects, 19 teachers presented they had worked as a testing office member in any institution, whereas 12 did not. Table 18. Independent samples t-test results based on being a testing office member

LAKS_TOTAL

Mean Diff.

df

t

p

4.768

29

1.623

.115

*p< .05

As the independent samples t-test findings in Table 18 show, no statistically significant difference between YES and NO groups was revealed (p = .115). Videlicet, similar to the previous variables, being a testing office member and preparing language exams for learners do not make any changes in the LAK level of language instructors. Overall, based on these results, when the variables – namely years of experience, educational background, BA program graduated, having a testing course in BA, training on testing and assessment, and being a testing office member – are concerned, no statistically significant change could be concluded, which highlights these variables neither increase nor decrease the language assessment knowledge of the instructors.

270

 Language Assessment

2. What Assessment Purposes, Methods and Procedures do EFL Instructors Report Using in Their Language Assessment Practices? The purpose of the second research question was to underline the purposes, methods, and procedures language instructors report doing. For this purpose, through descriptive statistics, firstly, the purposes of assessment and evaluation with the categories of student-centered, instructional, and administrative purposes were presented. Secondly, assessment methods for each language skill were introduced based on instructor-made, student-conducted, and standardized testing categories. Finally, under the category of procedures, the sources language instructors resort to, feedback methods they use, and the time they spend on testing and evaluation were put forward.

a. Purposes of Assessment Table 19. Purposes of assessment N (n = 31)

%

Obtain information on my students’ progress

31

100

Provide feedback to my students as they progress through the course

30

96.8

Diagnose strengths and weaknesses in my students

31

100

Determine final grades for my students

26

83.9

Motivate my students to learn

24

77.4

Purpose Student-centered purposes

Formally document growth in learning of my students

25

80.6

Make my students work harder

20

64.5

Prepare students for tests they will need to take in the future (e.g., TOEFL, MELAB, CET)

15

48.4

28

90.3

Instructional purposes Plan my instruction Diagnose strengths and weaknesses in my own teaching and instruction

29

93.4

Group my students at the right level of instruction in my class

19

61.3

Provide information to the central administration

27

87.1

Provide information to an outside funding agency

4

12.9

Administrative purposes

In one of the subsections of the survey questionnaire developed by Cheng et al. (2004), participants were expected to respond to 13 purposes presented in Table 19 in order to reveal their own aims to assess and evaluate their students. As can be understood from the table, the purposes of assessment section was separated into three different categories: student-centered, instructional, and administrative. For student-centered purposes, more than half of the subjects reported that they used assessment to do all of the tasks under this category except preparing students for the standardized tests they will take

271

 Language Assessment

in the future. This may be because all of the participants were prep school language teachers working at a state university, and the language program at the school does not have such a role. Concerning the second category – instructional purposes, approximately all subjects presented that they resorted to assessment for planning their instruction and diagnosing the strengths and weaknesses of their teaching. However, grouping the students at the right level of instruction was not among the purposes of some of the subjects, which may result from the fact that at the school of foreign languages, the students are always grouped based on their proficiency levels at the very beginning of the academic year by the administration through a placement exam, and it is generally not the teachers’ responsibility. As for administrative purposes, providing information to the central administration was within the scope of language instructors when the assessment purposes were concerned, whereas delivering information to an outside funding agency was not. In a sense, the instructors reported utilizing assessment and evaluation to give educational information about their students to the school administration, which seems to be an expected outcome. However, as there is not such an issue at their school, they did not prefer to give information about the student’s progress to an outside agency.

b. Assessment Methods for Each Language Skill In the second subsection of the survey questionnaire, EFL instructors’ classroom assessment practices in terms of listening, speaking, reading, and writing were presented under three categories, instructor-made assessment methods, students-conducted assessment methods, and standardized testing. The findings for assessment methods for the reading skill are demonstrated in Table 20. Regarding instructor-made assessment methods, most participants reported using short-answer, matching, true-false, multiple-choice, and sentence completion items. It was surprising that cloze items were not reported among the others because cloze test questions are used for an in-depth understanding of grammatical competence as a part of the language proficiency test at the school of foreign languages. Apart from that, editing was the least selected item type, which seems logical as, throughout the year, the students generally do not encounter many editing tasks when reading skill is concerned.

272

 Language Assessment

Table 20. Assessment methods for reading Assessment Methods

N (n = 31)

%

Instructor-made assessment methods Short-answer items

29

93.5

Matching items

25

80.6

Interpretive items

20

64.5

True-false items

28

90.3

Multiple-choice items

30

96.8

Cloze items

19

61.3

Sentence completion items

28

90.3

Editing

12

38.7

Completion of forms (e. g. application)

18

58.1

Student summaries of what they read

18

58.1

Student journal

5

16.1

Student-conducted assessment methods

Oral interviews/questioning

23

74.2

Peer assessment

14

45.2

Read aloud/dictation

11

35.5

Self-assessment

11

35.5

Student portfolio

20

64.5

Standardized testing (reading)

22

71.1

For student-conducted assessment methods, the percentages were not as high as instructor-made assessment methods. Oral interviews/questioning and student portfolios got the highest ranking. This is because the reading & writing lessons within the program require such methods for careful assessment, but a much lower value pattern was observed for the methods of student journals, peer/self-assessment, and dictation. In the final category, 22 out of 31 language instructors presented to resort to standardized testing in assessing reading, which is relatively low. Also, it is a surprising result because standardized reading tests are generally among the highly-demanded and most popular assessment methods in the educational context. Standardization has the utmost importance and value in language teaching, especially in schools with more than 20 classrooms. Hence, it seems crucial to make room for standardized reading tests in both language proficiency and placement exams. Moreover, the results of assessment methods for the writing skill are demonstrated in Table 21. Concerning instructor-made assessment methods, short essays and editing a sentence/paragraph were presented to be the most preferred assessment methods for the writing skill. However, the other methods had low percentages, reflecting the language program of the school of foreign languages and the school’s assessment procedures. As an example, the writing exams do not generally have a section of long essays or matching/true-false items. Students are supposed to write a paragraph with a specific genre, such

273

 Language Assessment

as an opinion or argumentative paragraph on a given topic. That is why the instructors do not resort to including those evaluation methods in their assessment process. Among the student-conducted writing assessment methods, the use of student portfolios got the highest ranking (25 out of 31). On the other hand, similar to the reading assessment methods, less than half of the participants reported using student journals and peer/self-assessment methods. This can also be explained by the assessment procedures of the school. Since there is a standardized assessment process for all of the prep classes at the school of foreign languages the instructors work for, they have to keep up with those standards and do what the testing office tells them to do. The last category was standardized testing for writing. 16 out of 31 participants reported they preferred standard tests for assessing writing. This finding contradicts the previous results in that although standardization is a critical issue for student-conducted writing assessment methods, it does not seem the same when it comes to the actual method, which seems ironic. Actually, in a way, the instructors might perceive the standardized tests as tests developed by the publishing companies or tests like TOEFL, IELTS, etc., and thus differentiate between the assessment methods administered to the students at their school and the other standardized tests. The difference between the findings might be explained by the discrepancy in the perceptions. Table 21. Assessment methods for writing N (n = 31)

%

Short essay

30

96.8

Editing a sentence or paragraph

26

83.9

Long essay

16

51.6

Multiple-choice items to identify grammatical errors in a sentence

15

48.4

Matching items

14

45.2

True-false items

11

35.5

Student journal

10

32.3

Peer assessment

14

45.2

Student portfolio

25

80.6

Self-assessment

10

32.3

Standardized testing (writing)

16

51.6

Assessment Methods Instructor-made assessment methods

Student-conducted assessment methods

Furthermore, the findings of assessment methods for listening and speaking skills are illustrated in Table 22. Both instructor-made and student-conducted assessment methods revealed a great variety of assessment types. Most participants reported using the methods shown below for listening and speaking skills. Among the student-conducted methods, oral presentations got the highest ranking (30 out of 31), proving that oral presentations have an important place in the assessment process. However, as is the case for the other two skills, dictation and peer/self-assessment were not among the most popular methods in terms of listening and speaking skills.

274

 Language Assessment

Table 22. Assessment methods for listening and speaking N (n = 31)

%

Take notes

27

87.1

Prepare summaries of what is heard

23

74.2

Multiple-choice items following listening to a spoken passage

26

83.9

Oral presentation

30

96.8

Oral interviews/dialogues

25

80.6

Oral discussion with each student

28

90.3

Retell a story after listening to a passage

24

77.4

Provide an oral description of an event or thing

27

87.1

Peer assessment

12

38.7

Oral reading/dictation

14

45.2

Self-assessment

11

35.5

Follow directions given orally

22

71.0

Public speaking

17

54.8

Give oral directions

20

64.5

Listening test

18

58.1

Speaking test

16

51.6

Assessment Methods Instructor-made assessment methods

Student-conducted assessment methods

Standardized testing

As for the standardized tests, the table indicates low percentages, as is expected. To clarify, for standardized listening tests, 18 out of 31; speaking tests, 16 out of 31 language instructors (slightly more than half) reported using those tests for proper assessment. As is the case for both reading and writing skills, it seems obvious that there is a difference in language instructors’ perceptions regarding what standardized tests are. As a result, although most teachers used the assessment methods prepared and scheduled by the testing office at their school with standardized applications, they did not resort to standardized listening and speaking tests such as TOEFL and IELTS.

c. Procedures of Assessment In the final subsection of the survey questionnaire developed by Cheng et al. (2004), the participants were expected to provide the sources of assessment, the forms of feedback, and the time they spent on language assessment. Concerning the sources of assessment, as demonstrated in Table 23, the highest rated source was the textbooks (30 out of 31). Following that, 22 out of 31 participants stated that they preferred to develop their own assessments. In addition, it was also reported that the syllabus/curriculum, the internet, and other instructors were among the popular sources (19 out of 31). The least selected source of assessment

275

 Language Assessment

was published assessments (14 out of 31), which underlines that technological advances have changed the way language instructors prepare language tests/tasks. Table 23. Sources of assessment N (n = 31)

%

Instructor

22

71.1

Other Instructors

19

61.3

Textbooks

30

96.8

Published assessments

14

45.2

Mandated syllabus/curriculum

21

67.7

Internet

19

61.3

Sources

Print sources

As for the forms of feedback provided, Table 24 shows various types and the way they are reported to the students. During the course, approximately all of the participants stated that they provided feedback in the form of either verbal or written comments. Total test scores were also among the highly preferred forms of feedback (26 out of 31). On the other hand, teaching diaries, letter grades, and individual conferences with students appeared not to be very popular among language teachers. For checklists, more than half of the subjects presented using this kind of feedback mechanism during courses. Table 24. Forms of feedback N (n = 31)

%

Verbal feedback

29

93.5

Written comments

30

96.8

Teaching diary/log

1

3.2

Conference with students

10

32.3

Total test scores

26

83.9

Letter grades

4

12.9

Checklist

18

58.1

Written comments

25

80.6

Teaching diary/log

1

3.2

Conference with students

8

25.8

Total test scores

26

83.9

Letter grades

9

29

Checklist

13

41.9

Forms of Feedback During course

Final report

276

 Language Assessment

Regarding how the feedback is provided, i. e. the final reporting, the most common report type was the total test score (26 out of 31), and the second most common was written comments (25 out of 31). As is the case for the forms of feedback, the final reports in the form of teaching diaries, individual conferences, and letter grades were not highly popular with the subjects. Finally, the table indicates that the instructors reported using end-of-course checklists less frequently (13 out of 31). Table 25. Time spent on language assessment N (n = 31)

%

10% or less

1

3.2

11 - 15%

3

9.7

16 - 20%

3

9.7

21 - 30%

3

9.7

31 - 40%

8

25.7

40% or more

13

42

Total Time

For the time spent on language assessment, the distribution of the total time (as frequency and percentage) is represented in Table 25 for each setting. More than half of the instructors (21 out of 31) reported spending more than 30% of their time at school on testing. Videlicet, highlighting that 19 out of 31 participants mentioned they had worked as a testing office member, the findings in the following table seem logical. With these findings, it could be stated that the time spent by instructors on assessment, including instruction, exam preparation, and marking/scoring, is not negligible. Most instructors are knowledgeable that teaching requires not only providing theoretical information to the students but also a standardized evaluation of the teaching process, which could only be attained by self-devotion.

DISCUSSION The main aim of the present study is to investigate the general and skill-based language assessment knowledge of its participants and also to delve into the assessment purposes, methods, and practices they employ. The results of the LAK scale showed a significantly high level of general language assessment knowledge, which is 37.84 out of 60. When skill-based knowledge of the participants was analyzed, reading and speaking were found out to be the ones with the higher LAK scores compared to listening and writing. For the reading skill, the LAK score of the instructors was revealed to be 11.29 out of 15, while for the speaking skill, the score was 10 out of 15. When it comes to listening, the LAK score was indicated to be 8.45 out of 15, and for the writing skill, it was 8.10 out of 15. Anderson and Lynch (1988) pointed out that listening assessment is burdensome considering the fact that there are many other factors which need to be considered, which could be the reason behind the low LAK score for listening. Similarly, Weigle (2012) pointed out that writing assessment is complex and challenging, and this is reflected in the findings of this study. Furthermore, technology, which became

277

 Language Assessment

even more critical with online teaching, and its adverse effects may be a contributing factor in terms of making the writing assessment more challenging for instructors (Razı, 2015). Reading, which is a receptive skill, was found to be the one with the highest LAK score. Brown and Hudson (1998) indicate that reading assessment is generally conducted with the help of selectedresponse assessment tasks such as true-false and multiple-choice questions. Swan and Walter (2017) claim that activities such as matching, finding the main idea, and predicting are so widely administered in reading instruction and assessment that many instructors find them very practical, even though there might be other more effective options. Since these types of assessment tasks are practical and easy to deal with, this might have a contribution to the overall reading-based language assessment knowledge of the instructors. As assessment knowledge and practice go in hand in hand, smooth practice, on the one hand, might have influenced the knowledge of the instructors. The surprising outcome of the study is the speaking LAK score, which was revealed to be 10 out of 15. As a productive skill, the assessment of speaking has been considered to be more challenging than receptive skills (Brown and Hudson, 1998; O’Sullivan, 2012). However, in the context of this study, the syllabus, the in-service trainings that the participants were exposed to, and the speaking rubric, which was designed specifically for the speaking exams of the institution, might have a significant influence on the speaking-based assessment knowledge development of the participants. However, there were not any statistically significant relationships between the LAK level of the instructors and their years of experience, educational background, their bachelor’s degree, testing and assessment courses they took during BA, any training on testing and assessment they received and whether or not they worked as a testing office member. The reason behind this could stem from the fact that almost all of the participants have been working as English instructors in the same institution for more than 5 years, apart from 2 instructors who can be considered novice teachers. Many factors, such as using the same syllabus and the same books, taking part in the same in-service training courses, and using the same assessment methods, may have played a role in this issue. For the assessment purposes, it was revealed that more than half of the participants used assessment methods for student-centered purposes, such as obtaining information about their students’ development and giving feedback, and it was also found that almost all of the participants used assessment for instructional purposes like getting feedback about their own teaching and assessment methods. As Stiggins (1999) points out, assessment is an excellent way to reveal the effectiveness of classroom instruction, and the findings of this study correspond with this as participants selected obtaining information about students and getting feedback about their own teaching as important reasons why they use assessment methods. Cheng et al. (2004) and Acar-Erdol &Yıldızlı (2018) also clarified the mutual relationship between assessment and teaching, indicating that assessment is a great way to get feedback. Apparently, the participants find assessment beneficial for feedback, too. For reading assessment, instructors reported using primarily short answer questions, matching, true-false items, multiple choice questions, and sentence completion. Brown and Hudson (1998) stated that selected-response assessment and constructed-response assessment tasks are generally preferred for reading assessment, and this study’s findings are compatible with this. This may be because of the fact that reading is assessed with the help of its subskills (Hubley, 2012) and controlled tasks are more suitable for the assessment of the reading skill. As for student-conducted assessment methods for reading, the findings revealed that oral interviews and student portfolios were used. In contrast, student journals and peer and self-assessment were not as

278

 Language Assessment

popular. It is not surprising to see student portfolios as one of the popular assessment methods as it is actually an obligatory part of the curriculum in the institution the data was collected. As for writing assessment, participants stated choosing short essays and editing exercises, and these findings are similar to what Cheng et al. (2004) and Brown and Hudson (1998) indicated about using essays and personal-response assessments in writing assessment. However, the participants reported not using long essays and true-false items for the assessment of writing, which can be explained by the syllabus restrictions again. As the writing syllabus mostly consists of paragraphs and short essays, it is not surprising to see these as the most preferred assessment methods. When it comes to student-conducted assessment methods, using portfolios was the one with the highest score, which is also a part of the writing syllabus of the school. However, student journals and peer and self-assessment were found to be less preferred, which can be considered one of the study’s most surprising findings. Brown (2003) claims that one of the most important aims of assessment is to foster students’ self-evaluation. Similarly, Harmer (2004) mentions the importance of self-assessment and peer feedback in writing assessment. Brown and Hudson (1998) indicate that writing assessment can be carried out best with the help of personal response assessments. Hence, it is surprising to see peer and self-assessment as methods which are not preferred. There could be many reasons behind it, such as curriculum (Zhang & Burry-Stock, 2003) and teachers’ beliefs (Acar-Erdol &Yıldızlı, 2018). For listening and speaking skills, the findings revealed that participants preferred oral presentations during their teaching. According to Brown and Hudson (1998), Cheng et al. (2004), and O’Sullivan (2012), personal response assessment tasks, interviews, and oral presentations are effective methods of speaking assessment; however, it might be interesting to see oral presentations chosen for listening assessment. Furthermore, the results highlight that the participants did not select dictation, peer-assessment, and self-assessment as methods they would like to utilize. This finding, which is quite similar to the writing assessment, could be an indicator of teachers’ preference for teacher feedback as they picked feedback as one of the most important purposes of assessment. Yet, they stated they did not make use of peer assessment and self-assessment. Consequently, this might show a need for teacher training related to formative assessment methods. Moreover, the participants’ responses related to feedback types they employed were verbal and written feedback in the form of total test scores. In addition, they reported not preferring methods such as teaching diaries and student conferences to give feedback, which shows that they might like to rely on teacher feedback.

CONCLUSION This study, which aimed to find out the language assessment knowledge of its participants along with their assessment purposes, methods, and practices, found out that the general LAK level of the participants was 37.84 out of 60, which is statistically significant. For skill-based LAK levels, reading was the one with the highest scores, and speaking followed it. Meanwhile, listening and writing were the skills with the lowest LAK scores. In addition, it was revealed that there were not any statistically significant relationships between participants’ LAK levels and their education, years of experience, testing office experience, BA degree, and testing training experience. Regarding assessment methods, it was found that short-answer, matching, and true-false questions were more preferable to editing and student journals for reading assessment. For writing assessment, it was revealed that short essays and editing along with 279

 Language Assessment

student portfolios were assessment methods that the participants used. Finally, for speaking and listening, the oral presentation was the most preferred one, whereas peer assessment and self-assessment were left out. Furthermore, even though the participants claimed that feedback was one of the main objectives of the assessment, it can be concluded that they lean on verbal and written teacher feedback rather than any alternative feedback forms. Last but not least, it was revealed that the participants tend to use textbooks while forming assessment tools and spend 30% of their time on testing. The findings of this study present some critical implications for the related literature. First of all, despite the high levels of general LAK and skill-based LAK knowledge of the participants, it can be said that they might need more training related to formative assessment and alternative feedback methods. Being more knowledgeable about these notions will positively affect their assessment knowledge and language assessment literacy; hence training instructors about formative and alternative assessment methods is necessary (Popham, 2009; Malone,2013). Similarly, another implication of this study is the importance of training language instructors about assessment as a whole so that they can keep up with the requirements of different courses and educational developments. For further research, the correlation between instructors’ LAK knowledge and their assessment practices can be investigated to see if there are any statistically significant relationships. Last but not least, assessment knowledge and practices could be handled separately for each different skill.

ACKNOWLEDGMENT It is our pleasure to acknowledge the help provided by all the language instructors from Kütahya Dumlupinar University, which has offered more profound insight into the present study. Apart from that, this research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.

REFERENCES Acar-Erdol, T., & Yıldızlı, H. (2018). Classroom Assessment Practices of Teachers in Turkey. International Journal of Instruction, 11(3), 587–602. doi:10.12973/iji.2018.11340a Anderson, A., & Lynch, T. (1988). Listening. Oxford University Press. Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford University Press. Berry, V., Sheehan, S., & Munro, S. (2019). What does language assessment literacy mean to teachers? ELT Journal, 73(2), 113–123. doi:10.1093/elt/ccy055 Brown, H. D. (2003). Language Assessment: Principles and Classroom Practices. Pearson Education. Brown, J. D., & Hudson, T. (1998). The alternatives in language assessment. TESOL Quarterly, 32(4), 653–675. doi:10.2307/3587999 Can, E. (2017). English Teachers’ Classroom Assessment Practices and Their Views about the Ethics of Classroom Assessment Practices [Unpublished MA Thesis]. Cağ University.

280

 Language Assessment

Cheng, L., Rogers, T., & Hu, H. (2004). ESL/EFL instructors’ classroom assessment practices: Purposes, methods, and procedures. Language Testing, 21(3), 360–389. doi:10.1191/0265532204lt288oa Creswell, J. W. (2012). Educational research: Planning, conducting, and evaluating quantitative and qualitative research. Merrill. Davies, A. (2008). Textbook trends in teaching language testing. Language Testing, 25(3), 327–347. doi:10.1177/0265532208090156 Harmer, J. (2004). How to teach writing. Longman. Henning, G. (1987). A guide to language testing, development, evaluation, and research. Newbury House. Hubley, N. N. (2012). Assessing reading. In C. Coombe, P. Davidson, B. O’Sullivan, & S. Stoynoff (Eds.), The Cambridge Guide to Second Language Assessment (pp. 211–217). Cambridge University Press. Hughes, A. (1989). Testing for language teachers. Cambridge University Press. Kyriacou, C. (2007). Essential teaching skills. Blackwell Education. Lam, R. (2019). Teacher assessment literacy: Surveying knowledge, conceptions, and practices of classroom-based writing assessment in Hong Kong. System, 81, 78–89. doi:10.1016/j.system.2019.01.006 Leung, C. (2014). Classroom-based assessment issues for language teacher education. In A. J. Kunnan (Ed.), The Companion to Language Assessment (pp. 1510–1519). Wiley Blackwell. Malone, M. E. (2011). Assessment literacy for language educators. CAL Digest October 2011. Available at www.cal.org Malone, M. E. (2013). The essentials of assessment literacy: Contrasts between testers and users. Language Testing, 30(3), 329–344. doi:10.1177/0265532213480129 Mertler, C. A., & Campbell, C. (2005). Measuring teachers’ knowledge and application of classroom assessment concepts: Development of the assessment literacy inventory. Paper presented at the annual meeting of the American Research Association, Montreal, Quebec, Canada. Retrieved from https://eric. ed.gov/?id=ED490355 O’Sullivan, B. (2012). Assessing speaking. In C. Coombe, P. Davidson, B. O’Sullivan, & S. Stoynoff (Eds.), The Cambridge Guide to Second Language Assessment (pp. 234–246). Cambridge University Press. Ölmezer-Öztürk, E. (2018). Developing and Validating Language Assessment Knowledge Scale (LAKS) and Exploring the Assessment Knowledge of EFL Teachers [Unpublished Ph.D. Dissertation]. Anadolu University. Ölmezer-Öztürk, E., & Aydın, B. (2019). Investigating language assessment knowledge of EFL teachers. Hacettepe University Journal of Education, 34(3), 602–620. doi:10.16986/HUJE.2018043465 Özdemir-Yılmazer, M., & Özkan, Y. (2017). Classroom assessment practices of English language instructor. Journal of Language and Linguistic Studies, 13(2), 324–345. Popham, J. W. (2009). Assessment literacy for teachers: Faddish or fundamental? Theory into Practice, 48(1), 4–11. doi:10.1080/00405840802577536

281

 Language Assessment

Popham, W. J. (2006). All about accountability / Needed: A dose of assessment literacy. Educational Leadership, 63(6), 84–85. Purpura, J. E. (2016). Second and foreign language assessment. Modern Language Journal, 100(S1), 190–208. Advance online publication. doi:10.1111/modl.12308 Razı, S. (2015). Development of a rubric to assess academic writing incorporating plagiarism detectors. SAGE Open, 5(2), 1–13. doi:10.1177/2158244015590162 Sevimel-Sahin, A., & Subasi, G. (2019). An overview of language assessment literacy research within English language education context. Kuramsal Eğitimbilim Dergisi, 12(4), 1340–1364. Shepard, L. A. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7), 4–14. doi:10.3102/0013189X029007004 Stiggins, R. J. (1999). Evaluating classroom assessment training in teacher education programs. Educational Measurement: Issues and Practice, 18(1), 23–27. doi:10.1111/j.1745-3992.1999.tb00004.x Swan, M., & Walter, C. (2017). Misunderstanding comprehension. ELT Journal, 71(2), 228–236. doi:10.1093/elt/ccw094 Tabachnick, B. G., & Fidell, L. S. (2013). Using Multivariate Statistics (6th ed.). Pearson Education. Taber, K. S. (2018). The use of Cronbach’s alpha when developing and reporting research instruments in science education. Research in Science Education, 48(6), 1273–1296. doi:10.100711165-016-9602-2 Taylor, L. (2006). The Changing Landscape of English: Implications for Language Assessment. ELT Journal, 60(1), 51–60. doi:10.1093/elt/cci081 Tran, T. H. (2012). Second Language Assessment for Classroom Teachers. Paper presented at MIDTESOL 2012, Ames, IA. Tribble, C. (1996). Writing. Oxford University Press. Weigle, S. C. (2012). Assessing writing. In C. Coombe, P. Davidson, B. O’Sullivan, & S. Stoynoff (Eds.), The Cambridge Guide to Second Language Assessment (pp. 236–246). Cambridge University Press. White, E. (2009). Are you assessment literate? OnCue Journal, 3(1), 3–25. Zhang, Z., & Burry-Stock, J. A. (2003). Classroom assessment practices and teachers’ self-perceived assessment skills. Applied Measurement in Education, 16(4), 323–342. doi:10.1207/S15324818AME1604_4

ADDITIONAL READING Coombe, C., Vafadar, H., & Mohebbi, H. (2020). Language assessment literacy: What do we need to learn, unlearn, and relearn? Language Testing in Asia, 10(1), 1–16. doi:10.118640468-020-00101-6 Creswell, J. W., & Guetterman, T. C. (2019). Educational research: Planning, conducting, and evaluating quantitative and qualitative research. Pearson.

282

 Language Assessment

Giraldo, F. (2021). Language assessment literacy and teachers’ professional development: A review of the literature. Profile Issues in Teachers Professional Development, 23(2), 265–279. doi:10.15446/ profile.v23n2.90533 Nikolov, M., & Timpe-Laughlin, V. (2021). Assessing young learners’ foreign language abilities. Language Teaching, 54(1), 1–37. doi:10.1017/S0261444820000294 Nurdiana, N. N. (2022). Language teacher assessment literacy: A current review. Journal of English Language and Culture, 11(1). Advance online publication. doi:10.30813/jelc.v11i1.2291 Ölmezer-Öztürk, E., & Aydin, B. (2018). Toward measuring language teachers’ assessment knowledge: Development and validation of Language Assessment Knowledge Scale (LAKS). Language Testing in Asia, 8(1), 20. doi:10.118640468-018-0075-2 Xu, Y., & Brown, G. T. (2016). Teacher assessment literacy in practice: A reconceptualization. Teaching and Teacher Education, 58, 149–162. doi:10.1016/j.tate.2016.05.010

KEY TERMS AND DEFINITIONS Assessment Literacy: The combination of knowledge and skills that are essential to carry out effective assessment practices. Assessment Principles: Fundamentals that should be considered when forming and administering an assessment tool. Language Assessment: Evaluation of language patterns that are applied in order to investigate the output of language instruction. Language Assessment Knowledge: The extent of information that needs to be possessed related to the terms and notions regarding different forms of language assessment. Language Assessment Practices: Approaches, methods, and tools that the instructors utilize in classrooms in order to conduct language assessment. Skill-Based Assessment: Different forms of effective language evaluation based on each language skill. Standardization in Assessment: Assuring consistency and regularity in terms of the language assessment practices.

283

284

Chapter 14

Parents’ Perceptions of Foreign Language Assessment and Parental Involvement Nesrin Ozturk https://orcid.org/0000-0002-7334-8476 İzmir Democracy University, Turkey Begum Atsan https://orcid.org/0000-0002-8312-0029 İzmir Democracy University, Turkey

ABSTRACT International and national foreign language education policies recognize the invaluable role of parents. Because parents’ perceptions of foreign language assessment may initiate any parental involvement behavior, a qualitative descriptive study was conducted to investigate the phenomenon. Data were collected from 25 parents via semi-structured interviews and analyzed thematically. Findings confirmed parents’ understandings of a foreign language proficiency pertain to communicative use of the language. However, assessment practices at schools are test-driven, and they may not be authentic, valid, and criterion-referenced practices. Parents, moreover, highlighted a need for assessment literacy; nevertheless, they do not get any support from any stakeholders. Also, assessment practices’ outcomes may initiate parental involvement behaviors that pertain to parenting helping, communicative effective, and learning at home. This study highlights an urgent need to improve parents’ foreign language assessment literacy and parental involvement behaviors to enrich learners’ development.

INTRODUCTION The number of children learning English as a foreign language has increased (Forey et al., 2016), and in relation, it may be even that the number of nonnative speakers of English outnumbered the natives (Hosseinpour, Yazdani, et al., 2015). While governments strive to meet the challenges of the 21st century’s DOI: 10.4018/978-1-6684-5660-6.ch014

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

dynamics such as globalization, technological advancements, and migration (OECD, 2021), they also try to help students acquire a second and/or foreign language. This is because proficiency in a foreign language helps learners develop an intercultural understanding, cognitive skills, as well as economic and career potentials (Forey et al., 2016; Hosseinpour, Sherkatolabbasi, et al., 2015; OECD, 2021). The Organization for Economic Co-operation and Development’s (OECD) recent policy on delivering an international foreign language assessment in 2025 may also produce a great shift. As the Programme for International Student Assessment (PISA) test validation model focuses on socio-cultural dynamics, parents’ foreign language proficiency, family support regarding a foreign language, and their perceptions and attitudes will be in the scope for their impact on learners’ proficiency (OECD, 2021). In this sense, such practices may lead governments including Turkey to keep revising their foreign language education policies and practices to include parents as a mechanism for language learners’ development.

Context of the Study Nature of the Curriculum The Turkish Ministry of National Education (MoNE) redesigned the English language education curriculum for primary and middle schools with a reference to the Common European Framework of References for Languages (CEFR) in 2018 (Milli Eğitim Bakanlığı, 2018b). As the CEFR proposes that language learning is a lifelong process, Turkish students start to learn English as of the 2nd grade, if not earlier (Milli Eğitim Bakanlığı, 2018b). Regarding children’s developmental characteristics and the CEFR’s framework, the English language curriculum also puts forward developing language skills in an authentic environment and in relation, employing a diversity of assessment practices.

Nature of the Assessment The MoNE highlighted that an important aspect of the curriculum is assessment and evaluation. Indeed, “learning, teaching, and testing are part of a whole, interacting constantly with each other in shaping not only teachers’ instructional choices but also students’ learning strategies, and even parents’ attitudes toward what is critical and valuable in educative provisions. (Milli Eğitim Bakanlığı, 2018b, p.6) As the CEFR suggests criterion-referenced assessment (Council of Europe, 2020), the MoNE suggests various methods or techniques such as alternative, process-oriented, self-assessment, diagnostic, reflective, and summative evaluation to evaluate language skills and competencies (Milli Eğitim Bakanlığı, 2018b). Assessment practices may also display variations regarding the developmental characteristics of the learners. For the 2nd and 3rd graders, assessment practices may be implemented in and out of the class, and they reflect the notions of formative assessment. As of the 4th grade, both summative and formative assessment practices may be employed. In this sense, (a) written and oral tests, quizzes, and take-home assignments and (b) high-stake exams, product-oriented projects, and pen-paper tests can be employed for formative and summative assessment, respectively (Milli Eğitim Bakanlığı, 2018b).

285

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

Parents and the English Language Curriculum As invaluable roles of the “support mechanisms of teachers and other shareholders” including parents are recognized (Milli Eğitim Bakanlığı, 2018b, p.4), the MoNE has implemented an assessment platform. On this platform, parents can track children’s all assessment records from kindergarten to high school at any levels and forms where both content and competencies are clarified. Moreover, parents are offered modules to familiarize themselves with assessment practices. Thereby, all stakeholders can discuss children’s development as well as take initiatives to support their process or manage insufficiencies, if necessary (Milli Eğitim Bakanlığı, 2018a).

Purpose of the Research The MoNE’s initiatives for parental involvement may seem promising because Turkish parents may have all opportunities to be informed about the objectives as well as forms of assessment, and they can get support from any stakeholders when they feel incompetent to contribute to their children’s success. However, a lack of research on parental involvement in Turkey highlights a need to examine the practicality and effectiveness of these initiatives, if at all. Thereby, this study focuses on describing parents’ understandings of foreign language assessment (FLA) and practices at the late elementary and middle school levels. To this end, the following questions will be answered: 1. What is the Turkish parents’ perception of foreign language proficiency? a. How do foreign language assessment practices and parents’ perceptions of foreign language proficiency align, if at all? 2. How much do the Turkish parents know about foreign language assessment, if at all? 3. How do Turkish parents involve in their children’s foreign language learning processes, if at all?

BACKGROUND Assessment Assessment pertains to an ongoing process where various techniques (Purpura, 2016) as well as skills are used to evaluate performance in an area of study (Larenas et al., 2021). Although assessment may be considered as an instrument for measuring students’ performances by traditional tests, it is important to highlight that assessment and test are different terms. Testing is scheduled in the curriculum, planned, and structured whereas assessment is a non-stop process (Brown, 2004). In this sense, as Colby- Kelly and Turner (2007) stated, assessment has a function of bridging teaching and learning. According to Thomas et al. (2004), a good type of assessment should help students and teachers determine the suitability of the content, pace of instruction, and effective teaching approaches as well as initiate observation of learning throughout the course and self-evaluation. Foreign Language Assessment (FLA), similarly, has a purpose of bringing out the achievement of foreign language learners through some procedural content such as a syllabus, proficiency tests, or standards (Purpura, 2016). It pertains to the communicative use of productive or receptive language skills across different proficiency levels represented by standards or competencies (Council of Europe, 2020). 286

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

Types of Assessment Summative vs. Formative Assessment. Summative and formative assessment may have similar features; however, their purposes are different. Summative assessment aims to measure learners’ performance on the summary of the whole semester to check whether the objectives have been met. Summative assessment measures may be final tests or general proficiency exams (Brown, 2004). Formative assessment, on the other hand, is an ongoing multi-phase process, and it may happen at any time of the learning process (Ketabi & Ketabi, 2014). Teachers may employ formative assessment to get feedback for the immediate best instructional actions (Brown, 2004). Practices such as portfolios, surveys, journals, interviews, and oral presentations can be examples of formative assessment (Ketabi & Ketabi, 2014). Norm-Referenced vs. Criterion-Referenced Assessment. Norm-referenced tests rank students to show their performances among others (Hussain et al., 2015). Some examples may be large-scale and standardized tests like the TOEFL or SAT (Brown, 2004). Criteria referenced tests, on the other hand, have a structured and specified set of grading system to assess learners’ performances (Hussain et al., 2015). In this sense, classroom tests designed by instructors for the course objectives can be considered as examples of this type (Brown, 2004). Diagnostic Assessment. Diagnostic tests aim to detect learners’ weaknesses and strengths regarding their knowledge and skills (Lee & Sawaki, 2009). Therefore, crucial elements and modifications can be determined for instructional design (Hughes, 1989). For this purpose, learners’ profiles may be identified by for example, proficiency tests, interviews, and observation to produce a detailed analysis of needs and competencies (Hughes, 1989).

Assessment Principles Some principles such as practicality, reliability, validity, authenticity, and washback may be considered for the effectiveness of assessment practices (Herrera & Macías, 2015). Initially, a practical test should not be expensive or hard to manage (Brown, 2004). As East (2016) stated, practicality also pertains to enough time, materials, or people to administer the process of assessment effectively. Moreover, it helps compare the efficiency of the practice to the other types of assessments (East, 2016). The other principle that needs recognition is reliability. Reliability pertains to the steadiness and fairness of the tests. That is, test procedures and results should be straightforward across diverse situations (Brown, 2004). As tests results may be impacted by several factors, a scoring criterion should be designed to bring success to reliability (Navarrete, 1990). In this sense, tests whose scores are consistent across times, places, or occasions may be considered reliable in identifying competencies (Hughes, 1989). Validity measures how appropriately the test measures the objectives of the content. There are types of validity such as content, criterion-related, construct, and face validity. Content validity pertains to the efficiency and representableness of the test in the sense of its content (Akbari, 2012). Criterion-related validity pertains to the test’s effectiveness in predicting competencies (Brown, 2004). Construct validity pertains to the coverage of the test (Hughes, 1989). Face validity pertains to the perception of the testtakers which cannot be tested empirically by an instructor or an expert (Brown, 2004). Authenticity is another pillar that needs to be considered for assessment practices. It pertains to identifying learners’ ability on handling real-life situations (Brown, 2004; Doye, 1991). Authenticity may be ensured when the test language is natural, items are contextualized, content is appropriate or attractive to the learner, and the organization of themes may mimic the real-world (Brown, 2004). 287

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

The last principle is washback, and it pertains to the effects of language testing on learning and teaching. Buck (1988) claimed that washback effects may be both positive and negative. For positive washback, Bailey (1999) suggested some methodologies as in the following: assessment practices should focus on abilities, samples may be used extensively and in an unpredictable way, direct testing may be implemented, testing may be criterion-referenced, achievement tests should be based on objectives, and both students’ as well as instructors’ understanding of the test should be ensured. Negative washback, on the other hand, arises when the test does not reflect language competencies and pressurizes the context of teaching and learning in terms of its form and content (Taylor, 2005). As Davies et al. stated (1999), for example, “the skill of writing is tested only by multiple choice items then there is great pressure to practice such items rather than to practice the skill of writing itself” (p.225). In this sense, underrepresentation and irrelevance of the content should be reduced to lessen negative washback (Djurić, 2015).

Parental Involvement Parents may be a great influence (Deslandes & Rivard, 2013), contributor (Dewi, 2017; Diaz et al., 2020; Forey et al., 2016), or predictor of student-achievement. Indeed, parents may provide a reliable source of information about children’s developmental and academic histories (Deslandes & Rivard, 2013; Sparks & Ganschow, 1995). As parents can accurately rate and influence children’s learning behavior, such information may be used to make educational decisions including placement and interventions, goal setting, or designing activities that accommodate learners’ strengths and needs (Bedore et al., 2011). Moreover, parents may influence or support socioemotional development of their children (Dewi, 2017; Hosseinpour, Sherkatolabbasi, et al., 2015; Kalaycı & Öz, 2018), especially regarding the variables in the school environment such as anxiety, motivation, self-esteem, attitudes, or aptitude (Choi et al., 2020; Deslandes & Rivard, 2013; Dewi, 2017). There are frameworks to explicate parental involvement. Epstein who studied parental involvement for long years developed a parental involvement framework based on their research findings. Epstein et al. (2002) identified six types of parental involvement interactions that identify behaviors, responsibilities, and deeds to augment students’ achievement. These interactions include parenting-helping, communicative-effective, volunteering-organizing, learning at home, decision-making, and collaborating with the community. These interactions are presented discretely in the following; however, they may be observed exclusively in natural settings. Parenting-helping pertains to family members being aware of and familiar with the child’s development and in relation, providing them with home environments to enhance learning. Communicative-effective interactions pertain to contact about school events, students’ academic and personal development, and insights or observations at home. They help create two-way communication between school and home. Volunteering-organizing activities, on the other hand, pertain to activities that aim to support students and school programs, and they may be initiated by school personnel, parents, and community members. These activities help involve parents as volunteers and audiences in school activities where teachers can work with them to support students’ development. Learning at home activities provide parents with information about instructional procedures such as homework, rubrics, or exams; therefore, they can be involved in their children’s academic learning at home. These activities may include any curriculum-related activities that are designed to enable students to share and discuss tasks with parents. Moreover, decision-making activities pertain to parents’ contribution to school decisions, governance, and advocacy activities through school councils, improvement committees, and parent 288

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

organizations. Finally, collaborating with the community activities integrate funds, services, and assets to meet the needs of the school community and coordinate resources as well as services for all stakeholders including families, students, and the school community to contribute service to the community.

Parental Involvement and Assessment Parental involvement might show variations regarding parents’ socio-economic status, educational and language background, age, self-efficacy, perceptions of learning, attitudes, and being invited for collaboration (Bedore et al., 2011; Deslandes & Rivard, 2013; Forey et al., 2016; Hosseinpour, Sherkatolabbasi, et al., 2015; Kalaycı & Öz, 2018). Regarding assessment, the scope may dramatically narrow down to assessment literacy. In this sense, what parents know about assessment may be a byproduct of their academic histories and socioeconomic characteristics. That is, it may be that parents have developed mental models of what and how children should learn (Deslandes & Rivard, 2013); thereby, assessment outcomes may be a great concern for parents because they may be “threatening” (Deslandes & Rivard, 2013, p.23). It is also possible that parents might subsume assessment to tests (Martinez et al., 2004). Therefore, a great deal of misunderstanding between teachers and parents about students’ development may occur. Parents’ involvement in the assessment process should be systematized to eliminate misconceptions and support children’s development. For this purpose, schools can organize meetings or discussion sessions to inform parents about assessment policies and practices (Dawadi, 2019). Parents may also be given tools showing how grades are calculated and what they mean (Deslandes & Rivard, 2013). Moreover, simulations of assessment practices where parents experience how and what their children are assessed for (Deslandes & Rivard, 2013) may be delivered. It is also important not to highlight terminology with depreciative connotations. Therefore, parents can understand the assessment process and get equipped to monitor their children’s learning process (Deslandes & Rivard, 2013). Regarding this practice, parents may be provided observation checklist about children’s growth (Fredericks & Rasinski, 1990). School board and teachers can also listen to parents’ expectations (Fredericks & Rasinski, 1990); therefore, a tool that shows ratings of home assignments such as difficulty, children’s engagements, appropriateness, or suggestions can be designed to increase their involvement and support. Teachers can also support parents’ assessment literacy and involvement behaviors. They may provide parents with opportunities to visit classrooms; therefore, they can observe instructional program and their children (Fredericks & Rasinski, 1990). To increase the impacts of such visits and help parents accurately interpret school practices (Bedore et al., 2011), teachers can be trained to communicate and work with parents whose backgrounds and academic histories may be different (Bedore et al., 2011; Dawadi, 2019). Moreover, teachers need to refer to parents’ observation records regularly via different modules of communication (i.e., on the phone, face-to-face conferences, or online meetings), and let parents ask for clarification, if need be (Fredericks & Rasinski, 1990).

Parental Involvement, Foreign Language Learning, and Assessment Foreign language learners may meet challenges such as a lack of sociocultural elements supporting their foreign language development. In this sense, home activities such as parents’ reminding children for homework, asking what they learned at school, and asking about their feelings (Dewi, 2017) may be important. Also, as Butler (2015) argued, parents and children may influence each other’s involvement 289

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

and achievement, respectively. That is, when parents monitor children’s activities or model attitudes, they may present higher school attendance, educational aspirations, positive classroom behavior (Hosseinpour, Sherkatolabbasi, et al., 2015), motivation (Butler, 2015), and positive self-concept (Kangasvieri & Leontjev, 2021). However, although parental involvement has potentials for foreign language development or achievement, it is not sufficiently explored (Forey et al., 2016). In this sense, in the following previous research on parental involvement and foreign language proficiency will be reviewed chronologically to detect a potential cumulative trend, if there is any at all. One of the earlier studies was conducted by Sparks and Ganschow (1995), and they examined whether low, average, and high-risk groups show any differences in performance tests of the native language, cognitive ability, and foreign language aptitude and achievement from the parents’ perspective. They collected data from the parents of 79 native-born American teenagers who studied their first-year foreign language (i.e., Spanish, French, German, or Latin) courses at a middle school. Parents were sent the foreign language screening instrument (FLSI-P) which screens the population who may be likely to be at risk for foreign language learning problems. Findings confirmed that parents helped identify three groups of risk students, and there were significant differences among low-, middle-, and high-risk groups. Indeed, high-risk students had significantly lower foreign language aptitude, foreign language grades, and native language skills compared to the others. Ngadiman et al. (2009) examined Indonesian parents’ (N=212) and teachers’ (N=11) perceptions of the assessment of English performance. Both groups had similar perceptions regarding the functions and types of the tests; however, they had different understandings of the validity and the content. Regarding the functions of the tests, participants highlighted that tests helped report students’ levels to parents, identify weaknesses and strengths of the program, improve teaching, motivate students, monitor students’ progress, assign grades, measure students’ achievement of objectives, and determine students’ rank. Regarding the content of the tests, parents expected them to cover language skills such as reading, writing, speaking, and listening. On the contrary, teachers stated that tests covered grammar and vocabulary items, and they adapted tests from the supplementary materials (i.e., English exercise books). They also thought that those were valid tests because they covered the materials taught in the class, yet not the instructional objectives. Moreover, most of the teachers thought that parents consider high grades was associated with high proficiency. That is, grades were success indicators for parents who confirmed this perception. Bedore et al. (2011) examined the validity of parent and teacher reports in determining SpanishEnglish bilingual children’s language proficiency. 440 children were given a set of language tests and parents were asked to rate both language proficiencies. Findings confirmed that when parents rated their children’s first and second language proficiency, their ratings and test scores were correlated, positively as in Sparks and Ganschow’s (1995) study. However, teachers’ ratings of children’s second language and test scores were not correlated. Also, parents focused more on the semantics rather than morphosyntax. For parents, communicative functions were more important while teachers cared for academics. Butler continued examining parental involvement in the following years (2014, 2015) and conducted a study with 572 English language learners and their parents in Mainland China. While parents were surveyed for the socioeconomic (SES) characteristics represented by the income, involvement behaviors, and beliefs about English language learning and children’s competencies, children were given a standardized English language test. Then, 96 students were interviewed for parental support and attitudes. Findings confirmed that parents played a substantial role in children’s motivation. Especially, low SES children suffer from a lack of parent-oriented motivation and support. Indeed, those children may be exposed to 290

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

a set of maladaptive controlling behaviors or beliefs (Butler, 2015). As a result, those children’s selfperceived competence and achievement was affected, negatively (Butler, 2014). Hosseinpour, Sherkatolabbasi, et al. (2015) also investigated the parental involvement of 70 Iranian couples of the 3rd graders. Employing a mixed research methodology, they found that achievement scores of those students whose parents’ level of involvement and attitude towards foreign language learning was higher performed significantly better than their counterparts. This study also reports that parental involvement significantly varied by parents’ knowledge of English, educational background, and income levels. As a follow-up study, Hosseinpour, Yazdani, et al. (2015) collected data from 70 children’s parents and examined parents’ involvement, attitude, educational background, and level of income concerning their children’s English achievement test scores. All variables were highly correlated to students’ test scores. In addition to the previous findings, Hosseinpour, Yazdani, et al. (2015) found that parents’ educational and income levels were also correlated to their children’s test scores. A recent study conducted by Forey et al. (2016) with 5-8 years old children and their parents in Hong Kong examined parents’ perceptions of supporting children’s English development. Participants (N=500) came from a low socio-economic background and worked full-time jobs. 18 parents participated in the semi-structured focus interviews. The data confirmed that almost all parents agreed that parental involvement was important for their children’s English development. For this purpose, they read stories, talked to their children, taught them English vocabulary, played games, sang songs, and watched shows or videos in English. However, parents’ responses also indicated that they experienced difficulties in organizing activities and choosing materials for their kids. The study also reported that parents were restrained from helping children because they felt insecure with their English grammar, pronunciation, and vocabulary. Indeed, they perceived that learning a foreign language may be facilitated via these elements. As well, they felt humiliated after their children were taught some language items and they could not remember them afterward. Parents, moreover, lacked time and skills to support their children’s development, and when they offered some help, the child displayed a lack of interest. Their beliefs about learning a foreign language did not match school practices; therefore, they just monitored children’s homework. There was also a study that came from the context of this research, Turkey. Kalaycı and Öz (2018) examined parental involvement in a private elementary school. One hundred eighty participants were recruited, and 10 were interviewed about their involvement in English language learning. Half of the parents stated that their involvement did not make a significant contribution to children’s English language learning, and it was teachers’ responsibility. Others stated that they engaged in children’s English language development; otherwise, it might be affected adversely. Parents stated that they read books, listened to songs, and watched videos with their children. They also practiced vocabulary items and grammatical structures as parents in Forey et al. (2016) did. However, these were usually revision activities where children were exposed to the language out of class. They also helped children with homework by either guiding children for the task demands or helping with spelling or pronunciation. Finally, parents did not initiate an interaction with the English teacher as s/he would when there was a problem or need for help. Indeed, they expected guidance from the teacher to implement strategies. Dawadi (2019) employed a mixed method research design with six Nepali parents. Parents whose education level was categorized as high and low were interviewed about the foreign language test. While most parents considered the tests to be fair during the pre-test context, they thought that the test was not valid as it just focused on memorization and a set of limited knowledge. During the post-test interviews, most parents got suspicious of test-fairness although it was controlled by the government, and there were strict rules and regulations. Especially, test delivery and grading were considered problematic because of 291

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

the loose invigilation, cheating, as well as biased or unfair scoring practices. In this sense, they thought the test lacked face validity and reliability. Moreover, there were differences between the parents with high and low education regarding the knowledge about tests. Parents with a low education level lacked the knowledge of test structure and format as well as the concept of test accuracy. Diaz et al.’s (2020) study examined Chilean parents’ perceptions of assessment and their children’s grades in English as a foreign language. The authors conducted semi-structured interviews with 74 parents to find out how they helped children to get prepared for tests or various assessments. Findings highlighted those parents tried to support their children by showing them some videos, helping them practice speaking in English, helping them review the subject, translating what they need, helping them with their homework, searching for examples and information, providing them with visuals, helping them with pronunciation, hiring private teachers, providing emotional support, and gaming. Otherwise, half of the parents stated children studied on their own and parents did not intervene in their studies. This is because parents did not hold a sufficient proficiency level in English. Finally, Larenas et al. (2021) reported Chilean parents’ perceptions of 10th grade children’s English grades. This sample also included 74 parents with whom semi-structured interviews were conducted. Findings revealed almost all parents reacted positively to their children’s English grades and reinforced children by acknowledging their accomplishments and effort, praising them for their grades, encouraging the child, and practicing English with them. Larenas et al. (2021) argued that parents in their study had a positive perception of English and supported their children to learn it because it may bring future possibilities in a rapid-changing world. An array of studies conducted in different countries (i.e., United States, China, Iran, Turkey, Nepal, and Chile) reported similar findings. It may be concluded that parents can identify their children’s foreign language proficiency accurately, and their ratings may correlate with test scores. Parents, however, may mostly help their children’s development of grammar, vocabulary, and pronunciation although they care about communicative use of a foreign language. However, because parents may lack time, skills, or proficiency in a foreign language, they might support children’s development to a limited extent via various home-based activities such as talking to children, helping children learn vocabulary, watching videos together, monitoring their homework, searching examples or extra practices, singing songs, helping children review the content, or having them practice the mechanics of the language. Or else, parents may feel that it is teachers’ responsibility to help develop foreign language proficiency as they do not feel confident with this task. Although these findings are limited because they do not simply focus on parents’ FLA understanding, they may highlight an urgent need for parents’ assessment literacy. In some studies, the need for parents’ foreign language assessment literacy emerges as an expectation.

METHODOLOGY Research Design This study employs a qualitative descriptive research methodology and aims to provide an illustrative summary of parents’ perceptions and involvement regarding English language assessment in Turkey. As qualitative descriptive research draws from the naturalistic inquiry, it “purports a commitment to studying something in its natural state to the extent that is possible within the context of the research arena”

292

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

(Lambert & Lambert, 2013, p.255). In this sense, this study may not produce a theory; however, it may describe or discover the nature of the phenomenon; parental involvement and FLA. For this case study, purposeful sampling technique that deem rich data was employed, and data collection tools included interviews. In relation, data were analyzed regarding no pre-existing set of themes or categories. Indeed, it employed a set of data-driven codes simultaneously generated. Therefore, the findings were a formal descriptive summary of the content as suggested by Lambert and Lambert (2013).

Participants This research employed purposeful convenience and snowball sampling methodologies to recruit parents whose children study at the last grade of elementary school and at a middle school in Turkey at the time of the study. The MoNE does not suggest summative assessment practices for the early grades (Milli Eğitim Bakanlığı, 2018b). Therefore, parents whose children study at the 4th to 8th grades were included in the study. Twenty-five parents were recruited for this study. While two interviews were conducted with fathers, 23 of the interviews were conducted with the mothers. Female participants were mostly housewives (N=12) and teachers (N=9). There was also one manager, worker, academician, and an entrepreneur in the group. The participants’ degree of education was as in the following: high school (N=8), college (N=7), master’s degree (N=4), middle (N=3), and elementary (N=2) school, and one with a Ph.D. Participants stated that mothers (N=19) primarily provided or managed educational initiatives for their children. In few families, fathers (N=2) or both parents (N=4) may engage in educational decisions, processes, or outcomes. Parents mostly stated that their income was at the middle level (N=22), and three of the participants noted that they had a high level of income. In relation, parents usually have 2 children (N=16), and the child whose data was collected was mostly the second. Few parents, on the other hand, have an only-child (N=3) or three (N=5). The children studied at the 8th (N=10), 4th (N=5), 5th (N=4), 7th (N=4), and 6th (N=2) grades at the time of the study. Those children studied at a state school (N=17) or a private school (N=8).

Data Collection Tools and Procedures Data were collected via semi-structured interviews on the phone at the parents’ best convenience. Participants were informed that data were confidential, and personal data that help identify participants were not collected. Participants were provided with a recap and opportunities to change or correct any statements. The authors developed interview questions and then, they were confirmed by an expert who implements and studies FLA for two decades, professionally. The questions aimed to identify parents’ understandings of FLA, and they pertained to functions of assessment practices, knowledge of assessment types, availability of support for their assessment literacy, and perceptions of assessment outcomes. The following semi-structured interview questions were discussed with the participants in their native language. 1. How is your child’s foreign language proficiency assessed? a. Which types of assessment practices are used? b. Do you think such methods are appropriate, if at all? 2. What is assessed to determine your child’s foreign language proficiency? 3. How do you track your child’s foreign language development? 293

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

a. What do grades mean? b. Have you got any support from the teacher or someone else to interpret these grades? 4. What do you do when you learn your child’s grade for English class? a. How do assessment outcomes impact your involvement in your child’s foreign language development, if at all? 5. How would you rate your understanding or knowledge of FLA? a. Do you participate in any seminars, workshops, or models of FLA practices to improve your knowledge or skills? How do they impact your understanding of FLA?

Data Analysis Procedures The data were analyzed via thematic analysis methodology. Thematic analysis is a flexible, analytic, and “independent qualitative descriptive approach” to identify, analyze, and report common patterns within data (Vaismoradi, Turunen, & Bondas, 2013, p.400). To analyze the data, the data were transcribed verbatim, first. Then, two authors coded the data independently and derived the codes and categories. Following these steps, the authors did a comparison of the codes, and the interrater reliability was calculated as α. 90 for the full set. Next, they discussed and compared the categories to refine the themes. The following emerged as the themes and categories: 1. Proficiency in a foreign language: parents’ perceptions of foreign language proficiency, exam types, content of the assessment, appropriateness of the content, 2. Parents’ foreign language assessment literacy: parents’ knowledge of FLA, ways to assess foreign language performance, meaning of grades, interpreting the grades, and support for FLA, 3. Parents’ involvement in the development of a foreign language: importance of foreign language, supporting and/or tracking foreign language development, functions of grades, and variations in parental involvement.

FINDINGS Proficiency in a Foreign Language: Parents’ Perceptions vs. Exams Parents expressed their perceptions of proficiency in a foreign language, and they (N=19) stated that proficiency in a foreign language pertains to communication. Regarding this response, communicating with foreigners (N=6) and in a foreign country (N=9) emerged, naturally. That is, proficient language users can communicate their needs, demands, interests, and daily routines easily. They can also understand others and help them, if there is a need at all. They can manage their professional or daily tasks easily when they are in a foreign country. Moreover, few parents related foreign language proficiency to high exam grades while another expressed that a proficient language user can read, write, listen, and speak in a foreign language (P.21). Six of the parents, on the other hand, stated that they had no idea about foreign language proficiency. Moreover, 6 parents expressed the importance of being proficient in a foreign language. This theme emerged naturally as the participants commented on foreign language proficiency. They stated that one should know a foreign language to work at better positions (N=2), have solid relations (N=2) when in a 294

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

foreign country, work in a foreign country or company, gain more knowledge, be open to other cultures, and have high self-esteem. Twenty-three parents stated FLA practices are test-driven, and they are either multiple-choice types of exams (N=11) or written exams (N=10) where learners fill in the blanks, answer simple questions, or write sentence level answers. Also, 7 parents mentioned oral exams. Parents (N=16) stated that the content of the exam usually pertains to vocabulary (N=9), the curriculum (content of the class, N=5), and grammar (N=6). One parent (P.12) stated that “they use very traditional techniques on exams. In our time, there would be dialogs, listening and speaking sections, but now they only use question-answer types of exams for grammar or vocabulary”. Parents whose children attend to a private school stated that exams also include speaking parts (N=5) beyond vocabulary and grammar sections. This is because private schools may deliver two forms of exams; “one is based on the MoNE’s curriculum, and it is a multiple-choice type of exam. The school delivers another one, and it includes speaking and writing parts” (P.25). Five parents stated that they do not know the content of the exams as declared by P.11 “I only know about my kid’s English language proficiency through his/her exam grades, but I am not sure about the types of exams”. Four did not provide any responses about assessing foreign language proficiency. Eighteen of the parents expressed that the content and question types in the exams are not appropriate. “Those exams do not assess if the children can use the language or if they can communicate” (P.4). Indeed, “if those questions and content were OK, then they could speak” (P.25), and “express themselves well” (P.6). However, “even at the university level, they cannot understand, read, and speak English” (P.4). It may be because “exam questions cannot be transferred to real life. They just choose the best option” (P.19) and “because there is no practice, they forget those topics soon after the exam” (P.7). Five of the parents, on the other hand, stated that those exams are appropriate. However, while one of the parents stated, “I do not know much about assessing foreign languages, I cannot say those are inappropriate” (P.16), the others exclaimed that “as if we have any other choices! Do we? We don’t” (P.5); “tests are part of our education system” (P.9). Also, “if I say the exams are not appropriate, I have to think about alternatives; however, I cannot think about any other methods. I do not know about alternatives” (P.10).

Parents’ Foreign Language Assessment Literacy Regarding parents’ FLA literacy, grades, interpreting grades, knowledge of FLA, and support for FLA were discussed. Twelve of the parents stated that they track their children’s foreign language proficiency or development via exam grades. Regarding this factor, parents (N=14) stated that grades indicate their children learned the content at the time or their children were successful on the exam (N=10). However, grades may be impacted by children’s performance on site, motivation, or interest in the topic (N=3). Few of the parents (N=4) stated that grades may indicate the rank of the children, or they will pass/ fail the class. In this sense, an average grade may mean that their English performance is average, and they can pass the class (P.2). However, some of the parents (N=6) also added that exam grades do not indicate that children know English; however, they just show “my child memorized the exam content well” (P.8) or “the content is interesting or not interesting for him and this affects his performance” (P.19). As P.14 stated, “high grades such as 90 means nothing if she cannot communicate or use it for her needs”. It is because “they just fill in the blanks using single items that are already provided” (P.17). Also, “the number of questions

295

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

on tests are very limited; thereby, that is not sufficient to evaluate language proficiency” (P.6). One of the parents (P.19) remarkably explained why they do not regard exam grades. Exam grades mean nothing as he does not use English. I mean, he cannot speak or communicate what he does in his daily life. He may just guess the answers easily because the questions assess basic skills like remembering. Exams do not include questions that require analyzing, synthesizing, or evaluating the content. Moreover, they provide the question stems in Turkish; therefore, students can guess the correct answer even if they do not know for example, the vocabulary item. In this case, it does not mean he knows English. So, if this is the case, can 80 or 90 mean anything?! Two parents related the assessment and instructional process and stated that low grades may indeed indicate that “there is something wrong either with him or with us, if the grades are low” (P.20) and it may be “sometimes the kids and sometimes the teacher… ineffective in teaching and learning process” (P.11). Parents also (N=20) stated that they are not provided any information or help to interpret exam grades. As P.24 stated, there may be “a lack of communication between the school, teachers, and parents” regarding FLA. Indeed, teachers may not “explain what a grade represents and which level it stands for” (P.18). P.16 exclaimed that There is no explanation regarding the percentages of the content; they do not state that 40% of the grade pertains to for example, speaking. We do not know the types and levels of the questions, either. Therefore, what 60 or 90 stands for is unclear. They do not inform us. In relation, “even if I [a parent] ask about her grades, they say she already got a 100. What do you want to learn more?!” (P.17). On the other hand, 5 parents stated that teachers talk to parents to explain what the child needs to improve -which “usually pertain to grammar and vocabulary” (P.5). However, parents also thought that “they [teachers] should be teaching them. They just tell us what the child needs to improve!!” (P.4). Twenty-two of the parents stated that they do not know about FLA. On the other hand, three of the parents knew about FLA to a limited extent due to their previous experiences. One parent stated that “I know about the TOEFL or international exams because my elder daughter took these exams to work abroad. But schools do not use them” (P.21). All parents unanimously stated that schools or teachers do not deliver any workshops, seminars, or meetings to inform them about FLA practices, principles, and purposes. Even one parent stated that “I have 3 children and my daughter is a college-graduate. For such a long time, I have never ever heard about such a meeting” (P.20). However, they highlighted that they “would love to attend such a meeting” (P.23).

Parents’ Involvement in the Development of a Foreign Language Parental involvement in children’s foreign language learning were distinctive at two levels: (a) behaviors to support children’s foreign language development in general and (b) behaviors after exam scores are announced. All parents expressed several ways to support their children’s English language development. Those activities pertain to parenting- helping, communicative effective, and learning at home. Parenting helping activities are dominant modes. Parents in this study provide their children with opportunities to access the internet; therefore, they can watch movies or videos in the foreign language (N=6). Also, few 296

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

parents (N=4) mentioned YouTube or websites to revise or practice the content. Three of the parents stated that their children play online games, and it helps them improve English, “for example they can learn new vocabulary items” (P.7). One parent also mentioned her child does research on the internet and thinks about the concepts, critically. Parents also track their children’s development which pertains to parenting helping kind of involvement. All parents stated that they check their children’s grades to track their foreign language development, and few (N=5) may observe their children’s interactions with other foreign language learners, elder siblings, or the stimuli in the foreign language. For example, a father stated that he observed his child singing songs in English or translating English to Turkish when shopping. Therefore, he had an idea about the level and progress of his child. Two of the parents stated that they observe their children’s study-habits at home. Moreover, learning at home activities were also recalled. Some of the parents (N=3) may have conversations with their children about the English class or they (N=3) may check their homework. Three parents also study with their children. Two of the parents stated that they read books together and one stated that they helped their children memorize vocabulary items. Also, another parent discusses the content with her child. As more than half of the parents talked about their little child, some other parents (N=4) mentioned that elder sibling is helping younger sibling(s) with homework, providing help, speaking to them, or watching videos together. There were also few cases that parents may not feel competent to help their children (N=3) as they “do not know English” (P.13 & P.8) or they are “primary school graduates” (P.20). Furthermore, exam grades may also be influential on parental involvement as communicative effective activities usually pertain to exam scores. That is, some parents (N=10) talked to teachers about the grades and asked for guidance. Also, grades may initiate parenting helping and/or learning at home types of behaviors. That is, when the grade was low, some parents (N=11) tried to motivate children to study more. They talked to their children, and some explained the importance of English for future life or profession, or high school. Three parents had their kids attend private courses. Furthermore, some parents (N=10) helped their children revise the topics. Two parents relied on their academic histories; for example, two parents had their child write vocabulary items 6 or 7 times as they did when they were students. Three others may do extra practices on different books or websites. On the other hand, when the grade was high, parents (N=10) usually appreciated their children’s success. Only one parent stated that “I do not care much about grades, I prefer him to be motivated to learn the language” (P.16). Four of them, however, stated that they did/could not track and/or support their children’s proficiency or development of foreign language because they do not know English, or they are elementary school graduates.

Variations in Parental Involvement Parental involvement behaviors may show some variations. Parenting helping (N=19), communicative effective (N=19), and learning at home (N=12) activities were mostly presented by high-school graduate housewives and college graduate teachers (see Figure 1). Indeed, one parent stated that “I am a teacher and I know children’s developmental characteristics, capabilities, and needs” (P.19).

297

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

Figure 1. Parental involvement by professions and educational degree

Moreover, parental involvement behaviors may change by children’s grade and school type. As seen in Figure 2, parental involvement is highest at 8th grade and 4th grade. It may be because in Turkey, children take a national exam for high school placement at 8th grade, and some cities deliver a large-scale test at 4th grade. Moreover, the state school students’ parents may show more involvement when compared to the private schools. This is because as P.18 and P.17 argued, English language curriculum and instruction at state schools may not be sufficient to develop students’ language proficiency. Figure 2. Parental involvement by grade and school type

Finally, the rank order of the children may be a factor in parental involvement (see Figure 3). That is, the little children may get more support from their family; both from the parents and/or elder siblings.

298

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

Siblings may check younger one’s homework (P.5), talk to the younger ones (P.12), or “cousins and elder sister support their studies” (P.20). Figure 3. Parental involvement by rank order of the child

DISCUSSION The 21st century’s demands and dynamics may push national policies and practices to concentrate on developing citizens’ foreign language proficiency more adequately and comprehensively than the previous eras. Although large scale testing such as PISA may provide substantial feedback for participating countries’ initiatives, parents’ role in children’s success (e.g., Butler, 2014, 2015; Dawadi, 2019; Diaz et al., 2020; Forey et al., 2016; Hosseinpour, Sherkatolabbasi, et al., 2015; Hosseinpour, Yazdani, et al., 2015; Larenas et al., 2021; Sparks & Ganschow, 1995) cannot be ignored, anymore. Indeed, examining national policies and practices with different perspectives may improve the practicability and effectiveness of them. This study identified a discrepancy between the current practices and policies of FLA in Turkey. However, it may not pertain to parents, and parents may not be influential, either. First, parents’ perceptions of a foreign language proficiency align with the MoNE’s and CEFR’s basic notion; communicative use of language competencies in real-life situations as in Ngadiman et al.’s (2009) study. While parents expect their children to be able to read, write, speak, and listen competently, English language exams in practice might focus on morphosyntax as in Bedore et al.’s (2011) research. Indeed, current assessment practices are test-driven and limited in scope, and they are not authentic. Moreover, they lack validity and criterion-references although both national and international policies emphasize it. For these reasons, grades may not indicate achievement of the curriculum standards or CEFR’s competencies as parents are also aware of the fact. This study also highlights an urgent need to help improve parents’ FLA literacy. While parents in this study unanimously declared that they do not know about FLA, teachers may restrain from providing some guidance or explanation about why, how, and what is assessed besides how to interpret the outcomes.

299

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

As Kalaycı and Öz (2018) found parents, however, may not be content with teachers’ reaching out to them only when their child is having problems or failing. Also, as Deslandes and Rivard (2013) argued parents may get involved in educational endeavors more when they feel that their effort will make a positive difference. In this sense, initiating parental involvement may first start with developing parents’ understanding of FLA although parents do not always announce it as in this study. For this task, teachers need to explain the rationale of the test content and assessment types. They also need to discuss functions of assessment practices and how to evaluate/interpret the outcomes. When parents do not understand those fundamentals, they may develop different interpretations or expectations regarding their children’s foreign language proficiency. Yet more seriously, they may lose trust in schools or teachers. Moreover, this study suggests that parenting behaviors may be universal. Parents in this study engage in some activities such as providing their children with internet access to watch videos, visit websites, do extra practices or research, and play games. They also have conversations with their children explaining the importance of a foreign language, help with their homework, study with them, observe their task engagement, or help them memorize vocabulary items. Moreover, they appreciated their children’s success and tried to motivate them to do better when they failed. They took the roles of teachers and help their children revise the content as much as they can. When they feel incompetent, they pushed their limits to have their children take private classes. Indeed, these behaviors are typical parental involvement activities reported by the previous research (e.g., Diaz et al., 2020; Forey et al., 2016; Kalaycı & Öz, 2018; Larenas et al., 2021). Typical Turkish parental involvement activities, however, may be limited to parenting helping, communicative effective, and learning at home. Indeed, the nature of these activities may reflect parents’ care. Because these are solely home-based activities and they are not suggested or initiated by teachers or professionals, it may take longer time to do the right actions. During this period, learners’ motivation, aptitude, and achievement in a foreign language may be impacted negatively as Hosseinpour, Sherkatolabbasi, et al. (2015) argued. That is, parental involvement behaviors might be too serious for ignorance. Findings regarding siblings’ interaction in the development of a foreign language may be distinctive. Epstein et al.’s (2002) framework may implicitly and potentially recognize any family members’ support; however, sibling’s support should be explicitly set. When parents do not feel competent to support their children (e.g., Forey et al., 2016; Kalaycı & Öz, 2018), siblings or any immediate relative may take over their role. However, the nature of the interactions would be different regarding the power dynamics and intimacy.

RESEARCH DIRECTIONS AND PRACTICAL IMPLICATIONS Regarding the status-quo, teachers should find ways to support parents’ both FLA literacy and involvement behaviors. First, they need to know who parents are and what they can do as parents’ backgrounds and academic histories may be different (Bedore et al., 2011; Dawadi, 2019). It is also important to listen to parents’ expectations (Fredericks & Rasinski, 1990), understand parental involvement behaviors, and then examine the divergences between policies, practices, and parents’ expectations. Thereby, training programs’ content and nature may be adapted to the focus groups’ needs, potentials, resources, and competencies. While needs-based meetings help introduce parents with assessment and parental involvement practices, simulations can help them try on those practices (Dawadi, 2019; Deslandes &

300

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

Rivard, 2013). Furthermore, teachers may provide feedback for the effectiveness of parental involvement regarding students’ foreign language development and specifically, in relation to assessment practices. It may be that we placed a great burden on teachers’ shoulders to help increase parents’ FLA literacy and involvement practices. However, there may be teachers who do not know how to handle these issues or have time, knowledge, or motivation to deliver such trainings. For these reasons, it is important to examine teachers’ FLA literacy, understandings of parental involvement, and communication skills and motivations to deliver such trainings. Future research may examine pre-service teacher education programs’ inclusion of FLA and parental involvement and identify in-service teachers’ FLA literacy and initiation of parental involvement. We suggest that each school may recruit an assessment specialist to help stakeholders increase their FLA literacy, polish practices, and design new pathways accurately; therefore, teachers’ load can be managed fairly. Assessment specialists may help teachers track students’ development, design valid and reliable assessment practices, and implement various forms of those practices. They may also help teachers recognize the link between assessment and instruction; therefore, curriculum standards might not be jeopardized due to negative washback. For this purpose, assessment specialists can work with teachers one-on-one, and they can deliver need-based FLA workshops and seminars. It is also important for assessment specialists to help increase parents’ literacy of FLA because it may impact involvement behaviors. Parents may or may not lack such knowledge; however, some may be motivated to learn about it or polish their repertoire. When teachers cannot provide help, parents can turn to assessment specialists, and they may be offered with seminars and workshops where they learn about FLA and procedures to support their children’s development in a foreign language, respectively. Such practices may also bring school and parents together while parents may have a voice to influence policies. Finally, teachers may design authentic assessment practices that require both parents’ and children’s involvement. Indeed, while such projects may meet parents’ expectations of children’s communicative language use, they can observe and support them better. Such projects can also help initiate other parental involvement activities such as volunteering and community engagement. When teachers design such assessment practices, all stakeholders (i.e., students, parents, teachers, and community members) may work together for a good reason. Large scale projects, for example funded by national or international organizations, may be solid examples; however, they may not be realistic or approachable for all schools. In such cases, small local projects such as charity bazaars or spring festivals may substitute them.

CONCLUSION Learning a foreign language is not merely a cognitive act, and it may indeed require various kinds of support. That is, language learning may have social, emotional, and even moral aspects that unite people, and it may not be sterilized from the environment that a child is exposed to and interacting with. It may not be an isolated act of memorizing bits and pieces but transforming ways of thinking and living. This research received no specific grant from any funding agency in the public, commercial, or notfor-profit sectors.

301

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

REFERENCES Akbari, R. (2012). Validity in language testing. In C. Coombe, P. Davidson, B. O’Sullivan, & S. Stoynoff (Eds.), The Cambridge guide to second language assessment (pp. 30–36). Cambridge University Press. Bailey, K. M. (1999). Washback in language testing. Educational Testing Service. Bedore, L. M., Peña, E. D., Joyner, D., & Macken, C. (2011). Parent and teacher rating of bilingual language proficiency and language development concerns. International Journal of Bilingual Education and Bilingualism, 14(5), 489–511. doi:10.1080/13670050.2010.529102 PMID:29910668 Brown, H. D. (2004). Principles and classroom practices. Longman. Buck, G. (1988). Testing listening comprehension in Japanese university entrance examinations. JALT Journal, 10(1), 15-42. Butler, Y. G. (2014). Parental factors and early English education as a foreign language: A case study in Mainland China. Research Papers in Education, 29(4), 410–437. doi:10.1080/02671522.2013.776625 Butler, Y. G. (2015). Parental factors in children’s motivation for learning English: A case in China. Research Papers in Education, 30(2), 164–191. doi:10.1080/02671522.2014.891643 Choi, N., Sheo, J., & Kang, S. (2020). Individual and parental factors associated with preschool children’s foreign language anxiety in an EFL setting. Elementary Education Online, 19(3), 1116–1126. Chow, B. W. Y., McBride-Chang, C., & Cheung, H. (2010). Parent-child reading in English as a second language: Effects on language and literacy development of Chinese kindergarteners. Journal of Research in Reading, 33(3), 284–301. doi:10.1111/j.1467-9817.2009.01414.x Colby-Kelly, C., & Turner, C. E. (2007). AFL research in the L2 classroom and evidence of usefulness: Taking formative assessment to the next level. Canadian Modern Language Review, 64(1), 9–38. doi:10.3138/cmlr.64.1.009 Council of Europe. (2020). Common European framework of reference for languages: Learning, teaching, assessment – Companion volume. Council of Europe Publishing. Davies, A., Brown, A., Elder, C., Hill, K., Lumley, T., & McNamara, T. (1999). Dictionary of language testing (Vol. 7). Cambridge University Press. Dawadi, S. (2019). Students’ and parents’ attitude towards the SEE English test. Journal of NELTA, 24(1–2), 1–16. doi:10.3126/nelta.v24i1-2.27677 Deslandes, R., & Rivard, M.-C. (2013). A pilot study aiming to promote parents’ understanding of learning assessments at the elementary level. School Community Journal, 23(2), 9–31. Dewi, S. S. (2017). Parents’ involvement in children’s English language learning. IJET, 6(1), 102–122. doi:10.15642/ijet.2017.6.1.102-122 Diaz, C., Acuña, N., Ravanal, B., & Riffo, I. (2020). Unraveling parents’ perceptions of English language learning. Humanities & Social Sciences Reviews, 8(2), 193–204. doi:10.18510/hssr.2020.8223

302

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

Djurić, M. (2015). Dealing with situations of positive and negative washback. Scripta Manent, 4(1), 14–27. Doye, P. (1991). Authenticity in foreign language testing. In Current developments in language testing (pp. 103-110). Regional Language Centre. East, M. (2016). Assessing foreign language students’ spoken proficiency: Stakeholder perspectives on assessment innovation. Springer. doi:10.1007/978-981-10-0303-5 Epstein, J., Sanders, M., Simon, B., Salinas, K., Jansorn, N., & Van Voorhis, F. (2002). School, family, and community partnerships: Your handbook for action (2nd ed.). CorwinPress. Forey, G., Besser, S., & Sampson, N. (2016). Parental involvement in foreign language learning: The case of Hong Kong. Journal of Early Childhood Literacy, 16(3), 383–413. doi:10.1177/1468798415597469 Fredericks, A. D., & Rasinski, T. V. (1990). Working with parents: Involving parents in the assessment process. The Reading Teacher, 44(4), 346–349. Herrera Mosquera, L., & Macías,, V. D. F. (2015). A call for language assessment literacy in the education and development of teachers of English as a foreign language. Colombian Applied Linguistics Journal, 17(2), 302–312. doi:10.14483/udistrital.jour.calj.2015.2.a09 Hosseinpour, V., Sherkatolabbasi, M., & Yarahmadi, M. (2015). The impact of parents’ involvement in and attitude toward their children’s foreign language programs for learning English. International Journal of Applied Linguistics and English Literature, 4(4), 175–185. Hosseinpour, V., Yazdani, S., & Yarahmadi, M. (2015). The relationship between parents’ involvement, attitude, educational background and level of income and their children’s english achievement test scores. Journal of Language Teaching and Research, 6(6), 1370–1378. doi:10.17507/jltr.0606.27 Hughes, A. (1989). Testing for language teachers. Cambridge University Press. Hussain, S., Tadesse, T., & Sajid, S. (2015). Norm-referenced and criterion-referenced test in EFL classroom. Journal of Humanities and Social Science Invention, 4(10), 24–30. Jamali, F., & Gheisari, N. (2015). Formative assessment in the EFL context of Kermanshah high schools: Teachers familiarity and application. Global Journal of Foreign Language Teaching, 5(1), 76–84. doi:10.18844/gjflt.v5i0.48 Kalaycı, G., & Öz, H. (2018). Parental inveolvement in English language education: Understanding the parents’ perceptions. International Online Journal of Education & Teaching, 5(4), 832–847. Kangasvieri, T., & Leontjev, D. (2021). Current L2 self-concept of Finnish comprehensive school students: The role of grades, parents, peers, and society. System, 100, 1–14. doi:10.1016/j.system.2021.102549 Ketabi, S., & Ketabi, S. (2014). Classroom and formative assessment in second/foreign language teaching and learning. Theory and Practice in Language Studies, 4(2), 435–440. doi:10.4304/tpls.4.2.435-440 Lambert, V. A., & Lambert, C. E. (2013). Qualitative descriptive research: An acceptable design. Pacific Rim International Journal of Nursing Research, 16(4), 255–256.

303

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

Larenas, C. D., Boero, N. A., Rodríguez, B. R., & Sánchez, I. R. (2021). English language assessment: Unfolding school students’ and parents’ views. Educação e Pesquisa, 47, 1–27. doi:10.159016784634202147226529 Lee, Y. W., & Sawaki, Y. (2009). Cognitive diagnosis approaches to language assessment: An overview. Language Assessment Quarterly, 6(3), 172–189. doi:10.1080/15434300902985108 Martinez, R.-A., Martinez, R., & Perez, M. H. (2004). Children’s school assessment: Implications for family – school partnerships. International Journal of Educational Research, 41(1), 24–39. doi:10.1016/j. ijer.2005.04.004 Milli Eğitim Bakanlığı. (2018a). 2023 Vision of national education. http://2023vizyonu.meb.gov.tr/ doc/2023_EGITIM_VIZYONU.pdf Milli Eğitim Bakanlığı. (2018b). English foreign language curriculum. http://mufredat.meb.gov.tr/ Dosyalar/201812411191321-İNGİLİZCEÖĞRETİMPROGRAMIKlasörü.pdf Navarrete, C., Wilde, J., Nelson, C., Martinez, R., & Hargett, G. (1990). Informal assessment in educational evaluation: Implications for bilingual education programs. National Clearinghouse for Bilingual Education. Ngadiman, A., Widiati, A. S., & Widiyanto, Y. N. (2009). Parents’ and teachers’ perceptions of the assessment of the students’ English achievement. Magister Scientiae, 26, 83–97. OECD. (2021). PISA 2025 foreign language assessment framework. OECD Publishing. Purpura, J. E. (2016). Second and foreign language assessment. Modern Language Journal, 100(1), 190–208. doi:10.1111/modl.12308 Sparks, R. L., & Ganschow, L. (1995). Parent perceptions in the screening of performance in foreign language courses. Foreign Language Annals, 28(3), 371–391. doi:10.1111/j.1944-9720.1995.tb00806.x Thomas, J., Allman, C. B., & Beech, M. (2005). Assessment for the diverse classroom: A handbook for teachers. Bureau of Exceptional Education and Student Services, Florida Department of Education. Toylor, L. (2005). Washback and impact. ELT Journal, 59(2), 154–155. doi:10.1093/eltj/cci030 Vaismoradi, M., Turunen, H., & Bondas, T. (2013). Content analysis and thematic analysis: Implications for conducting a qualitative descriptive study. Nursing & Health Sciences, 15(3), 398–405. doi:10.1111/ nhs.12048 PMID:23480423

ADDITIONAL READING Jeynes, W. (Ed.). (2022). Relational Aspects of Parental Involvement to Support Educational Outcomes: Parental Communication, Expectations, and Participation for Student Success. Taylor & Francis. doi:10.4324/9781003128434

304

 Parents’ Perceptions of Foreign Language Assessment and Parental Involvement

Paseka, A., & Byrne, D. (Eds.). (2019). Parental involvement across European education systems: Critical perspectives. Routledge. doi:10.4324/9781351066341 Piechurska-Kuciel, E., & Szyszka, M. (Eds.). (2015). The Ecosystem of the Foreign Language Learner: Selected Issues. Springer. doi:10.1007/978-3-319-14334-7 Rokita-Jaśkow, J., & Ellis, M. (Eds.). (2019). Early instructed second language acquisition: Pathways to competence (Vol. 2). Multilingual Matters. doi:10.21832/ROKITA2500 Wages, M. (2016). Parent involvement: Collaboration is the key for every child’s success. Rowman & Littlefield.

KEY TERMS AND DEFINITIONS Foreign Language Assessment: An evaluation practice of foreign language learners’ extant productive and receptive competencies to inform stakeholders and improve curricular practices, if need be. Foreign Language Assessment Literacy: It pertains to an understanding of the purposes and the principles of sound assessment and the capacity of examining foreign language performances to make informed decisions. Foreign Language Proficiency: It is an indicator of how well a foreign language learner uses receptive and productive skills as well as other language competencies such as syntax, morphology, semantics, and vocabulary regarding the context, task demands, and other agents’ characteristics. Parental Involvement: It pertains to parents’ behaviors with or on behalf of the children for their development, academic success, and future endeavors at home and school.

305

306

Chapter 15

Academic Integrity in Online Foreign Language Assessment: What Does Current Research Tell Us? Aylin Sevimel-Sahin https://orcid.org/0000-0003-2279-510X Anadolu University, Eskisehir, Turkey

ABSTRACT The immediate transition to online teaching due to the pandemic has required the institutions to employ online assessment more frequently than ever. However, most teachers, students, and schools are not ready for that. Therefore, they have not planned and practiced their assessment methods effectively in online settings because of some challenges faced. One of them is the difficulty in sustaining academic integrity in digital environments, and many studies have already concluded there is a huge increase in dishonest behaviors in online assessment tasks. But academic integrity is an indispensable concept to improve teaching and learning by performing reliable, valid, and secure assessments, especially in online platforms. Then, the purpose of this chapter is to discuss academic integrity in relation to online foreign language assessment practiced during the pandemic by presenting the background to online assessment, academic integrity, and their relationship, and reporting the recent research studies within this scope.

INTRODUCTION The recent and ongoing COVID-19 pandemic has required most of the institutions or schools around the world to prioritize online education. During this period, foreign language teachers have begun to deliver their courses through online platforms and thereby, assess their learners online by means of digital tools. But this kind of teaching and assessment has been performed under a sudden and unplanned mode of education (Zhang et al., 2021). That is, most institutions have been unprepared for this rapid shift of education from face-to-face to face-to-screen mode, so it is called emergency remote teaching (ERT) (Henari & Ahmed, 2021). ERT is not the same with the concept of already established and well-planned online education systems and principles but rather, it is an immediate action to this crisis DOI: 10.4018/978-1-6684-5660-6.ch015

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 Academic Integrity in Online Foreign Language Assessment

situation (Gacs et al., 2020). Therefore, teachers and other stakeholders have made their best efforts to handle this situation efficiently by adapting their policies, strategies, methods, and practices of teaching/ learning and assessment. However, most of teachers have had some difficulties especially in testing, assessing, and evaluating the learners’ language performance online in an effective way (Beck et al., 2020). Although online assessment has already been used with the integration of digital technologies into the instruction (Oldfield et al., 2012), conducting online language assessment is a new way for most of them during ERT. That is, many have been introduced to this type of assessment for the first time, and thus, they need to cope with its requirements on their own most of the time (Beck et al., 2020). In this sense, a number of teachers have adjusted or modified their language assessment methods and strategies according to what online instruction demands. Nevertheless, some have been successful in handling such assessment (Situmorang et al., 2020) whereas some others have had problems such as not having required knowledge, being inexperienced, and not having digital technological competence (Anasse & Rhandy, 2021), not having adequate support or guideline (Czura & Dooly, 2021), internet connection problems (Hakim, 2020), cheating and plagiarism issues (Blinova, 2022), and not being able to design reliable and valid items (Behforouz, 2022). As it is widely acknowledged that language assessment is an integral part of teaching because it gives feedback about the instruction to guide and plan appropriate learning activities so as to develop learners’ knowledge and performance in language learning as well as to motivate them for better learning opportunities (Rogier, 2014). So, most teachers have tried to overcome the problems in online assessment with or without help to perform better practices to improve learners’ progress as much as possible even under such conditions. For example, most studies have indicated that formative assessment practices have been preferred and used during the pandemic over the summative ones due to several reasons, and challenges but the most highlighted concern has become academic dishonesty in online assessment (Bjelobaba, 2021; Koris & Pal, 2021). Indeed, one of the most debatable and challenging recent concerns is about how to provide academic integrity in online assessment; no matter what the nature (i.e., summative or formative) or the timing of assessment is (i.e., synchronous or asynchronous application) (Bjelobaba, 2021). It is because foreign language teachers have had serious concerns and problems about this issue, specifically while administering their assessment methods during the pandemic (Gamage et al., 2020). Then, it is important to know what academic integrity means. It refers to being honest, trustworthy, fair, respectful, and responsible for the academic work that has been performed (Tauginiene et al., 2018). It is associated with the other concepts such as academic misconduct, dishonesty, cheating, and plagiarism, all of which actually show that academic integrity is violated (Minocha, 2021). In a way, it can be inferred that breaches of academic integrity in assessment occur when dishonest and unethical ways are applied to assessment tasks such as cheating (Blau et al., 2020). Even though academic dishonesty is not a new issue and has already been studied (e.g., Amzalag et al., 2022; Augusta & Henderson, 2021), its impact on online assessment has been felt more than ever due to the pandemic (Bearman et al., 2020). During this ERT condition, there are some common concerns and problems found about this issue. To exemplify, insecure language assessment settings where students are not physically present (Shariffuddin et al., 2022), lack of different security strategies needed in online assessment (Reedy et al., 2021), an increasing rate in resorting to contract cheating (Erguvan, 2021), adopting ‘ad-hoc solutions to testing’, (Janke et al., 2021, p. 1), and lack of the rules for ethical behaviors integrated in online versions of assessment (Amzalag et al., 2022). Besides, teachers have believed students tend to cheat more in online assessment where they cannot control its security (Adzima, 2020; Reedy et al., 2021). So, they are wor307

 Academic Integrity in Online Foreign Language Assessment

ried about cheating or plagiarism more than ever (Behforouz, 2022; Celik & Lancaster, 2021). Most of the time, the instructors are not sure how to behave against academic dishonesty or how to prevent it in advance while constructing and administering their assessment tools, and evaluating the results of assessment (Gamage et al., 2020). It is because there is little support, information, or training how to sustain academic integrity, how to prevent academic misconduct, and what the consequences of plagiarism or cheating would be (Gamage et al., 2020). Therefore, they have had difficulties in performing secure online language assessment under such circumstances. However, maintaining academic integrity is an indispensable principle of sound language assessment because it affects other assessment principles such as reliability and validity, and it is very hard to establish reliable and valid online assessment procedures at ERT times (Muhammad & Ockey, 2021). Furthermore, one of the components of language assessment literacy (LAL) is being aware of the principles of ethics and codes of practice that guide assessment plans and actions (Fulcher, 2012; Giraldo, 2018). Hence, academic integrity is a significant principle of language assessment and recently, there have been many critical questions about it. On the whole, it can be discussed that there has been a growing concern in sustaining academic integrity while mitigating academic dishonesty in online assessment due to the questions related to the credibility and reliability of procedures as well as the responsibility of all the stakeholders in dealing with this issue (Sabrina et al., 2022; Tiong & Lee, 2021). Therefore, as Bjelobaba (2021) asserts, safeguarding academic integrity has become a challenge recently. In addition, since ensuring academic integrity in digital assessments are different from the face-to-face one, this concept should be revisited; its definition and characteristics should be revised; and the strategies to cope with academic misconduct should be modified in relation to the digital age (Reedy et al., 2021). So, it is clear that there is a need to enlighten academic integrity from the perspective of online foreign language assessment to inform and help the concerned stakeholders, especially teachers who are the central figure in assessment decisions and applications. There is also a need to contribute to the literature in this respect because of the lack of research on this topic specifically within the scope of online foreign language assessment (Celik & Lancaster, 2021). Considering all of the discussions and arguments, this chapter provides an overview of the latest research studies about academic integrity in online foreign language assessment, especially within ERT that has been executed during the pandemic. Firstly, it begins with an overview of the current state of online foreign language assessment. Then, it presents what academic integrity means, what kind of characteristics it has, what type of violations there are, and what the reasons of academic dishonesty are. Next, it reports current research about this issue. Particularly, it illustrates what kind of practices have been implemented by exemplifying different contexts of education. Finally, it concludes with the recommendations and suggested guidelines, and future research directions for academic integrity of online assessment in the field of foreign language teaching and learning.

OVERVIEW OF ONLINE FOREIGN LANGUAGE ASSESSMENT Language assessment is the process of collecting information systematically about learning progress, skills, and abilities through using multiple tools, techniques or methods that are designed and developed for different purposes, and evaluating the findings to find out whether instructional objectives are met, what is learned and what the problems are in terms of language development so that to plan teaching and improve learning (Brown, 2004). There are basically two types of assessment: The first is summative 308

 Academic Integrity in Online Foreign Language Assessment

assessment which means gathering all the data about learners’ language knowledge and abilities, and then, evaluating their overall performance by focusing the sum of learning at the end of a course or term (Coombe, 2018). The second one is formative assessment which refers to continuously evaluating and monitoring learners’ language development during the learning process (Coombe, 2018). While one-shot exams or tests, final projects and so forth are mostly used for summative one, observation protocols, portfolio, interviews, self-/peer-assessment tasks and the like are mostly employed for formative one. Regarding classroom assessment, Brown (2004) emphasizes that it is inherently formative but it can be performed for summative purposes in order to inform and improve teaching and learning all the time. As far as the meaning and features of language assessment are considered, van der Westhuizen (2016) puts forward that the same principles and forms of face-to-face assessment tasks can be applied appropriately to online administration of assessment. However, to implement online assessment, some specific digital platforms or tools are used via the Internet, and the assessment procedure is conducted without the physical setting. Thus, online language assessment refers to the assessment of learners’ language skills and knowledge by using several web-based technologies available through the Internet, and it can be practiced fully online such as proficiency exams or only demand online submission of academic works both for summative and formative assessment purposes (Weleschuk et al., 2019). Moreover, considering the mode of online administration, online assessment can be carried out either synchronously when students complete, for example, an online quiz within a limited time connected at the same time from different places via the Internet or asynchronously when they submit an assignment, for instance, at any time anywhere as long as they use the Internet (Wright et al., 2013). Online language assessment has already been utilized before the pandemic by means of digital technology integration into teaching and learning, especially in distance education, blended, computerassisted or mobile-learning environments (Pu & Xu, 2021). For example, online proficiency exams such as TOEFL have been administered to determine the level of the participants’ language knowledge and skills (Isbell & Kremmel, 2020). Besides, Web 2.0 tools such as Socrative has been integrated to assess learners’ progress online even in the classroom (Assulaimani, 2021). So, its importance has already been underlined, and language teachers have needed to modify their assessment procedures according to different teaching contexts and learning approaches. The reasons for the alterations in assessment towards online one can be attributed to the transitions of language pedagogies to more communicative and digital ones (Kostova, 2020), the changing characteristics of learners owing to the 21st century skills that necessitate more interactive, collaborative, authentic, problem-solving, and adaptive assessment methods (Boitshwarelo et al., 2017; Khairil & Mokshein, 2018), and the differences in the experience of learning/teaching which can take place online in a more flexible time and space (Picciano, 2017). However, the pandemic has led to a substantial increase in the use of online assessment more than ever as mentioned before, which makes educators begin to feel its significance and need more. As a result, certain digital tools have gained popularity during ERT to undertake assessment procedures. For instance, Canvas, Blackboard, and the like have been used extensively for online assessment purposes as the popular Learning Management Systems (LMS) during ERT (Polisca et al., 2022): Teachers have opportunities to assign homework, grade them, give feedback, follow attendance, and open discussion forums because they are already integrated in such systems as the assessment component. Moreover, Mentimeter, Kahoot, Plickers, and similar Web 2.0 tools have been employed to assess learners’ skills and abilities while delivering online instruction through other digital platforms such as Zoom in which there is no assessment section (Assulaimani, 2021).

309

 Academic Integrity in Online Foreign Language Assessment

On the other hand, it is revealed that some teachers have been satisfied with online assessment during the pandemic because of its benefits whereas some others have come across some difficulties which cause them to refrain from such form of assessment. The advantages of online assessment can be exemplified as follows: Using several digital multimedia appealing to different assessment purposes as well as learner autonomy (Oldfield et al., 2012); receiving or giving instant feedback in various forms such as video- or audio-recorded feedback (Weleschuk et al., 2019); assessing complex language skills thanks to numerous digital applications (Kostova, 2020); storing, accessing, and evaluating the assessment data easily at any time anywhere (Khairil & Mokshein, 2018). Contrary to such opportunities, there are some challenges of online assessment. For instance, inadequate knowledge and expertise in implementing online assessment; unavailability of necessary technical background; lack of suitable policies, practical guidelines, and support; lack of time, energy, and cost; insecure systems; and difficulty in establishing reliable and valid assessments (Ahmad et al., 2021). Considering such benefits and challenges, Mahfoodh (2021) argues that four factors act as the prerequisites to ideal online language assessment: ‘High-speed Internet connection, digitally literate teachers, digital literate learners, and adequate digital platforms’ (p. 3). However, when such factors are missing or insufficient, the credibility of online assessment methods and results become questionable, which leads to the concerns specifically related to academic integrity. The reason is that poor assessment brings about inappropriate practices, and unreliable and invalid tasks. As a result, it causes a failure in finding out the real language abilities/skills of each learner due to the increasing academic misconduct among learners (Gamage et al., 2020). Therefore, to sustain academic integrity is essential to undertake online assessment effectively and securely in order to improve teaching and learning. That is, without good online assessment plans and designs, the quality of education cannot be assessed in an honest, fair, and ethical way. For this reason, it is crucial to establish a clear assessment policy with the code of ethics to prevent misunderstandings and problems before, during, and after assessment procedures are conducted (Hidri, 2020). Without such polices related to the code of ethics of assessment, it is hard to ensure academic integrity. All in all, it can be concluded that preserving academic integrity in online language assessment is a contemporary challenge because there is a huge increase in dishonest behaviors, especially during ERT practices. Hence, its significance has been noticed in assessing language abilities online more than ever.

CONCEPT OF ACADEMIC INTEGRITY IN ONLINE FOREIGN LANGUAGE ASSESSMENT In this section, firstly, the meaning of academic integrity, its characteristics, importance, and relationship with online assessment are discussed. Then, violations of academic integrity in relation to dishonesty or misconduct, and their reasons are exemplified.

Definition and Characteristics Academic integrity is associated with moral principles and values that affect and guide stakeholders’ behaviors, educational procedures, and consequences of academic works, research, teaching, learning, and assessment. In essence, academic integrity refers to ‘the commitment from students, faculty, and staff to demonstrate honest, moral behavior in their academic lives’ (Minocha, 2021, p. 2). Similarly, it is defined as 310

 Academic Integrity in Online Foreign Language Assessment

‘the compliance with ethical and professional principles, standards, practices, and a consistent system of values, that serves as guidance for making decisions and taking actions in education, research and scholarship’ by the European Network for Academic Integrity (ENAI) (Tauginiene et al., 2018, pp. 7-8). That is, it is related to such constructs as ethics, moral, fairness, and honesty. Therefore, there are certain values required for academic integrity. In this regard, The International Center for Academic Integrity (ICAI) (2021) indicates that academic integrity is ‘a commitment to six fundamental values: honesty, trust, fairness, respect, responsibility, and courage’, and ‘without them, the work of teachers, learners, and researchers loses value and credibility’ (p. 4). ICAI (2021) also illustrates how such values can be demonstrated by every member of the academic community to sustain academic integrity: • • • •





Honesty is about ‘being truthful; giving credits to the owner of the work; keeping promises; providing factual evidence; aspiring to objectivity’ and it expands firstly from individuals to the community (p. 5). Trust is recognized when ‘clearly stating expectations and following through; promoting transparency in values, processes, and outcomes; trusting others; giving credence; encouraging mutual understanding; acting with genuineness’ (p. 6). Fairness occurs when ‘applying rules and policies consistently; engaging with others equitably; keeping an open-mind; being objective; taking responsibility for your own actions’ (p. 7). Respect is provided when ‘practicing active listening; receiving feedback willingly; accepting that others’ thoughts and ideas have validity; showing empathy; seeking open communication; affirming others and accepting differences; recognizing the consequences of our words and actions on others’ (p. 8). Responsibility is indicated when ‘holding yourself accountable for your actions; engaging with others in difficult conversations, even when silence might be easier; knowing and following institutional rules and conduct codes; creating, understanding, and respecting personal boundaries; following through with tasks and expectations; modeling good behavior’ (p. 9). Courage is shown when ‘being brave even when others might not; taking a stand to address a wrongdoing and supporting others doing the same; endure discomfort for something you believe in; being undaunted in defending integrity; being willing to take risk and risk failure’ (p. 10).

Regarding such values, it can be concluded that not just students but also teachers and other stakeholders are expected to demonstrate academic integrity throughout their educational lives. It is because academic integrity influences the quality and standards of teaching, learning, assessment, and research. From the perspective of assessment, it can be stated that it is important to ensure academic integrity by securing assessment practices through preventing attempts to cheat as well as implementing related policies and rules, encouraging the stakeholders, especially students, to perform ethically in their work, and treating them in a fair and unbiased way while assessing their skills and knowledge (Bearman et al., 2020; British Council, 2020). In order to do this, students, teachers, and institutions should possess the fundamental values that ICAI (2021) propounds. So, there are some behaviors expected from all the members of the academic community to maintain academic integrity in assessment according to ICAI’s (2021) fundamental values (Celik & Lancaster, 2021). Firstly, the members should be honest with themselves and each other (Gamage et al., 2020). Secondly, students should prepare their assignments in a sincere way, and teachers and institutions should provide transparent and consistent guidelines and standards for assignments, and apply them in a fair way while evaluating them in order to show the trust 311

 Academic Integrity in Online Foreign Language Assessment

(Augusta & Henderson, 2021). Thirdly, to demonstrate fairness in assessment, students should establish original tasks or assignments by acknowledging others’ works, and giving references considering the related policies, and teachers and faculty should inform clear expectations about dishonest behaviors and at the same time, behave in an objective way (Minocha, 2021). Fourthly, students should respect to various different ideas when using new knowledge in their tasks, and teachers and institutions should respect to diverse characteristics and opinions of students by giving sincere feedback to value them (Tauginiene et al., 2019). Fifthly, students should be aware of their responsibilities to their own assignments, and also know the codes of conduct so that they oppose malpractices with regard to dishonesty (Gamage et al., 2020). In the same vein, teachers and other members of faculty should take responsibility of their actions while dealing with dishonesty, and be role-models of good behaviors to sustain integrity in assessment (Egan, 2018). Lastly, all the academic members should be courageous when they encounter misconduct not only in their work but also in others’ works even though there may be negative consequences such as low grades (Douglas College, 2020). Otherwise, it would seem normal to them, especially if the sanctions were not be applied. So, ensuring academic integrity would not be possible. Furthermore, with respect to the digital side of assessment, Centre for Research in Assessment and Digital Learning (CRADLE) defines academic integrity within the scope of digital assessment as ‘to equip students with the competences and values necessary to engage in ethical scholarship while assessment security focuses on securing assessment against cheating, and on detecting any cheating that may have occurred’ (Gamage et al., 2020, p. 5). However, no matter what the institutions, teachers, and learners try to commit to academic integrity values in assessment practices, there are still breaches that have an impact on the assessment quality. In fact, breaching academic integrity refers to academic dishonesty or misconduct that corresponds to deceitful behaviors and actions such as cheating, plagiarism, unfair treatment, and insecurity involved in assessment (Sabrina et al., 2022). That is, instead of behaving in a trustworthy, honest, responsible, respectful, fair, and courageous way under assessment procedures, resorting to short-cuts such as cheating in exams, colluding with other to copy the work, presenting others’ ideas as own, and falsifying references due to certain reasons like getting better grades is acknowledged as dishonest behaviors in assessment, which actually shows the violations of academic integrity (Blau et al., 2020). Yet, as mentioned before, there has been a growing rate in academic dishonesty in online assessment, especially during the pandemic when ERT has been conducted because of the availability and easiness of digital technologies, lack of knowledge and skills in performing online assessment or inability to modify assessment strategies, lack of awareness of academic integrity meaning, and inadequate information, support, or policy to handle misconduct (Adzima, 2020; Gamage et al., 2020). Although academic integrity in online assessment cannot be totally guaranteed when it is performed remotely, there are certain ways to tackle academic dishonesty in online assessment so that the quality of education is improved (Gamage et al., 2020). But first, it is important to be aware of what kind of violations there are about academic integrity, and what their reasons are in order to overcome dishonesty in online assessment by revising the existing assessment designs.

Violations: Types, Features, and Reasons Academic integrity is violated when students, teachers, and other stakeholders in an educational context do not behave in an honest, fair, ethical, or responsible way. Concerning the assessment practices, the violations can take place mostly when students cheat, plagiarize, fabricate, and involve themselves in 312

 Academic Integrity in Online Foreign Language Assessment

collusion, and when teachers and institutions do not apply the determined policies against such unethical behaviors and actions consistently. Hence, it can be surmised that without handling such violations in assessment practices, it cannot be possible to cultivate academic integrity so as to improve the quality of teaching and learning. Academic dishonesty or misconduct basically means breaching academic integrity principles and values. It is defined as ‘morally culpable behaviors perpetrated by individuals or institutions that transgress standards of moral behavior held in common between other individuals and/or groups in institutions of education, research, or scholarship’ (Jordan, 2013, p. 252). Therefore, any behavior or action that could undermine academic integrity values can cause the emergence of immorality and unfairness that may create advantages or disadvantages for different students, teachers, or faculty members in their academic lives (Tauginiene et al., 2018). However, it is crucial to be aware of what constitutes academic misconduct and know what kind of behaviors count as being dishonest to handle it effectively. The most common types of dishonest behaviors that are represented in academic works and assignments are acknowledged as follows: Cheating, contract cheating, plagiarism, collusion, fabrication, falsification, and facilitation. Cheating is ‘the intentional use of study materials, information or any kind of aid, the use of which is not allowed, including consulting others’ (Blau et al., 2020, p. 159). In other words, it is a kind of unauthorized assistance in preparing assignments, completing assessment tasks, or taking exams. The prevalent cheating behaviors consist of copying in exams, stealing answer keys, and materials, possessing crib notes or digital tools especially during exams, and submitting identical assignments (Olt, 2002; Tauginiene et al., 2019). There is also a special form of cheating especially on the rise during digital assessment which is contract cheating. It is defined as ‘the submission of work by students for academic credit which the students have paid contractors to write for them’ (Clarke & Lancaster, 2006, p. 19). So, it happens when students use a third-party by purchasing already finished assignments to submit as their own (British Council, 2020). Such third-parties are commercial organizations or services that are also known as contract cheating websites, paper, or essay mills (Tauginiene et al., 2019). It is a very dangerous form of cheating because even text-matching software systems like Turnitin cannot detect it whether assignments are written by students or ghost-writers because it is originally generated by professional teams (Erguvan, 2021; Tauginiene et al., 2019). Though contract cheating has already been a challenge as a dishonest behavior in assessment (Bretag et al., 2019), its spread use has increased during COVID-19 because institutions have moved to online assessments and such online services have targeted students more (Ahsan et al., 2021). Another type of academic misconduct is plagiarizing someone else’s works without acknowledging them. So, plagiarism refers to ‘intentional or unintentional uses of another person’s words or ideas without properly crediting their source’ (Youmans, 2011, p. 1). That is, presenting someone else’s work, ideas, and the like as their own without citing the source of information either intentionally or unintentionally. If it is intentional plagiarism, it is a theft of ideas done deliberately without crediting the source while knowing how to give references and citations (Park, 2003). But if it is unintentional, then it stems from the lack of knowledge in how to cite, give references, and paraphrase appropriately (Park, 2003). Nonetheless, no matter what it is intentional or unintentional, plagiarism is a threat to academic integrity. For this reason, it is important to prevent or penalize it. But in order to tackle it, firstly, one needs to be familiar with the types of plagiaristic behaviors. The following behaviors are the frequently encountered ones: Using texts, images, or other types of content that belong to someone else without citing them; copying someone else’s ideas and presenting as if one’s own work; using invalid resources; inappropriate paraphrasing; patchwriting; and translation without acknowledging the original work (Awasthi, 2019; Blau et al., 2020; Tauginiene et al., 2019). In addi313

 Academic Integrity in Online Foreign Language Assessment

tion, collusion is one of the dishonest behaviors. It is the act of collaboration among students to share the resources and help each other in a way that is not allowed according to the regulations of assessment (British Council, 2020). For example, students can assist each other under examination situations or attempt to give help for others to cheat or copy, which might benefit the unfair privilege of a certain student group (Tauginiene et al., 2019). Moreover, there are other unacceptable behaviors with respect to academic misconduct. For instance, fabrication denotes to making up or inventing the information, data, and the like that do not exist, and use them in the assigned academic works (Blau et al., 2020). To exemplify it, students include non-existent citations or publications or just provide incorrect source of data in their assignments. Another dishonest behavior is about falsifying one’s attendance or identity, changing own work and resubmitting it again, manipulating others’ ideas and misrepresenting them, all of which pertains to falsification (Tauginiene et al., 2019). In other words, students can recycle their previous tasks that are graded and resubmit them, or impersonate their identity and access to the assessment. Lastly, facilitation means ‘helping others to engage in cheating’ (Holden et al., 2021, p. 2). That is, it refers to intentional assistance by others in breaching academic integrity (Blau et al., 2020). After all, such behaviors or actions that are attempted in academic works, teaching, learning, and assessment undermine academic integrity values. The types and features of academic dishonesty have been exemplified so far. Yet, there is a need to present and discuss the reasons behind them so that the members of academic community know how to overcome them and at the same time, modify their own behaviors to perform honesty in their assignments or tasks, or may change their assessment designs, evaluation, and related policies. For example, Egan (2018) discusses the reasons why students cheat in assessments, and she gives examples such as the need or wish to get better scores, being too busy and procrastinating, not having enough time, heavy workload, not being aware of what counts as cheating, not caring because of the lack of deterrence, and so on. In addition to these reasons, differences in academic abilities or lack of competence, not being motivated, personal attitudes, underdeveloped sense of integrity, not being content with teaching/learning context, and family or peer pressure for competitiveness have been reported other reasons for cheating behaviors (Amigud & Lancaster, 2019; Ellis et al., 2020). Furthermore, from the point of plagiarism, lack of appropriate academic writing skills in preparing assignments, poor language skills, temptation of the Internet due to its easiness to access information and copy them are some other causes that make students plagiarize (Awasthi, 2019). Also, Amzalag et al. (2022) categorize such reasons for academic dishonesty under two headings: Personal-intrinsic factors like insufficient knowledge and laziness, and extrinsic factors such as ignorance of faculty about unethical behaviors and lack of necessary policies. Apart from such factors, other reasons for academic dishonesty have been explored by comparing faceto-face and online contexts because of the dominance of digital technologies and the inevitable use of online assessment owing to the pandemic. The underlying rationale behind such studies seems to be bound up with the digitalization or the characteristic of online environments. It is because there is a new concept that might reflect the concerns of dishonesty in relation to online contexts. This is called e-dishonesty which means ‘behaviors that depart from academic integrity in the online environment’ and it ‘raises new considerations that may not have been previously considered by instructors and administrators […] such as searching the Internet, communicating with others over a messaging system, purchasing answers from others, accessing local/external storage on their computer, or accessing a book or notes directly’ (Holden et al., 2021, p. 2). So, it can be inferred that academic dishonesty transforms into e-dishonesty as far as online platforms are taken into account. Although Adzima (2020) argues that most of the dishonest behaviors in face-to-face classes may apply to online ones, she emphasizes the 314

 Academic Integrity in Online Foreign Language Assessment

absence of proctoring, fake identities, lower risk of being caught, inadequate academic integrity policies related to online practices, the perception that it is easy to cheat online as the new challenges of academic dishonesty in online courses and assessment. Besides, Gamage et al. (2020) point out that the varying academic integrity policies among universities, which results from different cultural backgrounds of the academic contexts, may lead students to take short-cuts in their online assessment during the pandemic. In addition to these, Holden et al. (2021) group the reasons of academic dishonesty that students present in online assessment at ERT times under four factors: ‘Individual factors (e.g., incentives or pressures to get higher grades, the belief that cheating is not a problem according to personal ethics); institutional factors (e.g., existence of cheating culture, inadequate sanctions or penalization established in the policies); medium of delivery (e.g., perception of the tendency to cheat online); assessment-specific factors (e.g., the effect of assessment types such as more opportunities to cheat in summative assessment or less plagiarism in essays if open-book one is employed)’ (pp. 2-5). Finally, British Council (2020) report other factors that cause threats to academic integrity in online assessment such as lack of assistance for those who are unfamiliar with the concepts, poor interactions among students and the academic staff in terms of integrity policies or demands, absence of consistent actions against dishonesty, formats of some assessment tasks which might facilitate easy cheating, and unfair or inconsistent strategies applied for the breaches of integrity. As a conclusion, it can be said that it is imperative to know the definition of academic dishonesty and the reasons behind the violations of academic integrity in order to overcome such violations to enhance teaching and learning through appropriate assessment designs within online settings.

RECENT RESEARCH ON ACADEMIC INTEGRITY RELATED TO ONLINE ASSESSMENT In this part of the chapter, contemporary research studies regarding academic integrity as well as academic dishonesty in online assessment at ERT times during the COVID-19 pandemic are presented. However, since there are very few studies conducted within the scope of online foreign language assessment, the relevant studies carried out in general education contexts are also reported so as to guide assessment procedures and give insights into these issues. To begin with, there are systematic literature reviews about online assessment security, challenges, and integrity strategies. For example, Garg and Goel (2022) focused on student academic dishonesty and revealed that there are individual factors (e.g., personality, laziness, unwillingness, and the thought that everyone is doing and nobody is hurt) and environmental factors (e.g., too much workload in courses, time demands, lack of training and integrity policies, Internet availability) which cause online academic dishonesty. They also found students engage in dishonest behaviors in their assessment tasks by impersonating themselves, using or consulting to forbidden aids such as mobile phones, colluding with other students to complete their tasks or tests, plagiarizing, and relying on the gaps of systems such as multiple attempts to submit assignments. Besides, they reported the strategies employed to address online academic dishonesty such as modifying assessment format (e.g., randomization of items, and use of various assessment tasks), adopting and implementing integrity policies (e.g., ethics training, and using honor codes), proctoring students through cameras and screen recording when synchronous tests are applied; and using plagiarism detection tools in written assignments. As a result of their review, they concluded that there should be an academic dishonesty mitigation plan in online assessment which 315

 Academic Integrity in Online Foreign Language Assessment

requires different roles from all the concerned stakeholders in assessment procedures in order to prevent and detect dishonest practices in that area. In addition to this review, Sabrina et al. (2022) investigated different online assessment options and academic integrity strategies utilized in online courses. They found out written assignments, online discussions, projects, reflections, peer evaluation, and the like are mostly used as online assessment methods. Moreover, they reported the ways how academic integrity is violated through plagiarism, cheating, fabrication, and related dishonest actions. Therefore, they demonstrated how such violations are treated. For instance, informing students about the consequences of cheating in line with its policy, providing a module that explains the expected behaviors, randomizing the order of questions, monitoring students via cameras, implementing academic integrity policies by means of detecting and deterring misconduct in online assessment, and providing deadlines are mostly used strategies. As a conclusion, Sabrina et al. (2022) underlined the fact that ‘no single method or design is enough to eliminate all sorts of academic integrity violations’ (p. 64) but appropriate online assessment designs, consciousness-raising of students and carrying out suitable institutional policies about academic integrity may help to foster academic integrity in online assessment. Other than such reviews, there are some studies addressing the challenges of online assessment during COVID-19 and they have yielded a rise in academic dishonesty due to the unpreparedness of institutions and improper assessment types. For example, Sharadgah and Sa’di (2020) surveyed the university faculty staff about assessment in virtual environments, and their results indicated that the faculty was not ready for online assessment, and therefore, they held concerns about academic dishonesty and the majority of them agreed that most students cheated in online tests. Likewise, Guangul et al. (2020) used questionnaires to find out the mostly used online assessment types and challenges in a college. The results produced that assignments or project-based assessment within a time limit rather than proctored exams were mostly preferred, and the main challenges were academic dishonesty, indifference of students to the assessment submission, infrastructure like Internet problems, and not being able to cover the syllabus and learning outcomes, out of which academic dishonesty became the most prioritized concern by the participants. For this reason, they suggested some ways to minimize violations of academic integrity such as combining different assessment methods and strategies and using online presentations. The most prevalent research topic about academic integrity in relation to dishonesty in online assessment during the pandemic is about exploring the perceptions and behaviors of students’ and other faculty members, especially in higher education institutions. Most of the research studies used questionnaires to survey their opinions and practices, and produced somewhat similar findings. For example, several studies revealed that students cheated in online exams more frequently than face-to-face classes due to the immediate and unplanned transition to online assessment during COVID-19 (Alessio & Messinger, 2021; Blinova, 2022; Janke et al., 2021; Shariffuddin et al., 2022; Valizadeh, 2022). But one study indicated different perceptions about the easiness of cheating in online assessments. Students believed it was harder and more difficult to cheat in online exams whereas academic staff did not think there was a difference between invigilated traditional exams and non-invigilated timed online exams with regard to cheating (Reedy et al., 2021). In addition, the most reported cheating ways were comprised of using the Internet without permission, colluding with others in exams, copying-pasting the information from the Internet, consulting the course notes in tests, and exchanging and discussing ideas with classmates, and submitting others’ assignments as their own (Blinova, 2022; Janke et al., 2021; Shariffuddin et al., 2022; Valizadeh, 2022). Moreover, a number of studies showed that the main reasons for dishonest behaviors, specifically for cheating, were the absence of knowledge, not being motivated to do exams or assignments because of the belief that the task is not useful or the task is too difficult, the desire to get higher 316

 Academic Integrity in Online Foreign Language Assessment

marks, technical problems, the perception that it is not immoral, increased workload, internal-external pressures regarding time, and negative attitudes towards the reliability of online exams (Amzalag et al., 2022; Blinova, 2022; Meccawy et al., 2021; Valizadeh, 2022). Besides, Shariffuddin et al. (2022) investigated how students perceive their lecturers and higher education institutions in terms of preventing academic dishonesty in online assessment. Their findings indicated that lecturers mostly warned students not to cheat or copy, and informed them the negative consequences of their dishonest behaviors by stating the penalties. They also showed that students’ institutions encouraged to take actions against academic dishonesty with their provided policy. In a similar vein, some studies presented the methods to deal with academic dishonesty and deter cheating. For instance, use of Turnitin to check similarities between assignments, use of webcams to monitor online exams, use of formative assessment more such as coursework, short quizzes, and projects, limiting time to complete or submit, randomizing questions or tasks, providing different tasks to different students, designing questions that appeal to higher-order thinking skills were generally perceived suitable ways to cope with academic dishonesty in online assessment during the pandemic (Meccawy et al., 2021; Paullet, 2020; Reedy et al., 2021). To sum up, the research studies regarding perceptions and behaviors discussed so far have more or less yielded similar findings about academic integrity in online assessment. When it comes to foreign language online assessment context, there are very few research studies based on academic integrity in online assessment during COVID-19. Therefore, there is a need to do research on this topic more from the perspective of online language assessment because assessing language skills and knowledge demand careful assessment planning, administration, and evaluation when online contexts are involved. To start with, Behforouz (2022) conducted a review study and presented critical findings. For example, the language assessment principles such as reliability, validity, authenticity, and transparency lost their significance during the pandemic, which caused teachers question academic integrity. Therefore, they were found to be more anxious about plagiarism, cheating, and any other misconduct in online language assessment even though some precautions were taken by teachers and institutions. Behforouz (2022) also noted three serious problems encountered in online assessment as ‘getting assessment answers in advance’, ‘unfair retaking of assessment’, and ‘unauthorized help within the assessments’ (pp. 571-572). Another study addressed plagiarism among students who were learning English as a foreign language (Nagi & John, 2020), and found out that students ignored different forms of plagiarism, were incapable of writing in English properly so they tempted to copy, and wanted to get higher and better marks for their assignments. Besides, Rofiah and Waluyo (2020) focused on online vocabulary assessment in English by using a Web 2.0 tool (Socrative) in order to investigate whether such adoption of assessment way affected cheating perceptions of students. As a result of their study, they revealed that most students found it easy to cheat and translate questions asked in tests in this application, and students stated they would prefer Socrative rather than paper-based tests due to the easiness of cheating. So, the researchers concluded digital technology creates opportunities for cheating more, and therefore, some precautions such as proctoring students’ finger movements during the test time should be taken by teachers. In addition, another study proved the importance of online assessment design which was made up of various steps to promote academic integrity. Bjelobaba (2021) practiced an online assessment design in a grammar course given for pre-service language teachers to deter cheating, and that design included different types of assessment tasks such as videos, exercises, asynchronous discussions, and synchronous seminars. After administering that design, the researcher concluded that peer-assessment and feedback were useful to decrease collusion, and it was more beneficial to use different assessment tasks rather than summative exams to reduce cheating behaviors. Apart from such studies, Erguvan (2021) 317

 Academic Integrity in Online Foreign Language Assessment

investigated whether contract cheating increased during the pandemic. The researcher interviewed the academics working at English language departments of the universities, and the findings showed that all the faculty members were able to detect contract cheating easily due to the perfectness of assignments, and students mostly consulted paper mills, their friends or family owing to their laziness, the pressure to get highest grades, and the easy accessibility of such mills. Since contract cheating was on the rise, and it was a threat to reliable language assessment, teachers employed individual strategies to handle it such as changing types of assessment tasks, using some software to detect, and giving low grades. It was individual because the institution in this study only informed teachers about the consequences of such dishonest behaviors but did not take an action against such cases or implement sanctions. Furthermore, Celik and Lancaster (2021) explored the breaches of academic integrity when teaching English was conducted online both synchronously and asynchronously during COVID-19. After eliciting the attitudes and potential violations of teachers and students through surveys, the researchers reported that most of the students committed to the values of academic integrity but they were more willing to use translation programs to prepare their assignments when there was no support or help. The researchers also presented the common violations in online English courses under three categories: Exam-related violations (e.g., getting help from those whose English is better or searching the answers on the Internet during the exam), assignment-related violations (e.g., taking advantage of contract cheating websites for homework), and online session-related violations (e.g., avoiding to respond to teachers’ questions by providing excuses related to technical problems). Likewise, they grouped the threats to academic integrity in online classes under four titles: Exam-related threats (e.g., using only multiple-choice exams to test), assignment-related threats (e.g., assigning too challenging tasks beyond students’ language capabilities); online session-related threats (e.g., too long online lessons), and other threats (e.g., lack of digital literacy, ignoring misconduct). The last research by Alenezi (2022) is about how English university teachers use online assessment and what attitudes they have towards it at ERT times. The researcher surveyed teachers, and the findings yielded that portfolio and online presentations were the mostly used online assessment types though teachers preferred summative assessment the most. The researcher also reported the constraints that teachers faced; for example, limited time to prepare online assessment, the need for digital literacy to design, and technical problems. Regarding cheating, teachers stated the most common behaviors were getting help from others, and copying the works from the Internet resources. To conclude, it can be argued that these research studies have contributed to the understanding of academic integrity in online assessment employed during the pandemic by presenting various characteristics and practices.

RECOMMENDATIONS AND GUIDELINES The current research that investigates what academic integrity means, what sort of its violations there are, and how to overcome dishonest behaviors in online assessment has revealed some urgent needs of the stakeholders in the academic community to be fulfilled. The prominent ones include the little awareness of the academic integrity concept with respect to online assessment, the limited knowledge of what constitutes online academic dishonesty and how to behave against it, the ignorance of discussing assessment-related ethical issues in training phase of teachers and students, the inability to plan sound assessments and perform them in a secure way, and the lack of robust policies to ensure academic integrity values, and at the same time, react to misconduct consistently and fairly. Thus, in this section, 318

 Academic Integrity in Online Foreign Language Assessment

some suggested guidelines, and strategies to cope with dishonest behaviors in order to maintain academic integrity in online assessment from the literature are presented. In the first place, it is important to provide an institutional policy of academic integrity for online education with teachers and students so that they know what kind of good behaviors are expected, what counts as dishonest behaviors, what the consequences of misconduct are, and how they should be reacted to. Although the policies or integrity programs may differ from one institution to another due to some cultural or contextual differences or other needs, the recommended or established ones may be used as guidelines by adapting them according to their own contexts (Gamage et al., 2020; ICAI, 2021). Then, the presence of a policy, and after that, its awareness and explicitness are the first steps towards ensuring academic integrity. Therefore, the content of such policies may be integrated into online course platforms and in each course’s module (Holden et al., 2021). The content might be in written form or visualized by using infographics, and it can contain the honor codes, the expectations from the lessons and assessments, study skills, the rules, and principles of online assessment (Minocha, 2021). However, this information should be reminded and recycled systematically in courses, seminars, or any other training for all the stakeholders to raise their awareness more (Surahman & Wang, 2022). Otherwise, it would be impractical if students or teachers did not know and understand the meaning of academic integrity and dishonesty, and the policies to safeguard integrity and to deter or detect dishonesty. Also, holding discussions with students and teachers, and building a positive relationship in institutions regarding integrity values are important to monitor and mitigate the concerns about policies of academic dishonesty in digital assessments (Amzalag et al., 2022; Bearman et al., 2020; British Council, 2020). In addition, it might be helpful to prevent any misconduct before it occurs with clear ethical and unethical standards given in policies as well as the stated punishment or penalties (Blau et al., 2020). For that, cheating and plagiaristic behaviors can be exemplified, and more support to learners may be provided by teachers and other faculty members. Yet, presenting the expected ethical behaviors is not sufficient; the guidance by institutions to deal with any dishonest behavior should be given, and penalties or sanctions should be applied consistently and fairly to deter them (British Council, 2020). Likewise, training, and professional development activities are needed to share practices, model good behaviors, and discuss experiences related to misconduct (Minocha, 2021). From the point of online assessment training, especially preservice teachers should be trained about academic integrity to shape their perceptions, and how they can overcome dishonesty by planning, administering, and evaluating various online summative and formative assessment forms that are valid, reliable, fair, and secure (Bjelobaba, 2021; Sharadgah & Sa’di, 2020). Considering the online assessment-related procedures such design and administration, it is essential to rethink the assessment strategies used, take some precautions to make assessments secure, and monitor and follow dishonest behaviors so that nobody is more advantageous than others while evaluating their skills and knowledge. For example, if summative assessment is used, invigilated exams are suggested in which students are proctored via webcams and some software is used to block their browsers for searching the answers on the Internet or access to other devices for help (British Council, 2020; Elkhatat et al., 2021; Holden et al., 2021; Sharadgah & Sa’di, 2020). Moreover, limiting the time and the number of attempts to answer, randomizing the order of questions, and avoiding the use of previously utilized assessment test items are other strategies to mitigate cheating in online summative assessment (Elkhatat et al., 2021; Gamage et al., 2020; Holden et al., 2021). As for the written assignments that can be submitted after a prolonged time, some plagiarism detection tools such as Turnitin are recommended to find out any matched texts that are not cited by students (Awasthi, 2019; Sharadgah & Sa’di, 2020). In fact, the use of more formative assessments next to summative assessments are suggested when online 319

 Academic Integrity in Online Foreign Language Assessment

procedures are involved. It is because there is a tendency in cheating more particularly when multiplechoice items are used in one single summative assessment (British Council, 2020). Thus, more frequent use of formative assessment in online settings with different tasks that are authentic, individualized, contextualized, performance- and process-oriented, and require higher-order thinking skills such as criticizing and problem-solving should be adopted to increase academic integrity (Egan, 2018; Reedy et al., 2021; Surahman & Wang, 2022). In a similar vein, the use of self- or peer-assessment and reflections as well as providing feedback on a continuous basis might be beneficial to safeguard academic integrity (Bjelobaba, 2021; Egan, 2018). But before using such assessment designs relevant to online administration, teachers should inform students about academic integrity and dishonesty policies, and make them agree to it. So, they can distribute notice forms or checklists including the related statements about the policies or principles, and students can read, sign, and keep it during the course period (Rahim, 2020). Then, when some breaches of academic integrity occur, everybody knows the consequences. For example, some penalties may be given such as retaking another substituted exam or assignment, giving zero point or reductions in the scores, and suspension of students for a while (Douglas College, 2020). To summarize all the points discussed so far, it can be concluded that there are four factors that need attention when academic integrity and dishonesty in line with online assessment are considered: Policy, awareness, on-going training, and action. First, there should be academic integrity policies so that each stakeholder knows the values and commits to them. Second, they should be aware of the principles and sanctions; that is, awareness-raising about what is appropriate and what it is not in online assessment is needed. It is because the perception of ethics has changed owing to the widespread use of digital technologies, and some behaviors may seem sincere to students whereas they are not (Augusta & Henderson, 2021). Therefore, it is important to revisit the characteristics of academic integrity and dishonesty in this digital age. Third, ongoing training and professional development should be provided for the changing perspectives, statements, and expectations about these issues in order to know how to behave. Finally, taking an action is important to ensure academic integrity; this action may be about preparing sound assessment designs and practicing them effectively, or implementing the deterrence strategies and penalties against dishonesty, or just modelling good behaviors.

FUTURE RESEARCH DIRECTIONS This chapter has presented contemporary issues and research studies on academic integrity as well as dishonesty in online assessment, especially conducted during the pandemic. Most of the studies utilized surveys to find out the perceptions of students, teachers, and other faculty members about these matters, conducted them in higher education contexts without focusing on disciplinary differences. Since the attempts to study on academic dishonesty within the scope of online settings are very recent yet scarce (Adzima, 2020; Shariffuddin et al., 2022), it is important to make a significant contribution to the literature in this sense. First of all, it may be useful to address the relationships among the conceptions of academic integrity, online assessment practices, and potential problems that cause e-dishonesty other than higher education institutions. This is also underlined by Surahman and Wang (2022), and they stated more detailed studies on this topic investigating different teaching contexts such as primary and secondary schools are needed. Besides, it might be helpful to carry out more empirical studies that can be designed to elicit how integrity policies operated, assessment practices, possible challenges, and solutions while tackling 320

 Academic Integrity in Online Foreign Language Assessment

dishonest behaviors in online assessment rather than just relying on the perceptions and beliefs to exemplify and guide the related procedures in other contexts. For this reason, various research designs and instruments apart from surveys and interviews are needed to reveal what happens and should be done about academic integrity while assessing online to benefit the stakeholders. Although the main focus is on online assessment in this chapter, it can be fruitful to compare and contrast face-to-face and online classes considering different assessment types with respect to dishonest behaviors in order to suggest suitable strategies (Holden et al., 2021). It is because little is known what has changed before, during, and after the pandemic in terms of academic dishonesty in online assessment methods. Last but not least, there is a scarcity of disciplinary-based studies about what kind of approaches are used in online assessment to mitigate cheating behaviors and to promote academic integrity (Reedy et al., 2021). So, to study on academic integrity in online assessment from the perspective of foreign languages may provide valuable insights into this issue. It might also enlighten and assist students, teachers, and other members of academic community. All in all, it can be concluded that there is a need to do more research from various perspectives of academic integrity in online assessment including different contexts of education by using other research designs in order to develop better understandings of these concepts, and at the same time, to inform and help the concerned stakeholders in planning and practicing such assessment by paying attention to academic integrity values and raising their awareness of such notions in various disciplines.

CONCLUSION The present chapter addresses the concept of academic integrity in online foreign language assessment by discussing the recent research conducted during COVID-19 pandemic. It is hoped that this chapter has contributed to the literature in significant ways by providing the background, the latest conceptions and practices, reporting some guidelines, and suggesting further research areas. It is obvious that there is much more to be done to clarify these issues within this digital era, so this chapter is believed to pave the way for future studies that will concentrate on different aspects of academic integrity in online assessment to a certain extent.

REFERENCES Adzima, K. (2020). Examining online cheating in higher education using traditional classroom cheating as a guide. Electronic Journal of E-Learning, 18(6), 476–493. doi:10.34190/JEL.18.6.002 Ahmad, N., Rahim, I. S. A., & Ahmad, S. (2021). Challenges in implementing online language assessment-a critical reflection of issues faced amidst Covid-19 pandemic. In F. Baharom, Y. Yusof, R. Romli, H. Mohd, M. A. Saip, S. F. P. Mohamed, & Z. M. Aji (Eds.), Proceedings of Knowledge Management International Conference (KMICe) 2021 (pp. 74–79). UUM College of Arts and Sciences. Ahsan, K., Akbar, S., & Kam, B. (2021). Contract cheating in higher education: A systematic literature review and future research agenda. Assessment & Evaluation in Higher Education, 47(4), 523–539. do i:10.1080/02602938.2021.1931660

321

 Academic Integrity in Online Foreign Language Assessment

Alenezi, S. M. (2022). Tertiary level English language teachers’ use of, and attitudes to alternative and online assessments during the Covid-19 outbreak. International Journal of Education and Information Technologies, 16, 39–49. doi:10.46300/9109.2022.16.4 Alessio, H. M., & Messinger, J. D. (2021). Faculty and student perceptions of academic integrity in technology-assisted learning and testing. Frontiers in Education, 6, 629220. Advance online publication. doi:10.3389/feduc.2021.629220 Amigud, A., & Lancaster, T. (2019). 246 reasons to cheat: An analysis of students’ reasons for seeking to outsource academic work. Computers & Education, 134, 98–107. doi:10.1016/j.compedu.2019.01.017 Amzalag, M., Shapira, N., & Dolev, N. (2022). Two sides of the coin: Lack of academic integrity in exams during the Corona pandemic, students’ and lecturers’ perceptions. Journal of Academic Ethics, 20(2), 243–263. doi:10.100710805-021-09413-5 PMID:33846681 Anasse, K., & Rhandy, R. (2021). Teachers’ attitudes towards online writing assessment during Covid-19 pandemic. International Journal of Linguistics, Literature and Translation, 3(8), 65–70. doi:10.32996/ ijllt.2021.4.8.9 Assulaimani, T. (2021). Alternative language assessments in the digital age. Journal of King Abdulaziz University Arts and Humanities, 29(1), 597–609. doi:10.4197/Art.29-1.20 Augusta, C., & Henderson, R. D. E. (2021). Student Academic integrity in online learning in higher education in the era of COVID-19. In C. Cheong, J. Coldwell-Nielson, K. MacCallum, T. Luo, & A. Scime (Eds.), COVID-19 and education: Learning and teaching in a pandemic-constrained environment (pp. 409–423). Informing Science Press. doi:10.35542/osf.io/a3bnp Awasthi, S. (2019). Plagiarism and academic misconduct a systematic review. DESIDOC Journal of Library and Information Technology, 39(2), 94–100. doi:10.14429/djlit.39.2.13622 Bearman, M., Dawson, P., O’Donnell, M., Tai, J., & Jorre de St Jorre, T. (2020). Ensuring academic integrity and assessment security with redesigned online de livery. Deakin University. https://dteach.deakin.edu.au/2020/03/23/academic-integrity-o nline/ Beck, G., Tsaryk, O. M., & Rybina, N. V. (2020). Teaching and assessment strategies in online foreign languages distance learning. Медична Освіта, 2(2), 6–13. doi:10.11603/me.2414-5998.2020.2.11139 Behforouz, B. (2022). Online assessment and the features in language education context: A brief review. Journal of Language and Linguistic Studies, 18(1), 564–576. Bjelobaba, S. (2021). Deterring cheating using a complex assessment design: A case study. The Literacy Trek, 7(1), 55–77. doi:10.47216/literacytrek.936053 Blau, I., Goldberg, S., Friedman, A., & Eshet-Alkalai, Y. (2020). Violation of digital and analog academic integrity through the eyes of faculty members and students: Do institutional role and technology change ethical perspectives? Journal of Computing in Higher Education, 33(1), 157–187. doi:10.100712528020-09260-0 PMID:32837125

322

 Academic Integrity in Online Foreign Language Assessment

Blinova, O. (2022). What Covid taught us about assessment: students’ perceptions of academic integrity in distance learning. INTED2022 Proceedings, 6214-6218. 10.21125/inted.2022.1576 Boitshwarelo, B., Reedy, A. K., & Billany, T. (2017). Envisioning the use of online tests in assessing twenty-first century learning: A literature review. Research and Practice in Technology Enhanced Learning, 12(1), 1–16. doi:10.118641039-017-0055-7 PMID:30595721 Bretag, T., Harper, R., Burton, M., Ellis, C., Newton, P., van Haeringen, K., Saddiqui, S., & Rozenberg, P. (2019). Contract cheating and assessment design: Exploring the relationship. Assessment & Evaluation in Higher Education, 44(5), 676–691. doi:10.1080/02602938.2018.1527892 British Council. (2020). How can universities conduct online assessment that is sec u re a n d c re d i b l e ? h t t p s : / / w w w. b r i t i s h c o u n c i l . u z / s i t e s / d e fa u l t / f i l e s / s p o t l i g h t _ report_how_can_universities_conduct_online_assessment_that_i s_secure_and_credible_0.pdf Brown, H. D. (2004). Language assessment: Principles and classroom practices. Pearson Longman Education. Celik, O., & Lancaster, T. (2021). Violations of and threats to academic integrity in online English language teaching: Revealing the attitudes of students. The Literacy Trek, 7(1), 34–54. doi:10.47216/ literacytrek.932316 Clarke, R., & Lancaster, T. (2006). Eliminating the Successor to Plagiarism? Identifying the Usage of Contract Cheating Sites. Proceedings of the 2nd International Plagiarism Conference. Northumbria Learning Press. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.12 0.5440&rep=rep1&type=pdf Coombe, C. (2018). An A to Z of second language assessment: How language teachers understand assessment concepts. British Council. Czura, A., & Dooly, M. (2021). Foreign language assessment in virtual exchange – The ASSESSnet project. Collated Papers for the ALTE 7th International Conference, 137–140. D o u g l a s C o l l e ge . ( 2 0 2 0 ) . Ac a d e m i c i n t e gr i t y p o l i c y . h t t p s : / / w w w. d o u g l a s c o l l e g e . c a / s i t e s / d e f a u l t / f i l e s / d o c s / f i n a n ce-dates-and-deadlines/Academic%20Integrity%20Policy%20w%20F lowchart.pdf Egan, A. (2018). Improving academic integrity through assessment design. Dublin City University, National Institute for Digital Learning (NIDL). Elkhatat, A., Elsaid, K., & Almeer, S. (2021). Teaching tip: Cheating mitigation in online assessment. Chemical Engineering Education, 55(2), 103. doi:10.18260/2-1-370.660-125272 Ellis, C., van Haeringen, K., Harper, R., Bretag, T., Zucker, I., McBride, S., Rozenberg, P., Newton, P., & Saddiqui, S. (2020). Does authentic assessment assure academic integrity? Evidence from contract cheating data. Higher Education Research & Development, 39(3), 454–469. doi:10.1080/07294360.2 019.1680956

323

 Academic Integrity in Online Foreign Language Assessment

Erguvan, I. D. (2021). The rise of contract cheating during the COVID-19 pandemic: A qualitative study through the eyes of academics in Kuwait. Language Testing in Asia, 11(1), 34. Advance online publication. doi:10.118640468-021-00149-y Fulcher, G. (2012). Assessment literacy for the language classroom. Language Assessment Quarterly, 9(2), 113–132. doi:10.1080/15434303.2011.642041 Gacs, A., Goertler, S., & Spasova, S. (2020). Planned online language education versus crisis‐prompted online language teaching: Lessons for the future. Foreign Language Annals, 53(2), 380–392. doi:10.1111/ flan.12460 Gamage, K. A., Silva, E. K. D., & Gunawardhana, N. (2020). Online delivery and assessment during COVID-19: Safeguarding academic integrity. Education Sciences, 10(11), 301. doi:10.3390/educsci10110301 Garg, M., & Goel, A. (2022). A systematic literature review on online assessment security: Current challenges and integrity strategies. Computers & Security, 113, 1–13. doi:10.1016/j.cose.2021.102544 Giraldo, F. (2018). Language assessment literacy: Implications for language teachers. Profile: Issues in Teachers’. Professional Development, 20(1), 179–195. doi:10.15446/profile.v20n1.62089 Guangul, F. M., Suhail, A. H., Khalit, M. I., & Khidhir, B. A. (2020). Challenges of remote assessment in higher education in the context of COVID-19: A case study of Middle East College. Educational Assessment, Evaluation and Accountability, 32(4), 519–535. doi:10.100711092-020-09340-w PMID:33101539 Hakim, B. (2020). Technology integrated online classrooms and the challenges faced by the EFL teachers in Saudi Arabia during the COVID-19 pandemic. International Journal of Applied Linguistics and English Literature, 9(5), 33–39. doi:10.7575/aiac.ijalel.v.9n.5p.33 Henari, T. F., & Ahmed, D. A. K. (2021). Evaluating educators and students’ perspectives on asynchronous and synchronous modes of e-learning in crisis education. Asian EFL Journal Research Articles, 28(2), 80–98. Hidri, S. (2020). New challenges in language assessment. In S. Hidri (Ed.), Changing Language Assessment (1st ed., pp. 3–22). Palgrave Macmillan. doi:10.1007/978-3-030-42269-1_1 Holden, O. L., Norris, M. E., & Kuhlmeier, V. A. (2021). Academic integrity in online assessment: A research review. Frontiers in Education, 6, 1–13. doi:10.3389/feduc.2021.639814 Isbell, D. R., & Kremmel, B. (2020). Test review: Current options in at-home language proficiency tests for making high-stakes decisions. Language Testing, 37(4), 600–619. doi:10.1177/0265532220943483 Janke, S., Rudert, S. C., Petersen, N., Fritz, T. M., & Daumiller, M. (2021). Cheating in the wake of COVID-19: How dangerous is ad-hoc online testing for academic integrity? Computers and Education Open, 2, 1–9. doi:10.1016/j.caeo.2021.100055 Jordan, S. R. (2013). Conceptual clarification and the task of improving research on academic ethics. Journal of Academic Ethics, 11(3), 243–256. doi:10.100710805-013-9190-y

324

 Academic Integrity in Online Foreign Language Assessment

Khairil, L. F., & Mokshein, S. E. (2018). 21st century assessment: Online assessment. International Journal of Academic Research in Business & Social Sciences, 8(1), 659–672. doi:10.6007/IJARBSS/ v8-i1/3838 Koris, R., & Pal, A. (2021). Fostering learners’ involvement in the assessment process during the COVID-19 pandemic: Perspectives of university language and communication teachers across the globe. Journal of University Teaching & Learning Practice, 18(5), 11–20. doi:10.53761/1.18.5.11 Kostova, K. B. (2020). Outlines of English as a foreign language testing and assessment in higher education. Announcements of Union of Scientists - Sliven, 35(1), 90–106. Mahfoodh, H. (2021). Reflections on online EFL assessment: Challenges and solutions. 2021 Sustainable Leadership and Academic Excellence International Conference (SLAE), 1-8. 10.1109/ SLAE54202.2021.9788097 Meccawy, Z., Meccawy, M., & Alsobhi, A. (2021). Assessment in ‘survival mode’: Student and faculty perceptions of online assessment practices in HE during Covid-19 pandemic. International Journal for Educational Integrity, 17(1), 1–24. doi:10.100740979-021-00083-9 Minocha, S. (2021). Designing assessment for academic integrity. In Assessment Programme/ Scholarship Steering Group Event. In Assessment Hub: Supporting assessment practices around the OU. The Open University. http://oro.open.ac.uk/80306/1/Designing-Assessment-for-Acade mic-Integrity-ORO.pdf Muhammad, A. A., & Ockey, G. J. (2021). Upholding language assessment quality during the COVID-19 pandemic: Some final thoughts and questions. Language Assessment Quarterly, 18(1), 51–55. doi:10. 1080/15434303.2020.1867555 Nagi, K., & John, V. K. (2020). Plagiarism among Thai students: A study of attitudes and subjective norms. 2020 Sixth International Conference on e-Learning (econf), 45-50. 10.1109/econf51404.2020.9385427 Oldfield, A., Broadfoot, P., Sutherland, R., & Timmis, S. (2012). Assessment in a digital age: A research review. University of Bristol. Olt, M. R. (2002). Ethics and Distance Education: Strategies for Minimizing Academic Dishonesty in Online Assessment. Online Journal of Distance Learning Administration, 5(3), 1–7. Park, C. (2003). In other (people’s) words: Plagiarism by university students—literature and lessons. Assessment & Evaluation in Higher Education, 28(5), 471–488. doi:10.1080/02602930301677 Paullet, K. (2020). Student and faculty perceptions of academic dishonesty in online classes. Issues in Information Systems, 21(3), 327–333. doi:10.48009/3_iis_2020_327-333 Picciano, A. G. (2017). Theories and frameworks for online education: Seeking an integrated model. Online Learning, 21(3), 166–190. doi:10.24059/olj.v21i3.1225 Polisca, E., Stollhans, S., Bardot, R., & Rollet, C. (2022). How Covid-19 has changed language assessments in higher education: a practitioners’ view. In C. Hampton & S. Salin (Eds.), Innovative language teaching and learning at university: facilitating transition from and to higher education (pp. 81-91). Research-publishing.net. doi:10.14705/rpnet.2022.56.1375

325

 Academic Integrity in Online Foreign Language Assessment

Pu, S., & Xu, H. (2021). Examining changing assessment practices in online teaching: A multiplecase study of EFL school teachers in China. The Asia-Pacific Education Researcher, 30(6), 553–561. doi:10.100740299-021-00605-6 Rahim, A. F. A. (2020). Guidelines for online assessment in emergency remote teaching during the COVID-19 pandemic. Education in Medicine Journal, 12(2), 59–68. doi:10.21315/eimj2020.12.2.6 Reedy, A., Pfitzner, D., Rook, L., & Ellis, L. (2021). Responding to the COVID-19 emergency: Student and academic staff perceptions of academic integrity in the transition to online exams at three Australian universities. International Journal for Educational Integrity, 17(1), 1–32. doi:10.100740979-021-00075-9 Rofiah, N. L., & Waluyo, B. (2020). Using Socrative for vocabulary tests: Thai EFL learner acceptance and perceived risk of cheating. The Journal of Asia TEFL, 17(3), 966–982. doi:10.18823/asiatefl.2020.17.3.14.966 Rogier, D. (2014). Assessment literacy: Building a base for better teaching and learning. English Language Teaching Forum, 3, 2-13. Sabrina, F., Azad, S., Sohail, S., & Thakur, S. (2022). Ensuring academic integrity in online assessments: A literature review and recommendations. International Journal of Information and Education Technology (IJIET), 12(1), 60–70. doi:10.18178/ijiet.2022.12.1.1587 Sharadgah, T. A., & Sa’di, R. A. (2020). Preparedness of institutions of higher education for assessment in virtual learning environments during the COVID-19 lockdown: Evidence of bona fide challenges and prag-matic solutions. Journal of Information Technology Education, 19, 755–774. doi:10.28945/4615 Shariffuddin, S. A., Ibrahim, I. S. A., Shaaidi, W. R. W., Syukor, F. D. M., & Hussain, J. (2022). Academic dishonesty in online assessment from tertiary students’ perspective. International Journal of Advanced Research in Education and Society, 4(2), 75–84. doi:10.55057/ijares.2022.4.2.8 Situmorang, K., Nugroho, D. Y., & Pramusita, S. M. (2020). English teachers’ preparedness in technology enhanced language learning during Covid-19 pandemic – Students’ voice. Jo-ELT (Journal of English Language Teaching). Fakultas Pendidikan Bahasa & Seni Prodi Pendidikan Bahasa Inggris IKIP, 7(2), 57–67. doi:10.33394/jo-elt.v7i2.2973 Surahman, E., & Wang, T. (2022). Academic dishonesty and trustworthy assessment in online learning: A systematic literature review. Journal of Computer Assisted Learning, 38(6), 1–19. doi:10.1111/jcal.12708 Tauginienė, L., Gaižauskaitė, I., Glendinning, I., Kravjar, J., Ojsteršek, M., Ribeiro, L., Odiņeca, T., Marino, F., Cosentino, M., Sivasubramaniam, S., & Foltýnek, T. (2018). Glossary for academic integrity. European Network for Academic Integrity. http://www.academicintegrity.eu/wp/wp-content/uploads/2018/1 0/Glossary_revised_final.pdf Tauginienė, L., Ojsteršek, M., Foltınek, T., Marino, F., Cosentino, M., Gaižauskaitė, I., Glendinning, I., Sivasubramaniam, S., Razi, S., Ribeiro, L., Odiņeca, T., & Trevisiol, O. (2019). General Guidelines for Academic Integrity. ENAI Report 3A. https://www.academicintegrity.eu/wp/wp-content/uploads/2019/ 09/Guidelines_amended_version_1.1_09_2019.pdf

326

 Academic Integrity in Online Foreign Language Assessment

The International Center for Academic Integrity [ICAI]. (2021). The fundamental values of academic integrity (3rd ed.). www.academicintegrity.org/the-fundamental-valuesof-academicintegrity Tiong, L. C. O., & Lee, H. J. (2021). E-cheating prevention measures: Detection of cheating at online examinations using deep learning approach -- A case study. Journal of Latex Class Files, 1-9. doi:10.48550/ arXiv.2101.09841 Tsigaros, T., & Fesakis, G. (2021). E-assessment and academic integrity: A literature review. In A. Reis, J. Barroso, J. B. Lopes, T. Mikropoulos, & C-W. Fan (Eds.), Technology and Innovation in Learning, Teaching and Education: Second International Conference, TECH-EDU 2020 Proceedings (pp. 313319). Springer International Publishing. Valizadeh, M. (2022). Cheating in online learning programs: Learners’ perceptions and solutions. Turkish Online Journal of Distance Education, 23(1), 195–209. doi:10.17718/tojde.1050394 van der Westhuizen, D. (2016). Guidelines for online assessment for educators. Commonwealth of Learning. doi:10.13140/RG.2.2.31196.39040 Weleschuk, A., Dyjur, P., & Kelly, P. (2019). Online Assessment in Higher Education. Taylor Institute for Teaching and Learning Guide Series. Taylor Institute for Teaching and Learning at the University of Calgary. https://taylorinstitute.ucalgary.ca/resources/guides Wright, C., Antonios, A., Palaktsoglou, M., & Tsianika, M. (2013). Planning for authentic language assessment in higher education synchronous online environments. Journal of Modern Greek Studies, 246–258. Youmans, R. J. (2011). Does the adoption of plagiarism-detection software in higher education reduce plagiarism? Studies in Higher Education, 36(7), 749–761. doi:10.1080/03075079.2010.523457 Zhang, C., Yan, X., & Wang, J. (2021). EFL teachers’ online assessment practices during the COVID-19 pandemic: Changes and mediating factors. The Asia-Pacific Education Researcher, 30(6), 499–507. doi:10.100740299-021-00589-3

ADDITIONAL READING Bretag, T. (Ed.). (2016). Handbook of academic integrity. Springer Singapore. doi:10.1007/978-981287-098-8 Eaton, S. E., & Christensen Hughes, J. (Eds.). (2022). Academic integrity in Canada: An enduring and essential challenge. Springer Nature. doi:10.1007/978-3-030-83255-1 Hamilton, M., & Richardson, J. (2007). An academic integrity approach to learning and assessment design. Journal of Learning Design, 2(1), 37–51. doi:10.5204/jld.v2i1.27

327

 Academic Integrity in Online Foreign Language Assessment

Rohmana, W. I. M., Kamal, S., Amani, N., & As-Samawi, T. A. (2022). Academic dishonesty in online English as a Foreign Language classroom. EnJourMe (English Journal of Merdeka). Culture, Language, and Teaching of English, 7(2), 230–240. doi:10.26905/enjourme.v7i2.8827

KEY TERMS AND DEFINITIONS Academic Dishonesty: The violation or breach of academic integrity values and principles, and the demonstration of unreliable, disrespectful, unfair, and insincere behaviors in academic works, assessment, and procedures. Academic Integrity: The commitment of teachers, learners, and all the other stakeholders in education to the moral values of honesty, trust, respect, fairness, responsibility, and courage while reflecting such values to their academic works, assessment, and procedures. Emergency Remote Teaching (ERT): The process of online teaching, learning, and assessment temporarily conducted during COVID-19 pandemic in order to respond to the urgent need of education by using web-based technologies, digital platforms, tools, or applications to implement it via the Internet. Ethics: The system of established moral principles and rules in order to regulate assessment-related procedures and behaviors such as task design, item construction, administration, and evaluation. Fairness: The concept that considers each learner’s needs and characteristics so as to treat them in a suitable, objective, and equal way while assessing their skills and knowledge. To achieve fairness, reliable and valid assessment is needed. Online Language Assessment: The systematic data collection and evaluation of language learners’ knowledge, abilities, and skills by means of integrating various digital tools or applications connected with the Internet to administer different assessment tasks for different purposes in order to improve teaching and learning. Security: The protection, control, and maintenance of assessment data, tools, methods, administration, and results to prevent threats to reliability, validity, and other assessment principles and thereby, to provide the assessment privacy.

328

329

Chapter 16

A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications in Web of Science (WoS) Devrim Höl Pamukkale University, Turkey Ezgi Akman Pamukkale University, Turkey

ABSTRACT This chapter aimed to examine the e-assessment in second/foreign language teaching-themed international publications from WoS (Web of Science) using the bibliometric method, one of the literature review tools. In particular, the most prolific countries, annual scientific production, the most globally cited documents, authors, institutions, keywords, and changing research trends were analyzed. A total of 3352 research documents from the Web of Science (WoS) Core Collection database were included in the analysis including publications until June 2022. In the analysis of the data obtained, the open-source R Studio program and the “biblioshiny for bibliometrix” application, which is an R program tool, were used. Based on the data analysis and discussions of these documents, this study has revealed some important results that will contribute to the field of trends of e-assessment in second/foreign language teaching.

INTRODUCTION In recent years, with the drastic changes in education during COVID-19, e-assessment, which was once regarded as machine-directed, far from humanistic elements and an understanding with full of contradictions, has recently been considered a practical and effective assessment tool for educators. In the past, it DOI: 10.4018/978-1-6684-5660-6.ch016

Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

was thought that machines, mostly computers in assessment in foreign language teaching, were far from the concepts of designing education, including emotions, feelings, or anxiety. However, during the last two decades, the terms, machines, machine learning, artificial intelligence, and algorithms have become an indispensable part of our modern lives, and we are constantly in a relationship with them, so they replaced and evaluated in every field with the latest and provoking advances in technology. As people constantly rely more and more on machines which were once thought as cold and far from empathy devices, they began to notice the advantages of these machines bring and benefit from them. This was the case during the COVID-19 period, and, perhaps, for the first time, with the advent of this outbreak, educators, teachers, and researchers end this dispute with machines and technology, as it was the only way to teach, to reach and to assess learners during pandemic days. This period had a pleasant coincidence with the interest of researchers in second/foreign language assessment. During this period, there had been no consensus on the “effective” assessment in EFL/ ESL classrooms, although hundreds of research have been conducted on assessment. In these studies, researchers have tried to describe and investigate “the process of collecting information about a student to aid in decision-making about the progress and language development of the student.” (Cheng et al., 2004). In line with these studies in EFL assessment, various factors in the language learning process like teachers’ instruction type, and students’ learning experiences that can be influenced by the assessment practices were investigated (Coombs et al., 2018). In this cycle, it is a general fact that education is greatly influenced by assessment, but assessment practices are also influenced by a variety of factors that can occasionally conflict with one another (Ridgway et al., 2004). Likewise, teachers’ approaches to assessment, classroom experiences, and pre-service education of teachers were found to be very influential on the assessment practices (Coombs et al., 2018). In addition to these factors, technology and education have become a pivotal role in the last two decades and appeared another factor. However, although assessment in a foreign language teaching, assessment has significant importance due to its strong influence on various processes including teaching, learning, and decision-making (Coombs et al., 2018), in literature and, in research studies and theoretical articles written about both mainstream education and TESOL/TEFL literature, assessment is a widely discussed topic for its major role in learning and teaching that cannot be overlooked (Troudi et al., 2009). The continuously growing literature on language assessment has been helping language testers to look critically at their practices and build a better understanding of language education and assessment, therefore the area of language assessment is constantly challenged in fundamental aspects (McNamara, 2001). While the role of useful and effective assessment in EFL context is highly crucial, technological tools that include “e-assessment” came to the fore. The recent developments in information and communication technologies (ICT) made the adaptation of technology in educational settings more widespread, not only in the learning process but also in the assessment as well. The use of ICT in assessment is often referred to as “e-assessment”. According to Joint Information Systems Committee (JISC) (2007), e-assessment can be defined as “the end-to-end electronic assessment processes where ICT is used for the presentation of assessment activity, and the recording of responses.” These processes involve learners, teachers, institutions, and the general public as well. Crisp (2011) defines e-assessment as “the process of constructing, delivering, storing, or reporting student assessment tasks, responses, grades, or feedback with the use of digital technologies” (p.5). Definitions of e-assessment place a strong emphasis on the value of technology in the implementation of assessment. Thus, it can be easily understood that information and communication technologies (ICTs) are essential for e-assessment.

330

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

As technology is often associated with efficiency, language testers might consider technology in language assessment in terms of how it effectively facilitates the testing process. In our day, e-assessment became even more popular due to the necessity for change in traditional educational procedures and the constant improvements in technology. It is predicted that e-assessment will be used more often in educational systems shortly, which suggests that the tasks used in e-assessment may be quite different from those used in on-paper assessment and early computerized assessment (Boyle & Hutchison, 2009). This indicates studies in the field of e-assessment are not only developed greatly to this day but are also inclined to expand significantly over time. As a field within a discipline expands over time, the quality of the academic publications in that field will also determine its place and importance in the scientific world. Certain characteristics of these publications might define the field, which makes it important to look into the possible issues that may influence its future. Using traditional methods with the purpose to reveal academic growth within a field is not as sufficient as using new trends, such as the bibliometric method. In the light of these issues, the current paper investigates the published research articles written about e-assessment in the World of Science (WoS) using the bibliometric analysis method.

BACKGROUND Definition of Assessment in Foreign/Second Language Assessment is seen as central to the practice of education. For students, good performance gives access to further education and employment opportunities; for teachers and schools, it provides evidence of success as individuals and organizations (Ridgway et al., 2004). Assessment is indeed a critical element of teaching and learning and is crucial to students’ academic growth (White, 2009). It is a term that has been defined in many different ways. Additionally, the definitions of assessment are prone to change over time with new developments and all the changing views in education. In the early studies, the term “assessment” was largely used to refer to procedures for determining whether a series of educational activities were effective after they had been completed (William, 2011). Points of view on the term assessment expanded greatly over time with the developments in education. Notably in language learning, assessment is considered as having many different forms and shapes. The term “assessment” is used both as a general term for all forms of testing and assessment and as a way to distinguish “alternative assessment” from “testing.” For some researchers in the field of applied linguistics, the term “testing” refers to the development and administration of formally structured examinations like the Test of English as a Foreign Language (TOEFL), whereas the term “assessment” refers to fewer formal methods. According to Valette (1994), “tests” refer to standardized exams while “assessments” refer to classroom evaluations of student learning. It’s interesting to note that some testers are replacing the word “test” with “assessment” in contexts where they previously would have used “test.” For example, Kunnan in 1998. There appears to have been a shift in the attitudes of a great number of language testers, such that they may have begun, sometimes unconsciously, to conceive of testing only in reference to standardized, large-scale tests. Consequently, they chose the term “assessment” as the more inclusive and appropriate term. Therefore, stating that assessment is sometimes a misunderstood term in education, Brown (2004) claims that assessment “is an ongoing process that encompasses a much wider domain, as teachers assess students’ performances subconsciously.”. He further suggests that “A good 331

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

teacher never ceases to assess students, whether those assessments are incidental or intended.” (p.4). Green (2014) provides a rather simple definition of language assessment, stating that language assessment “involves obtaining evidence to inform inferences about a person’s language-related knowledge, skills or abilities.” (p.5). He further explains that in this definition, the “evidence” is found from the language use tasks and the “inferences” are the interpretation of the meaning of the performance based on our beliefs about the nature of language and its role in life. According to Taras (2005), the term assessment is used “to refer to judgments of students’ work, and ‘evaluation’ to refer to judgments regarding courses or course delivery, or the process of making of such judgments.” Furthermore, she adds that the assessment processes are the mechanics or procedures necessary to arrive at a judgment, which gives teachers helpful feedback and helps in their analysis of the skills and knowledge of their students. This definition gives strong hints to the term “formative assessment”, as it emphasizes the feedback teachers get to help them analyze the classroom instruction and students’ skills. McMillan (2001) claimed formative assessment is “to monitor and improve instruction and students’ learning.” Formative assessment usually occurs during instruction so teachers can determine whether the instruction is successful and adjust their instruction style when necessary (Taylor & Bobbit-Nolen, 2005). Some formative assessment can be classified as assessment learning, which aims to provide students with feedback to assess the quality of their learning and enhance their learning. behaviors, but not all (Frey & Schmitt, 2007).

Definition of Testing in Foreign/Second Language While assessment has been defined as a procedure, a process of making judgments; testing can be simply defined as the product for measuring the level of knowledge, a set of skills, or behavior. It can be stated that testing is one of the most used assessment tools in education (Adom et al, 2020). In the earlier attempts of defining the term “test”, Carroll (1968) made the following statement: “a psychological or educational test is a procedure designed to elicit certain behavior from which one can make inferences about certain characteristics of an individual.” (p.46). As the scholars gave broader definitions that cover many disciplines, language testing stayed in the lane of applied linguistics without having a distinct place to grow and evolve. Later on, Bachman (1991) states that the developments in the 1980s led to the emergence of language testing as a distinct field within applied linguistics. The first explicitly defined model of language ability for language testing was the skills-and-elements approach that was articulated by Lado (1961) and Carroll (1968). Johnson and Johnson (2001) stated that foreign/second language testing “involves many technologies and developments which are different from language teaching, and yet it interacts closely with most aspects of language teaching” (p. 187). Bachman (1990) draws attention to the importance of tests in language learning by claiming that they can provide a greater concentration on certain language skills that are of interest (p.21).

Historical Background of E-Assessment There is no doubt assessment is fundamental to the teaching process. Assessments with high stakes demonstrate educational goals, identify what is important to know, and influence instructional practice. It is crucial to design evaluation methods that represent our fundamental academic goals and award learners for acquiring traits and skills that are going to be a long-term advantage to both them and the 332

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

community. There is substantial proof from studies suggesting that well-designed assessment systems enhance students’ achievement. For comparison, the United States offers outstanding examples of systems in which focused specifically, high-stakes evaluation systems create misleading student improvements; this “friendly fire” results in wasted advantages at best, and devastation to learners, instructors, and societies at worst. In the 1920s, Sidney L. Pressey created various machines that could mechanically evaluate a student by presenting him with a sequence of questions that were keyed to multiple-choice answers. This marked the beginning of the employment of technology in the assessment process (Skinner, 1961). According to the Tomlinson Report (2004), there are significant issues with the contemporary academic requirement for students aged 14 to 19 years old in the USA. These issues include the following: there is an abundance of skills and experience; too few learners interact with academic achievement; the dropping out of school rate is outrageously significant; and the most willing learners are not strained by their studies. The competency, information, and individual characteristics that teenagers will require in the future are not being taught to them. Therefore, it means that it is proposed to take a significantly different approach to credentials, one that can only be implemented if there is extensive use of electronic assessment. However, the concept of electronic or digital assessment, also known as e-assessment or machine scoring, has a relatively short history dating back to the 1950s. Early forms of e-assessment were limited to multiple-choice tests and simple computer-based activities (Wainer, 2000). In the 1960s and 1970s, advancements in technology allowed for the development of more sophisticated e-assessment tools, such as computer-adaptive testing (Wainer et al., 2002) and automated scoring of writing samples (Attali, 2007). The use of e-assessment began to gain widespread acceptance in the 1990s and 2000s, as internet usage and computer literacy increased (Baker & Brown, 2011). This led to the development of more advanced e-assessment tools, such as online proctoring and remote invigilation (Kirschner, Van Merriënboer, & Kester, 2014). Nowadays, with the advancement of Artificial Intelligence and Machine Learning e-assessment is getting more accurate, efficient, and cost-effective (Rudner, 2019). Evaluating students using information and communications technology (ICT) is now more commonly known as “e-assessment.” In e-assessment, ICT is utilized throughout the entire of the assessment process, from the development of assignments through the storing of the results (JISC, 2007). Bennet (1998) wrote extensively about the future of assessment in which he included his views on computerbased assessment. He refers to the early innovations of electronic assessment as the first generation, and he claims that in this generation, electronic tests are substantively the same as those administered on paper (Bennet, 1998, p.3). Furthermore, he states that in the next generation of e-assessment, the change comes from the nature of test questions and formats (Bennet, 1998, p.5). He gives the assessment of listening skills in a foreign language as an example, in which he writes that with the arrival of permanent computer-based exam centers equipped to give high-quality multimedia, conducting tests with sound and video more efficiently. Also, he states that eventually, item-generating tools may also be used to determine test designs, and at some point, they may become advanced enough to generate items automatically (Bennet, 1998, p.8). In the last generation Bennet (1998, p.11) claimed in his paper as the “third generation” of electronic assessment, he stated that distance learning will become more widespread, and large-scale electronic assessment will become dominant, breaking the traditional ways of assessing learners. After the time he wrote about his beliefs on the future of electronic assessment, the area of e-assessment grew progressively. More recently, JISC created some publications which were intended as a guide on e-learning and e-assessment procedures in the 21st century (JISC, 2007). It

333

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

would not be wrong to state that e-assessment will undoubtedly develop into a significant and prevalent element of educational systems. Over time, the use of ICT for assessment purposes evolved greatly with the new advancements in technology. The efficiency of using technology was acknowledged since a variety of technologies, including recording equipment, statistical programs, and databases, as well as language-recognition-capable applications, can be utilized in the testing process (Burstein et al., 1996). Recent advances in computer technology in language assessment include development, storing, scoring, processing test data, and carrying out sophisticated statistical analysis (Farhady, 2005). Even though using computer technologies is considered as a more developed way of testing students in many cases, it is not free of problems. Considering test takers have varying levels of experience with computers, employing computer technology for assessment purposes might lead to the introduction of construct-irrelevant variance, which is one of the problems associated with this type of assessment (Kirsch, Jamieson, Taylor, & Eignor, 1998). It can be claimed that there was, and still is a division among scholars’ views of using computer technologies in language assessment. Some argue that computers serve as an alternative testing medium and a replacement for paper-and-pencil exams while others question the stability of the construct being measured by computers (Farhady, 2005) There is an ongoing need, as in many academic fields, to conduct research to gain a better understanding and to determine the gaps in the field which need to be filled. It can be stated that e-assessment is a relatively new area to be investigated with the ongoing developments in ICT and newly developing approaches within the field of assessment. One of the earliest publications on the use of technology in assessment was concerned with item response theory, which offers a method for obtaining accurate statistical data on test items (Hambleton et al., 1991). In line with this, a number of studies concerning the use of technology for assessment investigated computerized adaptive testing (CAT) (Wainer, 2000), which is a test that mimics what a good examiner would do and adapts to the ability of the candidate. The utilization of these methods, the underlying assumptions, and the development of the first computer-adaptive tests were the primary concerns of language testers at the beginning of the 1980s (Chapelle & Voss, 2017). Chapelle and Voss (2017) stated that the improvements in computer recognition of examinees’ replies were inaccessible to assessment practices, and they mainly stayed in the realm of research laboratories which means that the use of technology for assessment purposes had a slow start at the beginning. Later on, studies about the use of adaptive testing included the effects of various adaptivity schemes on learners’ test performance (Vispoel et al., 2000) and the strategies for grouping items in a way that preserves their context to permit multiple items to be selected together because they are associated with a single reading or listening passage (Eckes, 2014). In addition, a number of earlier studies examined the consequences brought on by the computerization of conventional examinations (Raikes & Harding, 2003), as well as the automated grading of students’ written work (Dikli, 2006). In their review of computer-assisted assessment (CAA), Conole and Warburton (2005) stated that there is a potential for CAA to reduce the workload associated with assessment and to offer innovative and powerful modes of assessment. One line of research about using technology in language assessment explores utilizing natural language processing technologies to evaluate the spoken and written language of learners (Chapelle & Voss, 2017). Chapelle and Voss (2017) state that recent research on natural language processing for language assessment has developed technology that can also assess learners’ constructed linguistic responses.

334

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

Purposes of e-Assessment and Types of e-Assessment Tools E-assessment, or the use of technology in the assessment of foreign language proficiency, has become an essential tool in language education in recent years. The integration of technology in language assessment allows for more efficient, accurate and reliable assessment results, and can provide valuable data and insights for teachers and institutions to improve their language teaching and learning programs. The main purposes of e-assessment in foreign language teaching and learning are to measure and evaluate language proficiency, to provide formative and summative feedback to students, and to track student progress over time. E-assessment tools can be used to assess a wide range of language skills, including listening, speaking, reading, and writing. Furthermore, e-assessment can also be used to assess a variety of language competencies, such as grammar, vocabulary, and pronunciation. There are several types of e-assessment tools available, including computer-based testing (CBT), computer-assisted language learning (CALL), and online language proficiency tests. CBT is a type of e-assessment that uses a computer or other digital device to administer tests and quizzes. CALL is a type of e-assessment that uses technology to provide interactive and personalized language instruction. Online language proficiency tests are a type of e-assessment that uses the internet to deliver standardized tests of language proficiency, such as the TOEFL or IELTS. E-assessment tools can also be used to assess a variety of e-assessment item types, including multiplechoice questions, fill-in-the-blank questions, short answer questions, and essay questions. Additionally, e-assessment tools can also include audio and video recording capabilities, which allow students to record their speaking and listening skills. Furthermore, speech recognition software can be used to evaluate students’ pronunciation, grammar, and fluency. One of the main advantages of e-assessment is that it allows for a more efficient and accurate assessment of language proficiency. For example, computer-based testing (CBT) and computer-assisted language learning (CALL) can provide immediate feedback to students on their language performance, allowing them to quickly identify and address areas of weakness. Furthermore, e-assessment tools such as automated scoring and speech recognition software can provide more objective and reliable assessment results compared to traditional paper-based methods. In addition, e-assessment can also provide valuable data and insights for teachers and institutions to improve their language teaching and learning programs. For example, data on student performance can be used to identify areas of weakness and to tailor instruction to meet the specific needs of individual students. Furthermore, e-assessment can also be used to track student progress over time, providing teachers with a clear understanding of how well students are progressing. Teachers can use this data to modify the curriculum, adjust teaching methods and to provide targeted support for students who need it. Another advantage of e-assessment is that it can be administered remotely, which is particularly useful during the current COVID-19 pandemic. This feature also can be beneficial for institutions to reach a larger pool of students, who are geographically dispersed, or for students who are unable to attend traditional in-person assessments due to various reasons. However, it’s important to note that while eassessment can be beneficial, it’s not without its challenges. A study by Panadero and Jonsson (2017) found that e-assessment can create an additional workload for teachers, particularly if they are not fully familiar with the technology or lack of technical support. Furthermore, e-assessment may not be accessible to all students, particularly those with limited access to technology. Therefore, it’s important for educators to carefully consider the benefits and limitations

335

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

Trends and Innovations in e-Assessment In recent years, there have been several trends and innovations in e-assessment that have had a significant impact on language education. One of the most notable trends is the increasing use of artificial intelligence (AI) and machine learning (ML) in e-assessment. AI-based tools such as automated scoring and speech recognition software can provide more objective and reliable assessment results compared to traditional methods, which can lead to more accurate evaluations of language proficiency. Another trend in e-assessment is the use of virtual reality (VR) and augmented reality (AR) technology. These technologies can be used to create immersive and interactive language-learning experiences, which can help to improve students’ engagement and motivation. For example, VR can be used to simulate real-life situations such as a conversation in a restaurant or a job interview, which can help students to practice their language skills in a realistic context. Gamification is also becoming increasingly popular in e-assessment. Gamification is the use of game elements, such as points, badges, and leaderboards, to make learning more engaging and motivating. This approach can be used to create interactive and fun language-learning activities, which can help to improve student engagement and motivation. Adaptive assessment is another trend in e-assessment, this approach is based on the idea that students should be assessed at their level of proficiency, and that assessment should be tailored to the individual needs of each student. Adaptive assessment can be used to provide personalized feedback to students, and to identify areas of weakness that need to be addressed. This approach allows teachers to create customized curricula and provide targeted support for students. Another trend in e-assessment is the use of mobile devices, such as smartphones and tablets, to deliver assessments. This approach can be beneficial because it allows students to take assessments at their convenience, and it can also be useful for students who are unable to attend traditional in-person assessments due to various reasons. Mobile e-assessment can also provide opportunities for real-life authentic assessment, by allowing students to demonstrate their language proficiency in real-life situations. In conclusion, e-assessment is a rapidly evolving field, and there are many trends and innovations that are shaping the future of language education. The use of AI and ML, VR and AR, gamification, adaptive assessment and mobile devices are just a few examples of how technology is being used to improve the accuracy and efficiency of language assessment and to make language learning more engaging and motivating for students.

Digital Tools in Language Testing Since the introduction of technology as a tool that improves both the teaching and learning process, there have been a lot of changes and developments in the educational sector. The incorporation of technology has had an effect on assessment as well because it is an essential and continual component of such a process. There is a lot of pressure on teachers to assess their students’ progress in foreign/second language classrooms. Teachers face various issues when it comes to assessment. One of the issues when it comes to foreign/second language assessment is deciding which method to use during this process. Since the early 1980s, computer technologies have been employed in foreign/second language testing (Brown, 1997). There are many advantages to computer-based language testing, including dichotomously scored items that provide immediate feedback, integration of media, and the tracing of a test taker’s moves (Roever, 2001). The fact that results were provided immediately after the completion of the test was one of the 336

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

most appealing aspects of computer-based tests (CBTs). This is a quality that is particularly useful for educational reasons (Bennett, 1999). On the other hand, computer-based language testing is not without its drawbacks, including the high costs of establishing new testing centers and the risk of sudden and unexplained computer malfunctions (Roever, 2001). For instance, one of the concerns that was mentioned by Fulcher (2000, p. 96) was that “The use of multimedia in a listening test may cause a modification in the fundamental characteristics of the content being assessed.” He continued to clarify that “It is possible that the process of processing listening texts is changed by video information in a way that we do not yet fully understand.” In recent times, the internet has had massive power over humans with all its benefits and drawbacks. The internet provides various possibilities for language testing purposes, as it consists of massive amounts of digital tools for any task someone can imagine. Digital assessment tools enable students to answer questions instantly through computers and smartphones, allowing teachers to provide instant feedback for evaluation (Yılmaz, 2017). It also helps to create a more interactive and engaging environment for the students in the classroom (Howell et al., 2017). Some issues that may arise from using web-based language testing could be cheating, item confidentiality, self-scoring tests having a script containing all answers, data storage security, server failure, and browser incompatibility (Roever, 2001). Furthermore, Roever (2001) wrote about the validity issues that come with web-based language testing, which are caused by computer familiarity, typing speed, delivery failures, loading time, and timer.

Web 2.0 Tools Increased availability of tools and services that are accessed instantly through a web browser, as opposed to being stored on the user’s desktop, is one of the most significant innovations of Web 2.0 (Godwin-Jones, 2008). Web 2.0 was described by O’Reilly (2007) as “a set of principles and practices that tie together a veritable solar system of sites that demonstrate some or all of those principles, at a varying distance from that core” (p.18-19). It can be stated about Web 2.0 that every bit of information on the Internet is linked to one another through hyperlinks and this feature allows students to discover new knowledge as they continue to explore (Solomon and Schrum, 2007). There is now a wide selection of assessment tools that are available on the Internet for teachers to use and adaptable both inside and outside of the classroom. The findings of a study conducted by Gray et al. (2012) indicated that, except a few obstacles, the academics saw the assessment with web 2.0 tools as essential and beneficial.

Digital E-Assessment Tools in EFL Classrooms The term “formative assessment” has been defined by Colby-Kelly and Turner (2008) as “the process of seeking and interpreting evidence for making substantively grounded decisions or judgments about the product of a learning task to decide where the learners are in their learning, where they need to go, and how best to get there” (p. 11). According to Tsulaia and Adamia (2020), formative assessment can be a helpful tool for monitoring learning progress, considering students’ needs, and identifying problems to help discover future steps. Due to current technological advancements that make it feasible to directly combine instruction and assessment, formative evaluation may be conducted not just through conventional methods but also by incorporating technology (Walter et al., 2010). On the other hand, Digital formative assessment (DFA) is the outcome of the research into formative assessment and computer-assisted assessment in the last two decades (McLaughlin & Yan, 2017). Digital formative assessment tools are thought to be able to solve the problems that teachers face while employing formative assessment methods, and 337

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

these problems include time constraints and crowded classrooms (Tsulaia & Adamia, 2020; Hatziapostlou & Paraskakis, 2010). These tools also offer effective benefits, such as increased motivation, more positive attitudes, and higher levels of engagement (Bhagat & Spector, 2017). The following sections refer to some examples of digital tools for formative assessment purposes. Kahoot! Kahoot! is a free online learning platform that allows teachers to create and share interactive quizzes, surveys, discussion topics, and jumble games with an unlimited number of participants in the classroom (Atilano, 2017). It includes design elements that promote and encourage learning like points, leaderboards, and nicknames. Teachers may easily create quizzes, discussions, and surveys on this platform. Quizzes created on this platform can include multiple-choice questions, pictures, and videos. It generates an atmosphere that is both enjoyable and competitive, this way it encourages students to learn (Deloos, 2015). Kahoot has been used in various educational settings, including foreign/second language learning (Iaremenko, 2017; Zarzycka-Piskorz, 2016; Dellos, 2015) and it is a great tool for formative assessment purposes in the classroom (Barnes, 2017). Quizlet Quizlet is a no-cost online educational platform that provides students with a wealth of resources for studying any topic via Cards and certain games. Nakata (2011) defines Quizlet as it is a flashcard tool that gives students the ability to study vocabulary in a way that combines matching and associating words. Students can make their digital flashcards with the help of Quizlet, which also provides a variety of learning modes for students to practice and study with. It has been shown through research that it helps children increase their receptive vocabulary knowledge (Milliner, 2013). Quizlet also enables teachers to assess their students’ comprehension of key academic vocabulary, while simultaneously allowing students to develop their collaboration and communication skills. In addition, the live version of Quizlet allows teachers the option of using a preset deck of vocabulary terms or creating their own (with a minimum of 12 terms) and then providing a link to their students. Wordwall Wordwall is an edutainment website with a variety of vocabulary-building games, such as information matching, picture matching, quizzes, and puzzles (Çil, 2021). Wordwall includes a variety of minigames that can be used to teach and assess students in the foreign/second language classroom. According to Hasram et al. (2021), Wordwall can be a useful tool to enhance students’ experiences and maintain their attention. Wordwall also allows teachers to express their creativity by allowing them to create their own materials. In addition, teachers can enter the topic they would like to teach or assess in the classroom and receive a selection of already prepared and customizable activities. Plickers Plickers is another web-based tool that can be used for formative assessment purposes. This tool requires access to the Plickers website (www.plickers.com), the Plickers application, and a smartphone. This website also provides unique plickers cards which can be printed out and distributed to the students. Each Plickers card has a QR-style design with a unique card number in the corners, with each side of the QR

338

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

design designated choices; A, B, C, and D. Questions may consist of text, images, or both types of media. Plickers is a suitable digital tool for formative assessment in the classroom, as the teacher can check the percentage of the class and performances of individual students on the question using a digital device. Furthermore, it is an application that can be used by people of any age because of its straightforward nature, quick response, and intuitive user interface (Jinu & Shamna Beegum, 2019)

Productivity on e-Assessment in EFL Classrooms The need for education, particularly for adults who want to improve their skills, is constantly increasing, and the growing demand for higher education prompts institutions, instructors, and students to resort to the many advantages of distance education (Bennet, 2002). As e-learning becomes more widespread, e-assessment also plays an important role in this context. Only recently the demand for e-assessment practices shifted to be common practice not only in higher education but also in middle and secondary school education. This shift was mainly caused by the recent Covid-19 pandemic, as face-to-face education was no longer available for some time. This event has helped e-learning and e-assessment practices to evolve more which caused the need for scholars, teachers, and students to become conscious of these practices. This chapter mainly aims to discover e-assessment in second/foreign language teaching-themed international documents published throughout the world using Bibliometric Analysis, which is a recent and practical way of gathering data that enables researchers to analyze and compile the publications throughout the world using various parameters; and to contribute and yield important findings for the literature and researchers in the second language teaching field. This study aims to identify the major publications, prominent authors, and main themes in the field of e-assessment. The research questions of this study are as follows: 1. What is the productivity status of international publications in the field of e-assessment in EFL classrooms? a. What are the most prolific countries on e-assessment in EFL classrooms? b. Who are the most globally cited researchers in the field of e-assessment in EFL classrooms? c. What are the most prolific journals on e-assessment in EFL classrooms? d. What are the most prolific journals on e-assessment in EFL classrooms? e. Who are the most prolific and influential researchers in the field of e-assessment in EFL classrooms? f. What are the most prolific institutions of published articles on e-assessment? 2. What are the research trends in international publications in the field of e-assessment? a. What are the most frequent keywords preferred by researchers in e-assessment in EFL classrooms research articles? b. What are the trends and changes in Key-Themes on e-assessment in EFL classrooms research?

METHODOLOGY The use of more traditional methods in the scientific investigation has become increasingly more difficult as a direct result of the increased productivity of academics. Researchers have the opportunity to access to a variety of opportunities for the analysis of large amounts of data thanks to bibliometric 339

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

analysis, which was made possible by the ongoing development of new technologies. In this specific study, the researchers made use of bibliometric analysis to investigate more deeply into the research articles regarding e-assessment in terms of major publications, prominent authors, and main themes.

Bibliometric Analysis Bibliometric methods have been used to provide quantitative analysis of written publications. Bibliometrics is highly correlated to the broader term “infometrics” (Egghe and Rousseau 1990; Wolfram, 2003) and the narrower term “scientometrics” (Bar-Ilan, 2010). Originally, it primarily consisted of bibliographic overviews of scientific productions or selections of highly cited publications. These overviews are further subdivided into lists of author productions, national or subject bibliographies, and other more specific information on studies conducted in many databases including WoS and Scopus. The data can be interpreted in various ways using different variables.

Bibliometric Database To investigate the bibliometric characteristics of e-assessment research articles, the researchers first accessed the Web of Science (WoS) website to search for relevant keywords for the study. The search on the WoS website was made on 16th June, 2022 at Pamukkale University, Turkey. The results of this search included the selected keywords “e-assessment in a second language, online assessment in foreign language teaching, computer adaptive testing in foreign language teaching”. Publications with these keywords in their titles, abstracts, or keywords were shown in the search results. In this study, search criteria included publications indexed in Social Sciences Citation Index (SSCI), Emerging Sources Citation Index (ESCI), Science Citation Index Expanded (SCI-EXPANDED), and Arts & Humanities Citation Index (A&HCI). Another criterion for the search results was the research areas on the WoS website. The search results were reduced to only include the relevant research areas, which are “Education & Educational Research”, “Education, Scientific Disciplines”, “Linguistics”, and “Language & Linguistics”. Lastly, only the research articles on the WoS website were included in this study, excluding the other types of published papers. At the end of the search process, a data set covering a total of 3352 documents was obtained from the database. Data was downloaded from WoS in plain text format with complete records. This format includes the document title, year, source, volume, issue, pages, number of citations, and references. As there were more than 500 data records after the search on the WoS website, these records were exported as seven different sets, and then recombined and imported into Biblioshiny via R Studio for the analysis. The main information about the data set consisting of 3352 documents is given in Table 1.

340

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

Table 1. Main Information about Data Description

Results

Description

Data Distribution

Results

Authors Collaboration

Timespan

1991:2022

Single-authored docs

1402

Sources (Journals, Books, etc.)

657

Multiple-authored docs

1950

Documents

3352

Co-authors per doc

2,08

References

116028

International co-authorships %

13,72

Keyword Plus (ID)

2330

Article

2966

Author’s Keywords (DE)

7476

Article; book chapter

205

Article; early access

151

Article; proceedings paper

30

Document Contents

Document Types

Authors Authors

5996

Authors of single-authored docs

1253

Table 1 shows the general information of 3352 research articles obtained from the WoS database. The timespan covers the period from 1991 to the end of July 2022. These studies were published in 657 sources with an average citation rate of 10,85 per document and a total of 5996 authors. Among 3352 articles, 1402 of them were written by a single author, while 1950 of them were written by multiple authors.

FINDINGS This section presents the findings for the research questions posed and analyzed through the bibliometric analysis. The aim of the study is to investigate the current productivity level of research publications on the e-assessment in education. In addition, this study aims to highlight e-assessment keyword distribution in research trends and how the trends evolved. The results are presented in an order that progresses from broad to narrower in terms of the keywords that were searched for in Bibliometric Analysis.

Productivity Status of the Research Articles Most Relevant Countries During the process of our research, one of our primary goals was to determine which countries all over the world have contributed the most to this field of study. To find out the leading countries related to the academic publications on e-assessment, the top ten most relevant countries were listed in Table 2.

341

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

Table 2. The top ten most relevant countries publishing e-assessment research articles Countries

Articles

USA

Percentage (of 2286)

986

Frequency

42,344

0,289

China

271

11,854

0,081

United Kingdom

212

9,273

0,063

Australia

193

8,44

0,058

Canada

177

7,742

0,053

Spain

110

4,811

0,033

Turkey

109

4,768

0,033

South Africa

93

4,068

0,028

Iran

78

3,412

0,023

Japan

75

3,280

0,022

When the table was examined, the results of the data analysis revealed that the United States of America was the leading country with a total of 986 articles. Afterward, China came in second with 271 articles, followed by the United Kingdom (212), Australia (193), Canada (177), Spain (110), Turkey (109), South Africa (93), Iran (78), and Japan (75).

Most Globally Cited Authors The number of citations for each study on e-assessment in EFL classrooms was another domain in the current study. The citation was taken into consideration as an additional important factor in this investigation since it is both an important and effective factor in determining the impact of a publication. Table 3 indicated the most cited publications on e-assessment. Table 3. The top ten authors frequently cited in e-assessment research articles Authors Crookes G.

Year 1991

DOI 10.1111/j.1467-1770.1991.tb00690.x

Total Citations

Total Citations per Year

396

12,38

Normalized Total Citations 2,59

Noels K.A.

2000

10.1111/0023-8333.00111

374

16,26

6,82

Thornton P.

2005

10.1111/j.1365-2729.2005.00129.x

319

17,72

10,66

Hornberger N.H.

2012

10.1080/13670050.2012.658016

274

24,91

16,82

Rolstad K.

2005

10.1177/0895904805278067

262

14,56

8,76

Schneider W.

2008

10.1111/j.1751-228X.2008.00041.x

203

13,53

10,19

Phipps S.

2009

10.1016/j.system.2009.03.002

200

14,29

10,48

Tikly L.

2011

10.1016/j.ijedudev.2010.06.001

193

16,08

11,71

Dewaele J.M.

2008

10.1111/j.1467-9922.2008.00482.x

178

11,87

8,94

Johnson K.E.

1994

10.1016/0742-051X(94)90024-8

175

6,03

7,04

342

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

The data were analyzed to investigate the most globally cited authors in the field of e-assessment. Among the top ten articles which are most frequently cited (Table 3), the articles of Crookes G. (396 times), Noels K. A. (374 times), and Thornton P. (319 times) are shown to have the most cited top three e-assessment research articles. It can be concluded that the publication year is an important factor in citation and the most cited articles are those which were published nearly a decade or more ago.

Co-Citation Analysis of Articles The term “co-citation” refers to the practice of citing two different publications within the same article (Small, 1973). It is possible to assume that if a few publications are frequently cited in the same article by different authors, then it is almost certain that they have some commonalities. (Benckendorff & Zehrer, 2013). Figure 1 shows the co-citation analysis of the e-assessment research articles through network mapping. Figure 1. Network map of the most-cited references

When the network map of the most-cited references is analyzed, there are four clusters that cover studies conducted on e-assessment in EFL classrooms. When the relationship among these variables is investigated, it was found that there is a meaningful relationship among the names of authors. The first one, the red cluster includes Krashen, Gardner, Dörnyei, and Cohen, and shows the relationship among the leading scholars and authors of methodology in language learning. The green cluster includes Cole, Swain, Ellis, and Cummins, the most important figures in language acquisition, and the citation relations are thought from that perspective. Next, the purple cluster includes Borg, Freeman, Vygotsky, and Shulman, the scholars who are associated with the field of teacher education. Lastly, the blue cluster represents the leading authors in various other fields relating to foreign/second language teaching.

343

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

Most Relevant Sources In addition, the most relevant sources in the field of e-assessment were investigated as part of this study. The leading ten journals that have published the most research articles on e-assessment in EFL classrooms are listed in Table 4. Table 4. The top ten journals publishing e-assessment research articles Sources International Journal of Bilingual Education and Bilingualism

Articles 113

Percentage (of 619) 18,255

Foreign Language Annals

91

14,701

System

79

12,762

Language Teaching Research

65

10,500

Modern Language Journal

62

10,016

Language and Education

46

7,431

TESOL Quarterly

46

7,431

TESOL Journal

44

7,108

TESL Canada Journal

40

6,462

Teaching and Teacher Education

33

5,331

When the table was examined, the research results of the data analysis showed that the International Journal of Bilingual Education and Bilingualism holds the top spot with a total of 113 articles, followed by Foreign Language Annals (91), System (79), Language Teaching Research (65), Modern Language Journal (62), Language and Education (46), TESOL Quarterly (46), TESOL Journal (44), TESL Canada Journal (40), and Teaching and Teacher Education (33).

Most Influential Authors The bibliometric analysis showed that the articles related to e-assessment had an average number of 2,08 authors per article. Table 5 shows the top ten authors with the highest number of published articles in the field of e-assessment.

344

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

Table 5. The top ten most influential authors in the field of e-assessment Authors

Number of Articles

Lo YY

9

Mady C

8

De Costa PI

7

De Graaff R

7

Farrell TSC

7

Johnson KE

7

Kissau S

7

Macaro E

7

Yu SL

7

Zhang LJ

7

It was found that many researchers conducted more than one study and contributed to the field with more than one article. At the top of the list, Lo has a total of nine articles that are published, followed by Mady with eight articles. Lastly, the rest of the authors in Table 5 has seven articles published. The findings show that most productive authors have more than 7 publications, which is a high number in productivity.

Most Relevant Institutions The published research articles in the WoS database related to e-assessment were analyzed to find the most productive institutions. Figure 2 shows the relevant findings. Figure 2. The top ten most relevant institutions of published articles on e-assessment

345

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

It can be seen in Figure 2 that the most productive institution in the field of e-assessment is the University of Hong Kong (f=52), followed by the Islamic Azad University (f=38), the University of North Carolina (f=36), the Pennsylvania State University (f=35), the Michigan State University (f=33), the Education University of Hong Kong (f=32), the University of Auckland (f=31), the University of British Columbia (f=31), the University of Queensland (f=31), and the University of Toronto (f=30).

Keywords and Research Trends Most Frequent Words Keywords is a good indicator and a way of searching for articles. WoS was examined and the most frequent words used were presented as they give cues of prominent issues in a field. Table 6 shows the most frequently used words in e-assessment research articles and their percentage within the top ten most frequent words list. Table 6. The top ten most frequent words in e-assessment research articles Words

Occurrences

Percentage (of 934)

Teacher Education

167

17,88

Higher education

134

14,35

Bilingual education

112

12,00

English as a second language

97

10,39

Second language acquisition

89

9,53

Education

71

7,60

Second language learning

68

7,28

Professional development

67

7,17

English language learners

66

7,07

Bilingualism

63

6,73

At the top of the list “teacher education” (f=167, %17,88) takes the lead. It indicates the emerging nature of using technology in assessment and the need for educating teachers. “teacher education” is followed by “higher education” (f=134, %14,35), “bilingual education” (f=112, %12,00), “English as a second language” (f=97, %10,39), “second language acquisition” (f=89, %9,53), “education” (f=71, %7,60), “second language learning” (f=68, %7,28), “professional development” (f=67, %7,17), “English language learners” (f=66, %7,07), and “bilingualism” (f=63, %6,73).

Research Trends The bibliometric analysis also helps to analyze the publications to figure out changes and trends within a research area. Figure 3 shows the research trends in the area of e-assessment in foreign/second language teaching.

346

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

Figure 3. Research trends of e-assessment-themed keywords between 2009-2021

In this figure, in 2009, the first research trend of e-assessment was connected to issues about second language learners. Between the years 2011-2013, the keywords “diversity”, “reflection”, and “academic achievement” reached their highest point. Similarly, in 2013, research trends included the keywords “second language learners”, “immigration”, and “Chinese”, which points out the interest in second language education and its cultural aspects. Between the years 2013-2015, this trend continues but shifts to different issues in language learning at the same time as the keywords include “bilingual education”, “ESL”, and “English”. In the year 2015, keywords include “English language learners”, “case study”, and “teacher beliefs”, which shows the research interest shifting into the matter of teachers, learners, and practices in the language classroom. The research trends between 2015-2017 continue in the same line of research areas with the trends of “teacher education”, “English as a second language”, and “second language acquisition”, similar to the trends in 2017 which are “education”, “second language learning”, and “professional development”. Between the years 2017-2019, the keywords include “higher education”, “motivation”, and “multilingualism”, demonstrating a mild shift in trends, which can be interpreted as the interest in learners’ affective factors in language learning. The research trends in 2019 include “second language”, “translanguaging”, and “writing”. On the other hand, between 2019-2021, it shifts to cultural matters once again with the keywords “second language instruction”, “equity” and “migration”. More recently, in the year 2021, a new trend emerges which is caused by the recent pandemic. Research trends in 2021 include “covid-19” and “qualitative research”.

347

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

Changes in Key Themes The development and expansion of a field within a discipline occurs over time. During this process, affiliated keywords used in publications related to the subject vary, which is an indication of the growing nature of the field itself as well as other fields with which it is connected. The progression of the associated keywords used in e-assessment articles is illustrated in Figure 4, which presents the information chronologically. Figure 4. Changes in the affiliated keywords of e-assessment articles

Figure 4 yields interesting findings for the researchers as it shows the progressing and processing research trends in e-assessment in second/foreign language teaching. This changing trend can be observable through Bibliometric Analysis and paves the way for the researchers. It can be observed from the figure that, between the years 1991-2013, the most prominent keyword in e-assessment articles was “education”, followed by “higher education”, “policy”, “preservice”, “reform”, and “classroom”. These keywords indicate the practice of e-assessment being used dominantly in the higher education context, and the changes in the traditional nature of teaching and assessing learners. However, between the years 2014-2018, the affiliated keyword “education” became more prominent, and the rest of the keywords changed into “children”, “technology” and “experience”. It can be stated that with the developments in technology, its use was not restricted to the higher-education context, and it became more widespread in the field of education. In the last three years, the affiliated keywords other than “education” changed into “identity”, “corrective feedback”, “acquisition” and “accuracy”, indicating a more individualized approach in the field of e-assessment in foreign/second language teaching.

CONCLUSION This study aimed to examine the e-assessment in second/foreign language teaching-themed international publications from WoS (Web of Science) using the bibliometric method, one of the literature review tools. In particular, the most prolific countries, annual scientific production, the most globally cited documents,

348

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

authors, institutions, keywords, and changing research trends were analyzed. A total of 3352 research documents from the Web of Science (WoS) Core Collection database were included in the analysis including publications until June 2022. In the analysis of the data obtained, the open-source R Studio program and the “biblioshiny for bibliometrics” application, which is an R program tool, were used. Based on the data analysis and discussions of these documents, this study has revealed some important results that will contribute to the field of trends of e-assessment in second/foreign language teaching. When the findings are analyzed, there are some significant findings in the study. Firstly, it was found that nearly half of the publications analyzed on e-assessment were published in the U.S.A. One of the reasons for this may be related to the technological developments in a country. When other countries are analyzed, the first most prolific five countries are developed countries including, the USA, China, the United Kingdom, Australia, and Canada. The reason for being prolific in such a recent field may be also related to the developments of technology in education in these countries. However, when the rest five countries are examined, it was found that, except for Japan, they are crowded and mostly developing countries such as Turkey, South Africa, and Iran. The reason may be that these countries need more urgent developments in assessment or e-assessment practices. In the present study, it was found that, although these five countries, which are Spain, Turkey, South Africa, Iran, and Japan, are among the top ten prolific countries, they have low citation frequency when compared with the top five countries as citation indicates direct or indirect influence on the other articles and field (Hu et al., 2010). Another important finding in the study is the contribution of authors in the field. It was found that top-ten authors have more than one research in the field, this number varies between 9 to 7. If researchers are required to have specific and broad competencies in a field, it is important to publish in a specific field and develop it rather than publishing just one or two publications and moving to another field. For that reason, we, as researchers, think that this is an important finding in the study. The third and most stunning finding of this research is on the research trends. In WoS, when the data were analyzed, we met with a very active and changing trend throughout the years. For instance, when the research trend in WoS was on three research keywords, which are academic achievement, reflection, and diversity, however, this trend moved to COVID-19, qualitative research, and secondly, second language acquisition, equity, and migration. We believe that this will give insights into the changing trend of second/foreign language education and research. Similarly, when we examined Changes in Key Themes, it was found that there are striking differences in the last two decades. Notably, when the research trend was on “education, higher education, policy, pre-service and reform” between the years 1991 and 2013, today, between 2019 and 2022, it was found that the most influential affiliated keywords are “education, English, identity, corrective feedback, and schools.” This is a great sign showing the moving trend from general perspectives on second/foreign language teaching to more individualized and tailored teaching approaches in second/foreign language teaching.

REFERENCES Adom, D., Mensah, J. A., & Dake, D. A. (2020). Test, measurement, and evaluation: Understanding and use of the concepts in education. International Journal of Evaluation and Research in Education, 9(1), 109–119. doi:10.11591/ijere.v9i1.20457

349

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

Atilano, M. (2017). Game on: Teaching research methods to college students using Kahoot! Library Faculty Presentations & Publications. Retrieved from https://digitalcommons.unf.edu/library_facpub/56 Attali, Y. (2007). Construct validity of e-rater® in scoring TOEFL® essays. ETS Research Report Series, 2007(1), i-22. doi:10.1002/j.2333-8504.2007.tb02063.x Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford University Press. Bachman, L. F. (1991). What does language testing have to offer? TESOL Quarterly, 25(4), 671. doi:10.2307/3587082 Bar-Ilan, J. (2010). Citations to the “Introduction to informetrics” indexed by WOS, Scopus and Google Scholar. Scientometrics, 82(3), 495–506. doi:10.100711192-010-0185-9 Barnes, R. (2017). Kahoot! in the classroom: Student engagement technique. Nurse Educator, 42(6), 280. doi:10.1097/NNE.0000000000000419 PMID:29049160 Benckendorff, P., & Zehrer, A. (2013). A network analysis of Tourism Research. Annals of Tourism Research, 43, 121–149. doi:10.1016/j.annals.2013.04.005 Bennett,R.E.(1998).Reinventingassessment.Speculationsonthefutureoflarge-scaleeducationaltesting.Apolicy information perspective. Retrieved from https://www.researchgate.net/publication/234731732_Reinventi ng_Assessment_Speculations_on_the_Future_of_Large-Scale_Educ ational_Testing_A_Policy_Information_Perspective Bennett, R. E. (1999). Using new technology to improve assessment. Educational Measurement: Issues and Practice, 18(3), 5–12. doi:10.1111/j.1745-3992.1999.tb00266.x Bennett, R. E. (2002). Using electronic assessment to measure student performance. The State Education Standard, National Association of State Boards of Education. Bhagat, K. K., & Spector, J. M. (2017). Formative assessment in complex problem-solving domains: The emerging role of assessment technologies. Journal of Educational Technology & Society, 20(4), 312–317. Boyle, A., & Hutchison, D. (2009). Sophisticated tasks in e‐assessment: What are they and what are their benefits? Assessment & Evaluation in Higher Education, 34(3), 305–319. doi:10.1080/02602930801956034 Brown, H. D. (2004). Language assessment principles and classroom practices. Longman. Brown, J. D. (1997). Computers in language testing: Present research and some future directions. Language Learning & Technology, 1(1), 44–59. Burstein, J., Frase, L., Ginther, A., & Grant, L. (1996). Technologies for language assessment. Annual Review of Applied Linguistics, 16, 240–260. doi:10.1017/S0267190500001537 Carroll, J. B. (1961). Fundamental considerations in testing English proficiency of foreign students. In L. F. Bachman (Ed.), Testing the English Proficiency of Foreign Students (pp. 30–40). Center for Applied Linguistics. Carroll, J. B. (1968). The psychology of language testing. In A. Davies (Ed.), Language Testing Symposium: A Psycholinguistic Approach (pp 46–69). London: Oxford University Press.

350

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

Chapelle, C. A., & Voss, E. (2017). Utilizing technology in language assessment. Language Testing and Assessment, 149–161. doi:10.1007/978-3-319-02261-1_10 Cheng, L., Rogers, T., & Hu, H. (2004). ESL/EFL instructors’ classroom assessment practices: Purposes, methods, and procedures. Language Testing, 21(3), 360–389. doi:10.1191/0265532204lt288oa Cil, E. (2021). The effect of using Wordwall.net in Increasing vocabulary knowledge of 5th grade EFL students. Language Education & Technology, 1(1), 21–28. Colby-Kelly, C., & Turner, C. E. (2008). AFL research in the L2 classroom and evidence of usefulness: Taking formative assessment to the next level. Canadian Modern Language Review, 64(1), 9–37. doi:10.3138/cmlr.64.1.009 Conole, G., & Warburton, B. (2005). A review of computer-assisted assessment. Research in Learning Technology, 13(1), 17–31. doi:10.3402/rlt.v13i1.10970 Coombs, A., DeLuca, C., LaPointe-McEwan, D., & Chalas, A. (2018). Changing approaches to classroom assessment: An empirical study across teacher career stages. Teaching and Teacher Education, 71, 134–144. doi:10.1016/j.tate.2017.12.010 Crisp, G. (2011). Teacher’s handbook on e-assessment. Australian Learning and Teaching Council. Dellos, R. (2015). Kahoot! A digital game resource for learning. International Journal of Instructional Technology and Distance Learning, 12, 49–52. Dikli, S. (2006). An overview of automated scoring of essays. The Journal of Technology, Learning, and Assessment, 5(1). Retrieved July 7, 2022, from http://www.jtla.org Eckes, T. (2014). Examining testlet effects in the TestDaF listening section: A testlet response theory modeling approach. Language Testing, 31(1), 39–61. doi:10.1177/0265532213492969 Egghe, L., & Rousseau, R. (1990). Introduction to informetrics: Quantitative methods in library, documentation and information science. Elsevier Science Publishers. Ellegaard, O., & Wallin, J. A. (2015). The bibliometric analysis of scholarly production: How great is the impact? Scientometrics, 105(3), 1809–1831. doi:10.100711192-015-1645-z PMID:26594073 Farhady, H. (2005). Language assessment: A linguametric perspective. Language Assessment Quarterly: An International Journal, 2(2), 147–164. doi:10.120715434311laq0202_3 Frey, B. B., & Schmitt, V. L. (2007). Coming to terms with classroom assessment. Journal of Advanced Academics, 18(3), 402–423. doi:10.4219/jaa-2007-495 Fulcher, G. (2000). Computers in language testing. In P. Brett & G. Motteram (Eds.), A Special Interest in Computers (pp. 93–107). IATEFL Publications. Godwin-Jones, B. (2008). Emerging technologies: Web-writing 2.0: Enabling, documenting, and assessing writing online. Language Learning & Technology, 12(2), 7–13.

351

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

Gray, K., Waycott, J., Clerehan, R., Hamilton, M., Richardson, J., Sheard, J., & Thompson, C. (2012). Worth it? Findings from a study of how academics assess students’ Web 2.0 activities. Research in Learning Technology, 20(1), 1–15. doi:10.3402/rlt.v20i0.16153 Green, A. (2014). Exploring language assessment and testing: Language in action. Routledge. Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Sage. Hasram, S., Nasir, M. K. M., Mohamad, M., Daud, M. Y., Abd Rahman, M. J., & Mohammad, W. M. R. W. (2021). The effects of wordwall online games (Wow) on English language vocabulary learning among year 5 pupils. Theory and Practice in Language Studies, 11(9), 1059–1066. doi:10.17507/tpls.1109.11 Hatziapostolou, T., & Paraskakis, I. (2010). Enhancing the impact of formative feedback on student learning through an online feedback system. Electronic Journal of e-Learning, 8(2), 2111-212. Howell, D. D., Tseng, D. C., & Colorado-Resa, J. T. (2017). Fast assessments with digital tools using multiple-choice questions. College Teaching, 65(3), 145–147. doi:10.1080/87567555.2017.1291489 Hu, X., Rousseau, R., & Chen, J. (2011). On the definition of forward and backward citation generations. Journal of Informetrics, 5(1), 27–36. doi:10.1016/j.joi.2010.07.004 Iaremenko, N. V. (2017). Enhancing English language learners’ motivation through online games. Information Technologies and Learning Tools, 59(3), 126–133. doi:10.33407/itlt.v59i3.1606 Jinu, R., & Shamna Beegum, S. (2019). Plickers: A tool for language assessment in the digital age. International Journal of Recent Technology and Engineering, 8(2S3), 166–171. doi:10.35940/ijrte. B1031.0782S319 JISC (Joint Infor mation Systems Committee). (2007). Ef fective pract i c e w i t h e - a s s e s s m e n t : A n o ve r v i e w o f t e ch n o l o g i e s , p o l i c i e s a n d p ra c tice in further and higher education. https://www.jisc.ac.uk/media/documents/themes/elearning/effp raceassess.pdf Johnson, K., & Johnson, H. (2001). Encyclopedic dictionary of applied linguistics: A handbook for language teaching. Foreign Language Teaching and Research Press and Blackwell. Kirsch, I., Jamieson, J., Taylor, C., & Eignor, D. (1998). Computer familiarity among TOEFL examinees (TOEFL Research Report No. 59). Princeton, NJ: Educational Testing Service. Lado, R. (1961). Language testing: The construction and use of foreign language tests. Longman. McLaughlin, T., & Yan, Z. (2017). Diverse delivery methods and strong psychological benefits: A review of online formative assessment. Journal of Computer Assisted Learning, 33(6), 562–574. doi:10.1111/ jcal.12200 McMillan, J. H. (2001). Classroom assessment: Principles and practice for effective instruction (2nd ed.). Allyn & Bacon. McNamara, T. (2001). Language assessment as social practice: Challenges for research. Language Testing, 18(4), 333–349. doi:10.1177/026553220101800402

352

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

Milliner, B. (2013). Using online flashcard software to raise business students’ TOEIC scores. Annual Report of JACET-SIG on ESP, 15, 52–60. Nakata, T. (2011). Computer-assisted second language vocabulary learning in a paired associate paradigm: A critical investigation of flashcard software. Computer Assisted Language Learning, 24(1), 17–38. do i:10.1080/09588221.2010.520675 O’Reilly, T. (2007). What is web 2.0: Design patterns and business models for the next generation if software. Communications & Stratégies, 65, 17–37. Panadero, E., Jonsson, A., & Botella, J. (2017). Effects of self-assessment on self-regulated learning and self-efficacy: Four meta-analyses. Educational Research Review, 22, 74–98. doi:10.1016/j. edurev.2017.08.004 Raikes, N., & Harding, R. (2003). The Horseless Carriage Stage: Replacing conventional measures. Assessment in Education: Principles, Policy & Practice, 10(3), 267–277. doi:10.1080/0969594032000148136 Ridgway, J., McCusker, S., & Pead, D. (2004). Literature review of e-assessment. NESTA Futurelab. Roever, C. (2001). Web-based language testing. Language Learning & Technology, 5(2), 84–94. Skinner, B. F. (1961). Teaching machines. Scientific American, 205(5), 90–106. doi:10.1038cientifica merican1161-90 PMID:13913636 Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24(4), 265–269. doi:10.1002/ asi.4630240406 Solomon, G., & Schrum, L. (2007). Web 2.0 new tools, new schools. International Society for Technology in Education (ISTE). Taminiau, E. M., Kester, L., Corbalan, G., Spector, J. M., Kirschner, P. A., & Van Merriënboer, J. J. (2015). Designing on‐demand education for simultaneous development of domain‐specific and self‐directed learning skills. Journal of Computer Assisted Learning, 31(5), 405–421. doi:10.1111/jcal.12076 Taras, M. (2005). Assessment-summative and formative-some theoretical reflections. British Journal of Educational Studies, 53(4), 466–478. doi:10.1111/j.1467-8527.2005.00307.x Taylor, C. S., & Bobbit-Nolen, S. (2005). Classroom assessment: Supporting teaching and learning in real classrooms. Prentice Hall. Tomlinson, M. (2004). 14-19 curriculum and qualifications reform: Interim report of the working group on 14-19 reform. London: DfES. www.14-19reform.gov.uk Troudi, S., Coombe, C., & Al‐Hamliy, M. (2009). EFL teachers’ views of English language assessment in higher education in the United Arab Emirates and Kuwait. TESOL Quarterly, 43(3), 546–555. doi:10.1002/j.1545-7249.2009.tb00252.x Tsulaia, N., & Adamia, Z. (2020). Formative assessment tools for higher education learning environment. International Scientific-Pedagogical Organization of Philologists “WEST-EAST” (ISPOP), 3(1), 86-93. doi:10.33739/2587-5434-2020-3-1-86-93

353

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

Valette, R. (1994). Teaching, testing and assessment: Conceptualizing the relationship. In C. Hancock (Ed.), Teaching, testing and assessment: Making the connection (pp. 1–42). National Textbook Company. Vispoel, W. P., Hendrickson, A. B., & Bleiler, T. (2000). Limiting answer review and change on computerized adaptive vocabulary tests: Psychometric and attitudinal results. Journal of Educational Measurement, 37(1), 21–38. doi:10.1111/j.1745-3984.2000.tb01074.x Wainer, H. (2000). Cats: Whither and whence. ETS Research Report Series, 2(2), i-15. doi:10.1002/j.2333-8504.2000.tb01835.x Walter, D., Way, R. P. D., & Nichols, P. (2010). Psychometric challenges and opportunities in implementing formative assessment. In H. L. Andrade, & G. J. Cizek (Eds.), Handbook of formative assessment (pp. 297–315). New York, NY: Taylor & Francis. Wang, X., Bradlow, E. T., & Wainer, H. (2002). A general Bayesian model for testlets: Theory and applications. Applied Psychological Measurement, 26(1), 109–128. doi:10.1177/0146621602026001007 White, E. (2009). Are you assessment literate? Some fundamental questions regarding effective classroombased assessment. OnCUE Journal, 3(1), 3–25. Wiliam, D. (2011). What is assessment for learning? Studies in Educational Evaluation, 37(1), 3–14. doi:10.1016/j.stueduc.2011.03.001 PMID:22114905 Wolfram, D. (2003). Applied informetrics for information retrieval research (No. 36). Greenwood Publishing Group. Yılmaz, B. (2017). The impact of digital assessment tools on students’ engagement in class: A case of two different secondary schools. Abant İzzet Baysal Üniversitesi Eğitim Fakültesi Dergisi, 17(3), 1606–1620. doi:10.17240/aibuefd.2017.17.31178-338850 Zarzycka-Piskorz, E. (2016). Kahoot it or not? Can games be motivating in learning grammar? Teaching English with Technology, 16(3), 17–36.

ADDITIONAL READING Amante, L., Oliveira, I. R., & Gomes, M. J. (2019). E-assessment in Portuguese higher education: Framework and perceptions of teachers and students. In A. Azevedo & J. Azevedo (Eds.), Handbook of research on e-assessment in higher education (pp. 312–333). IGI Global. Appiah, M., & Van Tonder, F. (2018). E-assessment in higher education: A review. International Journal of Business Management and Economic Research, 9(6). Astalini, A., Darmaji, D., Kurniawan, W., Anwar, K., & Kurniawan, D. (2019). Effectiveness of using e-module and e-assessment. doi:10.3991/ijim.v13i09.11016 Azevedo, A., & Azevedo, J. (Eds.). (2018). Handbook of research on e-assessment in higher education. IGI Global. doi:10.4018/978-1-5225-5936-8

354

 A Bibliometric Analysis on “E-Assessment in Teaching English as a Foreign Language” Publications

Benson, R., & Brack, C. (2010). Online learning and assessment in higher education: A planning guide. Elsevier. doi:10.1533/9781780631653 Donthu, N., Kumar, S., Mukherjee, D., Pandey, N., & Lim, W. M. (2021). How to conduct a bibliometric analysis: An overview and guidelines. Journal of Business Research, 133, 285–296. doi:10.1016/j. jbusres.2021.04.070 Ellegaard, O., & Wallin, J. A. (2015). The bibliometric analysis of scholarly production: How great is the impact? Scientometrics, 105(3), 1809–1831. doi:10.100711192-015-1645-z PMID:26594073 Guàrdia, L., Crisp, G., & Alsina, I. (2017). Trends and challenges of e-assessment to enhance student learning in Higher Education. Innovative Practices for Higher Education Assessment and Measurement, 36-56. Iskander, M. (Ed.). (2008). Innovative techniques in instruction technology, e-learning, e-assessment and education. Springer Science & Business Media. doi:10.1007/978-1-4020-8739-4 MassimoA.CorradoC. (2020). Biblioshiny bibliometrix. https://www.bibliometrix.org/biblioshiny/ Nacheva-Skopalik, L., & Green, S. (2016). Intelligent adaptable e-assessment for inclusive e- learning. International Journal of Web-Based Learning and Teaching Technologies, 11(1), 21–34. doi:10.4018/ IJWLTT.2016010102 Singh, V. K., Singh, P., Karmakar, M., Leta, J., & Mayr, P. (2021). The journal coverage of Web of Science, Scopus and Dimensions: A comparative analysis. Scientometrics, 126(6), 5113–5142. doi:10.100711192-021-03948-5 Zhang, X. (2020). A bibliometric analysis of second language acquisition between 1997 and 2018. Studies in Second Language Acquisition, 42(1), 199–222. doi:10.1017/S0272263119000573

KEY TERMS AND DEFINITIONS Bibliometrics: A common and thorough method for exploring and interpreting vast amounts of scientific data It allows to unpack the evolutionary details of a certain discipline while also offering insight on the field’s growing aspects. E-Assessment: E-assessment refers to assessment methods and procedures that put an emphasis on the role of information technology in measuring students’ learning.

355

356

Compilation of References

Acar-Erdol, T., & Yıldızlı, H. (2018). Classroom Assessment Practices of Teachers in Turkey. International Journal of Instruction, 11(3), 587–602. doi:10.12973/iji.2018.11340a ACTFL. (2011). World-readiness standards for learning languages. https://www.actfl.org/sites/default/files/publications/stand ards/World-ReadinessStandardsforLearningLanguages.pdf Ad Hoc Committee on Foreign Languages. (2007). Foreign languages and higher education: New structures for a changed world. Profession, 2007(1), 234–245. doi:10.1632/prof.2007.2007.1.234 Adams, T. E., Holman Jones, S., & Ellis, C. (2015). Autoethnography. Oxford University Press. Adom, D., Mensah, J. A., & Dake, D. A. (2020). Test, measurement, and evaluation: Understanding and use of the concepts in education. International Journal of Evaluation and Research in Education, 9(1), 109–119. doi:10.11591/ ijere.v9i1.20457 Adzima, K. (2020). Examining online cheating in higher education using traditional classroom cheating as a guide. Electronic Journal of E-Learning, 18(6), 476–493. doi:10.34190/JEL.18.6.002 Aftab, A. (2012). English language textbooks evaluation in Pakistan [Doctoral Dissertation]. University of Birmingham. Ahmad, N., Rahim, I. S. A., & Ahmad, S. (2021). Challenges in implementing online language assessment-a critical reflection of issues faced amidst Covid-19 pandemic. In F. Baharom, Y. Yusof, R. Romli, H. Mohd, M. A. Saip, S. F. P. Mohamed, & Z. M. Aji (Eds.), Proceedings of Knowledge Management International Conference (KMICe) 2021 (pp. 74–79). UUM College of Arts and Sciences. Ahsan, K., Akbar, S., & Kam, B. (2021). Contract cheating in higher education: A systematic literature review and future research agenda. Assessment & Evaluation in Higher Education, 47(4), 523–539. doi:10.1080/02602938.2021.1931660 Ahuvia, A. (2001). Traditional, interpretive, and reception based content analyses: Improving the ability of content analysis to address issues of pragmatic and theoretical concern. Social Indicators Research, 54(2), 139–172. doi:10.1023/A:1011087813505 Aindriú, S. (2022). The reasons why parents choose to transfer students with special educational needs from Irish immersion education. Language and Education, 36(1), 59–73. doi:10.1080/09500782.2021.1918707 Ajiferuke, M., & Boddewyn, J. (1970). “Culture” and Other Explanatory Variables in Comparative Management Studies. Academy of Management Journal, 13(2), 153–163. doi:10.2307/255102 Akbari, R. (2012). Validity in language testing. In C. Coombe, P. Davidson, B. O’Sullivan, & S. Stoynoff (Eds.), The Cambridge guide to second language assessment (pp. 30–36). Cambridge University Press.

 

Compilation of References

Akhtar, S., Hussain, F., Raja, F., Ehatisham-ul-haq, M., Baloch, N., Ishmanov, F., & Zikria, Y. (2020). Improving mispronunciation detection of arabic words for non-native learners using deep convolutional neural network features. Electronics (Basel), 9(6), 963. doi:10.3390/electronics9060963 Akın, G. (2016). Evaluation of national foreign language test in Turkey. Asian Journal of Educational Research, 4(3), 11–21. Akram, M., & Mahmood, A. (2011). The need of communicative approach (in ELT) in teacher training programmes in Pakistan. Language in India, 11(5), 172–178. Al-Azawei, A., Serenelli, F., & Lundqvist, K. (2016). Universal design for learning (UDL): A content analysis of peer-reviewed journal papers from 2012 to 2015. The Journal of Scholarship of Teaching and Learning, 16(3), 39–56. doi:10.14434/josotl.v16i3.19295 Alderson, C. (1990). Learner-centered testing through computers: Institutional issues in individual assessment. In J. A. L. de Jong (Ed.), Individualizing the assessment of language abilities (pp. 20–37). Multilingual Matters Ltd. Alderson, J. C., & Wall, D. (1993). Does washback exist? Applied Linguistics, 14(2), 115–129. doi:10.1093/applin/14.2.115 Alemi, M., Meghdari, A., & Ghazisaedy, M. (2014). Employing humanoid robots for teaching English language in Iranian junior high schools. International Journal of Humanoid Robotics, 11, 1450022-1–1450022-25. Alemi, M., Meghdari, A., & Haeri, N. S. (2017). Young EFL learners’ attitude towards RALL: An observational study focusing on motivation, anxiety, and interaction. In A. Kheddar, E. Yoshida, S. S. Ge, K. Suzuki, J.-J. Cabibihan, F. Eyssel, & H. He (Eds.), Proceedings of the International Conference on Social Robotics (pp. 252–261). Springer. 10.1007/978-3-319-70022-9_25 Alenezi, S. M. (2022). Tertiary level English language teachers’ use of, and attitudes to alternative and online assessments during the Covid-19 outbreak. International Journal of Education and Information Technologies, 16, 39–49. doi:10.46300/9109.2022.16.4 Alessio, H. M., & Messinger, J. D. (2021). Faculty and student perceptions of academic integrity in technology-assisted learning and testing. Frontiers in Education, 6, 629220. Advance online publication. doi:10.3389/feduc.2021.629220 Aljaafreh, A., & Lantolf, J. P. (1994). Negative feedback as regulation and second language learning in the Zone of Proximal Development. Modern Language Journal, 78(4), 465–483. doi:10.1111/j.1540-4781.1994.tb02064.x Al-Jamal, D., & Ghadi, N. (2008). English language general secondary certificate examination washback in Jordan. Asian EFL Journal, 10(3), 158–186. Allen, P. (2009). Definition. A paper presented at a competition at the University of Washington, University of Washington Press. Allwright, R. L. (1981). What do we want teaching materials for? ELT Journal, 36(1), 5–18. doi:10.1093/elt/36.1.5 Almekhlafi, E., AL-Makhlafi, M., Zhang, E., Wang, J., & Peng, J. (2022). A classification benchmark for Arabic alphabet phonemes with diacritics in deep neural networks. Computer Speech & Language, 71, 101274. doi:10.1016/j. csl.2021.101274 Alptekin, C. (1993). Target-language culture in EFL materials. ELT Journal, 47(2), 136–143. doi:10.1093/elt/47.2.136 Altshuler, L., Sussman, N. M., & Kachur, E. (2003). Assess ing changes in intercultural sensitivity among physician trainees using the intercultural development inventory. International Journal of Intercultural Relations, 27(4), 387–401. doi:10.1016/S0147-1767(03)00029-4 357

Compilation of References

Al-Zahrani, S. S. A., & Kaplowitz, S. A. (1993). Attributional biases in individualistic and collectivistic cultures: A comparison of Americans with Saudis. Social Psychology Quarterly, 56(3), 223–233. doi:10.2307/2786780 Amdal, I., Johnsen, M. H., & Versvik, E. (2009). Automatic evaluation of quantity contrast in non-native Norwegian speech. Proc. Int. Workshop Speech Lang. Technol. Educ., 21–24. American Federation of Teachers, National Council on Measurement in Education, & National Education Association. (1990). Standards for teacher competence in educational assessment of students. Educational Measurement: Issues and Practice, 9(4), 30-32. https://buros.org/standards-teacher-competence-educational-a ssessment-students Amigud, A., & Lancaster, T. (2019). 246 reasons to cheat: An analysis of students’ reasons for seeking to outsource academic work. Computers & Education, 134, 98–107. doi:10.1016/j.compedu.2019.01.017 Amin, M. E. (2011). Concepts of Testing and Principles of Test Construction. Paper Presented at the GCE Training Workshop on Multiple Choice Question, Bamenda. Amzalag, M., Shapira, N., & Dolev, N. (2022). Two sides of the coin: Lack of academic integrity in exams during the Corona pandemic, students’ and lecturers’ perceptions. Journal of Academic Ethics, 20(2), 243–263. doi:10.100710805021-09413-5 PMID:33846681 Anasse, K., & Rhandy, R. (2021). Teachers’ attitudes towards online writing assessment during Covid-19 pandemic. International Journal of Linguistics, Literature and Translation, 3(8), 65–70. doi:10.32996/ijllt.2021.4.8.9 Ander, T. (2015). Exploring communicative language teaching in a grade 9 nationwide textbook: New bridge to success [Master Thesis]. Bilkent University, Ankara, Turkey. Anderson, A., & Lynch, T. (1988). Listening. Oxford University Press. Anderson, L. (2006). Analytic autoethnography. Journal of Contemporary Ethnography, 35(4), 373–395. doi:10.1177/0891241605280449 Anderson, L. W., & Krathwohl, D. (Eds.). (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom’s taxonomy of educational objectives. Addison Wesley Longman, Inc. Anderson, L., & Glass-Coffin, B. (2013). I learn by going: Autoethnographic modes of inquiry. In S. Holman Jones, T. E. Adams, & C. Ellis (Eds.), Handbook of autoethnography (pp. 57–83). Left Coast Press. Anning, A., & Ring, K. (2004). Making sense of children’s drawings. Open University Press. Antes, T. A. (2017). Audio glosses as a participant in L2 dialogues: Evidence of mediation and microgenesis during information-gap activities. Language and Sociocultural Theory, 4(2), 101–123. doi:10.1558/lst.31234 Antón, M. (2003). Dynamic assessment of advanced foreign language learners. Paper presented at the American Association of Applied Linguistics, Washington, DC. Applefield, J. M., Huber, R., & Moallem, M. (2000). Constructivism in theory and practice: Toward a better understanding. High School Journal, 84(2), 35–53. Arias, C., Maturana, L., & Restrepo, M. (2012). Evaluación de los aprendizajes en lenguas extranjeras: Hacia practices justas y democráticas [Evaluation in foreign language learning: Towards fair and democratic practices]. Lenguaje, 40(1), 99–126. doi:10.25100/lenguaje.v40i1.4945 Ariyanti, A. (2016). Psychological factors affecting EFL students’ speaking performance. Asian TEFL Journal of Language Teaching and Applied Linguistics, 1(1), 77–88. doi:10.21462/asiantefl.v1i1.14 358

Compilation of References

Arora, V., Lahiri, A., & Reetz, H. (2017). Phonological feature-based mispronunciation detection and diagnosis using multi-task DNNs and active learning. Proc. INTERSPEECH 2017, 1432-1436. Arshad, A., & Mahmood, M. A. (2019). Investigating content and language integration in an EFL textbook: A corpusbased study. Linguistic Forum, 1(1), 8-17. Artal-Sevil, J. S., Castel, A. F. G., & Gracia, M. S. V. (2020, June). Flipped teaching and interactive tools. A multidisciplinary innovation experience in higher education. In 6th International Conference on Higher Education Advances (HEAd’20). https://web.archive.org/web/20210427174137id_/https://zaguan .unizar.es/record/95592/files/texto_completo.pdf Ash, S. L., & Clayton, P. H. (2009). Generating, deepening, and documenting learning: The power of critical reflection in applied learning. Journal of Applied Learning in Higher Education, 1, 25–48. Assulaimani, T. (2021). Alternative language assessments in the digital age. Journal of King Abdulaziz University Arts and Humanities, 29(1), 597–609. doi:10.4197/Art.29-1.20 Atilano, M. (2017). Game on: Teaching research methods to college students using Kahoot! Library Faculty Presentations & Publications. Retrieved from https://digitalcommons.unf.edu/library_facpub/56 Attali, Y. (2007). Construct validity of e-rater® in scoring TOEFL® essays. ETS Research Report Series, 2007(1), i-22. doi:10.1002/j.2333-8504.2007.tb02063.x Augusta, C., & Henderson, R. D. E. (2021). Student Academic integrity in online learning in higher education in the era of COVID-19. In C. Cheong, J. Coldwell-Nielson, K. MacCallum, T. Luo, & A. Scime (Eds.), COVID-19 and education: Learning and teaching in a pandemic-constrained environment (pp. 409–423). Informing Science Press. doi:10.35542/ osf.io/a3bnp Austin, J. L. (1962). How to do things with words. Harvard University Press. Awasthi, S. (2019). Plagiarism and academic misconduct a systematic review. DESIDOC Journal of Library and Information Technology, 39(2), 94–100. doi:10.14429/djlit.39.2.13622 Ayafor, I. M. (2005). Official Bilingualism in Cameroon: An Empirical Evaluation of the Status of English in Official Domains [PhD Thesis]. Albert-Ludwigs-Universität. Bach, G. (2005). Will the Real Madonna Please Reveal Herself?! Mediating “Self” and “Other” in Intercultural Learning. In G. Hermann-Brennecke (Ed.), Anglo-American awareness: Arpeggios in aesthetics (pp. 15–28). LIT Verlag Münster. Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford University Press. Bachman, L. F. (1991). What does language testing have to offer? TESOL Quarterly, 25(4), 671. doi:10.2307/3587082 Bachman, L. F. (2004). Statistical analyses for language assessment. Cambridge University Press. doi:10.1017/ CBO9780511667350 Bachman, L. F., & Palmer, A. (2010). Language assessment in practice. Oxford University Press. Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford University Press. Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice: Designing and developing useful language tests (Vol. 1). Oxford University Press. doi:10.2307/328718 Bachman, L. F., & Palmer, A. S. (2010). Language assessment in practice: Developing language assessment and justifying their use in the real world. Oxford University Press. 359

Compilation of References

Bagarić, V., & Djigunović, J. M. (2007). Defining communicative competence. Metodika, 8(1), 94–103. Bahari, A., Zhang, X., & Ardasheva, Y. (2021). Establishing a non-linear dynamic individual-centered language assessment model: A dynamic systems theory approach. Interactive Learning Environments, 29(7), 1–15. doi:10.1080/1049 4820.2021.1950769 Bailey, K. M. (1999). Washback in language testing. Educational Testing Service. Baker, W., & Ishikawa, T. (2021). Transcultural communication through global Englishes: An advanced textbook for students. Routledge. doi:10.4324/9780367809973 Baldwin, M., & Mussweiler, T. (2018). The culture of social comparison. Proceedings of the National Academy of Sciences of the United States of America, 115(39). Advance online publication. doi:10.1073/pnas.1721555115 PMID:30201717 Banaeian, H., & Gilanlioglu, I. (2021). Influence of the NAO as teaching assistant on university students’ vocabulary learning and attitudes. Australasian Journal of Educational Technology, 37(3), 71–87. doi:10.14742/ajet.6130 Bardovi-Harlig, K. (2013). Developing L2 pragmatics. Language Learning, 63(1), 68–86. doi:10.1111/j.14679922.2012.00738.x Bar-Ilan, J. (2010). Citations to the “Introduction to informetrics” indexed by WOS, Scopus and Google Scholar. Scientometrics, 82(3), 495–506. doi:10.100711192-010-0185-9 Barnes, R. (2017). Kahoot! in the classroom: Student engagement technique. Nurse Educator, 42(6), 280. doi:10.1097/ NNE.0000000000000419 PMID:29049160 Barry, N. H., & Lechner, J. V. (1995). Preservice teachers’ attitudes about and awareness of multicultural teaching and learning. Teaching and Teacher Education, 11(2), 149–161. doi:10.1016/0742-051X(94)00018-2 Bartneck, C., & Hu, J. (2008). Exploring the abuse of robots. Interaction Studies: Social Behaviour and Communication in Biological and Artificial Systems, 9(3), 415–433. doi:10.1075/is.9.3.04bar Bartneck, C., Kulić, D., Croft, E., & Zoghbi, S. (2009). Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots. International Journal of Social Robotics, 1(1), 71–81. doi:10.100712369-008-0001-3 Beacco, J.-C., Byram, M., Cavalli, M., Coste, D., Egli Cuenat, M., Goullier, F., & Panthier, J. (2016). Guide for the development and implementation of curricula for plurilingual and intercultural education. Council of Europe Publishing. Bearman, M., Dawson, P., O’Donnell, M., Tai, J., & Jorre de St Jorre, T. (2020). Ensuring academic integrity and assessment security with redesigned online delivery. Deakin University. https://dteach.deakin.edu.au/2020/03/23/academic-integrity-o nline/ Beck, G., Tsaryk, O. M., & Rybina, N. V. (2020). Teaching and assessment strategies in online foreign languages distance learning. Медична Освіта, 2(2), 6–13. doi:10.11603/me.2414-5998.2020.2.11139 Beck, S. W., Jones, K., Storm, S., & Smith, H. (2020). Scaffolding students’ writing processes through dialogic assessment. Journal of Adolescent & Adult Literacy, 63(6), 651–660. doi:10.1002/jaal.1039 Bedore, L. M., Peña, E. D., Joyner, D., & Macken, C. (2011). Parent and teacher rating of bilingual language proficiency and language development concerns. International Journal of Bilingual Education and Bilingualism, 14(5), 489–511. doi:10.1080/13670050.2010.529102 PMID:29910668 Behforouz, B. (2022). Online assessment and the features in language education context: A brief review. Journal of Language and Linguistic Studies, 18(1), 564–576. 360

Compilation of References

Bell, E., Bryman, A., & Harley, B. (2018). Business research methods. Oxford University Press. Benckendorff, P., & Zehrer, A. (2013). A network analysis of Tourism Research. Annals of Tourism Research, 43, 121–149. doi:10.1016/j.annals.2013.04.005 Benitti, F. B. V. (2012). Exploring the educational potential of robotics in schools: A systematic review. Computers & Education, 58(3), 978–988. doi:10.1016/j.compedu.2011.10.006 Bennett, R. E. (1998). Reinventing assessment. Speculations on the future of large-scale educational testing. A policy information perspective. Retrieved from https://www.researchgate.net/publication/234731732_Reinventi ng_Assessment_Speculations_on_the_Future_of_Large-Scale_Educ ational_Testing_A_Policy_Information_Perspective Bennett, M. J. (1993). Towards ethnorelativism: A developmental model of intercultural sensitivity. In R. Paige (Ed.), Education for the intercultural experience (pp. 21–71). Intercultural Press. Bennett, R. E. (1999). Using new technology to improve assessment. Educational Measurement: Issues and Practice, 18(3), 5–12. doi:10.1111/j.1745-3992.1999.tb00266.x Bennett, R. E. (2002). Using electronic assessment to measure student performance. The State Education Standard, National Association of State Boards of Education. Bensel, N., & Weiler, H. N. (2000). Hochschulen für das 21. Jahrhundert zwischen Staat, Markt und Eigenverantwortung: Ein Hochschulpolitisches Memorandum im Rahmen der „Initiative D21“ unter Federführung der DaimlerChrysler Services (debis). DaimlerChrysler Services (debis) AG. http://www.hochschul-management.de/HS-Politisches_Memorandum .pdf Beran, T. N., Ramirez-Serrano, A., Kuzyk, R., Fior, M., & Nugent, S. (2011). Understanding how children understand robots: Perceived animism in child–robot interaction. International Journal of Human-Computer Studies, 69(7-8), 539–550. doi:10.1016/j.ijhcs.2011.04.003 Bercher, D. A. (2012). Self-monitoring tools and student academic success: When perception matches reality. Journal of College Science Teaching, 41(5), 26–32. https://web.s.ebscohost.com/ehost/pdfviewer/pdfviewer?vid=0& sid=cfcdc8de-13bd-4621-bc38-1dea6df7f849%40redis Berelson, B. (1952). Content analysis in communication research. Free Press. Berkeley, S., Bender, W. N., Peaster, L. G., & Saunders, L. (2009). Implementation of response to intervention: A snapshot of progress. Journal of Learning Disabilities, 42(1), 85–95. doi:10.1177/0022219408326214 PMID:19103800 Berry, V., Sheehan, S., & Munro, S. (2019). What does language assessment literacy mean to teachers? ELT Journal, 73(2), 113–123. doi:10.1093/elt/ccy055 Berwick, G. (1994). Factors which Affect Pupil Achievement: The Development of a Whole School Assessment Programme and Accounting for Personal Constructs of Achievement [Unpub. PhD Thesis]. University of East Anglia. Best, K., Scott, L. A., & Thoma, C. A. (2015). Starting with the end in mind: Inclusive education designed to prepare students for adult life. In R. G. Craven, A. J. S. Moren, D. Tracey, P. D. Parker, & H. F. Zhong (Eds.), Inclusive education for students with intellectual disabilities (pp. 45–72). Information Age Press. Bhagat, K. K., & Spector, J. M. (2017). Formative assessment in complex problem-solving domains: The emerging role of assessment technologies. Journal of Educational Technology & Society, 20(4), 312–317.

361

Compilation of References

Birjandi, P., & Alemi, M. (2010). The impact of test anxiety on test performance among Iranian EFL learners. BRAIN. Broad Research in Artificial Intelligence and Neuroscience, 1(4), 44–58. Bishop, J., & Verleger, M. A. (2013, June). The Flipped Classroom: A Survey of the Research. Paper presented at the 2013 ASEE Annual Conference & Exposition, Atlanta, GA. 10.18260/1-2--22585 Bjelobaba, S. (2021). Deterring cheating using a complex assessment design: A case study. The Literacy Trek, 7(1), 55–77. doi:10.47216/literacytrek.936053 Black, P., & Wiliam, D. (2004). Assessment for Learning in the Classroom. In Assessment and Learning (pp. 9–21). SAGE Publications Ltd. Black, P., & William, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policy & Practice, 5(1), 7–74. doi:10.1080/0969595980050102 Blau, I., Goldberg, S., Friedman, A., & Eshet-Alkalai, Y. (2020). Violation of digital and analog academic integrity through the eyes of faculty members and students: Do institutional role and technology change ethical perspectives? Journal of Computing in Higher Education, 33(1), 157–187. doi:10.100712528-020-09260-0 PMID:32837125 Blinova, O. (2022). What Covid taught us about assessment: students’ perceptions of academic integrity in distance learning. INTED2022 Proceedings, 6214-6218. 10.21125/inted.2022.1576 Block, D. (1991). Some thoughts on DIY materials design. ELT Journal, 45(3), 211–217. doi:10.1093/elt/45.3.211 Bloom, B. S. (Ed.). (1956). Taxonomy of educational objectives, Handbook 1: Cognitive domain (2nd ed.). Longman Publishing. Bloom, B., Hastings, T., & Madaus, G. (1981). Evaluation to ImproveLeaning. McGraw-Hill. Boardman, M. (2007). “I know how much this child has learned. I have proof!” Employing digital technologies for documentation processes in kindergarten. Australian Journal of Early Childhood, 32(3), 59–66. doi:10.1177/183693910703200309 Bochner, A. P. (2007). Notes toward an ethics of memory in autoethnographic inquiry. In N. K. Denzin & M. D. Giardina (Eds.), Ethical futures in qualitative research: Decolonizing the politics of knowledge (pp. 197–208). Left Coast Press. Bochner, A. P., & Ellis, C. (2016). Evocative autoethnography: Writing stories and telling lives. Routledge. doi:10.4324/9781315545417 Boitshwarelo, B., Reedy, A. K., & Billany, T. (2017). Envisioning the use of online tests in assessing twenty-first century learning: A literature review. Research and Practice in Technology Enhanced Learning, 12(1), 1–16. doi:10.118641039017-0055-7 PMID:30595721 Bowman, B., Donovan, S., & Burns, S. (2001). Eager to learn: Educating our pre-schoolers. Report of Committee on Early Childhood Pedagogy. Commission on Behavioural and Social Sciences and Education National Research Council. Washington, DC: National Academy Press. Boyle, A., & Hutchison, D. (2009). Sophisticated tasks in e‐assessment: What are they and what are their benefits? Assessment & Evaluation in Higher Education, 34(3), 305–319. doi:10.1080/02602930801956034 Boylorn, R. M., & Orbe, M. P. (Eds.). (2014). Critical autoethnography: Intersecting cultural identities in everyday life. Routledge. Bretag, T., Harper, R., Burton, M., Ellis, C., Newton, P., van Haeringen, K., Saddiqui, S., & Rozenberg, P. (2019). Contract cheating and assessment design: Exploring the relationship. Assessment & Evaluation in Higher Education, 44(5), 676–691. doi:10.1080/02602938.2018.1527892 362

Compilation of References

Brewster, J., Ellis, G., & Girard, D. (2002). The primary English teacher’s guide. Penguin. Brindley, G. (2001). Language assessment and professional development. In C. Elder, A. Brown, E. Grove, K. Hill, N. Iwashita, T. Lumley, C. MacNamara, & K. O’Loughlin (Eds.), Experimenting with uncertainty: Essays in honour of Alan Davies (pp. 126–136). Cambridge University Press. Brislin, R. W. (2010). The undreaded job: Learning to thrive in a less-than-perfect workplace. Praeger. B r i t i s h C o u n c i l . ( 2 0 1 3 ) . C u l t u re a t wo rk : T h e Va l u e o f I n te rc u l t u ral Skills in the Workplace. British Council. https://www.britishcouncil.org/sites/default/files/culture-a t-work-report-v2.pdf B r i t i s h C o u n c i l . ( 2 0 2 0 ) . H ow c a n u n i ve rs i t i e s c o n d u c t o n l i n e a s s e s s m e n t t h a t i s s e c u r e a n d c r e d i b l e ? h t t p s : / / w w w. b r i t i s h c o u n c i l . u z / s i t e s / d e f a u l t / f i l e s / s p o t l i g h t _ report_how_can_universities_conduct_online_assessment_that_i s_secure_and_credible_0.pdf Broadfoot, P. (2009). Signs of change: Assessment past, present and future. In C. Wyatt-Smith & J. Cummings (Eds.), Educational Assessment in the 21st Century. Connecting Theory and Practice (pp. v–xi). Springer. Brooks, F. B., & Donato, R. (1994). Vygotskian approaches to understanding foreign language learner discourse during communicative tasks. Hispania, 77(2), 262–274. doi:10.2307/344508 Brown, B., & Levinson, S. C. (1978). Politeness: Some universals in language usage. Cambridge University Press. Brown, D. J. (2005). Testing in Language Program. McGraw-Hill. Brown, H. D. (2001). Teaching by principles: An interactive approach to language pedagogy. Pearson Education. Brown, H. D. (2003). Language Assessment: Principles and Classroom Practices. Pearson Education. Brown, H. D. (2003). Language Assessment: Principles and Classroom. Pearson. Brown, H. D. (2004). Language assessment principles and classroom practices. Longman. Brown, H. D. (2004). Language assessment: Principles and classroom practices. Pearson Education. Brown, H. D. (2004). Principles and classroom practices. Longman. Brown, J. D. (1995). The elements of language curriculum: A systematic approach to program development. Heinle & Heinle Publishers. Brown, J. D. (1997). Computers in language testing: Present research and some future directions. Language Learning & Technology, 1(1), 44–59. Brown, J. D., & Hudson, T. (1998). The alternatives in language assessment. TESOL Quarterly, 32(4), 653–675. doi:10.2307/3587999 Brusokaitė, E. (2013). Gender representation in EFL textbooks [Doctoral Dissertation]. Lithuanian University of Educational Sciences, Vilnius, Lithuania. Buck, G. (1988). Testing listening comprehension in Japanese university entrance examinations. JALT Journal, 10(1), 15-42. Buck, G. (2001). Assessing listening. Cambridge University Press. doi:10.1017/CBO9780511732959

363

Compilation of References

Burstein, J., Frase, L., Ginther, A., & Grant, L. (1996). Technologies for language assessment. Annual Review of Applied Linguistics, 16, 240–260. doi:10.1017/S0267190500001537 Butler, D. L. (2002). Qualitative approaches to investigating self-regulated learning: Contributions and challenges. Educational Psychologist, 37(1), 59–63. doi:10.1207/S15326985EP3701_7 Butler, Y. G. (2014). Parental factors and early English education as a foreign language: A case study in Mainland China. Research Papers in Education, 29(4), 410–437. doi:10.1080/02671522.2013.776625 Butler, Y. G. (2015). Parental factors in children’s motivation for learning English: A case in China. Research Papers in Education, 30(2), 164–191. doi:10.1080/02671522.2014.891643 Byram, M. (1997). Teaching and assessing intercultural communicative competence. Multilingual Matters. Byram, M. (2008). From foreign language education to education for intercultural citizenship: Essays and Reflections. Multilingual Matters. doi:10.21832/9781847690807 Byram, M. (2014). Twenty-five years on – from cultural studies to intercultural citizenship. Language, Culture and Curriculum, 27(3), 209–225. doi:10.1080/07908318.2014.974329 Byram, M. (2021). Teaching and assessing intercultural communicative competence: Revisited (2nd ed.). Multilingual Matters. doi:10.21832/9781800410251 Byram, M., & Wagner, M. (2018). Making a difference: Language teaching for intercultural and international dialogue. Foreign Language Annals, 51(1), 140–151. doi:10.1111/flan.12319 Byram, M., & Zarate, G. (1996). Defining and assessing intercultural competence: Some principles and proposals for the European context. Language Teaching, 29(4), 239–243. doi:10.1017/S0261444800008557 Cameron, C., Tate, B., Macnaughton, D., & Politano, C. (1998). Recognition without rewards. Peguis Publishers. Cameron, D. (2001). Working with spoken discourse. Sage (Atlanta, Ga.). Cameron, L. (2001). Teaching languages to young learners. Cambridge University Press. doi:10.1017/CBO9780511733109 Camilleri, B., Hasson, N., & Dodd, B. (2014). Dynamic assessment of bilingual children’s language at the point of referral. Educational and Child Psychology, 31(2), 57–72. doi:10.53841/bpsecp.2014.31.2.57 Can, E. (2017). English Teachers’ Classroom Assessment Practices and Their Views about the Ethics of Classroom Assessment Practices [Unpublished MA Thesis]. Cağ University. Can, H. (2020). A micro-analytic investigation into EFL teachers’ language test item reviewing interactions [PhD Thesis]. Middle East Technical University. Canagarajah, A. S. (2012). Teacher development in a global profession: An autoethnography. TESOL Quarterly, 46(2), 258–279. doi:10.1002/tesq.18 Canagarajah, A. S. (2013). Translingual practice: Global Englishes and cosmopolitan relations. Routledge. doi:10.4324/9780203120293 Canagarajah, S. (2006). Changing communicative needs, revised assessment objectives: Testing English as an international language. Language Assessment Quarterly, 3(3), 229–242. doi:10.120715434311laq0303_1 Canale, M. (1983). From communicative competence to communicative language pedagogy. Language & Communication, 1(1), 1–47.

364

Compilation of References

Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics, 1(1), 1–47. doi:10.1093/applin/1.1.1 Carr, M., & Claxton, G. (2002). Tracking the development of learning dispositions. Assessment in Education: Principles, Policy & Practice, 9(1), 9–37. doi:10.1080/09695940220119148 Carroll, J. B. (1961). Fundamental considerations in testing English proficiency of foreign students. In L. F. Bachman (Ed.), Testing the English Proficiency of Foreign Students (pp. 30–40). Center for Applied Linguistics. Carroll, J. B. (1968). The psychology of language testing. In A. Davies (Ed.), Language Testing Symposium: A Psycholinguistic Approach (pp 46–69). London: Oxford University Press. Caspari, D., & Schinschke, A. (2009). Aufgaben zur Feststellung und Überprüfung interkultureller Kompetenzen im Fremdsprachenunterricht—Entwurf einer Typologie. In M. Byram & A. Hu (Eds.), Interkulturelle Kompetenz und fremdsprachliches Lernen. Modelle, Empirie, Evaluation (pp. 299–315). Gunter Narr Verlag. Causo, A., Win, P. Z., Guo, P. S., & Chen, I.-M. (2017). Deploying social robots as teaching aid in pre- school K2 classes: A proof-of-concept study. In Proceedings of International Conference on Robotics and Automation (ICRA) (pp. 4264–4269). IEEE. 10.1109/ICRA.2017.7989490 Cazden, C. (2001). Classroom discourse: The language of teaching and learning. Heinemann Center for Applied Special Technology. Celce-Murcia, M., Dörnyei, Z., & Thurrell, S. (1995). Communicative competence: A pedagogically motivated model with content specifications. Issues in Applied Linguistics, 6(2), 5–35. doi:10.5070/L462005216 Çelebi, N., Vuranok, T. T., & Turgut, İ. H. (2016). Zümre öğretmenlerinin işbirliği düzeyini belirleme ölçeğinin geçerlik ve güvenirlik çalışması [The validity and reliability study of the scale of determining the level of cooperation of branch teachers]. Kastamonu Eğitim Dergisi, 24(2), 803-820. https://dergipark.org.tr/en/download/article-file/209704 Celik, O., & Lancaster, T. (2021). Violations of and threats to academic integrity in online English language teaching: Revealing the attitudes of students. The Literacy Trek, 7(1), 34–54. doi:10.47216/literacytrek.932316 Cenoz, J., Genesee, F., & Gorter, D. (2014). Critical analysis of CLIL: Taking stock and looking forward. Applied Linguistics, 35(3), 243–262. doi:10.1093/applin/amt011 Chalhoub-Deville, M. (2003). Second language interaction: Current perspectives and future trends. Language Testing, 20(4), 369–383. doi:10.1191/0265532203lt264oa Chambers, I., Costanza, R., Zingus, L., Cork, S., Hernandez, M., Sofiullah, A., & Kubiszewski, I. (2019). A public opinion survey of four future scenarios for Australia in 2050. Futures, 107, 119–132. doi:10.1016/j.futures.2018.12.002 Chambliss, M. J., & Calfee, R. C. (1998). Textbooks for learning: Nurturing children’s minds. Blackwell Publishers. Chang, C. W., Lee, J. H., Chao, P. Y., Wang, C. Y., & Chen, G. D. (2010). Exploring the possibility of using humanoid robots as instructional tools for teaching a second language in primary school. Journal of Educational Technology & Society, 13, 13–24. Chang, E. (2008). Autoethnography as method. West Coast Press. Chapelle, C. A., & Brindley, G. (2010). Assessment: An Introduction to Applied Linguistics. Hodder & Stoughton Ltd. Chapelle, C. A., & Voss, E. (2017). Utilizing technology in language assessment. Language Testing and Assessment, 149–161. doi:10.1007/978-3-319-02261-1_10

365

Compilation of References

Chapelle, C. A., Enright, M. K., & Jamieson, J. M. (Eds.). (2009). Building a validity argument for the Test of English as a Foreign Language. Routledge, Taylor & Francis Group. Chapelle, C., & Brindley, G. (2002). Assessment. In N. Schmitt (Ed.), An introduction to applied linguistics (pp. 267–286). Arnold. Chavez, M. (2002). We say “culture” and students ask “what?”: University students’ definitions of foreign language culture. Die Unterrichtspraxis / Teaching German, 35(2), 129. Cheng, L. (2005). Changing language teaching through language testing: A washback study (Vol. 21). Cambridge University Press. Cheng, L., & Curtis, A. (2004). Washback or backwash: A review of the impact of testing on teaching and learning. In L. Cheng, Y. Watanabe, & A. Curtis (Eds.), Washback in language testing: Research contexts and methods (pp. 3–17). Routledge. doi:10.4324/9781410609731-9 Cheng, L., Rogers, T., & Hu, H. (2004). ESL/EFL instructors’ classroom assessment practices: Purposes, methods, and procedures. Language Testing, 21(3), 360–389. doi:10.1191/0265532204lt288oa Cheng, L., Rogers, W. T., & Wang, X. (2007). Assessment purposes and procedures in ESL/EFL classrooms. Assessment & Evaluation in Higher Education, 33(1), 9–32. doi:10.1080/02602930601122555 Choi, N., Sheo, J., & Kang, S. (2020). Individual and parental factors associated with preschool children’s foreign language anxiety in an EFL setting. Elementary Education Online, 19(3), 1116–1126. Chomsky, N. (1965). Aspects of the theory of syntax. MIT Press. Chow, B. W. Y., McBride-Chang, C., & Cheung, H. (2010). Parent-child reading in English as a second language: Effects on language and literacy development of Chinese kindergarteners. Journal of Research in Reading, 33(3), 284–301. doi:10.1111/j.1467-9817.2009.01414.x Cil, E. (2021). The effect of using Wordwall.net in Increasing vocabulary knowledge of 5th grade EFL students. Language Education & Technology, 1(1), 21–28. Cizek, G. (2010). An introduction to formative assessment: History, characteristics, and challenges. In H. Andrade & G. J. Cizek (Eds.), Handbook of formative assessment (pp. 3–17). Routledge. Clarke, J., & Clarke, M. (1990). Stereotyping in TESOL materials. In B. Harrison (Ed.), Culture and the Language Classroom (pp. 31–44). Modern English Publications/British Council. C l a rke , R . , & Lancaster, T. (2006). Eliminating the Successor to Plagiar ism? Identif y ing the Usage of Contract Cheating Sites. Proceedings of the 2nd International Plagiarism Conference. Northumbria Learning Press. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.12 0.5440&rep=rep1&type=pdf Clegg, K., & Bryan, C. (2006). Reflections, rationales and realities. In C. Bryan & K. Clegg (Eds.), Innovative assessment in higher education (pp. 216–227). Routledge. Colby-Kelly, C., & Turner, C. E. (2007). AFL research in the L2 classroom and evidence of usefulness: Taking formative assessment to the next level. Canadian Modern Language Review, 64(1), 9–38. doi:10.3138/cmlr.64.1.009 Cole, M., John-Steiner, V., Scribner, S., & Souberman, E. (Eds.). (1978). Mind in society: The development of higher psychological processes. L. S. Vygotsky. Harvard U Press.

366

Compilation of References

Collie, J., & Slater, S. (1990). Literature in the language classroom: A resource book of ideas and activities. Cambridge University Press. Collier, M. J. (1989). Cultural and intercultural communication competence: Current approaches and directions for future research. International Journal of Intercultural Relations, 13(3), 287–302. doi:10.1016/0147-1767(89)90014-X Committee for Economic Development. (2006). Education for global leadership: The importance of international studies and foreign language education for U.S. economic and national security. Committee for Economic Development. Conole, G., & Warburton, B. (2005). A review of computer-assisted assessment. Research in Learning Technology, 13(1), 17–31. doi:10.3402/rlt.v13i1.10970 Cook, P. (2014). To actually be sociological: Autoethnography as an assessment and learning tool. Journal of Sociology, 50(3), 269–282. doi:10.1177/1440783312451780 Coombe, C., Troudi, S., & Al-Hamly, M. (2012). Foreign and second language teacher assessment literacy: Issues, challenges, and recommendations. In C. Coombe, P. Davidson, B. O’Sullivan, & S. Stoynoff (Eds.), The Cambridge guide to second language assessment (pp. 20-29). Cambridge University Press. Coombe, C. (2018). An A to Z of second language assessment: How language teachers understand assessment concepts. British Council. Coombe, C., Al-Hamly, M., & Troudi, S. (2009). Foreign and second language teacher assessment literacy: Issues, challenges and recommendations. Research Notes, 38, 14–18. Coombs, A., DeLuca, C., LaPointe-McEwan, D., & Chalas, A. (2018). Changing approaches to classroom assessment: An empirical study across teacher career stages. Teaching and Teacher Education, 71, 134–144. doi:10.1016/j.tate.2017.12.010 Copland, F. (2020). To teach or not to teach: English in primary schools. In H. H. Uysal (Ed.), Political, pedagogical and research insights into early language education (pp. 10–18). Cambridge Scholars Publishing. Copland, F., Garton, S., & Burns, A. (2014). Challenges in teaching English to young learners: Global perspectives and local realities. TESOL Quarterly, 48(4), 738–762. doi:10.1002/tesq.148 Cossa, J. (2013). Power Dynamics in International negotiations towards equitable policies, partnerships, and practices: Why it matters for Africa, the developing world, and their higher education systems. African and Asian Studies, 12(1-2), 100–117. doi:10.1163/15692108-12341253 Council of Europe. (2001). Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Council of Europe, Modern Languages Division. Council of Europe. (2020). Common European framework of reference for languages: Learning, teaching, assessment – Companion volume. Council of Europe Publishing. Council of Europe. (Ed.). (2001). Common European framework of reference for languages: Learning, teaching, assessment (10th print). Cambridge Univ. Press. Courey, S. J., Tappe, P., Siker, J., & LePage, P. (2012). Improved lesson planning with universal design for learning (UDL). Teacher Education and Special Education, 36(1), 7–27. doi:10.1177/0888406412446178 Cowan, N. (1999). An embedded- processes model of working memory. In A. Miyake & P. Shah (Eds.), Models of working memory: Mechanisms of active maintenance and executive control (pp. 62–101). Cambridge University Press. doi:10.1017/CBO9781139174909.006

367

Compilation of References

Coyle, D., Hood, P., & Marsh, D. (2010). CLIL: Content and language integrated learning. Cambridge University Press. doi:10.1017/9781009024549 Creswell, J. W. (2012). Educational research: Planning, conducting, and evaluating quantitative and qualitative research. Merrill. Crisp, G. (2011). Teacher’s handbook on e-assessment. Australian Learning and Teaching Council. Crossman, J. E., & Clarke, M. (2010). International experience and graduate employability: Stakeholder perceptions on the connection. Higher Education, 59(5), 599–613. doi:10.100710734-009-9268-z Crystal, D. (1997). The Cambridge encyclopedia of language. Cambridge University Press. Crystal, D. (2012). English as a global language. Cambridge University Press. doi:10.1017/CBO9781139196970 Cunningsworth, A. (1995). Choosing your coursebook. Heinemann. Czura, A., & Dooly, M. (2021). Foreign language assessment in virtual exchange – The ASSESSnet project. Collated Papers for the ALTE 7th International Conference, 137–140. Davies, A. (1990). Operationalizing uncertainty in language testing: An argument in favor of content validity. In J. A. L. de Jong (Ed.), Individualizing the assessment of language abilities (pp. 179–195). Multilingual Matters. Davies, A. (2007). Storytelling in the classroom: Enhancing traditional oral skills for teachers and pupils. Questions Publishing. Davies, A. (2008). Textbook trends in teaching language testing. Language Testing, 25(3), 327–347. doi:10.1177/0265532208090156 Davies, A., Brown, A., Elder, C., Hill, K., Lumley, T., & McNamara, T. (1999). Dictionary of language testing (Vol. 7). Cambridge University Press. Davin, K. J. (2013). Integration of dynamic assessment and instructional conversations to promote development and improve assessment in the language classroom. Language Teaching Research, 17(3), 303–322. doi:10.1177/1362168813482934 Davin, K. J. (2016). Classroom dynamic assessment: A critical examination of constructs and practices. Modern Language Journal, 100(4), 1–17. doi:10.1111/modl.12352 Davin, K. J., & Donato, R. (2013). Student collaboration and teacher-directed classroom dynamic assessment: A complementary pairing. Foreign Language Annals, 46(1), 5–22. doi:10.1111/flan.12012 Davison, W. F. (1976). Factors in evaluating and selecting texts for the foreign-language classroom. English Language Teaching Journal, 30(4), 310–314. doi:10.1093/elt/XXX.4.310 Dawadi, S. (2019). Students’ and parents’ attitude towards the SEE English test. Journal of NELTA, 24(1–2), 1–16. doi:10.3126/nelta.v24i1-2.27677 De Haan, H. (2014). Can internationalisation really lead to institutional competitive advantage? – A study of 16 Dutch public higher education institutions. European Journal of Higher Education, 4(2), 135–152. doi:10.1080/21568235.2 013.860359 de Jong, J. A. L. (Ed.). (1990). Individualizing the assessment of language abilities. Multilingual Matters.

368

Compilation of References

de Wit, J., Schodde, T., Willemsen, B., Bergmann, K., de Haas, M., Kopp, S., Kramer, E., & Vogt, P. (2018). The effect of a robot’s gestures and adaptive tutoring on children’s acquisition of second language vocabularies. In Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction (pp. 50–58). New York, NY: ACM. 10.1145/3171221.3171277 Deardorff, D. K. (2004). The identification and assessment of intercultural competence as a student outcome of internationalization at institutions of higher education in the United States [Dissertation]. North Carolina State University. Deardorff, D. K. (2020). Manual for developing intercultural competencies: Story circles. Routledge, Taylor & Francis Group. Deardorff, D. K. (2006). Identification and assessment of intercultural competence as a student outcome of internationalization. Journal of Studies in International Education, 10(3), 241–266. doi:10.1177/1028315306287002 Deardorff, D. K. (2015). Intercultural competence: Mapping the future research agenda. International Journal of Intercultural Relations, 48, 3–5. doi:10.1016/j.ijintrel.2015.03.002 Debbağ, M. (2018). Öğretim İlke ve Yöntemleri dersi öğretim programı için hazırlanan ters-yüz edilmiş sınıf modelinin etkililiği. Yayınlamamış Doktora tezi, Bolu Abant İzzet Baysal Üniversitesi. Deeks, J. J., Macaskill, P., & Irwig, L. (2005). The performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed. Journal of Clinical Epidemiology, 58(9), 882–893. doi:10.1016/j.jclinepi.2005.01.016 PMID:16085191 Dellos, R. (2015). Kahoot! A digital game resource for learning. International Journal of Instructional Technology and Distance Learning, 12, 49–52. Denzin, N. K., & Lincoln, Y. S. (Eds.). (2011). The SAGE handbook of qualitative research. SAGE. Depar tment for Education. (2015). Car ter review of initial teacher training. Retrieved on August 25, 2022 from https://www.gov.uk/ government/publications/carter-review-of -initial-teacher-training Depar tment for Education. (2019). Early career framework. Retr ieved on Aug u s t 2 5 , 2 0 2 2 f r o m h t t p s : / / w w w. g o v. u k / g o v e r n m e n t / p u b l i c a t i o n s / e a r l y - c a r e e r - f r a m ework Deslandes, R., & Rivard, M.-C. (2013). A pilot study aiming to promote parents’ understanding of learning assessments at the elementary level. School Community Journal, 23(2), 9–31. Dewi, S. S. (2017). Parents’ involvement in children’s English language learning. IJET, 6(1), 102–122. doi:10.15642/ ijet.2017.6.1.102-122 Diaz, C., Acuña, N., Ravanal, B., & Riffo, I. (2020). Unraveling parents’ perceptions of English language learning. Humanities & Social Sciences Reviews, 8(2), 193–204. doi:10.18510/hssr.2020.8223 Dikli, S. (2006). An overview of automated scoring of essays. The Journal of Technology, Learning, and Assessment, 5(1). Retrieved July 7, 2022, from http://www.jtla.org Diment, A., Fagerlund, E., Benfield, A., & Virtanen, T. (2019). detection of typical pronunciation errors in non-native English speech using convolutional recurrent neural networks. Proc. International Joint Conference on Neural Networks (IJCNN), 1-8. 10.1109/IJCNN.2019.8851963 Djurić, M. (2015). Dealing with situations of positive and negative washback. Scripta Manent, 4(1), 14–27. 369

Compilation of References

Dlugozs, D. W. (2000). Rethinking the role of reading in teaching a foreign language to young learners. ELT Journal, 543(3), 284–290. doi:10.1093/elt/54.3.284 Dodrige, M. (1999). Generic skill requirements for engineers in the 21st Century. Academic Press. Doğançay‐Aktuna, S., & Kızıltepe, Z. (2005). English in Turkey. World Englishes, 24(2), 253–265. doi:10.1111/j.1467971X.2005.00408.x Donato, R. (2000). Sociocultural contributions to understanding the foreign and second language classroom. In J. P. Lantolf (Ed.), Sociocultural theory and second language learning. Academic Press. Dong, W., & Xie, Y. (2019). Correlational neural network-based feature adaptation in L2 mispronunciation detection. Proc. International Conference on Asian Language Processing (IALP), 121-125. 10.1109/IALP48816.2019.9037719 Douglas College. (2020). Academic integrity policy. https://www.douglascollege.ca/sites/default/files/docs/finan ce-dates-and-deadlines/Academic%20Integrity%20Policy%20w%20F lowchart.pdf Douglas, K., & Carless, D. (2013). A history of autoethnography. In S. Holman Jones, T. E. Adams, & C. Ellis (Eds.), Handbook of autoethnography (pp. 84–106). Left Coast Press. Doye, P. (1991). Authenticity in foreign language testing. In Current developments in language testing (pp. 103-110). Regional Language Centre. Duff, P. (2007). Second language socialization as sociocultural theory: Insights and issues. Language Teaching, 40(4), 309–319. doi:10.1017/S0261444807004508 Duff, P. (2012). Second language socialization. In A. Duranti, E. Ochs, & B. Schieffelin (Eds.), Handbook of language socialization (pp. 564–586). Wiley-Blackwell. Dufva, M., & Voeten, M. M. (1999). Native language literacy and phonological memory as prerequisites for learning English as a foreign language. Applied Psycholinguistics, 20(3), 329–348. doi:10.1017/S014271649900301X Durrani, H. (2016). Attitudes of undergraduates towards ggrammar translation method and communicative language teaching in EFL context: A case study of SBK women’s university Quetta, Pakistan. Advances in Language and Literary Studies, 7(4), 167–172. Durrani, N. (2008). Schooling the ‘other’: The representation of gender and national identities in Pakistani curriculum texts. Compare: A Journal of Comparative Education, 38(5), 595–610. doi:10.1080/03057920802351374 East, M. (2016). Assessing foreign language students’ spoken proficiency: Stakeholder perspectives on assessment innovation. Springer. doi:10.1007/978-981-10-0303-5 Eckes, T. (2014). Examining testlet effects in the TestDaF listening section: A testlet response theory modeling approach. Language Testing, 31(1), 39–61. doi:10.1177/0265532213492969 Egalite, A. J., & Mills, J. N. (2019). Competitive impacts of means-tested vouchers on public school performance: Evidence from Louisiana. Education Finance and Policy, 1-45. Retrieved on the 29th of September, 2020 from https://www.mitpressjournals.org/doi/abs/10.1162/edfp_a_0028 6 Egan, A. (2018). Improving academic integrity through assessment design. Dublin City University, National Institute for Digital Learning (NIDL).

370

Compilation of References

Egghe, L., & Rousseau, R. (1990). Introduction to informetrics: Quantitative methods in library, documentation and information science. Elsevier Science Publishers. EHEA. (1999). The bologna declaration of 19 June 1999: Joint declaration of the Europe a n M i n i ste rs o f E d u c a t i o n . h t t p s : / / w w w. e u ra s h e . e u / l i b ra r y / m o d e r n i s i n g - p h e / B o l o g n a _ 1 9 9 9 _ Bologna-Declaration.pdf Eimler, S., von der Pütten, A., & Schächtle, U. (2010). Following the white rabbit—a robot rabbit as vocabulary trainer for beginners of English. In G. Leitner, M. Hitz, & A. Holzinger (Eds.), Lecture Notes in Computer Science: Vol. 6389. HCI in Work and Learning, Life and Leisure. USAB 2010. Springer. doi:10.1007/978-3-642-16607-5_22 Elkhatat, A., Elsaid, K., & Almeer, S. (2021). Teaching tip: Cheating mitigation in online assessment. Chemical Engineering Education, 55(2), 103. doi:10.18260/2-1-370.660-125272 Ellegaard, O., & Wallin, J. A. (2015). The bibliometric analysis of scholarly production: How great is the impact? Scientometrics, 105(3), 1809–1831. doi:10.100711192-015-1645-z PMID:26594073 Elley, W. B. (1989). Vocabulary acquisition from listening to stories. Reading Research Quarterly, 24(2), 174–187. doi:10.2307/747863 Ellis, R. (2008). The study of second language acquisition (2nd ed.). Oxford University Press. Ellis, R., Skehan, P., Li, S., Shintani, N., & Lambert, C. (2019). Cognitive-Interactionist Perspectives. In Task-Based Language Teaching: Theory and Practice (pp. 29-63). Cambridge University Press. doi:10.1017/9781108643689.006 Ellis, C. (1999). Heartful autoethnography. Qualitative Health Research, 9(5), 669–683. doi:10.1177/104973299129122153 Ellis, C. (2004). The Ethnographic I: A methodological novel about autoethnography. AltaMira Press. Ellis, C. (2009). Fighting back or moving on: An autoethnographic response to critics. International Review of Qualitative Research, 3(2), 371–378. doi:10.1525/irqr.2009.2.3.371 Ellis, C., & Bochner, A. P. (2006). Analyzing analytic autoethnography: An autopsy. Journal of Contemporary Ethnography, 35(4), 429–449. doi:10.1177/0891241606286979 Ellis, C., van Haeringen, K., Harper, R., Bretag, T., Zucker, I., McBride, S., Rozenberg, P., Newton, P., & Saddiqui, S. (2020). Does authentic assessment assure academic integrity? Evidence from contract cheating data. Higher Education Research & Development, 39(3), 454–469. doi:10.1080/07294360.2019.1680956 Ellis, G., & Brewster, J. (1991). The storytelling handbook for primary teachers. Penguin. Ellis, R. (1990). Individual learning styles in classroom second language development. In J. A. L. de Jong (Ed.), Individualizing the assessment of language abilities (pp. 83–96). Multilingual Matters. Ellis, R. (1997). The Study of second language acquisition. Oxford University Press. Enever, J., & Moon, J. (2010). A global revolution: Teaching English at primary school. Metropolitan University. Epstein, J., Sanders, M., Simon, B., Salinas, K., Jansorn, N., & Van Voorhis, F. (2002). School, family, and community partnerships: Your handbook for action (2nd ed.). CorwinPress. Erdodi, L. A., Nussbaum, S., Sagar, S., Abeare, C. A., & Schwartz, E. S. (2017). Limited English proficiency increases failure rates on performance validity tests with high verbal mediation. Psychological Injury and Law, 10(1), 96–103. doi:10.100712207-017-9282-x

371

Compilation of References

Erdoğan, P., & Savaş, P. (2022). Investigating the selection process for initial English teacher education: Turkey. Teaching and Teacher Education, 110, 1–18. doi:10.1016/j.tate.2021.103581 Erguvan, I. D. (2021). The rise of contract cheating during the COVID-19 pandemic: A qualitative study through the eyes of academics in Kuwait. Language Testing in Asia, 11(1), 34. Advance online publication. doi:10.118640468-021-00149-y Erlam, R., Ellis, R., & Batstone, R. (2013). Oral corrective feedback on L2 writing: Two approaches compared. System, 41(2), 257–268. doi:10.1016/j.system.2013.03.004 European Commission. (2014). The Erasmus impact study: Ef fects of mobility on the skills and employability of students and the internationalisation of higher education institutions. European Commission: Education and Culture. https://ec.europa.eu/assets/eac/education/library/study/2014 /erasmus-impact-summary_en.pdf European Commission. (2015). ECTS users’ guide 2015. http://bibpurl.oclc.org/web/75797 https://ec.europa.eu/educa tion/library/publications/2015/ects-users-guide_en.pdf Every Student Succeeds Act. 2015. PL 114-95, 114 U.S.C. Fantini, A. E. (2000). A central concern: developing intercultural competence. SIT Occasional Papers Series, Inaugural Issue, 25–43. Fantini, A. E. (2012). Language: An essential component of intercultural communicative competence. In The Routledge handbook of language and intercultural communication (pp. 263–278). Routledge. Fantini, A. E. (2009). Assessing intercultural competence. Issues and tools. In D. K. Deardorff (Ed.), The SAGE handbook of intercultural competence (pp. 456–476). SAGE Publications. doi:10.4135/9781071872987.n27 Fantini, A. E. (2018). Intercultural communicative competence in educational exchange: A Multinational Perspective (1st ed.). Routledge. doi:10.4324/9781351251747 Farhady, H. (2005). Language assessment: A linguametric perspective. Language Assessment Quarterly: An International Journal, 2(2), 147–164. doi:10.120715434311laq0202_3 Fellmann, G. (2016). Interkulturelles Lernen sichtbar machen. Lernertagebücher. Praxis Fremdsprachenunterricht, 5, 26–33. Fetterman, D. M., Kaftarian, S., & Wandersman, A. (1996). Empowerment Evaluation. Academic Press. Feuerstein, R., Feuerstein, R. S., & Falik, L. H. (2010). Beyond smarter: Mediated learning and the brain’s capacity for change. Teachers College Press. Flege, J. E., & Fletcher, K. L. (1992). Talker and listener effects on the degree of perceived foreign accent. The Journal of the Acoustical Society of America, 91(1), 370–389. doi:10.1121/1.402780 PMID:1737886 Florent, J., & Walter, C. (1989). A better role for women in TEFL. ELT Journal, 43(3), 180–184. doi:10.1093/elt/43.3.180 Flutcher, G., & Davidson, F. (2007). Language Testing and Assessment: An Advance Resource Book. Routledge and Francis Group. Fojkar, M. D., Skela, J., & Kovač, P. (2013). A study of the use of narratives in teaching English as a foreign language to young learners. English Language Teaching, 6(6), 21–28. Fook, J., & Askeland, G. A. (2006). The ‘critical’ in critical reflection. In S. White, J. Fook, & F. Gardner (Eds.), Critical reflection in health and social care. Open University Press/McGraw-Hill Education. Fook, J., & Gardner, F. (2007). Practicing critical reflection: A resource handbook: A handbook. McGraw-Hill Education. 372

Compilation of References

Forey, G., Besser, S., & Sampson, N. (2016). Parental involvement in foreign language learning: The case of Hong Kong. Journal of Early Childhood Literacy, 16(3), 383–413. doi:10.1177/1468798415597469 Fraser, N. (2009). Scales of justice: Reimagining political space in a globalizing world. Columbia University Press. Frawley, J., Nguyen, T., & Sarian, E. (Eds.). (2020). Transforming lives and systems: cultural competence and the higher education interface. Springer Singapore. doi:10.1007/978-981-15-5351-6 Fredericks, A. D., & Rasinski, T. V. (1990). Working with parents: Involving parents in the assessment process. The Reading Teacher, 44(4), 346–349. French, L. M. (2006). Phonological working memory and L2 acquisition: A developmental study of Quebec Francophone children learning English. Edwin Mellen Press. Frey, B. B., & Schmitt, V. L. (2007). Coming to terms with classroom assessment. Journal of Advanced Academics, 18(3), 402–423. doi:10.4219/jaa-2007-495 Frey, B. B., Schmitt, V. L., & Allen, J. P. (2012). Defining authentic classroom assessment. Practical Assessment, Research & Evaluation, 17(2), 1–18. Fuchs, D., Mock, D., Morgan, P. L., & Young, C. L. (2003). Responsiveness to intervention: Definition, evidence, and implications for the learning disabilities construct. Learning Disabilities Research & Practice, 18(3), 157–171. doi:10.1111/1540-5826.00072 Fuchs, K. (2021). Innovative teaching: A qualitative review of flipped classrooms. International Journal of Learning. Teaching and Educational Research, 20(3), 18–32. doi:10.26803/ijlter.20.3.2 Fulcher, G. (2000). Computers in language testing. In P. Brett & G. Motteram (Eds.), A Special Interest in Computers (pp. 93–107). IATEFL Publications. Fulcher, G. (2012). Assessment literacy for the language classroom. Language Assessment Quarterly, 9(2), 113–132. doi:10.1080/15434303.2011.642041 Gacs, A., Goertler, S., & Spasova, S. (2020). Planned online language education versus crisis‐prompted online language teaching: Lessons for the future. Foreign Language Annals, 53(2), 380–392. doi:10.1111/flan.12460 Gagne, R. M., Wager, W. W., Golas, K. C., & Keller, J. M. (2004). Principles of Instructional Design (5th ed.). Thomson Wadsworth. Gak, D. M. (2011). Textbook-an important element in the teaching process. Hatchaba Journal, 19(2), 78–82. Gamage, K. A., Silva, E. K. D., & Gunawardhana, N. (2020). Online delivery and assessment during COVID-19: Safeguarding academic integrity. Education Sciences, 10(11), 301. doi:10.3390/educsci10110301 Gannon, S. (2006). The (im)possibilities of writing the self-writing: French poststructural theory and autoethnography. Cultural Studies ↔ Critical Methodologies, 6(4), 474-495. doi:10.1177/1532708605285734 Gan, Z., Zhao, X., Zhou, S., & Wang, R. (2021). Improving mispronunciation detection of Mandarin for Tibetan students based on the end-to-end speech recognition model. Proc. International Symposium on Artificial Intelligence and its Application on Media (ISAIAM), 151-154. 10.1109/ISAIAM53259.2021.00039 Garcia Mayo, M. P., & Garcia Lecumberri, M. L. (Eds.). (2003). Age and the acquisition of English as a foreign language: Theoretical issues and fieldwork. Multilingual Matters. doi:10.21832/9781853596407

373

Compilation of References

Garg, M., & Goel, A. (2022). A systematic literature review on online assessment security: Current challenges and integrity strategies. Computers & Security, 113, 1–13. doi:10.1016/j.cose.2021.102544 Garrett-Rucks, P. (2016). Intercultural competence in instructed language learning: bridging theory and practice. IAP. Garton, S., & Copland, F. (Eds.). (2019). The Routledge handbook of teaching English to young learners. London: Routledge. Garton, S., Copland, F., & Burns, A. (2011). Investigating global practices in teaching English to young learners. ELT Research Papers, 11(1), 1–24. Gass, S. M. (1997). Input, interaction, and the second language learner. Erlbaum. Gates, S. (1995). Exploiting washback from standardized tests. In J. D. Brown & S. O. Yamashita (Eds.), Language testing in Japan (pp. 107–112). Japan Association for Language Teaching. Gaynor, B. (2014). From language policy to pedagogic practice: Elementary school in Japan. In S. Rich (Ed.), International perspectives on teaching English to young learners (pp. 66-86). Houndsmill: Palgrave Macmillan. GCE Board Regulations and Syllabuses . (2011). Cameroon General Certificate of Education Board. GCE O/L English Language Syllabus . (2016). Cameroon General Certificate of Education Board. Gerring, J. (2004). What is a case study and what is it good for? The American Political Science Review, 98(2), 341–354. doi:10.1017/S0003055404001182 Ghorbani, M. (2012). The washback effect of the university entrance examination on Iranian English teachers’ curricular planning and instruction. Iranian EFL Journal, 2, 60–87. Ghosn, I. (2002). Four good reasons to use literature in primary schools. ELT Journal, 56(2), 172–179. doi:10.1093/ elt/56.2.172 Giannakos, M. N. (2013). Enjoy and learn with educational games: Examining factors affecting learning performance. Computers & Education, 68, 429–439. doi:10.1016/j.compedu.2013.06.005 Gibbons, P. (2003). Mediating language learning: Teacher interactions with ESL students in a content-based classroom. TESOL Quarterly, 37(2), 247–272. doi:10.2307/3588504 Gibbs, G. (2006). How assessment frames student learning. In C. Bryan & K. Clegg (Eds.), Innovative assessment in higher education (pp. 23–36). Routledge. Giorgio, G. (2013). Reflections on writing through memory in autoethnography. In S. Holman Jones, T. E. Adams, & C. Ellis (Eds.), Handbook of autoethnography (pp. 406–424). Left Coast Press. Gipps, C. V. (1994). Beyond testing: towards a theory of educational assessment. Routledge. Giraldo, F. (2018). Language assessment literacy: Implications for language teachers. Profile: Issues in Teachers’. Professional Development, 20(1), 179–195. doi:10.15446/profile.v20n1.62089 Giraldo, F. (2021). Language assessment literacy and teachers’ professional development: A review of the literature. Profile: Issues in Teachers’. Professional Development, 23(2), 265–279. doi:10.15446/profile.v23n2.90533 Gisladottir, R. S., Bögels, S., & Levinson, S. C. (2018). Oscillatory Brain Responses Reflect Anticipation during Comprehension of Speech Acts in Spoken Dialog. Frontiers in Human Neuroscience, 7, 46. doi:10.3389/fnhum.2018.00034 PMID:29467635

374

Compilation of References

Godwin-Jones, B. (2008). Emerging technologies: Web-writing 2.0: Enabling, documenting, and assessing writing online. Language Learning & Technology, 12(2), 7–13. Goodall, H. L. (2000). Writing the new ethnography. AltaMira Press. Goodwin, A. L., & Macdonald, M. (1997). Educating the Rainbow: Authentic Assessment and Authentic Practice for Diverse Classrooms. In A. L. Goodwin (Ed.), Assessment for Equity and Inclusion. Embracing All Our Children (pp. 221–228). Routledge. Gordon, G., Spaulding, S., Westlund, J. K., Lee, J. J., Plummer, L., Martines, M., Das, M., & Breazeal, C. (2016). Affective Personalization of a Social Robot Tutor for Children’s Second Language Skills. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) (pp. 3951–3957). 10.1609/aaai.v30i1.9914 Graddol, D. (2006). English next. British Council Publications. Grant, N. (1987). Making the most of your textbook: Vol. 11. No. 8. Longman. Graves, K. (2001). Teachers as course developers. Cambridge University Press. Gray, K., Waycott, J., Clerehan, R., Hamilton, M., Richardson, J., Sheard, J., & Thompson, C. (2012). Worth it? Findings from a study of how academics assess students’ Web 2.0 activities. Research in Learning Technology, 20(1), 1–15. doi:10.3402/rlt.v20i0.16153 Green, A. (2007). IELTS washback in context: Preparation for academic writing in higher education (Vol. 25). Cambridge University Press. Green, A. (2014). Exploring language assessment and testing: Language in action. Routledge. Gresham, F. M. (2004). Current status and future directions of school-based behavioral interventions. School Psychology Review, 33(3), 326–343. doi:10.1080/02796015.2004.12086252 Griffith, R. L., Wolfeld, L., Armon, B. K., Rios, J., & Liu, O. L. (2016). Assessing intercultural competence in higher education: Existing research and future directions. ETS Research Report Series, 2016(2), 1–44. doi:10.1002/ets2.12112 Grigorenko, E. L. (2009). Dynamic assessment and response to intervention: Two sides of one coin. Journal of Learning Disabilities, 42(2), 111–132. doi:10.1177/0022219408326207 PMID:19073895 Guangul, F. M., Suhail, A. H., Khalit, M. I., & Khidhir, B. A. (2020). Challenges of remote assessment in higher education in the context of COVID-19: A case study of Middle East College. Educational Assessment, Evaluation and Accountability, 32(4), 519–535. doi:10.100711092-020-09340-w PMID:33101539 Gudykunst, W. B. (2004). Bridging differences: Effective intergroup communication. Sage (Atlanta, Ga.). Guo, M., Rui, C., Wang, W., Lin, B., Zhang, J., & Xie, Y. (2019). A study on mispronunciation detection based on fine-grained speech attribute. Proc. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 1197-1201. 10.1109/APSIPAASC47483.2019.9023156 Guskey, T. (2016). How classroom assessments improve learning. In M. Scherer (Ed.), On formative assessment: Readings from educational leadership (pp. 3–13). ASCD. Gutierrez-Clellen, V. F., & Pena, E. (2001). Dynamic assessment of diverse children: A tutorial. Language, Speech, and Hearing Services in Schools, 32(4), 212–224. doi:10.1044/0161-1461(2001/019) PMID:27764448

375

Compilation of References

Hakim, B. (2020). Technology integrated online classrooms and the challenges faced by the EFL teachers in Saudi Arabia during the COVID-19 pandemic. International Journal of Applied Linguistics and English Literature, 9(5), 33–39. doi:10.7575/aiac.ijalel.v.9n.5p.33 Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Sage. Hammer, M. R., Bennett, M. J., & Wiseman, R. (2003). Measuring intercultural sensitivity: The intercultural development inventory. International Journal of Intercultural Relations, 27(4), 421–443. doi:10.1016/S0147-1767(03)00032-4 Harmer, J. (2004). How to teach writing. Longman. Harrison, C. J., Könings, K. D., Schuwirth, L. W., Wass, V., & van der Vleuten, C. P. (2017). Changing the culture of assessment: The dominance of the summative assessment paradigm. BMC Medical Education, 17(1), 73. doi:10.118612909017-0912-5 PMID:28454581 Hart Research Associates. (2015). Falling short? College learning and career success. Selected findings from online surveys of employers and college students conducted on behalf of the association of American colleges & universities. Hart Research Associates. https://www.aacu.org/sites/default/files/files/LEAP/2015empl oyerstudentsurvey.pdf H a s a n , S . A . ( 2 0 0 9 ) . E n gl i s h l a n g u a ge t e a c h i n g i n Pa k i s t a n . R e t r i e v e d o n J u n e , 9 , 2 0 1 8 f r o m : h t t p : / / w w w . a r t i c l e s b a s e . c o m / l a n g u a g e s - a r t i c l e s / e n g l i s h - l a n g u age-teaching-in-pakistan- 1326181.html Hashemi, S. Z., & Borhani, A. (2015). Textbook evaluation: An investigation into “American English File” series. International Journal on Studies in English Language and Literature, 3(5), 47–55. Hasram, S., Nasir, M. K. M., Mohamad, M., Daud, M. Y., Abd Rahman, M. J., & Mohammad, W. M. R. W. (2021). The effects of wordwall online games (Wow) on English language vocabulary learning among year 5 pupils. Theory and Practice in Language Studies, 11(9), 1059–1066. doi:10.17507/tpls.1109.11 Hassan, A., Jamaludin, N. S., Sulaiman, T., & Baki, R. (2010). Western and Eastern educational philosophies. 40th Philosophy of Education Society of Australasia Conference, Murdoch University. Hasselgreen, A. (2005). Assessing the language of young learners. Language Testing, 22(3), 337–354. doi:10.1191/0265532205lt312oa Hasson, N. (2018). The dynamic assessment of language learning. Routledge. Hasson, N., Camilleri, B., Jones, C., Smith, J., & Dodd, B. (2013). Discriminating disorder from difference using dynamic assessment with bilingual children. Child Language Teaching and Therapy, 29(1), 57–75. doi:10.1177/0265659012459526 Hatipoğlu, Ç. (2010). Summative evolution of an undergraduate ‘English Language Testing and Evaluation’ course by future English language teachers. English Language Teacher Education and Development (ELTED), 13, 40-51. http://www.elted.net/uploads/7/3/1/6/7316005/v13_5hatipoglu. pdf Hatipoğlu, Ç. (2015). English language testing and evaluation (ELTE) training in Turkey: Expectations and needs of pre-service English language teachers. ELT Research Journal, 4(2), 111-128. https://dergipark.org.tr/en/pub/eltrj/issue/28780/308006 Hatipoğlu, Ç., & Erçetin, G. (2016). Türkiye’de yabancı dilde ölçme ve değerlendirme eğitiminin dünü ve bugünü [The past and present of Foreign Language Testing and Evaluation Education in Turkey]. In S. Akcan, & Y. Bayyurt (Eds.), 3. Ulusal Yabancı Dil Eğitimi Kurultayı Bildiri Kitabı (pp. 72-89). Istanbul: Boğaziçi Üniversitesi Press. 376

Compilation of References

Hatipoğlu, Ç., & Erçetin, G. (2016). Türkiye’de yabancı dilde ölçme ve değerlendirme eğitiminin dünü ve bugünü ve yarını [The past, present, and future of foreign language testing and evaluation education in Turkey]. In Proceedings of the third national conference on language education (pp. 72-89). Academic Press. Hatipoğlu, Ç. (2016). The impact of the university entrance exam on EFL education in Turkey: Pre-service English language teachers’ perspective. Procedia: Social and Behavioral Sciences, 232, 136–144. doi:10.1016/j.sbspro.2016.10.038 Hatipoğlu, Ç. (2017a). History of Language Teacher Training and English Language Testing and Evaluation (ELTE) Education in Turkey. In Y. Bayyurt & N. Sifakis (Eds.), English Language Education Policies and Practices in the Mediterranean Countries and Beyond (pp. 227–257). Peter Lang. Hatipoğlu, Ç. (2017b). Assessing speaking skills. In E. Solak (Ed.), Assessment in language teaching (pp. 118–148). Pelikan. Hatipoğlu, Ç. (2021a). Testing and assessment of speaking skills, test task types, and sample test items. In S. Çelik, H. Çelik, & C. Coombe (Eds.), Language assessment and test preparation in English as a foreign language (EFL) education (pp. 119–173). Vizetek. Hatipoğlu, Ç. (2021b). Assessment of language skills: Productive skills. In S. Inal & O. Tunaboylu (Eds.), Language Assessment Theory with Practice (pp. 167–211). Nobel. Hatipoğlu, Ç. (2022). Foreign Language Teacher Selection and Foreign Language Teaching and Assessment in the Reform Period of the Ottoman Empire (1700-1922). In G. Sonmez (Ed.), Prof. Dr. Ayşe S. AKYEL’e Öğrencilerinden Armağan Kitap: Türkiye’de Yabancı Dil Öğretmeni Eğitimi Üzerine Araştırmalar (pp. 23–38). Eğiten Kitap Yayıncılık. Hatziapostolou, T., & Paraskakis, I. (2010). Enhancing the impact of formative feedback on student learning through an online feedback system. Electronic Journal of e-Learning, 8(2), 2111-212. Hayler, M. (2010). Autoethnography: Making memory methodology. Research in Education, 3(1), 5–9. Haywood, H. C., & Tzuriel, D. (2002). Applications and challenges in dynamic assessment. Peabody Journal of Education, 77(2), 40–63. doi:10.1207/S15327930PJE7702_5 Haznedar, B. (2013). Child second language acquisition from a generative perspective. Linguistic Approaches to Bilingualism, 3(1), 26–47. doi:10.1075/lab.3.1.02haz Hecker, K. (2015). Kompetenzkonzepte des Bildungspersonals im Übergangssystem. Springer Fachmedien Wiesbaden. doi:10.1007/978-3-658-07655-9 Held, D., Goldblatt, D., Perraton, J., & McGrew, A. G. (1999). Global Transformations: Politics, economics, and culture. Stanford University Press. Hemminki, J., & Erkinheimo-Kyllonen, A. (2017). A humanoid robot as a language tutor - a case study from Helsinki Skills Center. Proceedings of R4L@ HRI2017. Henari, T. F., & Ahmed, D. A. K. (2021). Evaluating educators and students’ perspectives on asynchronous and synchronous modes of e-learning in crisis education. Asian EFL Journal Research Articles, 28(2), 80–98. Henning, G. (1987). A guide to language testing, development, evaluation, and research. Newbury House. Herazo, J. D., Davin, K. D., & Sagre, A. (2019). L2 dynamic assessment: An activity theory perspective. Modern Language Journal, 103(2), 443–458. doi:10.1111/modl.12559

377

Compilation of References

Herberg, J. S., Feller, S., Yengin, I., & Saerbeck, M. (2015). Robot watchfulness hinders learning performance. In Proceedings of the 24th IEEE International Symposium on Robot and Human Interactive Communication (pp. 153–160). Los Alamitos, CA: IEEE. Heritage, M. (2007). Formative assessment: What do teachers need to know and do? Phi Delta Kappan, 89(2), 140–145. doi:10.1177/003172170708900210 Herrera Mosquera, L., & Macías,, V. D. F. (2015). A call for language assessment literacy in the education and development of teachers of English as a foreign language. Colombian Applied Linguistics Journal, 17(2), 302–312. doi:10.14483/ udistrital.jour.calj.2015.2.a09 Hesse, H.-G. (2008). Interkulturelle Kompetenz: Vom theoretischen Konzept über die Operationalisierung bis zum Messinstrument. In N. Jude, J. Hartig, & E. Klieme (Eds.), Kompetenzerfassung in pädagogischen Handlungsfeldern. Theorien, Konzepte und Methoden. (Vol. 26). Bundesministerium für Bildung und Forschung. Heu, L. C., van Zomeren, M., & Hansen, N. (2019). Lonely alone or lonely together? A Cultural-psychological examination of individualism–Collectivism and loneliness in five European countries. Personality and Social Psychology Bulletin, 45(5), 780–793. doi:10.1177/0146167218796793 PMID:30264659 Hidri, S. (2020). New challenges in language assessment. In S. Hidri (Ed.), Changing Language Assessment (1st ed., pp. 3–22). Palgrave Macmillan. doi:10.1007/978-3-030-42269-1_1 Hinkel, E., & Fotos, S. (2002). New perspectives on grammar teaching in second language classrooms. Routledge. Hodder, I. (2013). The interpretation of documents and material culture. Sage (Atlanta, Ga.). Hofstede, G. H. (2001). Culture’s consequences: Comparing values, behaviors, institutions, and organizations across nations (2nd ed.). Sage Publications. Hogan, K., & Pressley, M. (1997). Scaffolding student learning: Instructional approaches and issues. Brookline Books. Holden, O. L., Norris, M. E., & Kuhlmeier, V. A. (2021). Academic integrity in online assessment: A research review. Frontiers in Education, 6, 1–13. doi:10.3389/feduc.2021.639814 Holden, S., & Rogers, M. (1997). English language teaching. Delti. Holman Jones, S. (2005). Autoethnography: Making the personal political. In N. K. Denzin & Y. S. Lincoln (Eds.), Handbook of qualitative research (3rd ed., pp. 763–791). SAGE. Holman Jones, S. (2018). Creative selves / creative cultures: Critical autoethnography, performance, and pedagogy (creativity, education and the arts). Palgrave Macmillan. doi:10.1007/978-3-319-47527-1 Hosseinpour, V., Sherkatolabbasi, M., & Yarahmadi, M. (2015). The impact of parents’ involvement in and attitude toward their children’s foreign language programs for learning English. International Journal of Applied Linguistics and English Literature, 4(4), 175–185. Hosseinpour, V., Yazdani, S., & Yarahmadi, M. (2015). The relationship between parents’ involvement, attitude, educational background and level of income and their children’s english achievement test scores. Journal of Language Teaching and Research, 6(6), 1370–1378. doi:10.17507/jltr.0606.27 Howell, D. D., Tseng, D. C., & Colorado-Resa, J. T. (2017). Fast assessments with digital tools using multiple-choice questions. College Teaching, 65(3), 145–147. doi:10.1080/87567555.2017.1291489

378

Compilation of References

Hrastinski, S., Stenbom, S., Benjaminsson, S., & Jansson, M. (2019). Identifying and exploring the effects of different types of tutor questions in individual online synchronous tutoring in mathematics. Interactive Learning Environments, 0(0), 1–13. Hsiao, H. S., Chang, C. S., Lin, C. Y., & Hsu, H. L. (2015). Irobiq: The influence of bidirectional interaction on kindergarteners’ reading motivation, literacy, and behavior. Interactive Learning Environments, 23(3), 269–292. doi:10.1 080/10494820.2012.745435 Hu, W., Qian, Y., & Soong, F. K. (2014). A new neural network-based logistic regression classifier for improving mispronunciation detection of L2 language learners. Proc. The 9th International Symposium on Chinese Spoken Language Processing, 245-249. 10.1109/ISCSLP.2014.6936712 Hu, W., Qian, Y., & Soong, F.K. (2015). An improved DNN-based approach to mispronunciation detection and diagnosis of L2 learners’ speech. Proc. Speech and Language Technology in Education (SLaTE 2015), 71-76. Huba, M. E., & Freed, J. E. (2000). Using Rubrics to Provide feedback to students. Learner-centred Assessment on College Campuses. Hubley, N. N. (2012). Assessing reading. In C. Coombe, P. Davidson, B. O’Sullivan, & S. Stoynoff (Eds.), The Cambridge Guide to Second Language Assessment (pp. 211–217). Cambridge University Press. Hughes, A. (2003). Testing for language teachers (2nd ed.). Cambridge University Press. Hung, S. T. A. (2012). A washback study on e-portfolio assessment in an English as a foreign language teacher preparation program. Computer Assisted Language Learning, 25(1), 21–36. doi:10.1080/09588221.2010.551756 Hurst, V., & Lally, M. (1992). Assessment and the nursery curriculum. In G. Blenkin & A. V. Kelly (Eds.), Assessment in early childhood education (pp. 46–68). Paul Chapman. Hussain, S., Tadesse, T., & Sajid, S. (2015). Norm-referenced and criterion-referenced test in EFL classroom. Journal of Humanities and Social Science Invention, 4(10), 24–30. Hutchinson, T. W. A. (1987). English for Specific Purposes: A learning-centred approach. Cambridge University Press. doi:10.1017/CBO9780511733031 Hutchinson, T., & Torres, E. (1994). The textbook as an agent of change. English Language Teaching Journal, 48(4), 315–328. doi:10.1093/elt/48.4.315 Hu, W., Qian, Y., & Soong, F. K. (2014). A DNN-based acoustic modeling of tonal language and its application to Mandarin pronunciation training. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 3206-3210. 10.1109/ICASSP.2014.6854192 Hu, W., Qian, Y., Soong, F., & Wang, Y. (2015). Improved mispronunciation detection with a deep neural network trained acoustic models and transferred learning-based logistic regression classifiers. Speech Communication, 67, 154–166. doi:10.1016/j.specom.2014.12.008 Hu, X., Rousseau, R., & Chen, J. (2011). On the definition of forward and backward citation generations. Journal of Informetrics, 5(1), 27–36. doi:10.1016/j.joi.2010.07.004 Hyland, K., & Hyland, F. (2006). Feedback on second language students’ writing. Language Teaching, 39(2), 83–101. doi:10.1017/S0261444806003399 Hymes, D. (1964). Introduction: Toward ethnographies of communication. American Anthropologist, 66(6/2), 1-34. Hymes, D. (1966). Two types of linguistic relativity. In W. Bright (Ed.), Sociolinguistics (pp. 114–158). Mouton. 379

Compilation of References

Hymes, D. (1971). Competence and performance in linguistic theory. In R. Huxley & E. Ingram (Eds.), Language Acquisition: Models and Methods (pp. 3–28). Academic Press. Hymes, D. (1972). On communicative competence. In J. B. Pride & J. Holmes (Eds.), Sociolinguistics: Selected Readings (pp. 269–293). Penguin. Hyun, E. J., Kim, S. Y., & Jang, S. K. (2008). Effects of a language activity using an ‘intelligent’ robot on the linguistic abilities of young children. Korean Journal of Early Childhood Education, 28(5), 175–197. doi:10.18023/kjece.2008.28.5.009 Iaremenko, N. V. (2017). Enhancing English language learners’ motivation through online games. Information Technologies and Learning Tools, 59(3), 126–133. doi:10.33407/itlt.v59i3.1606 Iio, T., Maeda, R., Ogawa, K. Y., Ishiguro, H., Suzuki, K., Aoki, T., Maesaki, M., & Hama, M. (2019). Improvement of Japanese adults’ English-speaking skills via experiences speaking to a robot. Journal of Computer Assisted Learning, 35(2), 228–245. doi:10.1111/jcal.12325 Inbar-Lourie, O. (2008). Constructing a language assessment knowledge base: A focus on language assessment courses. Language Testing, 25(3), 385–402. doi:10.1177/0265532208090158 Inbar-Lourie, O. (2012). Language assessment literacy. In C. Chapelle (Ed.), The encyclopedia of applied linguistics (pp. 1–9). John Wiley & Sons. doi:10.1002/9781405198431.wbeal0605 Inbar-Lourie, O. (2013). Guest Editorial to the special issue on language assessment literacy. Language Testing, 30(3), 301–307. doi:10.1177/0265532213480126 Inbar-Lourie, O., & Shohamy, E. (2009). Assessing young language learners: What is the construct? In M. Nikolov (Ed.), The age factor and early language learning (pp. 83–96). Mouton de Gruyter. INCA Assessor Manual. (2004). https://ec.europa.eu/mig rant-integ ration/librar y-document/in ca-project-intercultural-competence-assessment_en Individuals with Disabilities Education Act (as amended), 20 U.S.C. Sec. 1401 etseq. Infante, P., & Poehner, M. E. (2019). Realizing the ZPD in second language education: The complementary contributions of dynamic assessment and mediated development. Language and Sociocultural Theory, 6(1), 63–91. doi:10.1558/lst.38916 Ingold, T. (2002). Companion encyclopedia of anthropolog y . Tayl o r & F r a n c i s . h t t p s : / / p u b l i c . e b o o k c e n t r a l . p ro qu e st . c o m / ch o i c e / p u b l i c f u l l r e cord.aspx?p=169490 In, J.-Y., & Han, J.-H. (2015). The acoustic-phonetic change of English learners in robot assisted learning. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction (pp. 39–40). New York, NY: ACM. 10.1145/2701973.2702003 Ioannou-Georgiou, S., & Pavlou, P. (2003). Assessing Young Learners. Resource Books for Teachers. Oxford University Press. Isbell, D. R., & Kremmel, B. (2020). Test review: Current options in at-home language proficiency tests for making high-stakes decisions. Language Testing, 37(4), 600–619. doi:10.1177/0265532220943483 Ito, A., Lim, Y.-L., Suzuki, M., & Makino, S. (2005). Pronunciation error detection method based on error rule clustering using a decision tree. Proc. 9th Eur. Conf. Speech Commun. Technol., pp. 173–176. Jackson, C. N., & Ruf, H. T. (2018). The importance of prime repetition among intermediate-level second language learners. Studies in Second Language Acquisition, 40(3), 677–692. doi:10.1017/S0272263117000365 380

Compilation of References

Jacob, B. A. (2001). Getting tough? The impact of high school graduation exams. Educational Evaluation and Policy Analysis, 23(2), 99–121. doi:10.3102/01623737023002099 Jamali, F., & Gheisari, N. (2015). Formative assessment in the EFL context of Kermanshah high schools: Teachers familiarity and application. Global Journal of Foreign Language Teaching, 5(1), 76–84. doi:10.18844/gjflt.v5i0.48 Janebi Enayat, M., & Babaii, E. (2018). Reliable predictors of reduced redundancy test performance: The interaction between lexical bonds and test takers’ depth and breadth of vocabulary knowledge. Language Testing, 35(1), 121–144. doi:10.1177/0265532216683223 Jang, E. E. (2014). Focus on assessment. Oxford University Press. Janisch, C., Liu, X., & Akrofi, A. (2007). Implementing alternative assessment: Opportunities and obstacles. The Educational Forum, 71(3), 221–230. doi:10.1080/00131720709335007 Janke, S., Rudert, S. C., Petersen, N., Fritz, T. M., & Daumiller, M. (2021). Cheating in the wake of COVID-19: How dangerous is ad-hoc online testing for academic integrity? Computers and Education Open, 2, 1–9. doi:10.1016/j. caeo.2021.100055 Jenkins, J. (2015). Global Englishes. Routledge. Jinu, R., & Shamna Beegum, S. (2019). Plickers: A tool for language assessment in the digital age. International Journal of Recent Technology and Engineering, 8(2S3), 166–171. doi:10.35940/ijrte.B1031.0782S319 JISC (Joint Information Systems Committee). (2007). Effective practice with e-assessment: An overview of technologies, policies and practice in further and higher education. https://www.jisc.ac.uk/media/documents/themes/elearning/effp raceassess.pdf John, P. (2018). Constructivism: Its implications for language teaching and second-language acquisition. Papers in Education and Development, 33-34. https://journals.udsm.ac.tz/index.php/ped/article/view/1483 Johnson, J., & Newport, E. (1989). Critical period effects in second language learning: The influence of maturational state on the acquisition of English as a second language. Cognitive Psychology, 21(1), 60–99. doi:10.1016/0010-0285(89)900030 PMID:2920538 Johnson, K., & Johnson, H. (2001). Encyclopedic dictionary of applied linguistics: A handbook for language teaching. Foreign Language Teaching and Research Press and Blackwell. Johnstone, R. (2009). An early start: What are the key conditions for generalized success? In J. Enever, J. Moon, & U. Raman (Eds.), Young learner English language policy and implementation: International perspectives (pp. 31–41). Garnet Publishing. Jones, E., & de Wit, H. (2012). Globalization of internationalization: Thematic and regional reflections on a traditional concept. AUDEM: The International Journal of Higher Education and Democracy, 3(1), 35–54. Jordan, S. R. (2013). Conceptual clarification and the task of improving research on academic ethics. Journal of Academic Ethics, 11(3), 243–256. doi:10.100710805-013-9190-y Joshanloo, M. (2014). Eastern conceptualizations of happiness: Fundamental differences with western views. Journal of Happiness Studies, 15(2), 475–493. doi:10.100710902-013-9431-1 Kalaycı, G., & Öz, H. (2018). Parental inveolvement in English language education: Understanding the parents’ perceptions. International Online Journal of Education & Teaching, 5(4), 832–847.

381

Compilation of References

Kalkan, E. (2017). Eğitim, kültür, öğretmen: Öğretmen odaklı mı? Öğrenci odaklı mı? Istanbul Journal of Innovation in Education, 3(2), 51–63. Kanda, T., Hirano, T., Eaton, D., & Ishiguro, H. (2004). Interactive robots as social partners and peer tutors for children: A field trial. Human-Computer Interaction, 19(1), 61–84. doi:10.120715327051hci1901&2_4 Kanero, J., Geçkin, V., Oranç, C., Mamus, E., Küntay, A. C., & Göksun, T. (2018). Social robots for early language learning: Current evidence and future directions. Child Development Perspectives, 12(3), 146–151. doi:10.1111/cdep.12277 Kanero, J., Oranç, C., Koşkulu, S., Kumkale, G. C., Göksun, T., & Küntay, A. (2022). Are tutor robots for everyone? The influence of attitudes, anxiety, and personality on robot-led language learning. International Journal of Social Robotics, 14(2), 297–312. doi:10.100712369-021-00789-3 Kangasvieri, T., & Leontjev, D. (2021). Current L2 self-concept of Finnish comprehensive school students: The role of grades, parents, peers, and society. System, 100, 1–14. doi:10.1016/j.system.2021.102549 Karabulut, A. (2007). Micro level impacts of foreign language test (university entrance examination) in Turkey: A washback study [MA thesis]. Available from ProQuest Dissertations & Theses Global. (304856856) Kasap, S. (2019). Akademisyenlerin gozünden Türkiye’deki İngilizce eğitimi. Yüzüncü Yıl Üniversitesi Eğitim Fakültesi Dergisi, 16(1), 1032–1053. doi:10.23891/efdyyu.2019.152 Kasap, S. (2020). Sosyodilbilim ve dil eğitimi. In F. Tanhan & H. İ. Özok (Eds.), Eğitim Ortamlarında Nitelik. Anı Yayıncılık. Kasap, S. (2021). Sosyodilbilim. Akademisyen Kitabevi. Kausar, G., Mushtaq, M., & Badshah, I. (2016). The evaluation of English language textbook taught at intermediate level. Gomal University Journal of Research, 4, 32–43. Keane, T., Williams, M., Chalmers, C., & Boden, M. (2017). Humanoid robots awaken ancient language. Australian Educational Leader, 39(4), 58–61. Keleş, U. (2020). My language learning, using, and researching stories: Critical autoethnography of socialization (Publication No. 28154180) [Doctoral dissertation, The University of Alabama, Tuscaloosa]. ProQuest Dissertations and Theses Global. Keleş, U., & Yazan, B. (2022). “Fill in the blanks” vs. “feelin’ the blanks:” Communicative language teaching and ELT coursebooks. In H. Celik & S. Celik (Eds.), Coursebook evaluation in English language teaching (ELT) (pp. 161-186). Vizetek. Keleş, U. (2022). In an effort to write a “good” autoethnography in qualitative educational research: A modest proposal. Qualitative Report, 27(9), 2026–2046. doi:10.46743/2160-3715/2022.5662 Keleş, U. (2022b). Autoethnography as a recent methodology in applied linguistics: A methodological review. The Qualitative Report, 2(27), 448–474. doi:10.46743/2160-3715/2022.5131 Keleş, U. (2023). Exploring my in-betweenness as a growing transnational scholar through poetic autoethnography. In L. J. Pentón Herrera, E. Trinh, & B. Yazan (Eds.), Doctoral Students’ identities and emotional wellbeing in applied linguistics: Autoethnographic accounts. Routledge. doi:10.4324/9781003305934-7 Kendik-Gut, J. (2019). Influence of Background Knowledge and Language Proficiency on Comprehension of Domainspecific Texts by University Students. Theory and Practice of Second Language Acquisition, 5(2), 59–74. doi:10.31261/ tapsla.7519 382

Compilation of References

Kennedy, J., Baxter, P., Senft, E., & Belpaeme, T. (2016). Social robot tutoring for child second language learning. In Proceedings of the Eleventh ACM/IEEE International Conference on Human Robot Interaction (pp. 231–238). New York, NY: ACM 10.1109/HRI.2016.7451757 Ketabi, S., & Ketabi, S. (2014). Classroom and formative assessment in second/foreign language teaching and learning. Theory and Practice in Language Studies, 4(2), 435–440. doi:10.4304/tpls.4.2.435-440 Khairil, L. F., & Mokshein, S. E. (2018). 21st century assessment: Online assessment. International Journal of Academic Research in Business & Social Sciences, 8(1), 659–672. doi:10.6007/IJARBSS/v8-i1/3838 Khan, H. A. (2007). A needs analysis of Pakistani state boarding schools secondary level students for adoption of communicative language teaching [Master Thesis]. School of Arts & Education, Middlesex University, London, UK. Khan, R. M. B. (2018, May 20). English in Pakistan. The Nation. Retrieved on June 8, 2019 from https://nation.com. pk/24-May-2018/english-in-pakistan Kidd, C. D., & Breazeal, C. (2004). Effect of a robot on user perceptions. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 3559–3564). Los Alamitos, CA: IEEE. 10.1109/IROS.2004.1389967 Kim, D., & Hall, J. H. (2002). The role of interactive book reading program in the development of second language pragmatic competence. Modern Language Journal, 86, 332–348. Kirby, J. R. (1988). Style, strategy, and skill in reading. In R. R. Schmeck (Ed.), Learning strategies and learning styles (pp. 229–274). Plenum Press. doi:10.1007/978-1-4899-2118-5_9 Kırkgöz, Y. (2008). A case study of teachers’ implementation of curriculum innovation in English language teaching in Turkish primary education. Teaching and Teacher Education, 24(7), 1859–1875. doi:10.1016/j.tate.2008.02.007 Kırkgöz, Y. (2009). English language teaching in Turkish primary education. In J. Enever, J. Moon, & U. Raman (Eds.), Young learner English language policy and implementation: International perspectives (pp. 189–195). Garnet Education. Kirsch, I., Jamieson, J., Taylor, C., & Eignor, D. (1998). Computer familiarity among TOEFL examinees (TOEFL Research Report No. 59). Princeton, NJ: Educational Testing Service. Kitao, K., & Kitao, S. K. (1997). Selecting and developing teaching/learning materials. The Internet TESL Journal, 4(4), 20–45. Koch, M., & Straßer, P. (2008). Der Kompetenzbegriff: Kritik einer neuen Bildungsleitsemantik. In M. Koch & P. Straßer (Eds.), In der Tat kompetent: Zum Verständnis von Kompetenz und Tätigkeit in der beruflichen Benachteiligtenförderung (pp. 25–52). wbv Media. Köksal, D., & Ulum, Ö. G. (2016). Language learning strategies of Turkish and Arabic Students: A cross-cultural study. European Journal of Foreign Language Teaching, 1(1), 122–143. Köksal, D., & Ulum, Ö. G. (2018). Language assessment through Bloom’s Taxonomy. Journal of Language and Linguistic Studies, 14(2), 76–88. Koris, R., & Pal, A. (2021). Fostering learners’ involvement in the assessment process during the COVID-19 pandemic: Perspectives of university language and communication teachers across the globe. Journal of University Teaching & Learning Practice, 18(5), 11–20. doi:10.53761/1.18.5.11 Kory, J., & Breazeal, C. (2014). Storytelling with robots: Learning companions for preschool children’s language development. In Robot and Human Interactive Communication, 2014 RO-MAN: The 23rd IEEE International Symposium (pp. 643–648). IEEE. 383

Compilation of References

Kostova, K. B. (2020). Outlines of English as a foreign language testing and assessment in higher education. Announcements of Union of Scientists - Sliven, 35(1), 90–106. Kozulin, A., & Garb, E. (2002). Dynamic Assessment of EFL Text Comprehension. School Psychology International, 23(1), 112–127. doi:10.1177/0143034302023001733 Kozulin, A., & Levi, T. (2018). EFL learning potential: General or modular? Journal of Cognitive Education and Psychology, 17(1), 16–27. doi:10.1891/1945-8959.17.1.16 Kramsch, C. (2011). The symbolic dimensions of the intercultural. Language Teaching, 44(3), 354–367. doi:10.1017/ S0261444810000431 Kramsch, C. (2012). Theorizing translingual/transcultural competence. In G. Levine & A. M. Phipps (Eds.), Critical and Intercultural Theory and Language Pedagogy (pp. 15–31). Heinle Cengage Learning. Krashen, S. (1982). Principles and practices in second language acquisition. Pergamon Press. Krashen, S. (2004). The power of reading: Insights from the research. Libraries Unlimited. Kremmel, B., & Harding, L. (2020). Towards a comprehensive, empirical model of language assessment literacy across stakeholder groups: Developing the Language Assessment Literacy Survey. Language Assessment Quarterly, 17(1), 100–120. doi:10.1080/15434303.2019.1674855 Krippendorff, K. (2018). Content analysis: An introduction to its methodology. Sage Publications. Kroeber, A. L., & Kluckhohn, C. (1952). Culture. A critical review of concepts and definitions. Museum of American Archaeology and Ethnology. https://www.pseudology.org/Psyhology/CultureCriticalReview19 52a.pdf Külekçi, E. (2016). A concise analysis of the foreign language examination (YDS) in Turkey and its possible washback effects. International Online Journal of Education & Teaching, 3(4), 303–315. Kunnan, A. J. (2017). Evaluating language assessments. Routledge. doi:10.4324/9780203803554 Kurth, J. A. (2013). A unit-based approach to adaptations in inclusive classrooms. Teaching Exceptional Children, 46(2), 34–43. doi:10.1177/004005991304600204 Kushki, A., Nassaji, H., & Rahimi, M. (2022). Interventionist and interactionist dynamic assessment of argumentative writing in an EFL program. System, 107, 1–13. doi:10.1016/j.system.2022.102800 Kyriacou, C. (2007). Essential teaching skills. Blackwell Education. Lado, R. (1961). Language testing. Longmans, Green and Co. Lado, R. (1961). Language testing: The construction and use of foreign language tests. Longman. Lage, M. J., Platt, G. J., & Treglia, M. (2000). Inverting the classroom: A gateway to creating an inclusive learning environment. The Journal of Economic Education, 31(1), 30–43. doi:10.1080/00220480009596759 Laing, S. P., & Kamhi, A. (2003). Alternative assessment of language and literacy in culturally and linguistically diverse populations. Language, Speech, and Hearing Services in Schools, 34(1), 44–55. doi:10.1044/0161-1461(2003/005) PMID:27764486 Lambert, D., & Lines, D. (2000). Understanding Assessment: Purposes, Perceptions, Practice. Key Issues in Teaching and Learning Series. Routledge.

384

Compilation of References

Lambert, V. A., & Lambert, C. E. (2013). Qualitative descriptive research: An acceptable design. Pacific Rim International Journal of Nursing Research, 16(4), 255–256. Lam, R. (2019). Teacher assessment literacy: Surveying knowledge, conceptions, and practices of classroom-based writing assessment in Hong Kong. System, 81, 78–89. doi:10.1016/j.system.2019.01.006 Lantolf, J. (2000). Introducing sociocultural theory. In J. P. Lantolf (Ed.), Sociocultural theory and second language learning (pp. 1–26). Oxford University Press. Lantolf, J. (2000). Sociocultural Theory and Second Language Learning. Oxford University Press. Lantolf, J. P., & Poehner, M. E. (2004). Dynamic assessment: Bringing the past into the future. Journal of Applied Linguistics, 1(1), 49–74. doi:10.1558/japl.1.1.49.55872 Larenas, C. D., Boero, N. A., Rodríguez, B. R., & Sánchez, I. R. (2021). English language assessment: Unfolding school students’ and parents’ views. Educação e Pesquisa, 47, 1–27. doi:10.15901678-4634202147226529 Larochelle, M., Bednarz, N., & Garrison, J. (Eds.). (1998). Constructivism and education. Cambridge University Press. doi:10.1017/CBO9780511752865 Larsen-Freeman, D. (2001). Teaching grammar. In M. Celce-Murcia (Ed.), Teaching English as a Second or Foreign Language (3rd ed., pp. 251–266). Heinle & Heinle. Laufer, B., & Rozovski-Roitblat, B. (2011). Incidental vocabulary acquisition: The effects of task type, word occurrence and their combination. Language Teaching Research, 15, 391–411. Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Cambridge University Press. doi:10.1017/CBO9780511815355 Lee, H., & Lee, J. H. (2022a). Social robots for English language teaching. ELT Journal, 76(1), 119–124. doi:10.1093/ elt/ccab041 Lee, H., & Lee, J. H. (2022b). The effects of robot-assisted language learning: A meta-analysis. Educational Research Review, 35, 1–13. doi:10.1016/j.edurev.2021.100425 Lee, S., Noh, H., Lee, J., Lee, K., Lee, G., Sagong, S., & Kim, M. (2011). On the effectiveness of robot-assisted language learning. ReCALL, 23(1), 25–58. doi:10.1017/S0958344010000273 Lee, Y. W., & Sawaki, Y. (2009). Cognitive diagnosis approaches to language assessment: An overview. Language Assessment Quarterly, 6(3), 172–189. doi:10.1080/15434300902985108 Leo, R. J., & Cartagena, M. T. (1999). Gender bias in psychiatric texts. Academic Psychiatry, 23(2), 71–76. doi:10.1007/ BF03354245 PMID:25416009 Leung, C. (2005). Convivial communication: Recontextualizing communicative competence. International Journal of Applied Linguistics, 15(2), 119–144. doi:10.1111/j.1473-4192.2005.00084.x Leung, C. (2014). Classroom-based assessment issues for language teacher education. In A. J. Kunnan (Ed.), The Companion to Language Assessment (pp. 1510–1519). Wiley Blackwell. Leung, W., Liu, X., & Meng, H. (2019). CNN-RNN-CTC based end-to-end mispronunciation detection and diagnosis. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 8132-8136. 10.1109/ ICASSP.2019.8682654

385

Compilation of References

Levi, T. (2015). Towards a framework for assessing foreign language oral proficiency in a large-scale test setting: Learning from DA mediation examinee verbalizations. Language and Sociocultural Theory, 2(1), 1–24. doi:10.1558/lst.v2i1.23968 Levi, T. (2017). Developing L2 oral language proficiency using concept-based Dynamic Assessment within a large-scale testing context. Language and Sociocultural Theory, 3(2), 77–100. doi:10.1558/lst.v3i2.32866 Levi, T., & Inbar-Lourie, O. (2020). Assessment literacy or language assessment literacy: Learning from the teachers. Language Assessment Quarterly, 17(2), 168–182. doi:10.1080/15434303.2019.1692347 Li, H., Liang, J., Wang, S., & Xu, B. (2009). An efficient mispronunciation detection method using GLDS-SVM and formant enhanced features. Proc. IEEE Int. Conf. Acoust., Speech Signal Process (ICASSP), 4845–4848. Li, K., Qian, X., & Meng, H. (2017). Mispronunciation detection and diagnosis in L2 English speech using multidistribution deep neural networks. Proc. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(1), 193-207. 10.1109/TASLP.2016.2621675 Li, W., Chen, N. F., Siniscalchi, S. M., & Lee, C.-H. (2019). Improving mispronunciation detection of Mandarin tones for non-native learners with soft-target tone labels and blstm-based deep tone models. Proc. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(12), 2012-2024. 10.1109/TASLP.2019.2936755 Li, W., Chen, N., Siniscalchi, M., & Lee, C.-H. (2017). Improving mispronunciation detection for non-native learners with multisource information and LSTM-based deep models. Proc. INTERSPEECH 2017, 2759-2763. 10.21437/ Interspeech.2017-464 Lidz, C. S., & Gindis, B. (2003). Dynamic assessment of the evolving cognitive functions in children. In A. Kozulin, B. Gindis, V. S. Ageyev, & S. M. Miller (Eds.), Vygotsky’s educational theory in cultural context (pp. 99–116). Cambridge University Press. doi:10.1017/CBO9780511840975.007 Lidz, C. S., & Peña, E. D. (1996). Dynamic assessment: The model, its relevance as a nonbiased approach, and its application to Latino American preschool children. Language, Speech, and Hearing Services in Schools, 27(4), 367–372. doi:10.1044/0161-1461.2704.367 Lidz, C. S., & Peña, E. D. (2009). Response to intervention and dynamic assessment: Do we just appear to be speaking the same language? Seminars in Speech and Language, 30(02), 121–133. doi:10.1055-0029-1215719 PMID:19399697 Li, K., Wu, X., & Meng, H. (2017). Intonation classification for L2 English speech using multi-distribution deep neural networks. Computer Speech & Language, 43, 18–33. doi:10.1016/j.csl.2016.11.006 Li, M., & Zhang, X. (2021). A meta-analysis of self-assessment and language performance in language testing and assessment. Language Testing, 38(2), 189–218. doi:10.1177/0265532220932481 Lim, N., O’Reilly, M. F., Sigafoos, J., Ledbetter-Cho, K., & Lancioni, G. E. (2019). Should heritage languages be incorporated into interventions for bilingual individuals with neurodevelopmental disorders? A systematic review. Journal of Autism and Developmental Disorders, 49(3), 887–912. doi:10.100710803-018-3790-8 PMID:30368629 Lingren, E., & Muñoz, C. (2013). The influence of exposure, parents and linguistic distance on young European learners’ foreign language comprehension. International Journal of Multilingualism, 10(1), 105–129. Lionnet, F. (1990). Auto-ethnography: The an-archic style of dust tracks on a road. In H. L. Gates (Ed.), Reading black, reading feminist (pp. 382–413). Meridian. Lipton, G. (1992). Practical handbook to elementary foreign language programs, including FLES, FLEX, and immersion programs (2nd ed.). National Textbook.

386

Compilation of References

Litchfield, B., & Dempsey, J. (2015). Authentic assessment of knowledge, skills, and attitudes. New Directions for Teaching and Learning, 142, 65–80. Little, D., Goullier, F., & Hughes, G. (2011). The European language portfolio: The story so far (1991-2011). Council of Europe. Litz, D. R. (2005). Textbook evaluation and ELT management: A South Korean case study. Asian EFL Journal, 48, 1–53. Liu, J., & Zhang, J. (2018). The effects of extensive reading on English vocabulary learning: A meta-analysis. English Language Teaching, 11(6), 1–10. Li, W., Chen, N. F., Siniscalchi, S. M., & Lee, C.-H. (2018). Improving Mandarin tone mispronunciation detection for non-native learners with soft-target tone labels and BLSTM-based deep models. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 6249-6253. 10.1109/ICASSP.2018.8461629 Li, W., Siniscalchi, S. M., Chen, N. F., & Lee, C.-H. (2016). Improving non-native mispronunciation detection and enriching diagnostic feedback with DNN-based speech attribute modeling. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 6135-6139. 10.1109/ICASSP.2016.7472856 Lo, T., Sung, Y., & Chen, B. (2021). Improving end-to-end modeling for mispronunciation detection with effective augmentation mechanisms. ArXiv, abs/2110.08731. Long, M. H. (1990). Task, group, and task-group interactions. In S. Anivan (Ed.), Language Teaching Methodology for the Nineties (pp. 31–50). SEAMEO Regional Language Centre. Long, M. H. (1996). The role of the linguistic environment in second language acquisition. In W. C. Ritchie & T. K. Bhatia (Eds.), Handbook of second language acquisition (pp. 413–468). Elsevier. Long, M. H. (2000). Focus on form in task-based language teaching. In R. D. Lambert & E. Shohamy (Eds.), Language policy and pedagogy: Essays in honor of A Ronald Walton. John Benjamins. Lortie, D. (1975). Schoolteacher: A sociological study. University of Chicago Press. Luft, J. A. (1999). Assessing Science Teachers as They Implement Inquiry Lessons: The Extended Inquiry Observational Rubric. Science Educator, 8(1), 9–18. Lustig, M. W. (2005). WSCA 2005 presidential address: Toward a well‐functioning intercultural nation. Western Journal of Communication, 69(4), 377–379. doi:10.1080/10570310500305612 Lustig, M. W., & Koester, J. (2010). Intercultural competence: Interpersonal communication across cultures (6th ed.). Allyn & Bacon. Lynch, B. K. (2001). Rethinking Assessment from a Critical Perspective. Language Testing, 18(4), 351-372. Lyster, R. (2004). Differential effects of prompts and recasts in form-focused instruction. Studies in Second Language Acquisition, 26, 399–432. Macleod, M., & Norrby, C. (2002). Sexual stereotyping in Swedish language textbooks. Journal of the Australasian Universities Language and Literature Association, 97(1), 51–73. doi:10.1179/aulla.2002.97.1.005 MacNaughton, G., & Williams, G. (2004). Teaching young children: Choices in theory and practice. Open University Press. Magala, S. (2005). Cross-cultural competence. Routledge. doi:10.4324/9780203695494 Magnan, S. S., Murphy, D., & Sahakyan, N. (2014). Goals of collegiate learners and the standards for foreign language learning. Modern Language Journal, 98(S1), 1–11. doi:10.1111/j.1540-4781.2013.12056_3.x 387

Compilation of References

Mahfoodh, H. (2021). Reflections on online EFL assessment: Challenges and solutions. 2021 Sustainable Leadership and Academic Excellence International Conference (SLAE), 1-8. 10.1109/SLAE54202.2021.9788097 Majgaard, G. (2015). Multimodal robots as educational tools in primary and lower secondary education. Proceedings of the International Conferences Interfaces and Human Computer Interaction, 27–34. Malone, M. E. (2011). Assessment literacy for language educators. CAL Digest October 2011. Available at www.cal.org Malone, M. (2013). The essentials of assessment literacy: Contrasts between testers and users. Language Testing, 30(3), 329–344. doi:10.1177/0265532213480129 Mansoor, S. (2005). Language planning in higher education: A case study of Pakistan. Oxford University Press. Mantero, M. (2002). Scaffolding revisited: Sociocultural pedagogy within the foreign language classroom. Retrieved October 2016 from www.eric.ed.gov Maoz, Z. (2002). Case study methodology in international studies: From storytelling to hypothesis testing. Evaluating Methodology in International Studies, 163. Mao, Z., & Lee, I. (2021). Researching L2 student engagement with written feedback: Insights from sociocultural theory. TESOL Quarterly. Advance online publication. doi:10.1002/tesq.3071 Maqsood, M., Habib, H. A., & Nawaz, T. (2019). An efficient pronunciation detection system using discriminative acoustic-phonetic features for Arabic consonants. The International Arab Journal of Information Technology, 16, 242–250. Markman, E. M. (1994). Constraints on word meaning in early language acquisition. Lingua, 92, 199–227. Marsh, D. (1994). Bilingual education & content and language integrated learning. International Association for Crosscultural Communication, Language Teaching in the Member States of the European Union (Lingua), Paris, University of Sorbonne. Martinez, R.-A., Martinez, R., & Perez, M. H. (2004). Children’s school assessment: Implications for family – school partnerships. International Journal of Educational Research, 41(1), 24–39. doi:10.1016/j.ijer.2005.04.004 Mashori, G. M. (2010). Practicing process writing strategies in English: An experimental study of pre and post process teaching perceptions of undergraduate students at Shah Abdul Latif University Khairpur. English Language & Literary Forum, 12, 25–57. Masters, G. N. (1990). Psychometric aspects of individual assessment. In J. A. L. de Jong (Ed.), Individualizing the assessment of language abilities (pp. 56–70). Multilingual Matters. Matsumoto, D., & Hwang, H. C. (2013). Assessing cross-cultural competence: A review of available tests. Journal of Cross-Cultural Psychology, 44(6), 849–873. doi:10.1177/0022022113492891 Matveev, A. (2017). Intercultural competence in organizations. Springer International Publishing. doi:10.1007/978-3319-45701-7 Mavrommatis, Y. (1997). Understanding Assessment in the Classroom: Phases of the Assessment Process – the assessment episode. Assessment in Education: Principles, Policy & Practice, 4(3), 381–399. Mazzoni, E., & Benvenuti, M. (2015). A Robot-partner for preschool children learning English using socio-cognitive conflict. Journal of Educational Technology & Society, 18, 474–485. McAuliffe, M., & Triandafyllidou, A. (Eds.). (2021). World migration report 2022. International Organization for Migration (IOM). http://hdl.handle.net/1814/74322 388

Compilation of References

McDonough, J., & Shaw, C. (2012). Materials and methods in ELT. John Wiley & Sons. McGrath, I. (2002). Materials evaluation and design for language teaching Edinburgh textbooks in applied linguistics. Edinburgh University Press. McKay, P. (2006). Assessing young language learners. Cambridge University Press. McLaughlin, T., & Yan, Z. (2017). Diverse delivery methods and strong psychological benefits: A review of online formative assessment. Journal of Computer Assisted Learning, 33(6), 562–574. doi:10.1111/jcal.12200 McMillan, J. H. (2001). Classroom assessment: Principles and practice for effective instruction (2nd ed.). Allyn & Bacon. McNamara, T. (1996). Measuring second language performance. Longman. McNamara, T. (2001). Language Assessment as Social Practice: Challenges for Research. Language Testing, 18(4), 333–349. doi:10.1177/026553220101800402 McNamara, T., & Hill, K. (2011). Developing a comprehensive, empirically based research framework for classroombased assessment. Language Testing, 29(3), 395–420. McQuillan, J., & Tse, L. (1998). What’s is the story? Using the narrative approach in beginning language classrooms. TESOL Journal, 7, 18–23. MEB. (2013). Millî Eğitim Bakanlığı ortaöğretim kurumları yönetmeliği (7 Eylül 2013) [Ministry of National Education Regulation on the Secondary Education Institutions (September 7, 2013)]. Resmi Gazete, 28758. Meccawy, Z., Meccawy, M., & Alsobhi, A. (2021). Assessment in ‘survival mode’: Student and faculty perceptions of online assessment practices in HE during Covid-19 pandemic. International Journal for Educational Integrity, 17(1), 1–24. doi:10.100740979-021-00083-9 Mede, E., & Atay, D. (2017). English language teachers’ assessment literacy: The Turkish context. Dil Dergisi, 168(1), 43–60. Meiirbekov, S., Balkibekov, K., Jalankuzov, Z., & Sandygulova, A. (2016). “You win, I lose”: Towards adapting robot’s teaching strategy. In Proceedings of the Eleventh ACM/IEEE International Conference on Human Robot Interaction (pp. 475–476). New York, NY: ACM. Meisel, J. M. (2008). Child second language acquisition or successive first language acquisition? In B. Haznedar & E. Gavruseva (Eds.), Current trends in child second language acquisition (pp. 55-82). Amsterdam: John Benjamins. Mellard, D., McKnight, M., & Jordan, J. (2010). RTI tier structures and instructional intensity. Learning Disabilities Research & Practice, 25(4), 217–225. doi:10.1111/j.1540-5826.2010.00319.x Mengel, F., Sauermann, J., & Zölitz, U. (2019). Gender bias in teaching evaluations. Journal of the European Economic Association, 17(2), 535–566. doi:10.1093/jeea/jvx057 Mennell, S. (2020). Power, Individualism, and Collective Self Perception in the USA. Historical Social Research. Historische Sozialforschung, 45(171), 309–329. Mertler, C. A., & Campbell, C. (2005). Measuring teachers’ knowledge and application of classroom assessment concepts: Development of the assessment literacy inventory. Paper presented at the annual meeting of the American Research Association, Montreal, Quebec, Canada. Retrieved from https://eric.ed.gov/?id=ED490355 Messick, S. (1994). The interplay evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13–23. doi:10.3102/0013189X023002013 389

Compilation of References

Michaelis, E. J., & Mutlu, B. (2017). Someone to read with: Design of and experiences with an in-home learning companion robot for reading. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 301–312). ACM. 10.1145/3025453.3025499 Mignolo, W. D. (2000). Local histories/global designs: Coloniality, subaltern knowledges, and border thinking. Princeton University Press. Miller, S., & Pennycuff, L. (2008). The power of story: Using storytelling to improve literacy learning. Journal of CrossDisciplinary Perspectives in Education, 1(1), 36–43. Milli Eğitim Bakanlığı. (2018a). 2023 Vision of national education. http://2023vizyonu.meb.gov.tr/doc/2023_EGITIM_VIZYONU.pdf Milli Eğitim Bakanlığı. (2018b). English foreign language curriculum. http://mufredat.meb.gov.tr/ Dosyalar/201812411191321-İNGİLİZCEÖĞRETİMPROGRAMIKlasörü.pdf Milliner, B. (2013). Using online flashcard software to raise business students’ TOEIC scores. Annual Report of JACETSIG on ESP, 15, 52–60. Minocha, S. (2021). Designing assessment for academic integrity. In Assessment Programme/Scholarship Steering Group Event. In Assessment Hub: Supporting assessment practices around the OU. T h e O p e n U n i v e r s i t y. h t t p : / / o r o . o p e n . a c . u k / 8 0 3 0 6 / 1 / D e s i g n i n g - A s s e s s m e n t - f o r - A c a d e mic-Integrity-ORO.pdf Mkpa, M. A. (1987). Continuous Assessment Instruments and Techniques used by Secondary School Teachers. Total Publishers. Mohammadi, M., & Abdi, H. (2014). Textbook evaluation: A case study. Procedia: Social and Behavioral Sciences, 98, 1148–1155. doi:10.1016/j.sbspro.2014.03.528 Most spoken languages in the world. (2022). Statista. Retr ieved 3 Decemb e r 2 0 2 1 , f r o m h t t p s : / / w w w . s t a t i s t a . c o m / s t a t i s t i c s / 2 6 6 8 0 8 / t h e - m o s t - s p o k e n - l anguages-worldwide/ Motha, S., & Lin, A. (2014). “Non-coercive rearrangements”: Theorizing desire in TESOL. TESOL Quarterly, 48(2), 331–359. doi:10.1002/tesq.126 Mourão, S., & Lourenço, M. (2015). Early years second language education: International perspectives on theory and practice. Routledge. Moyer, A. (2013). Foreign accent: The phenomenon of non-native speech. Cambridge University Press. Mubin, O., Stevens, C. J., Shahid, S., Al Mahmud, A., & Dong, J. J. (2013). A review of the applicability of robots in education. Technology for Education and Learning, 1(1), 1–7. doi:10.2316/Journal.209.2013.1.209-0015 Muhammad, A. A., & Ockey, G. J. (2021). Upholding language assessment quality during the COVID-19 pandemic: Some final thoughts and questions. Language Assessment Quarterly, 18(1), 51–55. doi:10.1080/15434303.2020.1867555 Muñoz, C. (2006). Age and rate of foreign language learning. Multilingual Matters. Muñoz, C. (2008a). Symmetries and asymmetries of age effects in naturalistic and instructed L2 learning. Applied Linguistics, 29, 578–596. Muñoz, C. (2008b). Age-related differences in foreign language learning: Revisiting the empirical evidence. International Journal of Applied Linguistics, 46, 197–220. 390

Compilation of References

Nagi, K., & John, V. K. (2020). Plagiarism among Thai students: A study of attitudes and subjective norms. 2020 Sixth International Conference on e-Learning (econf), 45-50. 10.1109/econf51404.2020.9385427 Nakata, T. (2011). Computer-assisted second language vocabulary learning in a paired associate paradigm: A critical investigation of flashcard software. Computer Assisted Language Learning, 24(1), 17–38. doi:10.1080/09588221.201 0.520675 Nana, G. (2011). Official bilingualism and field narratives: Does school practice echo policy discourse? International Journal of Bilingual Education and Bilingualism. Naseem, S., Shah, S. K., & Tabassum, S. (2015). Evaluation of English textbook in Pakistan: A case study of Punjab textbook for 9th class. European Journal of English Language and Literature Studies, 3(3), 24–42. Nasrul, E. S., Alberth, A., & Ino, L. (2019). Students’ background knowledge, vocabulary competence, and motivation as predictors of reading comprehension at grade 11 of SMA Kartika Kendari. Journal of Language Education and Educational Technology, 4(1), 1–12. Nassaji, H. (2011). Correcting students’ written grammatical errors: The effects of negotiated versus nonnegotiated feedback. Studies in Second Language Learning and Teaching, 1(3), 315–334. doi:10.14746sllt.2011.1.3.2 Nassaji, H. (2016). Anniversary article interactional feedback in second language teaching and learning: A synthesis and analysis of current research. Language Teaching Research, 20(4), 535–562. doi:10.1177/1362168816644940 Nassaji, H. (2017). Negotiated oral feedback in response to written errors. In H. Nassaji & E. Kartchava (Eds.), Corrective feedback in second language teaching and learning: Research, theory, applications, implications (pp. 114–128). Routledge. doi:10.4324/9781315621432-9 Nassaji, H., & Swain, M. (2000). A Vygotskian perspective on corrective feedback in L2: The effect of random versus negotiated help on the learning of English articles. Language Awareness, 9(1), 34–51. doi:10.1080/09658410008667135 Navarrete, C., Wilde, J., Nelson, C., Martinez, R., & Hargett, G. (1990). Informal assessment in educational evaluation: Implications for bilingual education programs. National Clearinghouse for Bilingual Education. Nazir, F., Majeed, M. N., Ghazanfar, M. A., & Maqsood, M. (2019). Mispronunciation detection using deep convolutional neural network features and transfer learning-based model for Arabic phonemes. IEEE Access: Practical Innovations, Open Solutions, 7, 52589–52608. doi:10.1109/ACCESS.2019.2912648 Newfields, T. (2006). Teacher development and assessment literacy. In T. Newfields, I. Gledall, M. Kawate-Mierzejewska, Y. Ishida, M. Chapman, & P. Ross (Eds.), Authentic Communication: Proceedings of the 5th Annual JALT Pan-SIG Conference (pp. 48-73). Tokai University College of Marine Science. Newport, E. L. (1991). Contrasting conceptions of the critical period for language. In S. Carey & R. Gelman (Eds.), The epigenesis of mind (pp. 111–130). Erlbaum. Ngadiman, A., Widiati, A. S., & Widiyanto, Y. N. (2009). Parents’ and teachers’ perceptions of the assessment of the students’ English achievement. Magister Scientiae, 26, 83–97. Ngole, M. J. (2010). Increasing Enrolment at the Cameroon General Certificate of Education Examination and Challenges of Maintaining Quality Assessment. Journal of Educational Assessment in Africa, 13(2), 157–162. Nicholas, H., & Lightbown, P. (2008). Defining child second language acquisition, defining roles for L2 instruction. In J. Philp, R. Oliver, & A. Mackey (Eds.), Second language acquisition and the younger learner: Child’s play? (pp. 27–51). John Benjamins.

391

Compilation of References

Nikolov, M. (2016). Trends, issues, and challenges in assessing young language learners. In M. Nikolov (Ed.), Assessing young learners of English: Global and local perspectives (pp. 1–18). Springer. Niu, W., & Sternberg, R. J. (2006). The philosophical roots of Western and Eastern conceptions of creativity. Journal of Theoretical and Philosophical Psychology, 26(1-2), 18–38. doi:10.1037/h0091265 Northrup, D. (2013). How English became the global language. Palgrave Macmillan. doi:10.1057/9781137303073 Nunan, D. (1991). Communicative tasks and the language curriculum. TESOL Quarterly, 25(2), 279–295. doi:10.2307/3587464 O’Connor, K. (2022). Constructivism, curriculum and the knowledge question: Tensions and challenges for higher education. Studies in Higher Education, 47(2), 412–422. doi:10.1080/03075079.2020.1750585 O’Loughlin, K. (2006). Learning about second language assessment: Insights from a postgraduate student on-line subject forum. University of Sydney Papers in TESOL, 1, 71–85. O’Neill, R. (1982). Why use textbooks? ELT Journal, 36(2), 104–111. doi:10.1093/elt/36.2.104 O’Reilly, T. (2007). What is web 2.0: Design patterns and business models for the next generation if software. Communications & Stratégies, 65, 17–37. O’Sullivan, B. (2012). Assessing speaking. In C. Coombe, P. Davidson, B. O’Sullivan, & S. Stoynoff (Eds.), The Cambridge Guide to Second Language Assessment (pp. 234–246). Cambridge University Press. Odora-Hoppers, C. (2004). Culture, indigenous knowledge and development: The role of the university. Centre for Education Policy Development. OECD. (2005). The definition and selection of key competencies executive summary. OECD Publishing. https://www. oecd.org/pisa/35070367.pdf O E C D . ( 2 0 1 8 ) . T h e f u t u re o f e d u c a t i o n a n d s k i l l s . E d u c a t i o n 2 0 3 0 . O E C D P u b lishing. h t t p s : / / w w w . o e c d . o r g / e d u c a t i o n / 2 0 3 0 - p r o j e c t / c o n t a c t / E 2 0 3 0 % 2 0 Position%20Paper%20(05.04.2018).pdf OECD. (2019). PISA 2018 assessment and analytical framework. OECD. OECD. (2020). Education at a Glance 2020: OECD Indicators. OECD Publishing., doi:10.1787/69096873OECD. (2020). PISA 2018 Results (Volume 4): Are students ready to thrive in an interconnected world? OECD. OECD. (2021). PISA 2025 foreign language assessment framework. OECD Publishing. Oldfield, A., Broadfoot, P., Sutherland, R., & Timmis, S. (2012). Assessment in a digital age: A research review. University of Bristol. Ölmezer-Öztürk, E. (2018). Developing and Validating Language Assessment Knowledge Scale (LAKS) and Exploring the Assessment Knowledge of EFL Teachers [Unpublished Ph.D. Dissertation]. Anadolu University. Ölmezer-Öztürk, E., & Aydın, B. (2019). Investigating language assessment knowledge of EFL teachers. Hacettepe University Journal of Education, 34(3), 602–620. doi:10.16986/HUJE.2018043465 Olt, M. R. (2002). Ethics and Distance Education: Strategies for Minimizing Academic Dishonesty in Online Assessment. Online Journal of Distance Learning Administration, 5(3), 1–7.

392

Compilation of References

Ondas, S., Pleva, M., & Juhar, J. (2022). Child-robot spoken interaction in selected educational scenarios. In 20th International Conference on Emerging eLearning Technologies and Applications (ICETA) (pp. 478–483). IEEE. 10.1109/ ICETA57911.2022.9974859 Ordinary Level Subject Reports for the GCE General Education. (2012-2015). GCE Board Buea Practices. Whiteplains, NY: Longman. Oscarson, M. (1989). Self-assessment of language proficiency: Rationale and applications. Language Testing, 6(1), 1–13. doi:10.1177/026553228900600103 Ourfali, E. (2015). Comparison between Western and Middle Eastern cultures: Research on why American expatriates struggle in the Middle East. Otago Management Graduate Review, 13, 33–43. Özdemir-Yılmazer, M., & Özkan, Y. (2017). Classroom assessment practices of English language instructor. Journal of Language and Linguistic Studies, 13(2), 324–345. Özkan, M., & Arslantaş, H. İ. (2013). Etkili öğretmen özellikleri üzerine sıralama yöntemiyle bir ölçekleme çalışması [A scaling study on effective teacher characteristics within the rank model]. Trakya Üniversitesi Sosyal Bilimler Dergisi, 15(1), 311-330. https://dergipark.org.tr/en/download/article-file/321465 Özturan, T., & Uysal, H. H. (2022). Mediating multilingual immigrant learners’ L2 writing through interactive dynamic assessment. Kuramsal Eğitimbilim Dergisi, 15(2), 307–326. doi:10.30831/akukeg.1004155 Panadero, E., Jonsson, A., & Botella, J. (2017). Effects of self-assessment on self-regulated learning and self-efficacy: Four meta-analyses. Educational Research Review, 22, 74–98. doi:10.1016/j.edurev.2017.08.004 Panezai, S. G., & Channa, L. A. (2017). Pakistani government primary school teachers and the English textbooks of Grades 1–5: A mixed methods teachers’-led evaluation. Cogent Education, 4(1), 1–18. doi:10.1080/2331186X.2016.1269712 Panhwar, A. H., Baloch, S., & Khan, S. (2017). Making communicative language teaching work in Pakistan. International Journal of English Linguistics, 7(3), 226–234. doi:10.5539/ijel.v7n3p226 Papp, S., & Walczak, A. (2016). The development and validation of a computer-based test of English for young learners: Cambridge English young learners. In M. Nikolov (Ed.), Assessing young learners of English: Global and local perspectives. Springer. Park, C. (2003). In other (people’s) words: Plagiarism by university students—literature and lessons. Assessment & Evaluation in Higher Education, 28(5), 471–488. doi:10.1080/02602930301677 Paullet, K. (2020). Student and faculty perceptions of academic dishonesty in online classes. Issues in Information Systems, 21(3), 327–333. doi:10.48009/3_iis_2020_327-333 Peña, E. D. (2000). Measurement of modifiability in children from culturally and linguistically diverse backgrounds. Communication Disorders Quarterly, 21(2), 87–97. doi:10.1177/152574010002100203 Peña, E. D., Gillam, R. B., Malek, M., Felter, R., Resendiz, M., & Fiestas, C. (2006). Dynamic assessment of children from culturally diverse backgrounds: Application to narrative assessment. Journal of Speech, Language, and Hearing Research, 49, 1037–1057. PMID:17077213 Peña, E. D., Quinn, R., & Iglesias, A. (1992). The application of dynamic methods to language assessment: A non-biased procedure. The Journal of Special Education, 26(3), 269–280. doi:10.1177/002246699202600304 Pena, E., Iglesias, A., & Lidz, C. S. (2001). Reducing test bias through assessment of children’s word learning ability. American Journal of Speech-Language Pathology, 10(2), 138–151. doi:10.1044/1058-0360(2001/014) 393

Compilation of References

Pfenninger, S. E., & Singleton, D. (2019). Starting age overshadowed: The primacy of differential environmental and family support effects on L2 attainment in an instructional context. Language Learning, 69(1), 207–234. Philips, M. (2020). Multimodal representations for inclusion and success. In H. H. Uysal (Ed.), Political, pedagogical and research insights into early language education (pp. 70–81). Cambridge Scholars Publishing. Philips, S. (1993). Young learners. Oxford University Press. Phillipson, R. (1992). Linguistic imperialism. Oxford University Press. Picciano, A. G. (2017). Theories and frameworks for online education: Seeking an integrated model. Online Learning, 21(3), 166–190. doi:10.24059/olj.v21i3.1225 Pike, K. L. (1954). Language in relation to a unified theory of the structure of human behavior, part 1 (Preliminary ed.). Summer Institute of Linguistics. Pimsleur-Levine, J., & Benaisch, A. (2022). Little Pim. Retrieved June 9, 2022, from https://www.littlepim.com Pinter, A. (2011). Children learning second languages. Palgrave Macmillan. Pisha, B., & Coyne, P. (2001). Smart from the start: The promise of universal design for learning. Remedial and Special Education, 22(4), 197–203. doi:10.1177/074193250102200402 Poehner, M. (2008). Dynamic assessment: A Vygotskian approach to understanding and promoting second language development. Springer. doi:10.1007/978-0-387-75775-9 Poehner, M. E., & Inbar-Lourie, O. (2020). An epistemology of action for understanding and change in L2 classroom assessment: The case for praxis. In M. E. Poehner & O. Inbar-Lourie (Eds.), Toward a reconceptualization of second language classroom assessment (1st ed., pp. 1–20). Springer. doi:10.1007/978-3-030-35081-9_1 Poehner, M. E., & Inbar-Lourie, O. (Eds.). (2020). Toward a reconceptualization of second language classroom assessment: Praxis and researcher-teacher partnership. Springer. doi:10.1007/978-3-030-35081-9 Poehner, M. E., & Infante, P. (2016). Dynamic assessment in the language classroom. In D. Tsagari & J. Banerjee (Eds.), The Handbook of Second Language Assessment. De Gruyter. doi:10.1515/9781614513827-019 Poehner, M. E., & Infante, P. (2019). Mediated development and the internalization of psychological tools in second language (L2) education. Learning, Culture and Social Interaction, 22, 1–14. doi:10.1016/j.lcsi.2019.100322 Poehner, M. E., & Lantolf, J. P. (2005). Dynamic assessment in the language classroom. Language Teaching Research, 9(3), 233–265. doi:10.1191/1362168805lr166oa Poehner, M. E., & Lantolf, J. P. (2013). Bringing the ZPD into the equation: Capturing L2 development during Computerized Dynamic Assessment (C-DA). Language Teaching Research, 17(3), 323–342. doi:10.1177/1362168813482935 Poehner, M. E., & Leontjev, D. (2018). To correct or to cooperate: Mediational processes and L2 development. Language Teaching Research, 1–22. doi:10.1177/1362168818783212 Poehner, M. E., & van Compernolle, R. A. (2020). Reconsidering time and process in L2 dynamic assessment. In E. P. Matthew & I. L. Ofra (Eds.), Toward a reconceptualization of second language classroom assessment: Praxis and researcher-teacher partnership (pp. 173–195). Springer. doi:10.1007/978-3-030-35081-9_9 Poehner, M. E., Zhang, J., & Lu, X. (2015). Computerized dynamic assessment (C-DA): Diagnosing L2 development according to learner responsiveness to mediation. Language Testing, 32(3), 337–357. doi:10.1177/0265532214560390

394

Compilation of References

Polisca, E., Stollhans, S., Bardot, R., & Rollet, C. (2022). How Covid-19 has changed language assessments in higher education: a practitioners’ view. In C. Hampton & S. Salin (Eds.), Innovative language teaching and learning at university: facilitating transition from and to higher education (pp. 81-91). Research-publishing.net. doi:10.14705/rpnet.2022.56.1375 Popham, J. W. (2009). Assessment literacy for teachers: Faddish or fundamental? Theory into Practice, 48(1), 4–11. doi:10.1080/00405840802577536 Popham, W. J. (2006). All about accountability / Needed: A dose of assessment literacy. Educational Leadership, 63(6), 84–85. Popham, W. J. (2006). Assessment bias: How to banish it. Routledge. Porreca, K. L. (1984). Sexism in current ESL textbooks. TESOL Quarterly, 18(4), 705–724. doi:10.2307/3586584 Prabhu, N. S. (1987). Second language pedagogy (Vol. 20). Oxford University Press. Prechtl, E., & Lund, A. D. (2007). Intercultural competence and assessment: Perspectives from the INCA Project. In H. Kotthoff & H. Spencer-Oatey (Eds.), Handbook of intercultural communication. Mouton de Gruyter. doi:10.1515/9783110198584.5.467 Puchta, H. (2019). Teaching grammar to young learners. In S. Garton & F. Copland (Eds.), The Routledge handbook of teaching English to young learners (pp. 203–219). Routledge. Puckett, M., & Black, J. (2000). Authentic assessment of the young child. Prentice Hall. Pufahl, I., & Rhodes, N. (2011). Foreign language instruction in U.S. schools: Results of a national survey of elementary and secondary schools. Foreign Language Annals, 44(2), 258–288. doi:10.1111/j.1944-9720.2011.01130.x Punjab Education and English Language Initiative. (2013). Can English medium education work in Pakistan? Retrieved June 6, 2019, from https://www.britishcouncil.org/peeli_report.pdf Purpura, J. E. (2016). Second and foreign language assessment. Modern Language Journal, 100(S1), 190–208. Advance online publication. doi:10.1111/modl.12308 Pu, S., & Xu, H. (2021). Examining changing assessment practices in online teaching: A multiple-case study of EFL school teachers in China. The Asia-Pacific Education Researcher, 30(6), 553–561. doi:10.100740299-021-00605-6 Raeff, C., Greenfield, P. M., & Quiroz, B. (2000). Developing interpersonal relationships in the cultural contexts of individualism and collectivism. In S. Harkness, C. Raeff, & C. R. Super (Eds.), Variability in the social construction of the child: New directions in child development (pp. 59–74). Jossey-Bass. Rahim, A. F. A. (2020). Guidelines for online assessment in emergency remote teaching during the COVID-19 pandemic. Education in Medicine Journal, 12(2), 59–68. doi:10.21315/eimj2020.12.2.6 Rahman, T. (2004). Denizens of alien worlds: A study of education, inequality and polarization in Pakistan. Oxford University Press. Rahman, T. (2007). The role of English in Pakistan. In A. B. Tsui & J. W. Tollefson (Eds.), Language Policy, Culture, and Identity in Asian Contexts (pp. 219–239). Lawrence Erlbaum. Raikes, N., & Harding, R. (2003). The Horseless Carriage Stage: Replacing conventional measures. Assessment in Education: Principles, Policy & Practice, 10(3), 267–277. doi:10.1080/0969594032000148136 Randall, N. (2019). A Survey of Robot-Assisted Language Learning (RALL). ACM Transactions on Human-Robot Interaction, 9(1), 1–36. doi:10.1145/3345506 395

Compilation of References

Razı, S. (2015). Development of a rubric to assess academic writing incorporating plagiarism detectors. SAGE Open, 5(2), 1–13. doi:10.1177/2158244015590162 Rea-Dickins, P. (2001). Mirror, mirror on the wall: Identifying processes of classroom assessment. Language Testing, 18(4), 429–462. doi:10.1177/026553220101800407 Rea-Dickins, P., & Gardner, S. (2000). Snares and Silver Bullets: Disentangling the Construct of Formative Assessment. Language Testing, 17(2), 215–243. doi:10.1177/026553220001700206 Reedy, A., Pfitzner, D., Rook, L., & Ellis, L. (2021). Responding to the COVID-19 emergency: Student and academic staff perceptions of academic integrity in the transition to online exams at three Australian universities. International Journal for Educational Integrity, 17(1), 1–32. doi:10.100740979-021-00075-9 Regalla, M., & Peker, H. (2016). Multimodal instruction in pre-kindergarten: An introduction to an inclusive early language program. The National Network for Early Language Learning (NNELL) Learning Languages Journal, 21(2), 11-14. Retrieved from https://files.eric.ed.gov/fulltext/EJ1124522.pdf Regalla, M., & Peker, H. (2015). Early language learning for all: Examination of a prekindergarten French program in an inclusion setting. Foreign Language Annals, 48(4), 618–634. doi:10.1111/flan.12156 Regalla, M., & Peker, H. (2017). Prompting all students to learn: Examining dynamic assessment in a pre-kindergarten, inclusive French program. Foreign Language Annals, 50(2), 323–338. doi:10.1111/flan.12261 Regalla, M., Peker, H., Llyod, R., & O’Connor-Mor in, A. (2017). To exempt or not to exempt: An examination of an inclusive pre-kindergarten French program. International Journal of TESOL and Learning, 6(3&4), 83–100. http://untestedideas.net/journal_article.php?jid=ijt201712&v ol=6&issue=4 Reich, J. (2012). Rethinking teaching and time with the f lipped classroom. EdTech Res e a r c h e r E d u c a t i o n We e k . h t t p s : / / w w w. e d we e k . o r g / e d u c a t i o n / o p i n i o n - r e t h i n k i n g - t e a c h i n g -and-time-with-the-flipped-classroom/2012/06 Reimann, D. (2018). Inter- und transkulturelle kommunikative Kompetenz. In D. Reimann & S. Melo-Pfeifer (Eds.), Plurale Ansätze im Fremdsprachenunterricht in Deutschland: State of the art, Implementierung des REPA und Perspektiven (pp. 247–296). Narr Francke Attempto. Renner, C. E. (1997). Women are “Busy, Tall, and Beautiful”: Looking at sexism in EFL materials. Retrieved on June 7, 2019 from https://files.eric.ed.gov/fulltext/ED411670.pdf Richards, J. C., & Rogers, T. S. (2007). Principles of communicative language teaching and task-based instruction. Retrieved on June 6, 2019 from https://www.pearsonhighered.com/assets/samplechapter/0/1/3/1 /0131579061.pdf Richards, J. C. (2001). The role of textbooks in a language program. RELC Guidelines, 23(2), 12–16. Richards, J., Platt, J., & Platt, H. (1992). Dictionary of language teaching & applied linguistics. Longman. Ridgway, J., McCusker, S., & Pead, D. (2004). Literature review of e-assessment. NESTA Futurelab. Risager, K. (2006). Language and culture: Global flows and local complexity. Multilingual Matters. doi:10.21832/9781853598609 Risager, K. (2007). Language and culture pedagogy: From a national to a transnational paradigm. Multilingual Matters. doi:10.21832/9781853599613 396

Compilation of References

Risager, K. (2015). LINGUACULTURE The language–culture nexus in transnational perspective. In F. Sharifian (Ed.), The Routledge Handbook of language and culture (pp. 87–99). Routledge. Rixon, S. (1995). The role of fun and games activities in teaching young learners. In C. Brumfit, J. Moon, & R. Tongue (Eds.), Teaching English to children: From practice to principle (pp. 33–48). Longman. Roediger, H. L. III, & Marsh, E. J. (2005). The Positive and Negative Consequences of Multiple-Choice Testing. Journal of Experimental Psychology. Learning, Memory, and Cognition, 31(5), 1155–1159. doi:10.1037/0278-7393.31.5.1155 PMID:16248758 Roever, C. (2001). Web-based language testing. Language Learning & Technology, 5(2), 84–94. Rofiah, N. L., & Waluyo, B. (2020). Using Socrative for vocabulary tests: Thai EFL learner acceptance and perceived risk of cheating. The Journal of Asia TEFL, 17(3), 966–982. doi:10.18823/asiatefl.2020.17.3.14.966 Rogier, D. (2014). Assessment literacy: Building a base for better teaching and learning. English Language Teaching Forum, 3, 2-13. Rosa-Lugo, L. I., Mihai, F., & Nutta, J. W. (2012). Language and literacy development: an interdisciplinary focus on English learners with communication disorders. Plural Pub. Roseberry-McKibben, C. (2008). Multicultural students with special language needs (3rd ed.). Academic Communication Associates. Rose, D. H., & Meyer, A. (2002). Teaching every student in the digital age: Universal design for learning. Academic Press. Rose, D., Harbour, W., Johnston, C. S., Daley, S., & Abarbanell, L. (2006). Universal design for learning in postsecondary education. Journal of Postsecondary Education and Disability, 19. Rosenthal-von der Pütten, A. M., Straßmann, C., & Krämer, N. C. (2016). Robots or agents—Neither helps you more or less during second language acquisition. Experimental study on the effects of embodiment and type of speech output on evaluation and alignment. In Proceedings of the International Conference on Intelligent Virtual Agents (pp. 256–268). New York, NY: Springer. 10.1007/978-3-319-47665-0_23 Rothman, J., González-Alonso, J., & Puig-Mayenco. (2019). Third language acquisition and linguistic transfer. Cambridge University Press. Rott, G., Diefenbach, B., Vogel-Heuser, B., & Neuland, E. (2003). The challenge of inter- and transdisciplinary knowledge: Results of the WISA Project. European Conference of Educational Research, University of Hamburg. https://www. leeds.ac.uk/educol/documents/00003520.htm Ruben, B. D. (1989). The study of cross-cultural competence: Traditions and contemporary issues. International Journal of Intercultural Relations, 13(3), 229–240. doi:10.1016/0147-1767(89)90011-4 Ryokai, K., Vaucelle, C., & Cassell, C. (2003). Virtual peers as partners in storytelling and literacy learning. Journal of Computer Assisted Learning, 19(2), 195–208. doi:10.1046/j.0266-4909.2003.00020.x Sabrina, F., Azad, S., Sohail, S., & Thakur, S. (2022). Ensuring academic integrity in online assessments: A literature review and recommendations. International Journal of Information and Education Technology (IJIET), 12(1), 60–70. doi:10.18178/ijiet.2022.12.1.1587 Şahin, S. (2019). An analysis of English Language Testing and Evaluation course in English Language Teacher Education Programs in Turkey: Developing language assessment literacy of pre-service EFL teachers [PhD Thesis]. Middle East Technical University. 397

Compilation of References

Santos, B. S. (2007). Epistemologies of the south: Justice against epistemicide. Routledge. Saraceni, M. (2008). Meaningful form: Transitivity and intentionality. ELT Journal, 62(2), 164–172. Sardabi, N., Mansouri, B., & Behzadpoor, F. (2020). Autoethnography in TESOL. In J. Liontas (Ed.), the TESOL encyclopedia of English language teaching (pp. 1–6). Wiley & Sons, Inc. Savignon, S. J. (1972). Communicative competence: An experiment in foreign-language teaching. Center for Curriculum Development. Savignon, S. J. (1997). Communicative competence: Theory and classroom practice: Texts and contexts in second language learning. McGraw-Hill. Sayın, B. A., & Aslan, M. M. (2016). The negative effects of undergraduate placement examination of English (LYS-5) on ELT students in Turkey. Participatory Educational Research, 3(1), 30–39. doi:10.17275/per.16.02.3.1 Scarino, A. (2013). Language assessment literacy as self-awareness: Understanding the role of interpretation in assessment and in teacher learning. Language Testing, 30(3), 309–327. doi:10.1177/0265532213480128 Schärer, R. (2000). European language portfolio pilot project phase (1998-2000): Final Report. Council of Europe. https://rm.coe.int/16804586bb Schimmack, U., Radhakrishnan, P., Oishi, S., Dzokoto, V., & Ahadi, S. (2002). Culture, personality, and subjective wellbeing: Integrating process models of life satisfaction. Journal of Personality and Social Psychology, 82(4), 582–593. doi:10.1037/0022-3514.82.4.582 PMID:11999925 Schleicher, I., Leitner, K., Juenger, J., Moeltner, A., Ruesseler, M., Bender, B., Sterz, J., Schuettler, K.-F., Koenig, S., & Kreuder, J. G. (2017). Examiner effect on the objective structured clinical exam–a study at five medical schools. BMC Medical Education, 17(1), 1–7. doi:10.118612909-017-0908-1 PMID:28056975 Schodde, T., Bergmann, K., & Kopp, S. (2017). Adaptive robot language tutoring based on Bayesian knowledge tracing and predictive decision-making. In Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction (pp. 128–136). Vienna, Austria. 10.1145/2909824.3020222 Schwerdt, G., & Woessmann, L. (2017). The information value of central school exams. Economics of Education Review, 56, 65–79. doi:10.1016/j.econedurev.2016.11.005 Scott, L. A., Bruno, L., Gokita, T., & Thoma, C. A. (2022). Teacher candidates’ abilities to develop universal design for learning and universal design for transition lesson plans. International Journal of Inclusive Education, 26(4), 333–347. doi:10.1080/13603116.2019.1651910 Scott, L. A., Thoma, C. A., Puglia, L., Temple, P., & D’Aguilar, A. (2017). Implementing a UDL framework: A study of current personnel preparation practices. Intellectual and Developmental Disabilities, 55(1), 25–36. doi:10.1352/19349556-55.1.25 PMID:28181884 Searle, J. R. (1969). Speech acts: An essay in the philosophy of language. Cambridge University Press. doi:10.1017/ CBO9781139173438 Selvachandran, J., Kay-Raining Bird, E., DeSousa, J., & Chen, X. (2020). Special education needs in French Immersion: A parental perspective of supports and challenges. International Journal of Bilingual Education and Bilingualism, 25(3), 1120–1136. doi:10.1080/13670050.2020.1742650 Sen, A. (2011). The idea of justice. Harvard University Press.

398

Compilation of References

Serholt, S., Barendregt, W., Leite, I., Hastie, H., Jones, A., Paiva, A., ... Castellano, G. (2014). Teachers’ views on the use of empathic robotic tutors in the classroom. In Proceedings of the 23rd IEEE International Symposium on Robot and Human Interactive Communication (955–960). IEEE. 10.1109/ROMAN.2014.6926376 Serholt, S., Barendregt, W., Vasalou, A., Alves-Oliveira, P., Jones, A., Petisca, S., & Paiva, A. (2017). The case of classroom robots: Teachers’ deliberations on the ethical tensions. AI & Society, 32(4), 613–631. doi:10.100700146-016-0667-2 Serholt, S., Pareto, L., Ekström, S., & Ljungblad, S. (2020). Trouble and repair in child-robot interaction: A study of complex interactions with a robot tutee in a primary school classroom. Frontiers in Robotics and AI, 7, 46. doi:10.3389/ frobt.2020.00046 PMID:33501214 Sert, O. (2015). Social Interaction and L2 Classroom Discourse. Edinburgh University Press. doi:10.1515/9780748692651 Service, E. (1992). Phonology, working memory, and foreign language learning. Quarterly Journal of Experimental Psychology, 45A(1), 21–50. doi:10.1080/14640749208401314 PMID:1636010 Service, E., & Kohonen, V. (1995). Is the relation between phonological memory and foreign language learning accounted for by vocabulary acquisition? Applied Psycholinguistics, 16(2), 155–172. doi:10.1017/S0142716400007062 Sevimel-Sahin, A., & Subasi, G. (2019). An overview of language assessment literacy research within English language education context. Kuramsal Eğitimbilim Dergisi, 12(4), 1340–1364. Shah, S. K., Hassan, S., & Iqbal, W. (2015). Evaluation of text-book as curriculum: English for 6 and 7 grades in Pakistan. International Journal of English Language Education, 3(2), 71–89. doi:10.5296/ijele.v3i2.8042 Shamim, F. (2008). Trends, issues and challenges in English language education in Pakistan. Asia Pacific Journal of Education, 28(3), 235–249. doi:10.1080/02188790802267324 Shamim, F. (2011). English as the language for development in Pakistan: Issues, challenges and possible solutions. In H. Coleman (Ed.), Dreams and Realities: Developing Countries and the English Language (pp. 291–310). British Council. Sharadgah, T. A., & Sa’di, R. A. (2020). Preparedness of institutions of higher education for assessment in virtual learning environments during the COVID-19 lockdown: Evidence of bona fide challenges and prag-matic solutions. Journal of Information Technology Education, 19, 755–774. doi:10.28945/4615 Shariffuddin, S. A., Ibrahim, I. S. A., Shaaidi, W. R. W., Syukor, F. D. M., & Hussain, J. (2022). Academic dishonesty in online assessment from tertiary students’ perspective. International Journal of Advanced Research in Education and Society, 4(2), 75–84. doi:10.55057/ijares.2022.4.2.8 Sharkey, A. J. C. (2016). Should we welcome robot teachers? Ethics and Information Technology, 18(4), 283–297. doi:10.100710676-016-9387-z Sharkey, A. J. C., & Sharkey, N. E. (2012). Granny and the robots: Ethical issues in robot care for the elderly. Ethics and Information Technology, 14(1), 27–40. doi:10.100710676-010-9234-6 Sharkey, N. E., & Sharkey, A. J. C. (2010). The crying shame of robot nannies: An ethical appraisal. Interaction Studies: Social Behaviour and Communication in Biological and Artificial Systems, 11(2), 161–190. doi:10.1075/is.11.2.01sha Sheldon, L. E. (1988). Evaluating ELT textbooks and materials. ELT Journal, 42(4), 237–246. doi:10.1093/elt/42.4.237 Shepard, L. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7), 4–14. Shepard, L., Kagan, S., & Wurtz, E. (Eds.). (1998). Principles and recommendations for early childhood assessments. National Education Goals Panel.

399

Compilation of References

Shin, J., & Shin, D. H. (2015). Robots as a facilitator in language conversation class. In Proceedings of the 10th Annual ACM/IEEE International Conference on Human-Robot Interaction Extended Abstracts (pp.11–12). Associations for Computing Machinery. 10.1145/2701973.2702062 Shohamy, E. (2001). Democratic Assessment as an Alternative. Language Testing, 18(4), 373–391. doi:10.1177/026553220101800404 Shohamy, E., Donista-Schmidt, S., & Ferman, I. (1996). Test Impact revisited, washback effect over time. Language Testing, 13(3), 298–317. doi:10.1177/026553229601300305 Shohamy, E., & McNamara, T. (2009). Language tests for citizenship, immigration, and asylum. Language Assessment Quarterly, 6(1), 1–5. doi:10.1080/15434300802606440 Shrestha, P. N. (2017). Investigating the learner transfer of genre features and conceptual knowledge from an academic literacy course to business studies: Exploring the potential of dynamic assessment. Journal of English for Academic Purposes, 25, 1–17. doi:10.1016/j.jeap.2016.10.002 Shrestha, P. N. (2020). Dynamic Assessment of Students’ Academic Writing: Vygotskian and Systemic Functional Linguistic Perspectives. Springer. doi:10.1007/978-3-030-55845-1 Simon-Cereijido, G., & Gutierrez-Clellen, V. (2014). Bilingual education for all: Latino dual language learners with language disabilities. International Journal of Bilingual Education and Bilingualism, 17(2), 235-254. doi:10.1080/13 670050.2013.866630 Singh, B. (1999). Formative Assessment: which way now? Paper Presented at the British Educational Research Association Annual Conference, University of Sussex at Brighton. Singleton, D. (2001). Age and second language acquisition. Annual Review of Applied Linguistics, 21, 77–89. Singleton, D. (2005). The Critical Period Hypothesis: A coat of many colors. International Review of Applied Linguistics in Language Teaching, 43, 269–285. Sinicrope, C., Norris, J., & Watanabe, Y. (2007). Understanding and assessing intercultural competence: A summary of theory, research, and practice (technical report for the foreign language program evaluation project). Second Language Studies, 26(1), 58. Siren, T. (2018). Representations of men and women in English language textbooks: A critical discourse analysis of open road 1-7 [Master Thesis]. University of Oulu, Finland. Situmorang, K., Nugroho, D. Y., & Pramusita, S. M. (2020). English teachers’ preparedness in technology enhanced language learning during Covid-19 pandemic – Students’ voice. Jo-ELT (Journal of English Language Teaching). Fakultas Pendidikan Bahasa & Seni Prodi Pendidikan Bahasa Inggris IKIP, 7(2), 57–67. doi:10.33394/jo-elt.v7i2.2973 Skinner, B. F. (1961). Teaching machines. Scientific American, 205(5), 90–106. doi:10.1038cientificamerican1161-90 PMID:13913636 Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24(4), 265–269. doi:10.1002/asi.4630240406 Smith, K. (1995). Assessing and Testing Young Learners: Can we? Should we? IATEFL SIG Mini Symposium. Smith, R. M. (1991). IPARM: Item and Pearson Analysis with Rasch Model. Mesa Press. Solano-Flores, G. (2006). Language, dialect, and register: Sociolinguistics and the estimation of measurement error in the testing of English language learners. Teachers College Record, 108(11), 2354–2379. doi:10.1111/j.1467-9620.2006.00785.x 400

Compilation of References

Solomon, G., & Schrum, L. (2007). Web 2.0 new tools, new schools. International Society for Technology in Education (ISTE). Souto-Otero, M. (2020). Globalization of higher education, critical views. In P. N. Teixeira & J. C. Shin (Eds.), The international encyclopedia of higher education systems and institutions (pp. 568–572). Springer Netherlands. doi:10.1007/97894-017-8905-9_215 Sparks, R. L., & Ganschow, L. (1995). Parent perceptions in the screening of performance in foreign language courses. Foreign Language Annals, 28(3), 371–391. doi:10.1111/j.1944-9720.1995.tb00806.x Spector, J. E. (1992). Predicting progress in beginning reading: Dynamic assessment of phonemic awareness. Journal of Educational Psychology, 84(3), 353–363. Spencer-Oatey, H., & Franklin, P. (2009). Intercultural interaction: A multidisciplinary approach to intercultural communication. doi:10.1057/9780230244511 Spitzberg, B. H., & Changnon, G. (2009). Conceptualizing intercultural competence: Issue and tools. In D. K. Deardorff (Ed.), The SAGE handbook of intercultural competence (pp. 2–52). SAGE. doi:10.4135/9781071872987.n1 Spolsky, B. (1985). What does it mean to know how to use a language: An essay on the theoretical basis of language testing. Language Testing, 2(2), 180–191. doi:10.1177/026553228500200206 Stahl, G. K. (2001). Using assessment centers as tools for global leadership development: An exploratory study. In M. E. Mendenhall, G. K. Stahl, & T. M. Kühlmann (Eds.), Developing global business leaders: Policies, processes, and innovations (pp. 197–210). Quorum Books. Standing Conference of the Ministers of Education of the Federal States. (Ed.). (2004). Bildungsstandards für die erste Fremdsprache (Englisch/Französisch) für den Mittleren Schulabschluss. Beschlüsse Der Kultusministerkonferenz Vom 04.12.2003, Art.-Nr. 05966. https://www.kmk.org/fileadmin/veroeffentlichungen_beschluess e/2003/2003_12_04-BS-erste-Fremdsprache.pdf Standing Conference of the Ministers of Education of the Federal States. (Ed.). (2014). Bildungsstandards für die fortgeführte Fremdsprache (Englisch/Französisch) für die Allgemeine Hochschulreife. Beschlüsse Der Kultusministerkonferenz Vom 18.10.2012. https://www.kmk.org/fileadmin/veroeffentlichungen_beschluess e/2012/2012_10_18-Bildungsstandards-Fortgef-FS-Abi.pdf Starr, L. J. (2010). The use of autoethnography in educational research: Locating who we are in what we do. Canadian Journal for New Scholars in Education, 1(3), 1–9. Stevens, D. D., & Levi, A. J. (2013). Introduction to Rubrics: An Assessment Tool to Save Grading Time, Convey Effective Feedback, and Promote Student Learning (2nd ed.). Stylus Publishing, LLC. Stiggins, R. J. (1999). Evaluating classroom assessment training in teacher education programs. Educational Measurement: Issues and Practice, 18(1), 23–27. doi:10.1111/j.1745-3992.1999.tb00004.x Stiggins, R. J. (2002). Assessment crisis: The absence of assessment for learning. Phi Delta Kappan, 83(10), 758–765. doi:10.1177/003172170208301010 Stiggins, R. J. (2007). Classroom assessment for student learning. Pearson Education, Inc. Storch, N. (2018). Written corrective feedback from sociocultural theoretical perspectives: A research agenda. Language Teaching, 51(2), 262–277. doi:10.1017/S0261444818000034

401

Compilation of References

Strik, H., Truong, K., De Wet, F., & Cucchiarini, C. (2009). Comparing different approaches for automatic pronunciation error detection. Speech Communication, 51(10), 845–852. doi:10.1016/j.specom.2009.05.007 Suarta, I. M., Suwintana, I. K., Sudhana, I. G. P. F. P., & Hariyanti, N. K. D. (2017). Employability skills required by the 21st-century workplace: A literature review of labour market demand. Advances in Social Science, Education and Humanities Research, 102, 337–342. doi:10.2991/ictvt-17.2017.58 Sultana, N. (2019). Language assessment literacy: An uncharted area for the English language teachers in Bangladesh. Language Testing in Asia, 9(1), 1–14. doi:10.118640468-019-0077-8 Sunderland, J. (1992). Gender in the EFL classroom. ELT Journal, 46(1), 81–91. doi:10.1093/elt/46.1.81 Surahman, E., & Wang, T. (2022). Academic dishonesty and trustworthy assessment in online learning: A systematic literature review. Journal of Computer Assisted Learning, 38(6), 1–19. doi:10.1111/jcal.12708 Suzuki, Y., & DeKeyser, R. (2017). The interface of explicit and implicit knowledge in a second language: Insights from individual differences in cognitive aptitudes. Language Learning, 67(4), 747–790. doi:10.1111/lang.12241 Swader, C. S. (2019). Loneliness in Europe: Personal and societal individualism-collectivism and their connection to social isolation. Social Forces, 97(3), 1307–1336. doi:10.1093foy088 Swain, M., & Lapkin, S. (2000). Task-based second language learning: The uses of the first language. Language Teaching Research, 4, 251–274. Swan, M. (1985). A critical look at the communicative approach. ELT Journal, 39(1), 2–12. doi:10.1093/elt/39.1.2 Swan, M., & Walter, C. (2017). Misunderstanding comprehension. ELT Journal, 71(2), 228–236. doi:10.1093/elt/ccw094 Tabachnick, B. G., & Fidell, L. S. (2013). Using Multivariate Statistics (6th ed.). Pearson Education. Taber, K. S. (2018). The use of Cronbach’s alpha when developing and reporting research instruments in science education. Research in Science Education, 48(6), 1273–1296. doi:10.100711165-016-9602-2 Takahashi, E. (1998). Language development in social interaction: A longitudinal study of a Japanese FLES program from a Vygotskyan approach. Foreign Language Annals, 31(3), 392–406. Tambo, L. I. (2012). Principles and Methods of Teaching (2nd ed.). ANUCAM Limbe. Taminiau, E. M., Kester, L., Corbalan, G., Spector, J. M., Kirschner, P. A., & Van Merriënboer, J. J. (2015). Designing on‐demand education for simultaneous development of domain‐specific and self‐directed learning skills. Journal of Computer Assisted Learning, 31(5), 405–421. doi:10.1111/jcal.12076 Tanaka, F., & Matsuzoe, S. (2012). Children teach a care-receiving robot to promote their learning: Field experiments in a classroom for vocabulary learning. Journal of Human-Robot Interaction, 1, 78–95. doi:10.5898/JHRI.1.1.Tanaka Tante, A. C. (2010a). Young learners’ classroom assessment and their performance in English language in Englishspeaking Cameroon primary school. Journal of Educational Assessment in Africa, 4, 175–189. Tante, A. C. (2010b). The Purpose of English Language Teacher Assessment in the English-speaking Primary School in Cameroon. English Language Teacher Education and Development Journal, 13, 27–39. Taras, M. (2005). Assessment-summative and formative-some theoretical reflections. British Journal of Educational Studies, 53(4), 466–478. doi:10.1111/j.1467-8527.2005.00307.x Tatsuya, N., Kanda, T., Kidokoro, H., Suehiro, Y., & Yamada, S. (2016). Why do children abuse robots? Interaction Studies: Social Behaviour and Communication in Biological and Artificial Systems, 17(3), 348–370. 402

Compilation of References

Tauginienė, L., Gaižauskaitė, I., Glendinning, I., Kravjar, J., Ojsteršek, M., Ribeiro, L., Odiņeca, T., Marino, F., Cosentino, M., Sivasubramaniam, S., & Foltýnek, T. (2018). Glossary for academic integrity. European Network for Academic Integrity. http://www.academicintegrity.eu/wp/wp-content/uploads/2018/1 0/Glossary_revised_final.pdf Tauginienė, L., Ojsteršek, M., Foltınek, T., Marino, F., Cosentino, M., Gaižauskaitė, I., Glendinning, I., Sivasubramaniam, S., Razi, S., Ribeiro, L., Odiņeca, T., & Trevisiol, O. (2019). General Guidelines for Academic Integrity. ENAI Report 3A. https://www.academicintegrity.eu/wp/wp-content/uploads/2019/ 09/Guidelines_amended_version_1.1_09_2019.pdf Taylor, C. S., & Bobbit-Nolen, S. (2005). Classroom assessment: Supporting teaching and learning in real classrooms. Prentice Hall. Taylor, L. (2006). The Changing Landscape of English: Implications for Language Assessment. ELT Journal, 60(1), 51–60. doi:10.1093/elt/cci081 Taylor, L. (2009). Developing assessment literacy. Annual Review of Applied Linguistics, 29, 21–36. doi:10.1017/ S0267190509090035 Tezci, E., Dilekli, Y., Yıldırım, S., Kervan, S., & Mehmeti, F. (2017). Öğretmen adaylarının sahip olduğu öğretim anlayışları üzerine bir analiz. Education Sciences, 12(4), 163–176. Tharp, R. G., & Gallimore, R. (1991). The instructional conversation: Teaching and learning in social activity. Research Reports: 2. Paper rr02. Santa Cruz, CA: National Center for Research and Cultural Diversity and Second Language Learning. Retrieved from: http://escholarship. org/uc/item/5th0939d T h e I n t e r n a t i o n a l C e n t e r fo r Ac a d e m i c I n t e g r i t y [ I CA I ] . ( 2 0 2 1 ) . T h e f u n d a m e n t a l va l ues of academic integrity (3rd ed.). www.academicintegrity.org/the-fundamental-valuesof-academicintegrity The National Standards Collaborative Board. (2015). World-readiness standards for learning languages (4th ed.). Author. Thibault, P. J. (2000). The multimodal transcription of a television advertisement: Theory and practice. In A. Baldry (Ed.), Multimodality and multimediality in the distance learning age (pp. 311–385). Palladino. Thoma, C. A., Bartholomew, C. C., & Scott, L. A. (2009). Universal design for transition: A roadmap for planning and instruction. Paul H Brookes Publishing. Thomas, J., Allman, C. B., & Beech, M. (2005). Assessment for the diverse classroom: A handbook for teachers. Bureau of Exceptional Education and Student Services, Florida Department of Education. Thornbury, S., & Meddings, L. (1999). The roaring in the chimney. Retrieved on June 7, 2019 from http://www.hltmag. co.uk/sep01/Sartsep018.rtf Thornbury, S. (2002). How to teach vocabulary. Longman. Thornbury, S. (2006). How to teach grammar. Longman. Tickoo, M. L. (2003). Teaching and learning English: A sourcebook for teachers and teacher-trainers. Orient Longman. Tiong, L. C. O., & Lee, H. J. (2021). E-cheating prevention measures: Detection of cheating at online examinations using deep learning approach -- A case study. Journal of Latex Class Files, 1-9. doi:10.48550/arXiv.2101.09841 Tok, H. (2010). TEFL textbook evaluation: From teachers’ perspectives. Educational Research Review, 5(9), 508–517.

403

Compilation of References

Toksöz, I., & Kılıçkaya, F. (2018). Review of journal articles on washback in language testing in Turkey (2010-2017). Lublin Studies in Modern Languages and Literature, 41(2), 184. doi:10.17951/lsmll.2017.41.2.184 Tomlinson, M. (2004). 14-19 curriculum and qualifications reform: Interim report of the working group on 14-19 reform. London: DfES. www.14-19reform.gov.uk Tomlinson, B. (2010). Principles of effective materials development. In N. Harwood (Ed.), English language teaching materials: Theory and practice (pp. 81–98). Cambridge University Press. Torelli, C. J., Leslie, L. M., To, C., & Kim, S. (2020). Power and status across cultures. Current Opinion in Psychology, 33, 12–17. doi:10.1016/j.copsyc.2019.05.005 PMID:31336191 Torrance, H. (2001). Assessment for learning: Developing formative assessment in the classroom. International Journal of Primary. Elementary and Early Years Education, 29(3), 26–32. Torrance, H., & Pryor, J. (2002). Investigating formative assessment, teaching and learning in the classroom. Open University Press, McGraw Hill. Toylor, L. (2005). Washback and impact. ELT Journal, 59(2), 154–155. doi:10.1093/eltj/cci030 Tran, T. H. (2012). Second Language Assessment for Classroom Teachers. Paper presented at MIDTESOL 2012, Ames, IA. Treffers-Daller, J., & Silva-Corvalán, C. (Eds.). (2016). Language dominance in bilinguals: Issues of measurement and operationalization. Cambridge University Press. Tribble, C. (1996). Writing. Oxford University Press. Troudi, S., Coombe, C., & Al‐Hamliy, M. (2009). EFL teachers’ views of English language assessment in higher education in the United Arab Emirates and Kuwait. TESOL Quarterly, 43(3), 546–555. doi:10.1002/j.1545-7249.2009.tb00252.x Trumbull, E., & Lash, A. (2013). Understanding formative assessment: Insights from learning theory and measurement theory. WestEd. Tsigaros, T., & Fesakis, G. (2021). E-assessment and academic integrity: A literature review. In A. Reis, J. Barroso, J. B. Lopes, T. Mikropoulos, & C-W. Fan (Eds.), Technology and Innovation in Learning, Teaching and Education: Second International Conference, TECH-EDU 2020 Proceedings (pp. 313-319). Springer International Publishing. Tsulaia, N., & Adamia, Z. (2020). Formative assessment tools for higher education learning environment. International Scientific-Pedagogical Organization of Philologists “WEST-EAST” (ISPOP), 3(1), 86-93. doi:10.33739/2587-54342020-3-1-86-93 Tullis-Owen, J. A., McRae, C., Adams, T. E., & Vitale, A. (2009). Truth troubles. Qualitative Inquiry, 15(1), 178–200. doi:10.1177/1077800408318316 Tung, R. L. (1987). Expatriate assignments: Enhancing success and minimizing failure. The Academy of Management Perspectives, 1(2), 117–125. doi:10.5465/ame.1987.4275826 Tzuriel, D., & Shamir, A. (2002). The effects of mediation in computer assisted dynamic assessment. Journal of Computer Assisted Learning, 18(1), 21–32. doi:10.1046/j.0266-4909.2001.00204.x Ullah, H., & Skelton, C. (2013). Gender representation in the public sector schools textbooks of Pakistan. Educational Studies, 39(2), 183–194. doi:10.1080/03055698.2012.702892 Universal design for learning guidelines (Version 2.2). (2022). Retrieved from https://udlguidelines.cast.org/more/downloads Ur, P. (2007). A course in language teaching: Practice and theory. Cambridge University Press. 404

Compilation of References

Vaismoradi, M., Turunen, H., & Bondas, T. (2013). Content analysis and thematic analysis: Implications for conducting a qualitative descriptive study. Nursing & Health Sciences, 15(3), 398–405. doi:10.1111/nhs.12048 PMID:23480423 Valette, R. (1994). Teaching, testing and assessment: Conceptualizing the relationship. In C. Hancock (Ed.), Teaching, testing and assessment: Making the connection (pp. 1–42). National Textbook Company. Valette, R. M. (1994). Teaching, testing, and assessment: Conceptualizing the relationship. National Textbook Company. Valizadeh, M. (2022). Cheating in online learning programs: Learners’ perceptions and solutions. Turkish Online Journal of Distance Education, 23(1), 195–209. doi:10.17718/tojde.1050394 van den Berghe, R., Verhagen, J., Oudgenoeg-Paz, O., van der Ven, S., & Leseman, P. (2019). Social robots for language learning: A review. Review of Educational Research, 89(2), 259–295. doi:10.3102/0034654318821286 van der Westhuizen, D. (2016). Guidelines for online assessment for educators. Commonwealth of Learning. doi:10.13140/ RG.2.2.31196.39040 Van Dyne, L., Ang, S., Ng, K. Y., Rockstuhl, T., Tan, M. L., & Koh, C. (2012). Sub-Dimensions of the four factor model of cultural intelligence: Expanding the conceptualization and measurement of cultural intelligence: cq: sub-dimensions of cultural intelligence. Social and Personality Psychology Compass, 6(4), 295–313. doi:10.1111/j.1751-9004.2012.00429.x Veintie, T. (2013). Coloniality and cognitive justice: Reinterpreting formal education for the indigenous peoples in Ecuador. International Journal of Multicultural Education, 15(3), 45–60. doi:10.18251/ijme.v15i3.708 Vispoel, W. P., Hendrickson, A. B., & Bleiler, T. (2000). Limiting answer review and change on computerized adaptive vocabulary tests: Psychometric and attitudinal results. Journal of Educational Measurement, 37(1), 21–38. doi:10.1111/j.1745-3984.2000.tb01074.x Vogt, K. (2018). Interkulturelle kommunikative Kompetenz fördern. In Basiswissen Lehrerbildung: Englisch unterrichten (pp. 80–95). Klett/Kallmeyer. Vogt, K. (2016). Teaching Practice Abroad for developing intercultural competence in foreign language teachers. Canadian Journal of Applied Linguistics, 19(2), 85–106. Vogt, K., & Tsagari, D. (2014). Assessment literacy of foreign language teachers: Findings of a European Study. Language Assessment Quarterly, 11(4), 374–402. doi:10.1080/15434303.2014.960046 Volodina, A., Weinert, S., & Mursin, K. (2020). Development of academic vocabulary across primary school age: Differential growth and influential factors for German monolinguals and language minority learners. Developmental Psychology, 56(5), 922–936. doi:10.1037/dev0000910 PMID:32162935 Vygotsky, L. (1978). Mind in Society. Harvard University Press. Vygotsky, L. (1986). Thought and language. MIT Press. Vygotsky, L. S. (1978). Mind in society. Harvard University Press. Wainer, H. (2000). Cats: Whither and whence. ETS Research Report Series, 2(2), i-15. doi:10.1002/j.2333-8504.2000. tb01835.x Wallace, C., & Davies, M. (2009). Sharing Assessment in Health and Social Care: A Practical Handbook for Interprofessional Working. Sage Publications. doi:10.4135/9781446215999 Wall, S. (2008). Easier said than done: Writing an autoethnography. International Journal of Qualitative Methods, 7(1), 38–53. doi:10.1177/160940690800700103 405

Compilation of References

Walsh, S., & Sert, O. (2019). Mediating L2 learning through classroom interaction. In X. Gao (Ed.), Second handbook of English Language teaching (pp. 1–19). Springer., doi:10.1007/978-3-319-58542-0 35-1 Walter, D., Way, R. P. D., & Nichols, P. (2010). Psychometric challenges and opportunities in implementing formative assessment. In H. L. Andrade, & G. J. Cizek (Eds.), Handbook of formative assessment (pp. 297–315). New York, NY: Taylor & Francis. Wana, Z., Hansen, J. H. L., & Xie, Y. (2020). A multi-view approach for Mandarin non-native mispronunciation verification. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 8079-8083. 10.1109/ ICASSP40776.2020.9053981 Wang, H., Yan, B., Chiu, H., Hsu, Y., & Chen, B. (2021). Exploring non-autoregressive end-to-end neural modeling for English mispronunciation detection and diagnosis. ArXiv, abs/2111.00844. Wang, X., Bradlow, E. T., & Wainer, H. (2002). A general Bayesian model for testlets: Theory and applications. Applied Psychological Measurement, 26(1), 109–128. doi:10.1177/0146621602026001007 Wang, Y. H., Young, S. S., & Jang, J. S. R. (2013). Using tangible companions for enhancing learning English conversation. Journal of Educational Technology & Society, 16(2), 296–309. Wang, Z., Zhang, J., & Xie, Y. (2018). L2 mispronunciation verification based on acoustic phone embedding and Siamese networks. Proc. 11th International Symposium on Chinese Spoken Language Processing (ISCSLP), 444-448. 10.1109/ ISCSLP.2018.8706597 Warne, R. T., Yoon, M., & Price, C. J. (2014). Exploring the various interpretations of “test bias”. Cultural Diversity & Ethnic Minority Psychology, 20(4), 570–582. doi:10.1037/a0036503 PMID:25313435 Warsi, J. (2004). Conditions under which English is taught in Pakistan: An applied linguistic perspective. Sarid Journal, 1(1), 1–9. Watanabe, Y. (2004). Methodology in washback studies. In L. Cheng, Y. Watanabe, & A. Curtis (Eds.), Washback in language testing: Research contexts and methods (pp. 19–36). Lawrence Erlbaum Associates. Watanabe, Y. (2011). Teaching a course in assessment literacy to test takers: Its rationale, procedure, content and effectiveness. Research Notes, 46, 29–34. Webb, S. (2005). Receptive and productive vocabulary learning. Studies in Second Language Acquisition, 27(1), 33–52. doi:10.1017/S0272263105050023 Weigle, S. C. (2012). Assessing writing. In C. Coombe, P. Davidson, B. O’Sullivan, & S. Stoynoff (Eds.), The Cambridge Guide to Second Language Assessment (pp. 236–246). Cambridge University Press. Weinert, F. E. (Ed.). (2001). Leistungsmessungen in Schulen (1st ed.). Beltz. Weir, C. J. (2005). Language testing and validation: An evidence-based approach. Palgrave MacMillan. doi:10.1057/9780230514577 Weleschuk, A., Dyjur, P., & Kelly, P. (2019). Online Assessment in Higher Education. Taylor Institute for Teaching and Learning Guide Series. Taylor Institute for Teaching and Learning at the University of Calgary. https://taylorinstitute. ucalgary.ca/resources/guides Wellsby, M., & Pexman, P. M. (2014). Developing embodied cognition: Insights from children’s concepts and language processing. Frontiers in Psychology, 5, 506. doi:10.3389/fpsyg.2014.00506 PMID:24904513

406

Compilation of References

Wenger, E. (2004). Knowledge management as a doughnut: Shaping your knowledge strategy through communities of practice. Ivey Business Journal, 68(3). We n g e r, E . ( 2 0 1 1 ) . C o m m u n i t i e s o f p ra c t i c e : A b r i e f i n t ro d u c t i o n . R e trieved from h t t p s : / / s c h o l a r s b a n k . u o r e g o n . e d u / x m l u i / b i t s t r e a m / h a n d l e / 1 7 9 4 /11736/A%20brief%20intoduction%20to%20CoP.pdf Wenger, E. (1998). Communities of practice: Learning meaning and identity. Cambridge University Press. doi:10.1017/ CBO9780511803932 Wertsch, J. V. (1979). The regulation of human action and the given‐new organization of private speech. In G. Zivin (Ed.), The development of self‐regulation through private speech (pp. 78–98). John Wiley & Sons. Wertsch, J. V. (1985). Vygotsky and the social formation of mind. Harvard University Press. White, E. (2009). Are you assessment literate? OnCue Journal, 3(1), 3–25. White, E. (2009). Are you assessment literate? Some fundamental questions regarding effective classroom-based assessment. OnCUE Journal, 3(1), 3–25. Whittemore, S. (2018). Transversal competencies essential for future proofing the workforce. Skilla. Wight, M. S. (2015). Negotiating language learner identities: Students with disabilities in the foreign language learning environment. Dissertation Abstracts International Section A, 76. Wiliam, D. (2011). What is assessment for learning? Studies in Educational Evaluation, 37(1), 3–14. doi:10.1016/j. stueduc.2011.03.001 PMID:22114905 Williams, D. (1983). Developing criteria for textbook evaluation. ELT Journal, 37(3), 251–255. doi:10.1093/elt/37.3.251 Winkler, I. (2018). Doing autoethnography: Facing challenges, taking choices, accepting responsibilities. Qualitative Inquiry, 24(4), 236–247. doi:10.1177/1077800417728956 Witte, A. E. (2012). Making the Case for a Post-National Cultural Analysis of Organizations. Journal of Management Inquiry, 21(2), 141–159. doi:10.1177/1056492611415279 Witt, S. M., & Young, S. J. (2000). Phone-level pronunciation scoring and assessment for interactive language learning. Speech Communication, 30(2–3), 95–108. doi:10.1016/S0167-6393(99)00044-8 Wixson, K. K., & Valencia, S. W. (2011). Assessment in RTI: What teachers and specialists need to know. The Reading Teacher, 64, 466–469. doi:10.1598/RT.64.6.13 Wolf, M. R. (1984). Evaluation in Education; Foundations of competency Assessment and Program Review (2nd ed.). Praeger Publishers. Wolfram, D. (2003). Applied informetrics for information retrieval research (No. 36). Greenwood Publishing Group. Wood, D., Bruner, J., & Ross, G. (1976). The role of tutoring in problem solving. Journal of Child Psychology and Psychiatry, and Allied Disciplines, 17. World Tourism Organization. (Ed.). (2022). Yearbook of tourism statistics, data 2016 – 2020, 2022 Edition. World Tourism Organization (UNWTO). Wright, A. (1995). Storytelling with children. Oxford University Press. Wright, A. (1997). Creating stories with children. Oxford University Press. 407

Compilation of References

Wright, C., Antonios, A., Palaktsoglou, M., & Tsianika, M. (2013). Planning for authentic language assessment in higher education synchronous online environments. Journal of Modern Greek Studies, 246–258. Yadollahi, E., Johal, W., Paiva, A., & Dillenbourg, P. (2018). When deictic gestures in a robot can harm childrobot collaboration. In Proceedings of the 17th ACM Conference on Interaction Design and Children (pp.195–206). 10.1145/3202185.3202743 Yaman, İ. (2018). Türkiye’de İngilizce Öğrenmek: Zorluklar ve fırsatlar. RumeliDE Dil ve Edebiyat Araştırmaları Dergisi, 11, 161–175. doi:10.29000/rumelide.417491 Yamaoka, F., Kanda, T., Ishiguro, H., & Hagita, N. (2007). Interacting with a human or a humanoid robot? Proceeding of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Yan, B.-C., Wang, H.-W., Jiang, S.-W. F., Chao, F.-A., & Chen, B. (2022). Maximum F1-score training for end-to-end mispronunciation detection and diagnosis of L2 English speech. Proc. IEEE International Conference on Multimedia and Expo 2022. 10.1109/ICME52920.2022.9858931 Yaqoob, H. M. A., Ahmed, M., & Aftab, M. (2015). Constraints faced by teachers in conducting CLT based activities at secondary school sertificate (SSC) level in rural area of Pakistan. Education Research International, 4(2), 109–118. Yaşar, Ş. (1998). Yapısalcı Kuram ve öğrenme-öğretme süreci. In Vll. Ulusal Eğitim Bilimleri Kongresi Basılmış Bildiriler Kitabı (pp. 695-701). Konya: Selçuk Üniversitesi. https://www.academia.edu/24736887/YAPISALCI_KURAM_VE_%C3%96% C4%9ERENME-_%C3%96%C4%9ERETME_S%C3%9CREC%C4%B0 Yawkey, T. D., Gonzalez, V., & Juan, Y. (1994). Literacy and biliteracy strategies and approaches for young culturally and linguistically diverse children: Academic excellence P.I.A.G.E.T. comes alive. Journal of Reading Improvement, 31(3), 130–141. Yazan, B., & Keleş, U. (2022). A snippet of an ongoing narrative: A non-linear, fragmented, and unorthodox autoethnographic conversation. Applied Linguistics Inquiry, 1(1). https://doi.org/10.22077/ALI.2022.5561.1003 Yazan, B. (2018). TESL teacher educators’ professional self-development, identity, and agency. TESL Canada Journal, 35(2), 140–155. doi:10.18806/tesl.v35i2.1294 Yazan, B., & Keleş, U. (2022). A snippet of an ongoing narrative: A non-linear, fragmented, and unorthodox autoethnographic conversation. Applied Linguistics Inquiry, 1(1). Advance online publication. doi:10.22077/ALI.2022.5561.1003 Yıldırım, Ö. (2010). Washback effects of a high-stakes university entrance exam: Effects of the English section of the university entrance exam on future language teachers in Turkey. The Asian EFL Journal Quarterly, 12(2), 92–116. Yılmaz, B. (2017). The impact of digital assessment tools on students’ engagement in class: A case of two different secondary schools. Abant İzzet Baysal Üniversitesi Eğitim Fakültesi Dergisi, 17(3), 1606–1620. doi:10.17240/ aibuefd.2017.17.31178-338850 Yin, R. K. (2009). How to do better case studies. The SAGE Handbook of Applied Social Research Methods, 2, 254-282. Yoon, C., & Kim, S. (2007). Convenience and TAM in a ubiquitous computing environment: The case of wireless LAN. Electronic Commerce Research and Applications, 6(1), 102–112. doi:10.1016/j.elerap.2006.06.009 Youmans, R. J. (2011). Does the adoption of plagiarism-detection software in higher education reduce plagiarism? Studies in Higher Education, 36(7), 749–761. doi:10.1080/03075079.2010.523457 Zafar, S., & Mehmood, R. (2016). An evaluation of Pakistani intermediate English textbooks for cultural contents. Journal of Linguistics & Literature, 1(1), 124–136. 408

Compilation of References

Zarzycka-Piskorz, E. (2016). Kahoot it or not? Can games be motivating in learning grammar? Teaching English with Technology, 16(3), 17–36. Zhang, C., Yan, X., & Wang, J. (2021). EFL teachers’ online assessment practices during the COVID-19 pandemic: Changes and mediating factors. The Asia-Pacific Education Researcher, 30(6), 499–507. doi:10.100740299-021-00589-3 Zhang, F., Huang, C., Soong, F. K., Chu, M., & Wang, R. (2008). Automatic mispronunciation detection for Mandarin. Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, 5077–5080. Zhang, H., & van Compernolle, R. A. (2016). Learning potential and the dynamic assessment of L2 Chinese grammar through elicited imitation. Language and Sociocultural Theory, 3(1), 99–119. doi:10.1558/lst.v3i1.27549 Zhang, L., Ziping, Z., Chunmei, M., Linlin, S., Huazhi, S., Lifen, J., Shiwen, D., & Chang, G. (2020). End-to-End automatic pronunciation error detection based on improved hybrid CTC/attention architecture. Sensors (Basel), 20(7), 1809. doi:10.339020071809 PMID:32218379 Zhang, Z., & Burry-Stock, J. A. (2003). Classroom assessment practices and teachers’ self-perceived assessment skills. Applied Measurement in Education, 16(4), 323–342. doi:10.1207/S15324818AME1604_4 Zipp, J. F. (2007). Learning by exams: The impact of two-stage cooperative tests. Teaching Sociology, 35(1), 62–76. doi:10.1177/0092055X0703500105 Zownorega, S. J. (2013). Effectiveness of flipping the classroom in A honours level, mechanics-based physics class. MA Theses 1155. Eastern Illinois University. https://thekeep.eiu.edu/theses/1155

409

410

About the Contributors

Dinçay Köksal is a professor of English Language Teaching at Çanakkale Onsekiz Mart University. He worked as a teacher of English in the schools of Ministry of Education for 3 years. He presented and wrote many papers on language teaching, cross-cultural communication, He translated some books and edited many books. He is the editor of the Journal of Theory and Practice in Education (EKU) and ELT Research Journal. He founded Educational Research Association and International Association of Educational Researchers. He has been teaching courses such as Advanced Research Methods, Fieldwork in Applied Linguistics, Philosophy of Educational Research in the Master and doctoral programme in ELT. Nurdan Kavaklı Ulutaş received her Ph.D. degree in English Language Teaching at Hacettepe University. During her Ph.D. years, she was awarded by the graduate scholarship program of the Scientific and Technological Research Council of Turkey. She is currently a full-time academic at the department of English Language Teaching at Izmir Demokrasi University. She divides her loyalties between teaching undergraduate and graduate classes, and academic research. She has book chapters, and articles published in national and international academic journals. She has coordinated or participated in the steering committees of several national and international education projects. She is the editor-in-chief of a fledgling international academic journal, Futuristic Implementations of Research in Higher Education (FIRE). Her research interests include language teacher education, language testing and assessment, second language writing, and language attrition. Sezen Arslan is an assistant professor of English Language Teaching at Bandırma 17 Eylül University in Turkey, where she is leading as the vice-director of the School of Foreign Languages. She received her MA degree from Çanakkale Onsekiz Mart University and her Ph.D. degree in English Language Teaching from Hacettepe University. She has taught undergraduate courses in language testing/assessment and material development. She published in various refereed journals and presented papers in national and international conferences. She has also served on the editorial boards of international journals. She is currently the chief editor of the Futuristic Implementations of Research in Education (FIRE). Her main research interests are foreign language teacher training, professional development of language teachers, assessment, and intercultural awareness in language classrooms. *** Lovelyn Abang is a qualified teacher with over six years experience and an English language examiner with the Cameroon General Certificate of Education examination. She holds a B.Ed and M.Ed degrees  

About the Contributors

in Curriculum Studies and Teaching in English Language. She is currently completing her doctoral study in teacher development. Muhammad Ahmad is working as Secondary School Teacher (English) at Government High School, Hujra Shah Muqeem, Okara, Pakistan. Presently, he is enrolled as a PhD Candidate at the Department of Applied Linguistics, Government College University, Faisalabad, Pakistan. His research area includes corpus linguistics, critical discourse analysis, and ELT. Ezgi Akman is an M.A student in English Language Education Departmant at Pamukkale University. Begum Atsan is a sophomore in English Language Teaching department at İzmir Democracy University, İzmir-Turkey. Her research interests are assessment literacy and evaluation practices in L2, metacognition in young learners, and educational psychology. Also, she is dedicated to helping improve L2 practices by acknowledging the least acknowledged stakeholders’ voices and design a system where each learner receives a quality education. She attends conferences and projects for her professional development. She is one of the co-founders of the ELT club, and she has coordinated webinars bringing international faculty members together. Moritz Brüstle was trained to become a foreign language teacher and graduated from the University of Education Heidelberg, Germany in 2018. In 2020, he started to research intercultural competence development of tertiary students that spend a semester abroad. He focusses on developing and implementing a mixed-methods assessment-tool and is pursuing his PhD in the course of this research. Additionally, he has been working as an instructional designer at several German universities and is mainly focussing on developing and implementing digital formats. Esma Can has been working as an English language instructor at Kütahya Dumlupınar University since 2012. Her teaching experience consists of teaching English, especially writing, at different proficiency levels ranging from beginner to upper intermediate. Her research interests include teaching writing, assessment of writing, teaching reading and teacher training. She is currently working on a PhD degree in English Language Teaching program at Çanakkale Onsekiz Mart University. Vasfiye Geçkin completed her BA and MA studies at Foreign Language Education Department, Bogaziçi University, Turkey. She earned her PhD degree in Linguistics from the University of Potsdam, Germany and Macquarie University, Australia. She worked as a postdoctoral research fellow on the project ‘L2TOR- Second language tutoring using social robots’ at Koç University, Turkey. She works an assistant professor at the Foreign Language Education Department in İzmir Democracy University. Her research interests include language acquisition and bilingualism. Çiler Hatipoğlu is a Full Professor at the Department of Foreign Language Education at METU (Middle East Technical University), Ankara, where she teaches various Linguistics and FLE courses at undergraduate and graduate levels (). Her main research interests are foreign language assessment, pragmatics (cross-cultural and interlanguage), politeness, metadiscourse and corpus linguistics. She has edited books and special issues and published various articles on these issues in many prestigious national and international journals (e.g., Journal of Pragmatics, Language Testing, System, South African Journal 411

About the Contributors

of Education, Educational Sciences: Theory and Practice, NALANS, Explorations in English Language and Linguistics (ExELL), Studies About Languages) and books (published by John Benjamins, Lexington, Peter Lang). Dr Hatipoğlu has also either been a member of the research team or led national and international research projects on corpus linguistics (e.g., she is a member of the first Spoken Turkish Corpus development team), language assessment (EU project), language learning and material production (COST EU project), European Network on International Student Mobility: Connecting Research and Practice (EU Project), address forms in foreign language education and cross-cultural politeness. With a research team (MEB-UNICEF funded project), she investigated how and where formative assessment can best be integrated into the primary and middle school English as a foreign language education in Turkey. Belma Haznedar is Professor of Applied Linguistics at Boğaziçi University, Istanbul. She completed her MA and PhD studies at the University of Durham, UK, specializing on childhood bilingualism. Her research focuses on questions that bring together linguistics and language teaching. She is internationally known for her studies of simultaneous and successive bilingual children; for her investigation of mother-tongue development, bilingual language acquisition in early childhood and reading acquisition. She is the author and co-editor of Haznedar, B. & Gavruseva, E. (2008). Current Trends in Child Second Language Acquisition: A Generative Perspective. Amsterdam. John Benjamins; Haznedar, B. & Uysal, H.H. (2010). Handbook for Teaching Foreign Languages to Young Learners in Primary Schools. Ankara: Anı Publications; Haznedar, B. & Ketrez, N. (2016). The Acquisition of Turkish in Childhood. Amsterdam. John Benjamins. Devrim Hol is currently an assistant professor in the Department of Foreign Language Education at Pamukkale University, Turkey. He holds his master’s degree in English Language Teaching (ELT) from Pamukkale University, and a Ph.D. in ELT from Çanakkale Onsekiz Mart University. His research interests are Assessment, Evaluation and Testing English as a Second Language, National and International Standardized English Language Proficiency Tests. Dr. HOL has been a researcher, language teacher and language test developer and administrator since 2000. He worked in different level of schools as an English teacher, and was the head of department at Pamukkale University, School of Foreign Languages, Turkey between 2007 and 2017. He also worked as a National and International Testing Coordinator, Testing and Evaluation Unit Coordinator in the same university. He is still a full-time member of Foreign Language Teaching Department. Lubana Isaoglu is a Ph.D. student in the computer engineering department at Istanbul-Cerrahpasa university. She received her B.Sc. from Open University, Kuwait, in 2008 and then received her M.Sc. from Kadir Has University, Istanbul, in 2019. Her main interest is working with Artificial intelligence, especially neural networks. She has more than 10 years of teaching experience in different organizations, with different languages: the ministry of education in Kuwait, the higher institute for telecommunications & Navigation-Kuwait University, and Ibn Haldun university in Istanbul. Dilşah Kalay was born in Kütahya, Turkey in 1989. She earned her BA degree in Teaching English as a Foreign Language from Boğaziçi University, İstanbul, Turkey in 2011 and Ph.D. degree in English Language Teaching from Anadolu University, Eskişehir, Turkey in 2019. In 2012, she first joined the School of Foreign Languages (SFL) in İzmir Katip Çelebi University, and then the SFL in Kütahya Dumlupinar University, as a Lecturer. Since August 2019, she has been working in the same university, 412

About the Contributors

where she is an Assistant Professor at the moment, as both the departmental head and the assistant principal. Her current research interests include applied linguistics, SLA, psycholinguistics, vocabulary acquisition, language processing, teacher education, and learner-teacher perceptions. Ufuk Keleş completed his PhD studies on a Fulbright grant at the Department of Curriculum and Instruction, College of Education, the University of Alabama, Tuscaloosa (2016-2020). Currently, he works as an assistant professor of English language teaching at Bahçeşehir University, Turkey. Before, he taught Advanced EAP, English for specific purposes, and EFL at varying levels at Sabancı University for a year (2021-2022), Yıldız Technical University for over ten years (2006-2017), and Ankara University for three years (2003-2006). He holds one non-thesis MA degree in Gender Studies at İstanbul University (2004 – 2012), and an MA degree with a thesis in TEFL at Bilkent University (2012-2013). Also, he has a BA degree in English Language and Literature at Boğaziçi University (1998-2003). His PhD dissertation was a critical autoethnography of socialization as an English language learner, teacher, and speaker. His present research interests include L2 Socialization, Social Justice in ELT, Multicultural Education, Transnational Socialization, Autoethnographic Research, and Qualitative Research. He has published papers in internationally renowned peer-reviewed journals including The Qualitative Report; Pedagogy, Culture, and Society; Language Teaching Research; Language Teaching; and International Multilingual Research Journal as well as chapters in edited book volumes published by globally reputable publication companies such as IGI Global, Routledge, Springer Nature, and IAP. He has also presented his work in multiple international conferences such as TESOL Int, AAAL, AERA, GlobELT, and FLEAT. His teaching interests include Ethics in Higher Education, Materials Development and Evaluation, Coursebook Evaluation, Qualitative Research Methods, and Social Justice in ELT. He is married with a son. Zeynep Orman received her BSc in Computer Science Engineering in 2001 and her MSc and PhD degrees in Computer Engineering in 2003 and 2007, respectively from Istanbul University, Istanbul, Turkey. She has studied as a postdoctoral research fellow in the Department of Information Systems and Computing, Brunel University, London, UK in 2009. She is currently working as an Associate Professor in the Department of Computer Engineering, Istanbul University-Cerrahpasa. She is also the head of the Computer Science branch. Her research interests include artificial intelligence, neural networks, nonlinear systems, machine learning, optimization and data science. Tuba Özturan is an EFL instructor at School of Foreign Languages at Erzincan Binali Yıldırım University. She received her master’s degree and Ph.D. in English Language Teaching from Hacettepe University. Her research interests lie in the areas of learning-oriented assessment, dynamic assessment, sociocultural theory, L2 writing, and computer-based testing. Nesrin Ozturk studied her B.A. and M.S. at Middle East Technical University. She received her doctoral degree in Reading Education from the Department of Teaching and Learning, Policy and Leadership, College of Education, University of Maryland, College Park. Her research interests focus on metacognition, second language reading and assessment, and educational philosophy. She serves on the editorial boards of various journals, conferences, and scientific committees, and she provides workshops or in-service modules for professional development. Dr. Ozturk’s passion for contributing to a democratic and just society drives her to empower the youth and celebrate freedom of mind. She currently works at Izmir Democracy University, Department of Educational Sciences, Turkey. 413

About the Contributors

Hilal Peker (Ph.D., University of Central Florida, 2016) is the Federal Projects Coordinator and the Federal Director of Title V Part B at the Bureau of Federal Educational Programs of Florida Department of Education. She is also a professor of TESOL and teaches a wide variety of courses at the University of Central Florida, Florida State University, Framingham State University, and Saint Leo University. Her research interests include inclusive dual-language immersion programs, reconceptualized L2 motivational self-system (R-L2MSS), bullying-victimization, L2 identity, simulation technology, and teacher training. Ali Raza Siddique is working as Visiting Lecturer at the Department of Applied Linguistics, Government College University, Faisalabad, Pakistan. He is also enrolled as a PhD Candidate at the same department. His research area involves corpus linguistics, and ELT. Aylin Sevimel-Sahin is currently working in English Language Teaching Department, Faculty of Education, Anadolu University, Eskisehir, Turkey. She holds a doctorate degree in the field of ELT testing and assessment. She has been teaching undergraduate courses on language assessment and evaluation, how to teach language skills and areas, approaches to language teaching/learning, materials development, adaptation, and evaluation in language education, and practicum within English language teacher education program at the same university. Her research interests are ELT teacher education, language testing and assessment, practicum/teaching practice, affective domain of ELT, the use of digital tools in language education, and research methodology. Aleem Shakir is working as Assistant Professor at the Department of Applied Linguistics, Government College University, Faisalabad, Pakistan. The area of his research involves corpus linguistics, and ELT. Achu Charles Tante has been in the field of education for over twenty-five years working at all the levels as teacher, teacher trainer, researcher and outreach participant. He is Associate Professor with PhD and MA from the University of Warwick, UK. He trained as teacher trainer in the Higher Teacher Training College, University of Yaoundé, Cameroon. His research interest includes end-of-course English language examination and classroom-based assessment, classroom pedagogy, curriculum design and development, teacher training and professional development, inclusive language classroom and qualitative research design. Ömer Gökhan Ulum is a faculty member in Mersin University. His research interests cover culture, language, foreign language education, and linguistics. Hacer Hande Uysal Gürdal is a professor at Hacettepe University. She received her master’s degree in English Education and her Ph.D. in Foreign Language and ESL Education from The University of Iowa, United States. Her research interests are second language writing, academic discourse, early language teaching, and language policies. Karin Vogt is a Professor for Teaching English as a Foreign Language at the University of Education Heidelberg, Germany. Her research interests include, among others, classroom-based language assessment, intercultural learning, teaching practicums abroad, vocationally oriented language learning, the Common European Framework of Reference for Languages and digital media in the foreign language classroom.

414

415

Index

A

C

Academic Dishonesty 307-308, 312-321, 325-326, 328 Academic Integrity 306-308, 310-328 Accuracy vs. Fluency 179 Alternative Assessment 4-6, 9-10, 152, 184, 198, 212, 259, 280, 331 Alternative Assessment (or Alternate Assessment) 5, 9 Analytic vs. Evocative Autoethnography 179 Apprenticeship of Observation 157, 164, 168, 170, 179 Arabic Corpus 48, 50 Assessment 1-11, 14-16, 19-20, 22-30, 32-39, 43-46, 69, 79, 82-83, 89-91, 97-109, 111-112, 114-117, 120-122, 124-131, 136-141, 143-155, 175-176, 178-179, 181-184, 186-187, 189-191, 194-202, 204, 207-208, 211-216, 222-223, 226-231, 254256, 258-264, 266-267, 269-290, 292-296, 299307, 309-340, 346, 349-355 Assessment for Learning 6, 29, 34-35, 128, 154, 354 Assessment Knowledge 8, 254-256, 260, 262-264, 266-267, 270, 277-281, 283 Assessment Literacy 2, 5, 8-10, 104-106, 112, 116, 125-130, 254-255, 259-260, 280-284, 289, 292295, 303, 305, 308, 324, 326 Assessment Practices 1-3, 6, 109, 124, 254-264, 271272, 280-289, 293, 299-301, 307, 311-313, 320, 325-327, 330, 334, 351 Assessment Principles 283, 287, 308, 317, 328, 350 Assessment Rubrics 204, 211, 214, 222-223, 226227, 231 Autoethnography 156, 161-163, 174-179 Automated Speech Recognition (ASR) 88

Cheating 292, 306-308, 312-324, 326-327, 337 Classroom-Based Assessment 5-6, 11, 25, 35, 128, 281, 354 CLT (Communicative Language Teaching) 251 CLT Principles 232, 234, 236-237, 241-242, 244-245 Collectivism 36, 38-39, 41-42, 44-46 Communicative Effective 284, 296-297, 300 Communities of Practice (CoPs) 158, 160, 179 Competence Assessment 24-26 Computer-Adaptive Testing (CAT) 4, 9 Computer-Assisted Assessment 329, 334, 337, 351 Computer-Assisted Language Learning 70, 335 Content Analysis 197, 232, 236, 238, 241, 243, 246, 248, 304 Content Validity 8, 37, 130, 256, 287 Continuous Assessment 1-2, 144, 207, 229 Corpora 49-52, 70, 112 Corpus 49-54, 58, 67, 70, 75 Critical Autoethnography 156, 162, 175-176, 179 Cultural Bias 36-38, 46 Culturally Biased Language Assessment 36 Culture 15-20, 24, 27-33, 35-42, 44-46, 101, 154, 161, 186-187, 202, 234-235, 237, 240, 242-243, 245-247, 249, 282-283, 315, 328

B Bibliometric Analysis 329, 331, 339-341, 344, 346, 348, 351, 355 Bibliometrics 340, 349, 355

D Deep Learning 48, 327 Deep Neural Network 52-53, 55-57, 59, 68, 70 Differential Item Functioning (DIF) 9 Dishonesty 306-308, 310, 312-321, 325-326, 328 Dynamic Assessment 1, 6, 8-9, 89-91, 97, 99-103, 181, 184, 186, 189-191, 197-202, 212 Dynamic Assessment (DA) 6, 9, 89, 181

 

Index

E E-Assessment 327, 329-337, 339-355 EFL 8, 38-41, 43, 83, 89, 92, 100, 126-128, 156, 158-159, 166-167, 170-171, 174, 176, 178, 233234, 238, 246-251, 254, 256, 262-267, 271-272, 281, 302-303, 324-327, 330, 337, 339, 342-344, 351, 353 Emergency Remote Teaching 306, 326, 328 Emergency Remote Teaching (ERT) 306, 328 English as a Foreign Language 7, 11, 75, 92, 127, 151, 176, 198, 238, 251, 284, 292, 303, 317, 325, 328-329, 331 English Corpus 51, 53 English Language 37, 39, 41-42, 45, 51, 74, 82, 85, 92, 102, 104-106, 109, 112-114, 125-126, 128-131, 138, 146, 151-153, 156, 160-161, 163-166, 168, 171-174, 176-177, 180, 202, 204-207, 209-215, 217, 220-223, 225-234, 236-238, 243-245, 247252, 268, 281-283, 285-286, 290-292, 295-296, 298-299, 302-304, 318, 322-323, 326, 346-347, 352-353 English Language Testing and Evaluation Course (ELET) 130 ESL 8, 103, 198, 230-234, 238, 249, 251, 281, 330, 347, 351 ESL Materials 232 Ethics 87, 107, 116, 175, 280, 306, 308, 310-311, 315, 320, 322, 324-325, 328 Examination 44, 100, 112, 162, 174, 176-177, 194, 200, 204-207, 209-216, 220-222, 224-226, 228-229, 231, 237, 244, 314

321, 323, 330 Foreign Language Assessment Literacy 104, 284, 292, 294-295, 305 Foreign Language Education 6, 11, 15-16, 23, 25, 2930, 105, 284-285, 349 Foreign Language Exploratory (FLEX) Program 187, 202 Foreign Language Proficiency 9, 284-286, 290, 292295, 299-300, 305, 335 Formative Assessment 1, 6, 27, 35, 126, 131, 136, 139141, 144, 146, 149-150, 152, 154-155, 202, 212213, 230, 260-261, 279-280, 285, 287, 302-303, 307, 309, 317, 319-320, 332, 337-339, 350-354 French 15, 23, 77-78, 176, 181-182, 185-193, 195, 198, 200, 205, 290

G GCE Ordinary Level 204, 206, 211, 231 Globalization 11-14, 18, 23, 32-33, 35, 285 Goodness of Pronunciation 49, 54, 56-58, 70 Grammar 4-5, 71-72, 75-78, 95, 97, 102, 107, 109, 111, 119-120, 134, 138-139, 146, 152, 154, 157, 159-160, 164-171, 173-174, 179, 207-210, 214, 217, 224-225, 228, 234-237, 243-244, 247-248, 250, 258, 290-292, 295-296, 317, 335, 354

H High-Stakes Test 3, 159, 179 Humanoid Social Robots (HSRs) 88 Human-Robot Interaction 71, 81-82, 84-87

F

I

Face Validity 37, 130, 287, 292 Fairness 4-5, 9, 260, 287, 306, 311-312, 328 Feature Extraction 48, 53, 55-57, 59 FLEX 181, 186-187, 196, 199, 202 Flexible Dynamic Assessment 202 Flipped Classroom Model (FCM) 112, 130 Foreign Language 1-7, 9, 11, 15-16, 18, 23, 25-30, 3234, 44, 48, 66, 75, 88-90, 92, 99-100, 104-107, 110, 112, 115, 119, 125, 127, 129, 140, 151-153, 155, 167, 174, 176-177, 180-183, 186-188, 190-191, 194, 196-202, 235, 238, 244, 248, 251, 255-257, 261, 282-286, 289-297, 299-308, 310, 315, 317, 321, 323-325, 328-331, 333, 335, 339-340, 344, 348-349, 352 Foreign Language Assessment 1-2, 6, 104-105, 282, 284-286, 292, 294-295, 304-306, 308, 310, 315,

Inclusive Education 181, 184, 197, 200, 202 Individualism 38-39, 41-42, 44-46 Intelligent Tutoring Systems 71 Intercultural 11-35, 82, 146, 285 Intercultural Competence 11-17, 19-21, 23-35

416

L L2 Socialization 156, 158, 160-164, 172, 174, 180 L2 Writing 89, 91-92, 99-101 Language 1-2, 5-10, 15-16, 18-19, 23, 25-39, 41-45, 48-51, 54-56, 59, 66-90, 92, 94, 97, 99-107, 109110, 112-116, 119, 121-122, 124-131, 136-137, 139-140, 143-148, 150-184, 186-188, 190-191, 194, 196-202, 204-207, 209-217, 220-223, 225236, 238-252, 254-273, 275-310, 314-315, 317-

Index

318, 321-328, 330-340, 343-344, 346-353, 355 Language Assessment 1-10, 26, 35-36, 38, 43-44, 83, 101, 104-106, 112, 114-116, 122, 124-130, 138, 179, 197, 199, 213, 228-231, 254-257, 259-264, 266-267, 270-271, 275, 277-286, 292, 294-295, 302-310, 315, 317-318, 321, 323-325, 327-328, 330-332, 334-336, 350-353 Language Assessment Knowledge 8, 254-256, 260, 262-264, 266-267, 270, 277-279, 281, 283 Language Assessment Literacy (LAL) 5, 10, 104-105, 130, 308 Language Assessment Practices 2, 6, 124, 254-256, 263, 271, 283, 286 Language Portfolio 145, 153-155 Language Portfolios 136, 145-146 Language Teaching 2, 5, 16, 18, 26-29, 32, 39, 42-44, 74, 85, 88, 92, 100-103, 106, 113, 127, 138-140, 144, 146-147, 151-155, 161, 164, 168, 175-177, 183, 197-198, 201-202, 232, 234-241, 243-244, 246-251, 255-256, 260, 268, 273, 283, 303, 308, 323-326, 329-330, 332, 335, 339-340, 343-344, 346, 348-349, 352 Learning at Home 284, 288, 296-297, 300 Learning Outcomes 3, 6, 13, 15, 71, 74-75, 80, 252, 316 Learning-Oriented Assessment 89-90, 98, 103

M Mediation 43, 91, 95, 99-100, 102-103, 182-183, 191, 195-196, 202 Misconduct 306-308, 310, 312-314, 316-319, 322 Mispronunciation 48-49, 51-59, 62-63, 66-70 Mispronunciation Detection System 48

N Nationwide University Exam, High Stakes Exam 156 Neural Network 48-49, 52-57, 59, 63, 66-70

O Online Assessment 149, 306-310, 312, 314-327, 329, 340 Online Language Assessment 307-310, 317, 328

P Parental Involvement 284, 286, 288-291, 293-294, 296-301, 303-305 Parenting Helping 284, 296-297, 300 PEELI (Punjab Education and English Language

Initiative) 252 Peer Assessment 111, 117, 130, 259, 261, 279-280 Performance-Based Assessment (PBA) 6, 10 Plagiarism 282, 306-308, 312-317, 319, 322-323, 325, 327 Pragmatics 6, 71-72, 79-81, 83, 88 Progress Achievement Tests 130 Punjab 233, 236-238, 243, 249, 252

Q Qualitative Descriptive Research 292, 303

R Reading 9, 34, 44, 46, 48, 70-72, 78-79, 84-86, 88, 91, 97, 102, 107, 109, 111, 117, 119-120, 123, 129, 131, 134, 138, 146-149, 151-153, 155, 157, 159-160, 164, 166-167, 173, 177, 179, 201-202, 207-210, 214-215, 221, 224-225, 230, 237-238, 240-243, 245, 251, 258, 263, 265-267, 272-275, 277-279, 281-282, 290, 302-304, 327, 334-335, 354 Regulation 99, 103, 121, 128, 197, 201, 216 Robot Assisted Second Language Learning 71 Robot-Assisted Language Learning (RALL) 71, 86, 88

S Scripted Dynamic Assessment 202 Second Language 5, 7-9, 33, 44-45, 48-51, 56, 67, 7188, 90, 99-103, 114, 126, 128, 136-137, 143, 148, 150, 152-154, 156, 165, 171, 175, 187, 197-198, 201-202, 205, 231, 233, 235, 238, 244, 246-247, 249-251, 281-282, 290, 302, 305, 323, 329, 331332, 336, 338-340, 343, 346-349, 353, 355 Second Language Learning 48, 71-72, 78, 85, 88, 99101, 137, 152, 197-198, 201, 250, 338, 346-347 Security 30, 70, 81, 306-307, 312, 315, 322, 324, 328, 337 Self-Assessment 2, 24, 104, 111-112, 117, 120, 127128, 130, 136, 141, 144-145, 149, 155, 257-259, 261, 273-274, 278-280, 285, 353 Skill-Based Assessment 283 Sociocultural Theory 89-90, 99-103, 175, 182-183, 197-198, 202 Speaking and Listening 75, 148, 280, 335 Special Needs Students 181, 184, 186, 191, 194-196 Standardization in Assessment 283 Stories 79, 136, 138-139, 141, 147, 149-151, 155, 162-163, 172, 175-176, 237, 291 417

Index

Students With Disabilities 181-182, 186, 201 Summative Assessment 2, 4, 44, 182, 202, 212, 261, 285, 287, 293, 308, 315, 318-320

T Target Language Use (TLU) 10 Teaching to the Test 160, 164-165, 180 Test Development 204, 211, 214-216, 221-223, 226227, 231 Test Objectives 204, 211, 215, 227, 231 Test Performance 36, 43-44, 214, 231, 334 Test-Driven 295, 299 Testing and Evaluation Education in FL Teacher Training Programs 104 Textbook 28, 126, 166, 178, 199, 232-234, 236-252, 281, 354 Textbook Evaluation 232, 236, 239, 247-252 The European Network for Academic Integrity 311 The International Center for Academic Integrity 311, 327 The Ministry of National Education 105, 121, 284 Transcendence 103 Turkey 1, 36-39, 41, 48, 71, 89, 92, 104-106, 110-112, 114, 119, 121-122, 125-130, 136, 156-160, 162, 165, 167, 170-178, 246, 254, 262, 280, 284-286, 291-293, 298-299, 306, 329, 340, 342, 349 Turkish Context 105, 112, 128 Typically Developing Students 182, 184, 187

418

U UDL 181-182, 185-188, 191, 194-197, 200, 203 Universal Design for Learning 181, 184-186, 197, 199-203 Universal Design for Learning (UDL) 181, 184, 197, 203

W Washback Effect 156, 158-161, 173, 176-177, 180, 257, 261 Word Learning 72, 80, 199

Y Young Learners 83, 88, 113, 136-141, 144, 146-152, 154-155, 181-182, 229-230, 283

Z Zone of Actual Development 90, 103 Zone of Proximal Development 89-90, 99, 103, 182, 197, 203 Zone of Proximal Development (ZPD) 182, 203