Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach 9811654484, 9789811654480

This book presents an empirical study to develop and validate a proficiency scale of business English writing in the Chi

178 91 4MB

English Pages 189 [186] Year 2021

Table of contents :
Contents
Abbreviations
List of Figures
List of Tables
1 Core Issues in Business English Education in China
1.1 English as a Shared Language for International Business Communication
1.2 BE Education at the Tertiary Level in China
1.3 Problems with BE Education in China
1.3.1 The “Tower of Babel” Problem
1.3.2 Inadequate BE Assessment System
1.4 The Need for a Meta-Language for BE Education in China
1.5 Conclusion
References
2 Assessing Business English: The Role of Language Scales
2.1 Language Scales
2.1.1 Origins
2.1.2 Functions
2.1.3 Purposes
2.2 The CEFR
2.2.1 What is the CEFR?
2.2.2 Uses of the CEFR
2.2.3 Describing Business English Proficiency with the CEFR: Potential Problems
2.3 Scales Describing Business English Proficiency
2.3.1 Categorization of Scales Incorporating BE Proficiency
2.3.2 The ELTDU Stages of Attainment Scale (ELTDU Scale)
2.3.3 The ALTE Framework
2.3.4 The ESU Framework
2.3.5 The International Second Language Proficiency Ratings (ISLPR)
2.3.6 The Canadian Language Benchmarks (CLB)
2.4 Summary of the Review and Discussion
2.5 Conclusion
References
3 Understanding Business English Writing: A Genre Perspective
3.1 BE as an Offshoot of ESP
3.2 ESP Writing Competence
3.3 Genre as ESP Writing Competence
3.3.1 The Concept of Genre
3.3.2 Genre in the ESP Tradition
3.3.3 Genre Knowledge and ESP Writing
3.3.4 The Model of Genre Knowledge
3.4 Conclusion
References
4 Developing a Scale of Business English Writing Proficiency: Considerations and Methods
4.1 Important Considerations in ESP Assessment
4.1.1 EGP Versus ESP
4.1.2 Key Features of ESP Assessment
4.2 General Principles of Scale Development
4.2.1 Types of Measurement Scales
4.2.2 Essentials of Valid Measurement Scales
4.3 Common Methods of Scale Construction
4.3.1 Intuitive Versus Empirical Scale Development Methods
4.3.2 Performance-Based Versus Descriptor-Based Scale Development Methods
4.4 Four Phases of Developing a Scale of BE Writing Proficiency
4.5 Conclusion
References
5 The Establishment of a Descriptive Scheme for the Business English Writing Scale
5.1 Research Procedures
5.2 The Establishment of a BE Writing Descriptive Scheme
5.2.1 Theory-Based Descriptive Categories
5.2.2 Activity-Based Descriptive Categories
5.3 Conclusion
References
6 Evaluating the Quality of the Descriptive Scheme and Descriptors: Teacher Perspectives
6.1 Research Procedures
6.1.1 Instruments
6.1.2 Participants
6.1.3 Data Collection
6.1.4 Data Analysis
6.2 Initial Accumulation and Categorization of BE Writing Descriptors
6.3 Evaluating the Appropriateness of the Descriptive Scheme and Descriptors
6.3.1 Evaluating the Relevance of the Descriptive Framework to the Chinese Tertiary Context
6.3.2 Evaluating the Relevance of the Descriptors to the Chinese Tertiary Context
6.4 Conclusion
References
7 Developing the Business English Writing Proficiency Scale
7.1 Research Procedures
7.1.1 Instrument
7.1.2 Participants
7.1.3 Data Collection
7.1.4 Data Analysis
7.2 Scaling the Descriptors
7.2.1 Psychometric Properties of the Descriptors
7.2.2 Dimensionality Analysis
7.2.3 Differential Item Functioning (DIF) Analysis
7.2.4 Item Statistics
7.3 Constructing the BE Writing Scale
7.3.1 Creating Level Bands of Proficiency
7.3.2 Establishing a Global Scale
7.4 Conclusion
References
8 Validating the Business English Writing Scale
8.1 Research Procedures
8.1.1 Research Instruments
8.1.2 Participants
8.1.3 Data Collection
8.1.4 Data Analysis
8.2 Experts’ Perceptions of Scale Categories
8.2.1 Descriptor Categorization
8.2.2 Category Appropriateness
8.3 Experts’ Perceptions of Descriptor Levels
8.3.1 Inter-Rater Reliability: Rater Consistency and Severity
8.3.2 Bias Analysis
8.4 Scale Level Validation
8.4.1 Item Fit Analysis
8.4.2 Setting Cut-Offs
8.4.3 Validated Scale Versus Draft Scale: Level Assignment
8.5 Conclusion
References
9 Conclusions
9.1 A Recap of the Study
9.2 Presentation of the Scale
9.2.1 Presentation of the Global Scale
9.2.2 Presentation of the Illustrative Scales
9.3 Implications for ESP Scale Development
9.3.1 Theoretical Implications: Understanding the BE Writing Construct
9.3.2 Methodological Implications: ESP Scale Development
9.3.3 Practical Implications: Implementation of the BE Writing Scale
9.4 Limitations of the Study
9.5 Areas for Future Research
9.6 Concluding Remarks
References
Appendix Sources for Descriptor Collection
References

Recommend Papers

Towards a Poetics of Creative Writing 9781783093236

This book is a study of the poetics of creative writing as a subject in the dramatically changing context of practice as

140 5 750KB Read more

Accuracy across Proficiency Levels: A Learner Corpus Approach 9782875584304, 9782875584311

112 58 19MB Read more

The Economics of Business Valuation: Towards a Value Functional Approach 9780804783224

This book brings to light an expanded valuation toolkit, ultimately arguing that the "value functional" approa

104 35 3MB Read more

Fundamentals of Chemical Reactor Engineering: A Multi-Scale Approach 9781119755890

A comprehensive introduction to chemical reactor engineering from an industrial perspective. In Fundamentals of Chemical

103 61 Read more

A History of Irish Working-Class Writing 1107149681, 9781107149687

A History of Irish Working-Class Writing provides a wide-ranging and authoritative chronicle of the writing of Irish wor

114 74 3MB Read more

Successful Writing for Proficiency 9781842168806

Книга Successful Writing for Proficiency Successful Writing for ProficiencyКниги English литература Автор: Virginia Evan

436 52 18MB Read more

Cambridge Cpe Certificate Of Proficiency In English

111 67 8KB Read more

Design, Launch, and Scale IoT Services: A Practical Business Approach 978-1-4842-3712-0

The Internet of Things is causing major industry disruption, so companies need to plan and manage their “IoT journey” to

189 47 3KB Read more

EF English Proficiency Index [9.]

522 78 5MB Read more

Towards a Circular Economy: Transdisciplinary Approach for Business (CSR, Sustainability, Ethics & Governance) 9783030942922, 9783030942939, 3030942929

This volume presents a transdisciplinary approach to implementing a circular economy in international business. Written

110 69 8MB Read more

Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach
9811654484, 9789811654480

Author / Uploaded
Li Wang
Jason Fan

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Li Wang Jason Fan

Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach

Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach

Li Wang · Jason Fan

Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach

Li Wang Xi’an International Studies University Xi’an, China

Jason Fan Language Testing Research Centre University of Melbourne Melbourne, VIC, Australia

ISBN 978-981-16-5448-0 ISBN 978-981-16-5449-7 (eBook) https://doi.org/10.1007/978-981-16-5449-7 © Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Contents

1 Core Issues in Business English Education in China . . . . . . . . . . . . . . . 1.1 English as a Shared Language for International Business Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 BE Education at the Tertiary Level in China . . . . . . . . . . . . . . . . . . . . 1.3 Problems with BE Education in China . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 The “Tower of Babel” Problem . . . . . . . . . . . . . . . . . . . . . . . . 1.3.2 Inadequate BE Assessment System . . . . . . . . . . . . . . . . . . . . . 1.4 The Need for a Meta-Language for BE Education in China . . . . . . . 1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Assessing Business English: The Role of Language Scales . . . . . . . . . . 2.1 Language Scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Origins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.3 Purposes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The CEFR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 What is the CEFR? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Uses of the CEFR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Describing Business English Proficiency with the CEFR: Potential Problems . . . . . . . . . . . . . . . . . . . . . 2.3 Scales Describing Business English Proficiency . . . . . . . . . . . . . . . . . 2.3.1 Categorization of Scales Incorporating BE Proficiency . . . . 2.3.2 The ELTDU Stages of Attainment Scale (ELTDU Scale) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 The ALTE Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.4 The ESU Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.5 The International Second Language Proficiency Ratings (ISLPR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.6 The Canadian Language Benchmarks (CLB) . . . . . . . . . . . . .

1 1 2 4 4 5 5 7 7 9 9 9 10 11 12 12 13 16 17 17 18 19 22 24 26

v

vi

Contents

2.4 Summary of the Review and Discussion . . . . . . . . . . . . . . . . . . . . . . . 2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

28 29 30

3 Understanding Business English Writing: A Genre Perspective . . . . . 3.1 BE as an Offshoot of ESP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 ESP Writing Competence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Genre as ESP Writing Competence . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 The Concept of Genre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Genre in the ESP Tradition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3 Genre Knowledge and ESP Writing . . . . . . . . . . . . . . . . . . . . . 3.3.4 The Model of Genre Knowledge . . . . . . . . . . . . . . . . . . . . . . . 3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33 33 35 36 36 37 38 39 42 43

4 Developing a Scale of Business English Writing Proficiency: Considerations and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Important Considerations in ESP Assessment . . . . . . . . . . . . . . . . . . . 4.1.1 EGP Versus ESP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.2 Key Features of ESP Assessment . . . . . . . . . . . . . . . . . . . . . . . 4.2 General Principles of Scale Development . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Types of Measurement Scales . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Essentials of Valid Measurement Scales . . . . . . . . . . . . . . . . . 4.3 Common Methods of Scale Construction . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Intuitive Versus Empirical Scale Development Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Performance-Based Versus Descriptor-Based Scale Development Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Four Phases of Developing a Scale of BE Writing Proficiency . . . . . 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45 45 45 46 49 49 51 53 53 54 55 59 59

5 The Establishment of a Descriptive Scheme for the Business English Writing Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Research Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 The Establishment of a BE Writing Descriptive Scheme . . . . . . . . . . 5.2.1 Theory-Based Descriptive Categories . . . . . . . . . . . . . . . . . . . 5.2.2 Activity-Based Descriptive Categories . . . . . . . . . . . . . . . . . . 5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63 63 64 64 75 79 79

6 Evaluating the Quality of the Descriptive Scheme and Descriptors: Teacher Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Research Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1 Instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.2 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.3 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83 83 85 86 87

Contents

vii

6.1.4 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Initial Accumulation and Categorization of BE Writing Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Evaluating the Appropriateness of the Descriptive Scheme and Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Evaluating the Relevance of the Descriptive Framework to the Chinese Tertiary Context . . . . . . . . . . . . . . 6.3.2 Evaluating the Relevance of the Descriptors to the Chinese Tertiary Context . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

87

93 97 99

7 Developing the Business English Writing Proficiency Scale . . . . . . . . . 7.1 Research Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.1 Instrument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.2 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.3 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.4 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Scaling the Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Psychometric Properties of the Descriptors . . . . . . . . . . . . . . 7.2.2 Dimensionality Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.3 Differential Item Functioning (DIF) Analysis . . . . . . . . . . . . 7.2.4 Item Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Constructing the BE Writing Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Creating Level Bands of Proficiency . . . . . . . . . . . . . . . . . . . . 7.3.2 Establishing a Global Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

101 101 102 102 104 104 106 106 110 111 113 115 115 118 121 122

8 Validating the Business English Writing Scale . . . . . . . . . . . . . . . . . . . . . 8.1 Research Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.1 Research Instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.2 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.3 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.4 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Experts’ Perceptions of Scale Categories . . . . . . . . . . . . . . . . . . . . . . . 8.2.1 Descriptor Categorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.2 Category Appropriateness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Experts’ Perceptions of Descriptor Levels . . . . . . . . . . . . . . . . . . . . . . 8.3.1 Inter-Rater Reliability: Rater Consistency and Severity . . . . 8.3.2 Bias Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Scale Level Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.1 Item Fit Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.2 Setting Cut-Offs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.3 Validated Scale Versus Draft Scale: Level Assignment . . . . .

125 125 126 128 130 131 132 132 136 137 137 139 141 141 144 146

88 90 91

viii

Contents

8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 A Recap of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Presentation of the Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.1 Presentation of the Global Scale . . . . . . . . . . . . . . . . . . . . . . . . 9.2.2 Presentation of the Illustrative Scales . . . . . . . . . . . . . . . . . . . 9.3 Implications for ESP Scale Development . . . . . . . . . . . . . . . . . . . . . . 9.3.1 Theoretical Implications: Understanding the BE Writing Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.2 Methodological Implications: ESP Scale Development . . . . 9.3.3 Practical Implications: Implementation of the BE Writing Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Limitations of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Areas for Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

155 155 157 158 158 166 166 167 169 170 170 171 171

Appendix: Sources for Descriptor Collection . . . . . . . . . . . . . . . . . . . . . . . . . 173 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

Abbreviations

ALTE ASLPR BE BEC BULATS CCLB CEFR CET CLB CSE CTT DIF EAP EBLF EBP EFL EGP ELTDU EOP ESL ESP ESU FSI ISLPR LSP MFRM MMR NR RSA SFL

The Association of Language Testers in Europe Australian Second Language Proficiency Ratings Business English Business English Certificate Business Language Testing Service The Centre for Canadian Language Benchmarks The Common European Framework of Reference College English Test The Canadian Language Benchmarks China’s Standards of English Classical Test Theory Differential Item Functioning English for Academic Purposes English as a Business Lingua Franca English for Business Purposes English as a Foreign Language English for General Purposes The English Language Teaching Development Unit English for Occupational Purposes English as a Second Language English for Specific Purposes The English-Speaking Union The Foreign Service Institute The International Second Language Proficiency Ratings Languages for Specific Purposes The Many-facets Rasch Measurement Mixed Methods Research New Rhetoric Royal Society of Arts Systemic Functional Linguistics

ix

x

TEM TLU TOEIC

Abbreviations

Test for English Majors Target Language Use Test of English for International Communication

List of Figures

Fig. 2.1 Fig. 2.2 Fig. 3.1 Fig. 4.1 Fig. 5.1

Fig. 6.1 Fig. 6.2 Fig. 7.1

Fig. 7.2 Fig. 7.3 Fig. 8.1 Fig. 9.1

The CEFR vertical levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Structure of the ALTE framework . . . . . . . . . . . . . . . . . . . . . . . . . . The model of genre knowledge (from Tardy, 2009, p. 22) . . . . . . . The four phases of scale development and validation . . . . . . . . . . . The initial descriptive scheme and categories. Notes Minor revisions of the descriptive scheme were made in the later stages of the study, including: (1) Letter and E-mail were collapsed into one category as Letter/E-mail at Phase II of the study; (2) The categories of Documentary materials, News Release and Note were deleted from the scheme at phase II and phase IV of the study respectively; (3) The heading of the category Orthographical control was revised as Mechanics in phase IV of the study . . . . . . . . . . . . . . . . . . . . . . . Research procedures of teachers’ evaluation of the descriptive scheme and descriptors . . . . . . . . . . . . . . . . . . . . The revised descriptive scheme of the BE writing scale . . . . . . . . . Wright map. Notes M—the mean of the person or item estimates; S—one standard deviation from the mean, T—two standard deviations from the mean . . . . . . . . . . . . . . . . . . . Category probability curves for the five categories in the questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Plot of item estimates between BE majors and non-BE majors . . . Wright map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The descriptive scheme of the validated BE writing scale . . . . . . .

13 20 40 56

78 84 94

107 109 112 145 157

xi

List of Tables

Table 2.1 Table 2.2 Table 2.3 Table 2.4 Table 2.5 Table 3.1 Table 4.1 Table 5.1 Table 5.2 Table 5.3 Table 6.1 Table 6.2 Table 6.3 Table 6.4 Table 6.5 Table 6.6 Table 7.1 Table 7.2 Table 7.3 Table 7.4 Table 7.5 Table 7.6 Table 7.7 Table 7.8 Table 7.9

Common reference levels of the CEFR: global scale (from Council of Europe, 2001, p. 24) . . . . . . . . . . . . . . . . . . . . Types of BE proficiency scales . . . . . . . . . . . . . . . . . . . . . . . . . . A sub-scale of the ALTE framework in the category of work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yardsticks and stages of the ESU framework . . . . . . . . . . . . . . . Selected information from writing benchmark 9 . . . . . . . . . . . . ESP and SFL perspectives on genre . . . . . . . . . . . . . . . . . . . . . . Sample of Bachman and Palmer’s Scale (1983) (adapted from Bachman, 1990. p. 327) . . . . . . . . . . . . . . . . . . . . . . . . . . . Writing assessment criteria adopted by BE tests . . . . . . . . . . . . Categories of business lexis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Structural description of a sales letter . . . . . . . . . . . . . . . . . . . . . A Sample of the handout for individual evaluation of descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Codes used for individual evaluation of descriptors . . . . . . . . . . A sample of the handout for group evaluation of descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Descriptor documentation format . . . . . . . . . . . . . . . . . . . . . . . . A Sample section of the handout for session 3 of Workshop II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods and examples of descriptor modification . . . . . . . . . . . A sample section in the students’ questionnaire . . . . . . . . . . . . . Summary statistics for Rasch analysis . . . . . . . . . . . . . . . . . . . . Category structure of the rating scale . . . . . . . . . . . . . . . . . . . . . A summary of the descriptors with largest contrast loadings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Items demonstrating possible DIF effect . . . . . . . . . . . . . . . . . . A summary of misfitting items . . . . . . . . . . . . . . . . . . . . . . . . . . Levels and cut-offs of the scale . . . . . . . . . . . . . . . . . . . . . . . . . . The illustrative scale of grammar . . . . . . . . . . . . . . . . . . . . . . . . Analysis chart built for the illustrative scale of “grammar” . . . .

14 17 21 23 27 38 52 66 67 68 85 85 86 89 95 98 103 108 109 111 113 114 116 117 117 xiii

xiv

Table 7.10 Table 8.1 Table 8.2 Table 8.3 Table 8.4 Table 8.5 Table 8.6 Table 8.7 Table 8.8 Table 8.9 Table 8.10 Table 8.11 Table 8.12 Table 9.1 Table 9.2 Table 9.3 Table 9.4 Table 9.5 Table 9.6 Table 9.7 Table 9.8 Table 9.9 Table 9.10 Table 9.11 Table 9.12 Table 9.13 Table 9.14 Table 9.15 Table 9.16

List of Tables

BE writing proficiency: a global scale . . . . . . . . . . . . . . . . . . . . A sample section of the questionnaire . . . . . . . . . . . . . . . . . . . . . Details of the participating teaching experts . . . . . . . . . . . . . . . . Details of the participating domain experts . . . . . . . . . . . . . . . . Details of potentially problematic descriptors . . . . . . . . . . . . . . Rater measurement report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Group measurement report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rater group: item bias analysis . . . . . . . . . . . . . . . . . . . . . . . . . . A summary of the misfitting Items . . . . . . . . . . . . . . . . . . . . . . . Summary of item measurement report . . . . . . . . . . . . . . . . . . . . Levels and Cut-offs based on experts’ ratings . . . . . . . . . . . . . . Summary of descriptors assigned into different levels . . . . . . . . Revisions made in the validation phase . . . . . . . . . . . . . . . . . . . . Global scale of the BE writing scale . . . . . . . . . . . . . . . . . . . . . . Illustrative scale—vocabulary . . . . . . . . . . . . . . . . . . . . . . . . . . . Illustrative scale—grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Illustrative scale—mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . illustrative scale—genre format . . . . . . . . . . . . . . . . . . . . . . . . . . Illustrative scale—cohesion and coherence . . . . . . . . . . . . . . . . Illustrative scale—strategic competence . . . . . . . . . . . . . . . . . . . Illustrative scale—sociolinguistic competence . . . . . . . . . . . . . . Illustrative scale—intercultural competence . . . . . . . . . . . . . . . . Illustrative scale—business knowledge . . . . . . . . . . . . . . . . . . . . Illustrative scale—letter and e-mail . . . . . . . . . . . . . . . . . . . . . . . illustrative scale—report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Illustrative scale—minutes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Illustrative scale—memo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Illustrative scale—resume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Illustrative scale—summary . . . . . . . . . . . . . . . . . . . . . . . . . . . .

120 127 129 129 133 138 138 140 142 144 145 147 153 159 160 160 160 161 161 161 162 162 162 163 163 163 164 165 166

Chapter 1

Core Issues in Business English Education in China

1.1 English as a Shared Language for International Business Communication As the world is increasingly globalized due to technological innovations and economic exchanges, the past three decades have witnessed exponential growth in the communication between speakers of different languages. As a result, the importance of having one shared language around the globe has never been more keenly felt. Largely due to political and historical reasons, the English language assumed the role of an international language and has functioned as the common discourse in the international arena. With more than 350 million people currently speaking English as the first language and more than 430 million speaking it as the second language, English has become the fastest spreading language in human history and the scale and scope of its influence is unprecedented and staggering (Crystal, 2003). In the world of business, English is also gaining dominance and has been used as the default medium for communication, as international trade continues to expand, bringing more people to international transactions across territorial, linguistic, cultural and socio-political boundaries. People conducting business in English include not only native English speakers from countries such as the USA, the UK or Australia, but also non-native English speakers. For instance, studies by Charles and Marschan-Piekkari (2002) in Finland, Akar (2002) in Turkey and Bilbow (2002) in Hong Kong, all confirm that English is an intrinsic part of communication in multinational settings and has become a fact of life for many business people. Despite the controversy and the criticism concerning the dominance of English over other languages, it is widely acknowledged that English is and will continue to be the dominant language in the international business community in the coming decades (Nickerson, 2005). Various terminologies have been developed to explain the supremacy of English in international business communication such as Business English (BE), English for Business Purposes (EBP), and English as a business Lingua Franca (EBLF). © Springer Nature Singapore Pte Ltd. 2021 L. Wang and J. Fan, Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach, https://doi.org/10.1007/978-981-16-5449-7_1

1

2

1 Core Issues in Business English Education in China

Different foci indicated by these terms notwithstanding, they have a shared ground, referring to the type of English used in business settings to carry out communication such as scheduling meetings, writing business reports and conducting negotiations. An understanding of the current and sustained supremacy of English in the business world has led to the growing demand for proficient BE users, and the language teaching community has been enthusiastic in responding to the need. Since the 1960s, the teaching of BE has undergone several stages of development, with its content becoming more diversified and sophisticated. Initially, target specialist vocabulary was seen to be a key feature of BE courses preoccupied with the teaching of businessrelated words and terminology, with little consideration of how the learners might apply what they have learned to the real life. In the mid-1970s and 1980s, BE courses began to adopt the functional approach to language learning, laying emphasis on formulaic language for recommending, giving opinions and showing agreement, etc. From the 1990s onwards, BE education experienced greater expansion, incorporating various approaches to tap into more specific skill areas such as presentation techniques, negotiating, and effective meeting skills, with a view to meeting the need of a rapidly growing number of learners, who can be broadly divided into two types, namely job-experienced learners and pre-experience learners (Ellis & Johnson, 2005). The job-experienced learners, as the name suggests, have gained some practical experience of English use on the job and often have a set of highly specified language learning needs in relation to their job demands. BE courses designed for them typically focus on very specific tasks or activities such as how to improve oral fluency or how to be able to understand their clients with various English accents. The preexperience learners, by contrast, are mostly undergraduates at universities with no real-life business experience but aspire to enter the business world. They are often provided with courses which focus on knowledge and skills of more general BE communication, aiming to be better equipped linguistically to cope with the typical communicative demands in the international business community.

1.2 BE Education at the Tertiary Level in China The teaching of BE in China dates back to the early 1950s, when the College for Senior Cadres of Commerce, the first college of its kind specializing in training personnel for foreign trade, was established in Beijing. The college offered “English Translation” program to its students who were expected to graduate as qualified translators and interpreters in the field of foreign trade. To this end, these students were required to master a foreign language oriented to foreign trade, and be familiar with the techniques of conducting foreign trade as well as China’s trade policies. Three successors of this college, namely Beijing College of Foreign Trade, Beijing Institute of Foreign Trade, and the University of International Business and Economics, carried on its tradition by enrolling English major students and exposing them to BE courses, whose name evolved from “English Translation for Foreign Trade” in the

1.2 BE Education at the Tertiary Level in China

3

1950s and 1960s, “English for Foreign Trade” in the 1970s and 1980s, “English for Specific Purposes (International Trade)” in the early 1990s, “English (Economics and Trade)” in the late 1990s, and “Business English” from 2000 onward (Zhang, 2007, p. 400). The core courses pertaining to the embryonic form of BE education in China included Correspondence for Foreign Trade, Oral English for Foreign Trade, and Selected Readings on International Business and Economics from Western Newspapers. Textbooks used for these courses were either compiled by experienced teachers or by teachers in collaboration with businesspeople. A popular textbook entitled Correspondence for Foreign Trade, for example, was produced by veteran traders (Zhang, 2007). The source materials were taken from real-life business transactions and were organized by mimicking the procedures for foreign trade such as establishing business relations, making inquiries, offers and counter-offers. Along with China’s open-door policy, modernization and socio-economic development, the enterprise of BE teaching grew dramatically and is currently characterized by scale and enthusiasm (Zhong et al., 2015). The content of BE education has concomitantly spanned from a few courses steered to foreign trade issues to a wider range of courses reflecting the broader scene of international business. To date, over 1,000 colleges and universities offer BE courses to two broad groups of students: Business English majors and students majoring in business-related subject areas such as economy, international trade, finance and accounting. The BE major programs are spawned from traditionally linguistics- and literatureoriented English major programs. Although both emphasize the importance of laying a solid foundation in English language proficiency, BE programs endeavore to integrate language learning with business content by providing a more systematic and comprehensive training on students’ business knowledge and skills (Liu, 2009). After years of persistent efforts, the curriculum for BE majors has been implemented, delineating that business knowledge and practices, business discourse and English proficiency should be incorporated in the BE program (Chen & Wang, 2009). Contentbased courses such as international business culture, practices in international business communication, business ethics, business writing are hence offered in addition to the English language courses to prepare BE majors to be competent participants in the international business arena. In 2007, BE was officially recognized by Chinese Ministry of Education as an independent Bachelor of Arts program, running parallel with the English linguistics and literature program. Since then, BE as a major has been accumulating momentum, nurturing steadily each year an increasing number of graduates who are highly competitive in the job market (Wang & Li, 2011). Different from BE majors who typically cluster in universities originally specialized in foreign language education, the second group of BE learners comprise nonEnglish majors studying in comprehensive universities. Prior to the 1990s, these students were required to take College English courses aimed at enhancing their general English proficiency. However, it was soon revealed that English for General Purposes (EGP) education was insufficient for these graduates to engage in international communication activities, especially in their disciplinary areas (Feng, 2009).

4

1 Core Issues in Business English Education in China

In response to this problem, curriculum reform of college English language education was hence carried out, featuring the implementation of a series of top-down and bottom-up initiatives to promote ESP teaching (Feng, 2009). As a result, despite voices cautioning against such curricular reforms (e.g., Hu, 2009), it has become a trend for university students to receive English-medium instruction of disciplinary knowledge in so-called ESP courses. For those majoring in business-related subjects such as economics, international trade, finance, accounting, marketing, etc., BE is taught in the name of supplementary courses to complement the students’ core disciplinary studies.

1.3 Problems with BE Education in China 1.3.1 The “Tower of Babel” Problem The boom of BE education in the Chinese tertiary context has gained widespread attention across the nation and calls have constantly been made to further accelerate its growth. Scrutiny into empirical studies conducted to examine the effectiveness of BE education in China, however, has revealed two glaring issues. Firstly, due to the lack of a common yardstick, BE education in Chinese universities varies enormously across geographical or institutional divides, giving rise to a host of “Tower of Babel” problems (e.g., Ruan, 2005; Zhong et al., 2015). Descriptions of language competence in terms of course standards or teaching requirements, for instance, are generally simple and vague, rendering it extremely hard, if not impossible, to evaluate and safeguard the quality of the programs (Guan, 2005). Even for BE majors who have access to a national curriculum, different courses are offered depending on different conceptualizations of what BE is and the abilities and skills it entails (Wang et al., 2011). The situation is even more perplexing in settings where BE courses are offered as part of the precipitant process to promote ESP teaching. In the absence of clearlyformulated guidelines at the macro level, the courses are virtually delivered in an arbitrary manner and worrisome diversity concerning BE syllabuses and teaching objectives has been reported by studies investigating teaching and learning practices in the classrooms (Gu, 2010). In a similar vein, the qualifications BE teachers vary significantly: some are English teachers struggling with their insufficient business knowledge whereas others are discipline specialists, many of whom have inadequate English proficiency. The language of instruction also varies: some institutions require the use of English language throughout the course while others allow for frequent use of Chinese to ensure students’ understanding of the content knowledge, although the ratio of Chinese to English language use is seldom explicitly stipulated (Yang, 2011).

1.3 Problems with BE Education in China

5

1.3.2 Inadequate BE Assessment System Inadequate assessment system, which is often the least considered area in the formal sectors of BE education, constitutes another problem embedded in current BE education in China (e.g., Wang, 2007; Yan, 2012). Currently, institutions of higher learning as well as employers still have to rely on either College English Test (CET)1 or Test for English Majors (TEM)2 to infer Chinese learners’ BE proficiency (e.g., Liu, 2009; Xie, 2013). These two batteries of tests, although professionally developed and administered, are essentially oriented to the assessment of English for general or academic purposes, which renders them inappropriate instruments for the assessment of BE proficiency. In light of this problem, a growing number of undergraduates are now resorting to international or other domestically developed commercial tests as more accurate indicators of their BE ability. Examples of international BE tests available to Chinese learners today include BEC (Business English Certificate), BULATS (Business Language Testing Service) and TOEIC (Test of English for International Communication), among others. BEC, in particular, has established its niche in the Chinese market and ranks the second largest overseas exam administered in China (Chen et al., 2010). Domestically, numerous BE tests have also been devised, ranging from very specific ones such as Test for Customs Declarants and Test for Export Sales to comparatively general ones such as Test for Business English Majors. Because these BE tests differ considerably in terms of construct definition, proficiency level descriptions and scoring systems, it is legitimate to raise concerns with regard to issues such as qualification recognition and test comparability.

1.4 The Need for a Meta-Language for BE Education in China English language education has long been a compulsory requirement for education at all levels in China. The consistency of English curricula at different levels, however, has been increasingly criticized as being highly segmented and inconsistent (Jin et al., 2017). In 2014, the National Education Examinations Authority (NEEA), a governmental institution taking charge of educational examinations across the country, launched a project under the supervision of the Chinese Ministry of Education to develop a national framework of reference for English language education, known as China’s Standards of English (CSE). The CSE was officially unveiled in 2018, which defines English language proficiency at nine levels in terms of different aspects of the construct (i.e. listening, speaking, reading, writing, translation, interpretation, 1

The CET is an English language test battery targeting university undergraduates in China mainland, consisting of two levels: the CET Band 4 and 6. 2 The TEM is an English language test for English major undergraduates in China’s universities, consisting of two levels: the TEM Band 4 and 8.

6

1 Core Issues in Business English Education in China

organizational knowledge, pragmatic knowledge) by using specific, accurate and easy-to-understand proficiency descriptors.3 The debut of the CSE is an important milestone of English educational reform in China. The CSE is in essence a proficiency scale which intends to serve two major purposes (Fan & Frost, 2021; Jin et al., 2017). At the national level, the CSE is believed to act as a common point of reference for the alignment of teaching requirements, learning objectives and assessment criteria, with the view to improving the consistency, coherence, and efficiency of English language education at all stages. At the international level, the CSE is supposed to act as a benchmark, whose levels can be linked to those of other well-established English standards (e.g., Common European Framework of Reference for Languages, Association of Language Testers in Europe Can-Do Statements, American Council on the Teaching of Foreign Languages Guidelines, Canadian Language Benchmarks) and English language tests (e.g., IELTS, Aptis, TOEFL iBT, PET − Academic); by doing so, the mutual recognition of degrees, test results, or certificates between China and other countries will be substantially facilitated. With the CSE being widely adopted across China, it is rather sobering to note that its influence is restricted to the sphere of general English education. Scale-building in the ESP domain, especially that of BE, currently the largest sector of growth within ESP (Belcher et al., 2011), remains a void to be filled. As such, we argue for an urgent need to develop a scale of BE proficiency in light of the rapidly growing number of BE learners and the “Tower of Babel” problems associated with BE education in the Chinese tertiary context. The primary purpose of a proficiency scale is to provide a comprehensive system to be used to inform critical issues of BE education in China, such as the development of curricular requirements, teaching objectives, and assessment criteria. Involving a set of empirically calibrated, independently-formulated and user-friendly descriptors across different levels of BE proficiency, the scale will help enhance the transparency, coherence and effectiveness of BE learning, teaching and assessment in China by serving a starting point to “bring curricula, pedagogy, and assessment into fruitful interaction with one another” (Little, 2007, p. 652). In an English as a Foreign Language (EFL) context like China, a valid and useful BE proficiency scale should encompass all essential language skills, including listening, reading, speaking, writing and translating. However, developing a highquality proficiency scale requires a huge amount of resources, such as financial investment, human power and close collaboration between different organizations (Yang et al., 2012). This renders a project of such a nature too challenging to be completed in one single study. As such, in this study, we focused on the writing skill in business communications, which is the most frequently used skill in daily business activities in EFL contexts (e.g., Chew, 2005; Evans, 2010).

3

The CSE is available at http://www.neea.edu.cn/res/Home/1908/0c96023675649ac8775ff3422f9 1a91d.pdf.

1.5 Conclusion

7

1.5 Conclusion This chapter first provides an overview of the current supremacy of English used as a shared language in the international business arena and the increasing demand for proficient BE learners. Next, a review is conducted to explore the historical development of BE education in China’s tertiary settings, a context within which the study reported in this book is situated. After discussing the “Towel of Babel” problems identified with current BE education in Chinese universities, we argue for the need to develop a scale of BE writing with a view to enhancing the transparency, coherence and effectiveness of BE learning, teaching and assessment in China.

References Akar, D. (2002). The macro contextual factors shaping business discourse: The Turkish case. International Review of Applied Linguistics in Language Teaching, 40(4), 305–322. Belcher, D., Johns, A. M., & Paltridge, B. (Eds.). (2011). New directions in English for specific purposes research. The University of Michigan Press. Bilbow, G. (2002). Commissive speech act use in intercultural business meetings. International Review of Applied Linguistics in Language Teaching, 40(4), 287–303. Charles, M., & Marschan-Piekkari, R. (2002). Language training for enhanced horizontal communication: A challenge for MNCs. Business Communication Quarterly, 65(2), 9–29. Chen & Wang. (2009). Developing national curriculum for BA program in Business English of China. Foreign Languages in China, 6, 4–11/21. Chen, X., He, M., & Hugh, B. (2010). The use of BEC as a measurement instrument in higher education in China. Cambridge ESOL Research Notes, 40, 13–15. Chew, K. (2005). An investigation of the English language skills used by new entrants in banks in Hong Kong. English for Specific Purposes, 24(4), 423–435. Crystal, D. (2003). English as a global language. Cambridge University Press. Ellis, M., & Johnson, C. (2005). Teaching business English. Oxford University Press. Evans, S. (2010). Business as usual: The use of English in the professional world in Hong Kong. English for Specific Purposes, 29(3), 153–167. Feng, A. (2009). English in China: Convergence and divergence in policy and practice. AILA Review, 22, 85–102. Gu, Z. (2010). Instructional design of ESP curriculum—A constructivist perspective. Unpublished doctoral dissertation, Shanghai International Studies University. Guan, C. (2005). The significance and methods of implementing needs analysis in business English courses. Foreign Language and Their Teaching, 3, 37–40. Hu, G. W. (2009). The craze for English-medium education in China: Driving forces and looming consequences. English Today, 25(4), 47–54. Jin, Y., Wu, Z., Alderson, C., & Song, W. (2017). Developing the China standards of English: Challenges at macropolitical and micropolitical levels. Language Testing in Asia, 7(1), 1–19. Little, D. (2007). The common European framework of reference for languages: Perspectives on the making of supranational language education policy. The Modern Language Journal, 91(4), 645–655. Liu, F. (2009). A review of the historical development of Business English studies in China. Foreign Language World, 6, 10–16. Nickerson, C. (2005). English as a lingua franca in international business contexts. English for Specific Purposes, 24(4), 367–380.

8

1 Core Issues in Business English Education in China

Ruan, J. (2005). The teaching objectives and teaching principles of college business English courses. Foreign Language World, 107(3), 26–31. Wang, L., & Li, L. (2011). An analysis of foreign business language teaching and research in China. Foreign Language World, 147(6), 6–14. Wang, L., Chen, Z., & Zhang, Z. (2011). Developing national curriculum for BA program in Business English of China. The Asian ESP Journal, 7(1), 29–61. Wang, X. X. (2007). An analysis of English majors’ needs in BE course design. Unpublished MA thesis, Northeast Normal University. Xie, H. (2013). Curriculum design based on need analysis for business English majors of an independent institute—A case study of engineering and technology college of Yangtze University. Unpublished MA thesis, Yangtze University. Yan, M. (2012). A study of genre-based discursive competence of business English: Construct definition and test development. Unpublished doctoral thesis, Shanghai International Studies University. Yang, H., Zhu, Z., & Fang, X. (2012). The development of the national common framework of language proficiency scales—Principles, approaches, and validation study. Shanghai Foreign Language Education Press. Yang, N. W. (2011). An investigation on the professional quality of business English teachers. Unpublished MA thesis, South-Central University for Nationalities. Zhang, Z. C. (2007). Towards an integrated approach to teaching business English: A Chinese experience. English for Specific Purposes, 26(4), 399–410. Zhong, W., Zhang, W., & He, J. (2015). On positioning BA in business English program in higher education. Foreign Languages in China, 12(1), 4–10.

Chapter 2

Assessing Business English: The Role of Language Scales

In this chapter, important issues concerning language scales such as the origins, functions and purposes were briefly discussed. The CEFR, the most widely used language proficiency scale in Europe and indeed worldwide, was then introduced, along with a discussion of the problems associated with the prevalent practice using CEFR descriptors to profile BE proficiency. After that, several language proficiency scales, which were developed or implemented to gauge BE proficiency, were reviewed, followed by a discussion of their strengths and limitations. Based on the review, we argued for the need to develop a scale of BE writing proficiency tailored to the teaching, learning and assessment of BE in Chinese tertiary context.

2.1 Language Scales 2.1.1 Origins Language scales are also known as “band scores, band scales, profile bands, proficiency levels, proficiency scales, proficiency ratings” (Alderson, 1991, p. 71), or “guidelines, standards, levels, yardsticks, stages, scales, or grades” (North, 2000, p. 11). What they all have in common, nevertheless, is that they normally consist of “a series of ascending bands of proficiency” (Council of Europe, 2001, p. 40) and “characteristic profiles of the kinds and levels of performance which can be expected of representative learners at different stages” (Trim, 1978, p. 6). The many names given to language scales reflect the fact that language scales have quite different backgrounds. According to North (2000), language scales mainly find their origin in the following three sources: rating scales, examination levels and stages of attainment.

© Springer Nature Singapore Pte Ltd. 2021 L. Wang and J. Fan, Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach, https://doi.org/10.1007/978-981-16-5449-7_2

9

10

2 Assessing Business English: The Role of Language Scales

The majority of existing language scales derive from rating scales, typically used to facilitate the assignment of a grade in a test. The first significant rating scale in its modern form is the US Foreign Service Institute scale (the FSI scale) developed in the early 1950s (North, 2000). Ranging from levels 0 to 5, the FSI scale is designed to guide the rating of subjective skills. Each level of the scale was associated with a label and a description of what that level meant in terms of language performance. The second source of scale development is descriptions of examination levels. The suite of communicative examinations developed by the Royal Society of Arts (RSA) for English as a Foreign Language is a case in point. The exams have defined content and performance specifications for four levels, which share the same categories for the performance criteria, called “degrees of skill”. For instance, the criteria for Oral Interaction include: Accuracy, Appropriacy, Range, Flexibility and Size. The set of descriptors, presented in the teachers’ guide as a 20-cell grid (5 categories, 4 levels) make up an analytic scale of proficiency (North, 2000). A third origin of scales of language proficiency is the description of stages of attainment as part of the objectives for an educational system or course of instruction. Such scales are summaries of outcomes aligned to different graded levels considered to be critical to end-users of the educational system or course. Being process as well as product-motivated, such scales feature detailed content specifications and descriptors elaborating the quality of language performance. The first scale of this kind is the Stages of Attainment Scale developed by the English Language Teaching Development Unit, the ELT division of Oxford University Press (ELTDU, 1976, cited in North, 2000). Drawing inspiration from the Threshold Level, the ELTDU scale sets up a series of 8 levels and drafts level descriptors by conducting task analysis.

2.1.2 Functions Scales of language proficiency are typically criterion-referenced, offering descriptions of aspects of language knowledge or what language learners can do in the real world along a proficiency continuum. They are noted for a number of advantages as they can be used to: (1) (2)

(3) (4)

provide a “stereotype” with which learners can assess their own language proficiency (Trim, 1978); establish a framework of reference which can describe achievement in a system in terms meaningful to all stakeholders of that system (Brindley, 1991; Trim, 1978); enable comparison within one system in terms of syllabus planning, materials organisation, progress and exit assessment and certification (North, 1991); provide a tool to track learner progress and increase motivation (Council of Europe, 2001);

2.1 Language Scales

(5)

(6)

(7)

11

increase the reliability of subjective ratings, especially of the productive language skills, and provide a common standard and terminology for such judgments (Alderson, 1991); provide a common discourse for reporting results from teacher assessments, scored tests, rated tests and self-assessment (Alderson, 1991; Griffin, 1989); and facilitate the establishment of comparisons between systems or populations using a common yardstick (Carroll & West, 1989).

Apart from the inherent attractions associated with language proficiency scales, two important influences have further accelerated the widespread application of scales in the educational and testing spheres over the past few decades. The first influence is a general movement towards more transparency in educational systems than has traditionally been the case. The second is characterized by moves towards greater international integration, which underlines the importance of describing language learning objectives and accomplishments in qualitative terms (North & Schneider, 1998). The result is the emergence of a proliferation of language scales developed after the 1950s when the US FSI Scale made its appearance. Examples are the Eurocentres Scale of Language Proficiency (North, 1991), the British National Language Standards (Languages Lead Body, 1992), the ALTE Framework (Association of Language Testers in Europe, 1998), and the Common European Framework of Reference (CEFR, Council of Europe, 2001).

2.1.3 Purposes With language scales being increasingly applied in various contexts, it is worth adding at this point that scales of language proficiency are designed for different purposes and problems might arise when scales designed for one purpose are used for another. According to Alderson (1991), scales can be distinguished as user-oriented, assessororiented and constructor-oriented ones according to their intended purposes of use. The user-oriented scales usually involve the descriptions of levels of performance which provide worded information about typical or likely behaviors of candidates at any given level. Such explanations can facilitate test users, including employers, test-takers and admission officers to interpret the test results traditionally presented solely in form of numerical scores. The second type of scales is rater-oriented ones, often adopted in case of writing and speaking tests to facilitate the rating process. During an oral interview, for instance, an assessor will elicit language behavior from candidates and then compare the elicited behavior with the scale descriptions, be they holistically or analytically formulated ones. The third type of scales is designed to act as a set of specifications for test constructors, by describing the sorts of texts, tasks and items appropriate for a given level of test candidates. Their major function is to guide the construction of tests at targeted levels. Following Alderson’s line of thought, Pollitt and Murray (1996) made an important addition to the classification

12

2 Assessing Business English: The Role of Language Scales

of scale purposes by pointing out that scales can also be diagnosis-oriented; that is, they can be used to identify strengths and weaknesses of learner performances. In the early history of scale development, a scale was designed to serve one major function. The FSI scale, for instance, was solely created to aid the rating of oral performance. However, as the application of language scales gradually extended from the field of language assessment to language teaching and learning in general, a few scales, oftentimes called “language proficiency scales”, “frameworks”, or “standards”, were developed to cater for all the four scale functions outlined above, with a view to providing a common point of reference in a vast array of educational and testing contexts. Such proficiency scales are characterised by their comprehensiveness and usually consist of a host of sub-scales, with each one addressing one aspect of language performance such as grammar, pronunciation, etc. A typical example of such scales is the Common European Framework of Reference for Languages (CEFR) (Council of Europe, 2001), which provides guidance for the learning, teaching and assessment of more than 40 languages and has been exerting far-reaching impact on countries and regions around the globe (Little, 2007). Another example similar to the CEFR is China’s Standards of English (CSE) which was officially unveiled in 2018, serving as a unified standard for English learning, teaching, and assessment in China (e.g., Jin et al., 2017; Liu, 2019).

2.2 The CEFR 2.2.1 What is the CEFR? In response to the long-recognised need for clearly defined proficiency levels, the CEFR was published by the Council of Europe in 2001, aiming to “promote and facilitate cooperation among educational institutions in different countries; provide a sound basis for the mutual recognition of language qualifications; [and] assist learners, teachers, course designers, examining bodies and educational administrators to situate and co-ordinate their efforts” (Council of Europe, 2001, p. 6). A salient feature of the CEFR lies in its comprehensiveness. Containing more than 50 illustrative scales, the CEFR consists of both horizontal and vertical dimensions. The horizontal dimension, referred to as its taxonomic descriptive scheme, maps out a sophisticated set of more than 50 descriptive categories for profiling language proficiency. These descriptive categories fall into two broad categories: (a) “language activities” in which language users engage, and (b) “communicative language competences” derived from models of language use (Hulstijn, 2011). In Mislevy’s (1993, p. 343) terms, the descriptive scheme can be considered as a “learner model”, namely “a simplified description of selected aspects of the infinite varieties of skills and knowledge and characterize real students”. The vertical dimension, referred to as the common reference levels, outlines an ascending series of descriptors of language competence in three broad proficiency

2.2 The CEFR

13

A

B

C

Basic User

Independent User

Proficient User

A1 Breakthrough

A2

B1

B2

C1

C2

Waystage

Theshold

Vantage

Effective Operational

Mastery

Proficiency

Fig. 2.1 The CEFR vertical levels

bands (A = Basic User; B = Independent User, C = Proficiency User). Each band is subdivided into upper and lower levels: (A1 = Breakthrough; A2 = Waystage; B1 = Threshold; B2 = Vantage; C1 = Effective Operational Proficiency; C2 = Mastery) (see Fig. 2.1). Adopting an action-oriented approach, the CEFR views learners as “members of society who have tasks to accomplish in a given set of circumstances” (Council of Europe, 2001, p. 9). CEFR level descriptors are formulated in “can-do statements” to characterize the act of language communication in terms of what language users are able to do and the corresponding competences (i.e. knowledge and skills) required. At A2, for example, learners can “understand sentences and frequently used expressions related to areas of most immediate relevance” (e.g. very basic personal and family information, shopping, local geography, employment). At B1, they can “understand the main points of clear standard input on familiar matters regularly encountered in work, school, leisure, etc.”; and “deal with most situations likely to arise whilst travelling in an area where the language is spoken”. At C1, they can “understand a wide range of demanding, longer texts, and recognize implicit meaning; and can “express him/herself fluently and spontaneously without much obvious searching for expressions”. These examples are selected from the “Global Scale” of the CEFR, which is presented in Table 2.1. The reference level descriptors are one of the important innovations brought about by the CEFR because of their reader-friendly manner of formulation, adhering to principles of positiveness (describing competence, not incompetence), definiteness (avoiding vagueness), clarity (jargon-free and written in simple syntax), brevity (up to about 25 words), and independence (rather than having meaning only in relation to other descriptors). The success of the CEFR is also attributed to the rigorous empirical and statistical procedures involved in developing the descriptors (see North (2000) for a detailed account of the development of the descriptors in the CEFR).

2.2.2 Uses of the CEFR The CEFR has been gaining international recognition since its initial publication in 2001. Over the past 20 years or so, it has been increasingly visible as a reference tool

14

2 Assessing Business English: The Role of Language Scales

Table 2.1 Common reference levels of the CEFR: global scale (from Council of Europe, 2001, p. 24) Proficient user

C2 Can understand with ease virtually everything heard or read. Can summarize information from different spoken and written sources, reconstructing arguments and accounts in a coherent presentation. Can express him/herself spontaneously, very fluently and precisely, differentiating finer shades of meaning even in more complex situations C1 Can understand with ease virtually everything heard or read. Can summarize information from different spoken and written sources, reconstructing arguments and accounts in a coherent presentation. Can express him/herself spontaneously, very fluently and precisely, differentiating finer shades of meaning even in more complex situations

Independent user B2 Can understand the main ideas of complex text on both concrete and abstract topics, including technical discussions in his/her field of specialization. Can interact with a degree of fluency and spontaneity that makes regular interaction with native speakers quite possible without strain for either party. Can produce clear, detailed text on a wide range of subjects and explain a viewpoint on a topical issue giving the advantages and disadvantages of various options B1 Can understand the main points of clear standard input on familiar matters regularly encountered in work, school, leisure, etc. Can deal with most situations likely to arise whilst travelling in an area where the language is spoken. Can produce simple connected text on topics which are familiar or of personal interest. Can describe experiences and events, dreams, hopes and ambitions and briefly give reasons and explanations for opinions and plans Basic user

A2 Can understand sentences and frequently used expressions related to areas of most immediate relevance (e.g. very basic personal and family information, shopping, local geography, employment). Can communicate in simple and routine tasks requiring a simple and direct exchange of information on familiar and routine matters. Can describe in simple terms aspects of his/her background, immediate environment and matters in areas of immediate need A1 Can understand and use familiar everyday expressions and very basic phrases aimed at the satisfaction of needs of a concrete type. Can introduce him/herself and others and can ask and answer questions about personal details such as where he/she lives, people he/she knows and things he/she has. Can interact in a simple way provided the other person talks slowly and clearly and is prepared to help

for language teaching, learning and assessment in Europe and beyond (Alderson, 2002; North, 2014). The CEFR level labels and its reference level descriptions are particularly ubiquitous at different educational levels, affecting adults as well as young learners learning both an L1 or L2 (Figueras, 2012). For instance, words such as Beginner, Intermediate and Advanced were commonly used in the past to describe language learners’ proficiency, but criticisms were levelled concerning interpretation of the terms in that one person’s “intermediate” might be seen by another as “advanced” (Alderson, 2017).

2.2 The CEFR

15

Today, these familiar terms have largely been replaced by the six CEFR levels, which are substantiated by reader-friendly can-do statements specifying more explicitly what and how well a learner “can do” with the target language. In the fields of language teaching and learning, such comprehensiveness and clarity provided by the CEFR levels for profiling learner abilities are readily welcomed since modern curricula and language programs are increasingly outcomeand process-oriented, emphasising real-life language use. As a result, the CEFR levels have been used as a key element to frame language policy, language syllabuses, curriculum guidelines and textbooks. With CEFR-labelled objectives, teachers can provide their students will a learning landscape, marking the beginning point of their proficiency and the path to their aims of language learning. The level descriptions can also be used to diagnose learners’ strengths and weaknesses in language learning, helping them gain insight into their learning processes. As such, more individualized learning trajectories can be achieved with which each learner is clearer about the proficiency level that they should be aiming at. Apart from the role the CEFR plays in language teaching and learning, its impacts on language assessment are even more pronounced, affecting virtually every aspect of assessment practices, ranging from test development, rating scale construction, test score interpretation to the comparison of test qualifications. Council of Europe (2001) proposes that there are three main ways in which the CEFR can be used in relation to language tests: • “For the specification of the content of tests and examinations: WHAT IS ASSESSED. • For stating the criteria to determine the attainment of a learning objective: HOW PERFORMANCE IS INTERPRETED. • For describing the levels of proficiency in existing tests and examinations thus enabling comparisons to be made across different systems of qualifications: HOW COMPARISONS CAN BE MADE (p. 178, emphasis in original).” Of these three claims, the last one has been generating the greatest attention and it is heralded by a host of testing agencies worldwide, which have quickly claimed the alignment of their tests to the CEFR’s six-level scales. In response to the rapid acceptance of the CEFR by testing organisations, the Council of Europe published a manual titled Relating Language Examinations to the Common European Framework of Reference for Languages: Learning, Teaching, Assessment (CEFR), A Manual (Council of Europe, 2008) to assist examination providers in their efforts to align their language tests to the CEFR. The manual is designed to support four sets of procedures: (1) familiarisation with the CEFR’s common reference levels; (2) specification of examinations in terms of content, test development, marking and grading, and test analysis, and the relation of that specification to the CEFR scales; (3) standardisation in assessing performance and judging the difficulty of test items in relation to the CEFR scales; and (4) empirical validation through the analysis of test data (Figueras et al., 2005). The debut of this manual and its support materials has added fuel to the CEFR-linking endeavors, as an increasing number of tests are being aligned to the CEFR (Martyniuk, 2010).

16

2 Assessing Business English: The Role of Language Scales

2.2.3 Describing Business English Proficiency with the CEFR: Potential Problems Although originally developed to aid foreign language learning in the adult context in Europe, the CEFR is currently used in virtually all educational spheres, including languages for specific purposes (Figueras, 2012). In fact, quite a number of ESP tests, especially BE tests such as BEC, BULATS and TOEIC have been mapped to the CEFR along with other EGP tests (e.g., Khalifa & Ffrench, 2009; Tannenbum & Wylie, 2008). As can be seen below (UCLES, 2008), the BEC examination levels are explicitly defined in accordance with the CEFR levels. BEC Preliminary assesses English language ability used in a business context at Council of Europe ‘Threshold’ level (B1). BEC Vantage assesses English language ability used in a business context at Council of Europe ‘Vantage’ level (B2) BEC Higher assesses English language ability used in a business context at Council of Europe ‘Effective Operational Proficiency’ level (C1).

Such practice, however, is not without controversy. In fact, many scholars have cautioned against the misuse of the CEFR to the ESP domain. Alderson (2007), for instance, maintains that the CEFR in its current form is not suitable to the teaching of languages for specific purposes and application of the CEFR to the ESP domain “would not be acceptable without extensive revision” of the proficiency scale (p. 662). Kim and Elder (2009) also argue that the CEFR is inadequate in capturing the uniqueness of language use features, strategies and tasks in the ESP context. Scrutiny into the proficiency scale and its development process reveals that direct application of the CEFR to the ESP domain is problematic due to the following three reasons. Firstly, the CEFR is largely grounded in theories of communicative competence and language use (e.g. Canale & Swain, 1980; Bachman, 1990), which by default defines general English proficiency. While the EGP-oriented CEFR presumably enjoys a high degree of generalisability across different contexts, its direct relevance to the specific BE domain, an ESP sphere of interest, is likely to be low. Secondly, the CEFR descriptors were validated in terms of teachers’ perceptions of how different levels of actual learner performance might be best described. Most of these teachers teach EGP courses, with a limited or vague understanding of how English is used in business contexts. Unsurprisingly, their interpretation of the difficulty level associated with work-related descriptors involving tasks such as “telephoning”, “attending formal meetings”, “giving formal presentations”, “writing reports and essays”, “formal correspondence”—the ones which are most relevant to the context of business language use, turned out to be highly inconsistent. As a result, these descriptors were statistically calibrated as misfitting items and were removed from the final version of the framework (North, 2000), distancing the CEFR further from an accurate description of BE proficiency. Finally, the CEFR itself is defected by some of its inherent limitations. For instance, many descriptors incorporated into the scale were not empirically validated (Council of Europe, 2001). Besides, the fact that only language teachers’ views were

2.2 The CEFR

17

elicited to calibrate the difficulty level of the descriptors has given rise to critisism that what is being scaled is not necessarily learner proficiency, but “teachers’ perception of that proficiency” (Hulstijn, 2007, p. 666). Most importantly, the CEFR was initiated as a means to encourage “language learning for European citizenship” (Council of Europe, 2001). Hence, its validity in non-European contexts is open to question. Despite the caveats raised above, influence exerted by the CEFR has continued to accrue as it is increasingly adopted by examination boards and language testing agencies worldwide (Alderson, 2007). The popularity of the CEFR reflects the paucity of attention paid to the benchmarking of ESP abilities and the urgency to develop more appropriate instruments to gauge language proficiency of the ESP nature. In what follows, we take a look at a few ESP scales with a clear focus on BE proficiency.

2.3 Scales Describing Business English Proficiency 2.3.1 Categorization of Scales Incorporating BE Proficiency In the field of BE teaching, some business organizations have developed their own proficiency scales for the description and measurement of BE ability. However, most of the scales developed are confined to internal use, largely due to the organisation’s insistence on a degree of secrecy to maintain their competitive edge over their competitors (Nelson, 2000, p. 48). To date, only a limited number of language scales incorporating the description of BE ability are available for public scrutiny, which can be broadly categorized into three types according to their original purposes of development (see Table 2.2). The first type is constructed with the purpose of assessing the corporate employees’ proficiency in English in a business context, emphasizing job-oriented language skills. The second type is created by testing associations to Table 2.2 Types of BE proficiency scales Type

Purpose

Typical examples

Characteristic

1 Scales devised to meet corporate needs

Training and assessing company staff

ELTDU Scale

The whole scale centres on the description of BE proficiency

2 Scales devised to chart language tests

Facilitating test choice and score interpretation

ESU framework; ALTE framework

A few sub-scales are devoted to the description of BE proficiency

ISLPR; CLB

BE proficiency is described through workplace language use

3 Scales devised with a Training and assessing workplace orientation adult immigrants

18

2 Assessing Business English: The Role of Language Scales

facilitate test comparison and score interpretation; as BE tests are among the test batteries being considered, a section of these scales is devoted to the description of BE proficiency. The third type of scales features a clear orientation to workplace English use; BE proficiency, in such a case, is alluded to via the description of English language ability in workplace settings. As it is neither practicable nor necessary to review all the scales, five of the existing BE scales, namely the ELTDU Stages of Attainment Scale (ELTDU Scale), the Association of Language Testers in Europe (ALTE) Framework, the EnglishSpeaking Union (ESU) Framework, the International Second Language Proficiency Ratings (ISLPR) and the Canadian Language Benchmarks (CLB), will be examined in more detail in this section. These scales are chosen because of (a) their availability in the public domain, (b) their international currency and impact, and (c) their representativeness of the three types of BE scales discussed above. The review is expected to shed light on the process of how the scales were developed and the content per se, followed by a discussion of their strengths and limitations. It is expected that the review will shed light on the development of a proficiency scale for BE teaching, learning and assessment in the Chinese context. Each scale will be reviewed systematically with respect to (a) their purpose of development, (b) scale structure and (c) development and validation process. As the focus of the study is to develop a BE writing proficiency scale, the structure of each scale is displayed in such a way that only the content relevant to the writing skill is highlighted.

2.3.2 The ELTDU Stages of Attainment Scale (ELTDU Scale) (a)

Purpose

The ELTDU Stages of Attainment Scale (ELTDU scale), developed in 1976 by the English Language Teaching Development Unit in Colchester, English, is the first performance scale produced to facilitate ESP teaching and testing (Ellis & Johnson, 1996). Developed by the then R&D arm of the ELT division of Oxford University Press in response to the requirement of a large Swedish company (SKF), the scale represents a “bold undertaking” (McDonough, 1984, p. 114) to make precise statements about language levels in relation to what language skills were necessary in particular business situations. As the ELTDU scale is “intended to be used in business, companies and similar concerns in which information is frequently required on the language ability of personnel performing various functions throughout the company” (ELTDU, 1976, p. 3, cited in Davies, 1990, pp. 40–41), language specifications at each level are stated in terms of what tasks employees at particular stages are expected to perform with the language at their disposal. In this way, the scale provides reference, in a user-friendly manner, for the users to decide on whether, for instance, an employee

2.3 Scales Describing Business English Proficiency

19

can be sent to an English-speaking company, or future training in the language is required. (b)

Structure

The ELTDU scale is a complex document comprising 27 business skill areas, including activities such as “dealing with visitors”, “use of the telephone”, “internal and external business meetings”, etc. In each of the skill areas, there are definitions of what learners should be able to do at eight levels of attainment, ranging from elementary Level A to native-like competence at Level G. All the level definitions of the scale are presented as specifications of function, rather than focusing on the linguistic forms solely, as can be exemplified by descriptors in the skill “dealing with visitors” at Level B: Can welcome visitors and perform introductions, giving a brief outline of personal details, the purpose of visit, etc. Can give a simple description of work in hand. But cannot ‘talk shop’ or make social chat, or answer unpredictable questions.

It is worth noting that although level statements are formulated in a “can-do” format in this scale, the series of stages of attainment of the scale “do not describe what the learner does or can do but what the stage or level requires of him” (North, 1994, p. 47). (c)

Development and validation process

Publicly available information about how the ELTDU scale was developed and validated is very limited. North (1994) documented that ELTDU scale was developed on the basis of skills-oriented analysis of company and individual needs as it was motivated by the need to produce an objective and explicit description of language proficiency to assess employees’ English skills. Drawing inspirations from The Threshold Level (Van Ek, 1976), its developers also analyzed language functions as well as grammatical and vocabulary elements necessary to perform each of the identified tasks (North, 1994). The only validation study of the scale available for public scrutiny was conducted by Worth and Marshall (1981), who examined stakeholders’ perceptions of the scale in some German firms. The researchers reported the scale’s effectiveness in eliminating “undertraining and costly overtraining” (p. 131, cited in McDonough, 1984, p. 114) and concluded that increased precision about training targets led to increased commitment to those targets by both employers and employees.

2.3.3 The ALTE Framework (a)

Purpose

Being aware of the confusion faced by test-users regarding test choice and score interpretation, the Association of Language Testers in Europe (ALTE) launched a

20

2 Assessing Business English: The Role of Language Scales

“can-do” project, aiming to “establish a framework of ‘key levels’ of language performance, within which exams can be objectively described” (ALTE, 2002, p. 3). The resulting ALTE framework consists of a set of performance-related scales, describing what learners can do in the foreign language by the use of can-do statements. Based on an analysis of exam content, tasks types and candidate profiles, the tests developed by ALTE members are placed within this framework to facilitate communication among stakeholders in the testing process as well as the interpretation of test results by non-specialists. Beginning with general language proficiency tests, the framework has been extended to cover tests for business purposes, such as BULATS and BEC. Although the ALTE framework was originally designed to help demonstrate equivalences between the examination systems of ALTE members, the can-do statements, according to its developers, can also stand as an independently useful product to be used outside the ALTE context, offering reference for teaching material compilation, curriculum design, and in-class diagnostic task development, to mention just a few (ALTE, 2002). (b)

Structure

The ALTE framework comprises approximately 400 can-do statements, organized into a set of proficiency scales. Foci of these scales fall into three general categories: (1) Social and Tourist, (2) Work and (3) Study. Each of these categories includes a number of more specified areas or activities. Language proficiency in each of these areas is described by up to three scales, for the skills of Listening/Speaking, Reading and Writing. The structure of the framework is graphically presented in Fig. 2.2. Of all the three general areas, it is the series of performance scales under the category of Work that relate directly to the description of BE proficiency. A global description of BE proficiency is presented in the ALTE Work statements summary and all the four skills (with listening and speaking skills being considered together) are discussed at six levels in an ascending order, starting from ALTE Breakthrough to ALTE Level 5. Apart from the holistic scale, sixteen illustrative sub-scales are devised, in which a more detailed task-based description of BE proficiency is provided along four dimensions under the headings of “concern”, “activity”, “environment” and “language Three general categories Social &Tourist

More particular areas

Up to four skills

Work-related services 1 L/S Understanding and writing reports

Work Following and giving a presentation or demonstration Study

Etc.

Fig. 2.2 Structure of the ALTE framework

2.3 Scales Describing Business English Proficiency

21

Table 2.3 A sub-scale of the ALTE framework in the category of work Area

Work

Concern

Reports (of substantial length and formality)

Activity

Understanding and writing reports

Environment

Workplace (office, factory, etc.)

Language skills

Writing

ALTE Breakthrough

N/A*

ALTE 1

N/A

ALTE 2

N/A

ALTE 3

CAN write a simple report of a factual nature and begin to evaluate, advise, etc

ALTE 4

CAN write a report that communicates the desired message. WILL need more time to write the report than a native speaker would

ALTE 5

CAN write quite lengthy reports with only the occasional, minor error, and without taking much longer than a native speaker

Note *N/A = Not available

skills required”. Guided by the belief that only basic or very high-level proficiency is required for the successful fulfillment of certain tasks, some cells of the framework representing the highest or lowest levels are intentionally left blank (ALTE, 2002). For instance, the activity “understanding and writing reports” (see Table 2.3) is considered relevant only to relatively high-proficiency learners, and descriptors describing lower levels are not provided. (c)

Development and validation process

Involving leading experts on language assessment and applied linguistics, the original development process of the ALTE framework went through the following stages (Council of Europe, 2001, pp. 245–246): (1) (2) (3) (4) (5) (6)

Capturing the characteristics of users of ALTE language tests by means of questionnaires, reports from schools, etc.; Using the collected information to identify a list of candidate needs and major concerns; Using test specifications and internationally recognized proficiency statements from sources such as Waystage and Threshold to draw up initial statements; Evaluating the statements and assessing their relevance to the test takers; Trialing the statements among teachers and students to assess their relevance and transparency; Correcting, revising and simplifying the language of the statements in the light of the above information.

Being aware of the subjective nature of the level descriptions included in the ALTE framework, its developers carried out an extended process of empirical validation. Language learners’ self-reported information was the major source of data, which

22

2 Assessing Business English: The Role of Language Scales

was collected by presenting the proficiency scales in a set of questionnaires. Nearly ten thousand test takers of ALTE examinations, whose test results were available, participated in the survey, thus making the study the biggest data collection endeavor to validate a descriptive language proficiency scale (Council of Europe, 2001). It should be noted, however, that the validation process was mainly carried out among European speakers, reflecting largely the “European patterns of foreign language skills” (Neil, 2000, p. 12); as such, it is legitimate to raise concerns of its applicability to other contexts.

2.3.4 The ESU Framework (a)

Purpose

In recognition of the confusion caused by the availability of a bewildering variety of ELT examinations and their different score reporting systems, the English-Speaking Union (ESU) developed the ESU framework to enhance the transparency and comparability of EFL examinations. Comprising a set of 22 illustrative scales, the ESU framework features the use of real-world descriptions of language performance in assisting non-specialists to interpret and compare a whole range of EGP and ESP tests. Although the original purpose for developing the ESU framework was to provide a system for describing and comparing English language examinations, its developers claim that the framework can also be applied to the other areas of language education, providing guidance for issues such as course objective formulation, instructional materials development, etc. (b)

Structure

The ESU framework is presented in a pattern of five stages, ranging from Stage I to Stage V. These five Stages are substantiated by 22 sub-scales, called “yardsticks”, which describe language proficiency in terms of nine levels spanning a proficiency continuum from “extremely limited command” at Level 1 to a “fully competent command” at Level 9 (Carroll & West, 1989, p. 5). Progression of stages in this framework does not indicate improvement of language ability, as commonly assumed; rather, they differ in terms of the degree of detail of language knowledge and skills being addressed. In other words, the specificity of the descriptors increases as one moves from Stage I which presents an overall description of language proficiency irrespective of skill areas or context of language use, to Stage V, which goes into greater detail by elaborating on specific enabling skills (i.e. linguistic skills, functional skills and test-taking skills) underlying a person’s performance in reallife contexts. The corresponding relationship between each Stage and its Yardstick(s) is presented in Table 2.4. As is indicated in this table, descriptions of the BE proficiency appear at Stage IV in yardsticks 9 (listening), 12 (speaking), 15 (reading) and 18 (writing) respectively.

2.3 Scales Describing Business English Proficiency Table 2.4 Yardsticks and stages of the ESU framework

23

Yardstick

Number

Name

1

Stage I

Overall language proficiency

2

Stage II

Oral proficiency

3

Stage II

Graphic Proficiency

4

Stage III

Listening

5

Stage III

Speaking

6

Stage III

Reading

7

Stage IV

Writing

8

Stage IV

Listening for social and personal purposes

9

Stage IV

Listening for business purposes

10

Stage IV

Listening for study/training purposes

11

Stage IV

Speaking for social and personal purposes

12

Stage IV

Speaking for business purposes

13

Stage IV

Speaking for study/training purposes

14

Stage IV

Reading for social and personal purposes

15

Stage IV

Reading for business purposes

16

Stage IV

Reading for study/training purposes

17

Stage IV

Writing for social and personal purposes

18

Stage IV

Writing for business purposes

19

Stage IV

Writing for study/training purposes

20

Stage V

Linguistic skills

21

Stage V

Functional skills

22

Stage V

Examination skills

To ensure consistency from level to level and from one yardstick to another, all the yardsticks are produced from the same blueprint and all descriptors are formulated using the same criteria. The purpose is to make the production of new yardsticks relatively simple as the developers explicitly encourage the potential scale users to make necessary additions to cater to their contextual needs (Carroll & West, 1989). Such a descriptor formulation style, however, is flawed by its reliance on the use of qualifiers such as “a wide range of”, “moderate-level”, “well adjusted”, “adequately adjusted” to discriminate adjacent levels, causing difficulty in make independent interpretation of each descriptor as demonstrated by the following two exemplar yardsticks of the scale. “Writes simple business texts with good confidence and competence, but some limitations on moderate-level texts. Conveys short business messages but with

24

2 Assessing Business English: The Role of Language Scales

a loss of detail and subtlety. Lack of clarity requires the reader to backtrack frequently. Marginally communicative as a business writer. Business text organization is haphazard. Little variation in business style or sense of appropriacy to context or audience. Frequent lapses of fluency when writing. Controls a limited range of business language and specialist terminology in own field. Frequent errors in usage and spelling impede clarity of message.” (Yardstick 18, Writing for business purposes, level 4). “Writes simple business texts with adequate confidence and competence, but many problems with moderate-level texts. conveys the basic business message but with little detail or clarity. communication breaks down as language limitations interfere with message. Little sense of readers’ expectations. Little organisation of business texts. Little appreciation of business style or display conventions. Restricted to handling basic facts in short paragraphs. Basic punctuation. Has a narrow range of business language and the most common specialist terms in own fields. Regular inaccuracies in basic usage and spelling cause very frequent problems with business in the written mode.” (Yardstick 18, Writing for business purposes, level 3). (c)

Development and validation process

The ESU framework was developed on the basis of test analysis and discussions among experienced examiners as well as testing professionals. Little validation information of the framework is publicly available, except one verification study which compared the evaluation of L2 learners’ writing and speaking samples which were rated against the ESU framework and the rating criteria developed by the ESU examination boards respectively. The results showed that there was no significant difference between the results rated against the ESU framework and those against the rating criteria of ESU tests, which was then used as empirical evidence by the scale developers to support the alignment between ESU exam grades and the ESU framework (Carroll & West, 1989).

2.3.5 The International Second Language Proficiency Ratings (ISLPR) (a)

Purpose

ISLPR, previously named Australian Second Language Proficiency Ratings (ASLPR), was developed with a view to upgrading and assessing Australian immigrants’ English ability for their job requirements. As such, the scale features a clear orientation towards the description of language use in workplace settings. Though the ISLPR was initially compiled in the context of the teaching and learning of English as a second language in Australia, foreign language versions were produced in French, Italian, Japanese, Chinese, Indonesian and Korean to better reflect the increasing international currency of the scale. The ISLPR is mainly designed for three major purposes: (a) assessing the proficiency of individual learners; (b) research and policy-making; and (c) providing a framework for language curriculum development (Wylie, 2010).

2.3 Scales Describing Business English Proficiency

(b)

25

Structure of the scale

The ISLPR consists of a set of sub-scales for the speaking, listening, reading and writing skills and provides performance descriptors in terms of the practical tasks that learners can carry out and how well they carry them out along a continuum from 0 (no ability to communicate in the target language) to 5 (indistinguishable from a native speaker of the same socio-cultural background). Taking into consideration the intermediate “plus” and “minus” levels within each band, each sub-scale encompasses a total of up to 12 levels. The ISLPR is comprised of general proficiency versions and specified purpose versions (including versions for academic purposes and for professional purposes). All the scale versions follow the same format, with descriptors being presented in three columns under the headings of “general description of language behavior” and “examples of language behavior” and “comment” respectively. The first column offers an overall description of learners’ semantic, linguistic and sociocultural proficiency. The second column contains examples of typical tasks that learners at each level can perform. The third column, which has a high level of commonality across different versions, provides an explanation for technical terms included in the descriptors. Information pertinent to BE ability is presented in the second column, where typical examples of language use in workplace settings are discussed. As workplace English is an umbrella term which covers the use of English in business settings as well as other professional contexts such as education, medicine, law, or tourism, BE proficiency in this scale is often described in juxtaposition with abilities associated with these professions. The writing component of ISLPR level 4, for instance, states that “In Work situations, can write complex and relatively sensitive correspondence (e.g. proposals or reports to well-established clients). In particularly sensitive situations (e.g. delicate business or diplomatic communications) writing would need to be subjected to more careful editing than that of a comparably educated native speaker would”. (c)

Development and validation process

According to its developers, the ISLPR framework was constructed by drawing on theories of linguistics, psycholinguistics and sociolinguistics, as well as intuitions and experience of experts in the field of scale construction (Ingram, 2003). Like the ALTE framework, the ILSPR has been subjected to a dynamic, ongoing process of validation, based on which amendments are made to its original version. Specifically, a number of studies were carried out to investigate the concurrent validity (e.g., Ingram, 1984), predictive validity (e.g., Kellett & Cumming, 1995), construct validity (e.g., Lee, 1992) and inter/intra-rater-reliability of the ISLPR scales (e.g., Ingram, 1984). Most of these studies yielded positive findings, except for one study conducted in the Chinese context. Involving third-year students at the Guangzhou Institute of Foreign Languages, Ingram (1984) investigated the concurrent validity of the ISLPR. However, the Spearman correlations between the (then) ASLPR ratings and a high-stakes test score were unexpectedly low, ranging from

26

2 Assessing Business English: The Role of Language Scales

0.30 to 0.58. The result is fairly alarming in that many scales developed in one context are now applied to different contexts, in absence of sufficient empirical support for such a practice.

2.3.6 The Canadian Language Benchmarks (CLB) (a)

Purpose

Developed under the auspice of the Government of Canada, the Canadian Language Benchmarks (CLB) (CCLB, 2012) is a descriptive scale of language ability in English as a Second Language (ESL) written as 12 benchmarks along a proficiency continuum, reflecting the progression of the knowledge and skills that underlie basic, intermediate and advanced ability among adult ESL learners. Specifically, the CLB describes what a learner should be able to do with the target language at a certain proficiency level (i.e. benchmark) under a specific set of performance conditions such as length of text, time constraints and audience. Sample tasks in addition to outcomes are also provided in the scale to determine whether the learner has succeeded in performing the task and/or achieved a given proficiency level. The intended usages of the CLB include: (1) describing and measuring, in a standard way, the communicative proficiency of ESL learners; (2) recognising ESL learners’ achievements in language learning anywhere in Canada and ensuring the portability of their ESL credentials; and (3) assisting the development of programs, curricula and materials that relate to a consistent set of competency descriptors for ESL learners in Canada (Pawlikowska-Smith, 2000). (b)

Structure

The CLB is structured into three proficiency Stages, stating language proficiency across the basic, intermediate, and advanced levels. Each stage consists of four language benchmarks (Initial, Developing, Adequate and Fluent), placing language learners on a continuum of 12 levels. The Stages are identical in structure, describing tasks that people can do in the listening, speaking, reading and writing skills. Each Benchmark contains (1) Global performance descriptors, which provide a brief account of a learner’s general language ability; (2) Performance conditions, describing performance conditions such as the purpose of communication, setting/place, audience, topic, time constraints, etc.; and (3) Competency outcomes and standards, which specify what a person can do, examples of communication tasks and texts, in addition to outcomes expected to be achieved by learners at this benchmark level. Descriptions of BE ability appear in the fundamental dimension of the scale, i.e. the dimension of Competency outcomes and standards (Pawlikowska-Smith, 2000). Typical examples and tasks associated with each skill in this section are discussed in relation to three contexts: Community, Study, and Workplace. Similar to the ISLPR,

2.3 Scales Describing Business English Proficiency

27

Table 2.5 Selected information from writing benchmark 9 What the person can do

Examples of tasks and texts

Performance indicators

Business/service messages • Write letters to request and to respond to requests for information, directions, service/product, clarification, permission • Write short suggestions and reports as memos and pre-set form reports

Workplace • Write a semi-formal confirmation/reminder letter of the arrangements made between you/your company and another business partner on a shared project • Write purchase orders and invoices as pre-set forms, or as memos or letters Study • Write cover/transmittal letter, order letter, or an instructions letter Community • Write a report as a one- or two- paragraph memo or as pre-set form (e.g., incident/accident report; work progress or periodic progress report; service evaluation/inspection, intake, technician’s or lab report)

• Address task and conveys the message with adequate sense of audience, formality and genre; language, format and content of letter are appropriate to occasion and social context/relationship • Conveys main ideas and gives detail • Presents information in a customary document layout/format (e.g., letters contain a summary statement, details and a close), with an occasional error in grammar, punctuation/spelling and adequate paragraph structure • Fills report forms with required information, with no omissions of details

business-related information in the CLB is capsulated in the description of workplace English ability, as can be seen from the Writing Benchmark 9 provided in Table 2.5. (c)

Development and validation process

Drawing widely on theories of communicative language proficiency, such as Bachman (1990) and Celce-Murcia et al. (1995), the CLB–version 2000 was developed on the basis of the CLB Working Document published in 1996. From 2011 onwards, the Centre for Canadian Language Benchmarks (CCLB), a non-profit corporation established to take responsibility for CLB projects, committed itself to consistent revisions of the CLB, by, for instance, seeking consultation involving 1,300 people representing multiple stakeholders, holding a national forum with 40 representative stakeholders and a team of writers and expert advisors. (CCLB, 2012, p. II). The ongoing process finally yielded the newest version of CLB 2010, which enjoyed improvements in areas such as layout and presentation of information, greater comprehensibility, clarity, consistency and relevance. In early 2011, the CCLB undertook a four-stage validation project. In the first stage, the CLB was validated in terms of its theoretical framework, which proved to be consistent not only with the theoretical concepts it drew upon, but also with the key principles underlying other language proficiency scales such as the CEFR. In stage II, six independent experts mapped each descriptor of the scale to its underlying

28

2 Assessing Business English: The Role of Language Scales

theoretical framework and founded that the theoretical models that served as a foundation for the CLB were adequately represented. In stage III, exemplars and tasks developed or selected from the CLB were independently benchmarked by six experts, whose ratings demonstrated inter-rater agreement, confirming hence the validity and reliability of the descriptors. In the final stage, two independent experts confirmed that the revised and validated CLB observed the standards for reliability and validity set out in the Standards for Educational and Psychological Testing, supporting the use of CLB as a national standard in Canada (CCLB, 2012).

2.4 Summary of the Review and Discussion The review of the BE scales above demonstrates that although these scales represent groundbreaking endeavors in the field of ESP to assess BE proficiency, they arguably have the following limitations. Firstly, these scales suffer from the lack of an ESP theoretical basis. Developers of the ELTDU, ALTE and ESU scales, in particular, fail to provide explicit information as to which theory of language or language learning they have drawn upon, creating the impression that these scales might be developed on the basis of unjustified subjective preferences. Similarly, although the ISLPR acknowledges its reference to “psycholinguistic studies of second language development” (Ingram, 1984, p. 7), no details are provided regarding the specific psycholinguistic studies that it referred to. The CLB stands out among these scales the only one which makes explicit reference to its underlying theoretical frameworks such as models of communicative language ability proposed by Bachman (1990) and a pedagogical model of communicative competence advanced by Celce-Murcia et al. (1995). It should be noted, however, that these theories are by default EGP-oriented, and their effectiveness in guiding the development of a scale of BE proficiency, which is ESP in nature, is therefore questionable in light of the morphological, syntactic, discoursal and pragmatic differences between EGP and ESP (Douglas, 2000). The second limitation with these scales is related to the fact that they are all developed intuitively on the basis of, for instance, expert opinions. Whilst such committee authorship may work well for the development of in-house systems created for particular contexts with a familiar population of learners, they suffer from “a fundamental lack of demonstrated validity” (North, 2002, p. 88). Weaknesses of such an intuitive approach lie particularly in two aspects: (1) the placement of particular descriptors at certain levels is subjective; (2) users from different sectors may differ considerably in their interpretation of the descriptors. Consequently, the acceptability of these intuitively developed scales has to “rely primarily on the authority of the scholars involved in their definition, or on the political status of the bodies that control and promote them” rather than an inherent quality that enables them to function effectively in different contexts (De Jong, 1990, p. 72). Thirdly, most of the BE scales are lacking in openness and flexibility in that they do not provide a mechanism for prospective users to adapt the scale levels to local needs. The ESU framework is the only one which allows for changes to be made

2.4 Summary of the Review and Discussion

29

in accordance with real-life needs. However, it only allows for horizontal addition of illustrative yardsticks, as opposed to a flexible extension of its proficiency levels. Such rigidity of scale levels is likely to cause inconveniences in contexts where narrower bands are required, for instance, to trace subtle progress in language learning programs. The fourth problem concerns their descriptor style. The CLB and the ISLPR, for instance, are flawed by the use of jargons such as “suprasegmental”, thus making them less accessible to non-specialist users. Scales like the ELU framework rely on the use of qualifiers such as “a wide range of”, “moderate-level”, “well adjusted”, “adequately adjusted” to differentiate descriptors at adjacent levels. Such descriptors can only be interpreted in relation to other descriptors on the scale and do not convey much meaningful information of learner proficiency (North, 2000). The ALTE framework stands out as the only exception as it has resorted to descriptors formulated in can-do statements. These descriptors enjoy independent, stand-alone integrity and are briefly, positively worded in terms of what a learner is able to perform in the target language, even at the early stages of language learning. In light of the limitations associated with existing BE scales as well as the CEFR, it will arguably be problematic if we blindly adopt these scales to meet the needs of BE education in China. Besides, they are primarily oriented towards audiences living in English-speaking or ESL countries, which differ quite significantly from the target users in the Chinese tertiary settings. Many of the communicative activities emphasized in these scales as well as the primacy of the native-speaker model to which they ascribed, for instance, might not be pertinent to the Chinese context which is marked by a vibrant business environment involving the use of English as an international language or as a lingua franca. We therefore argue the necessity of developing a BE writing proficiency scale directly oriented towards the features as well as objectives of BE teaching, learning and assessment in China. The resultant scale is envisaged to serve as a point of reference for discussions surrounding BE writing proficiency, and this process of scale development is expected to generate insights into the critical issues in BE education in the Chinese tertiary context.

2.5 Conclusion Following a discussion of the important issues concerning language scales, this Chapter analysed the problems of using the CEFR, the most widely recognised and used language scale around the globe, to profile BE proficiency, as the CEFR was rooted in the EGP domain and thus may fail to portray the unique features of ESP discourses. Among the other language scales, five influential ones (i.e. the ELTDU scale, the ESU scale, the ALTE, the ISLPR, the CLB) that were written to gauge BE proficiency or included a section pertinent to the description were reviewed in terms of their purpose of design, scale structure, as well as their development and validation process. The analysis showed that although these scales were remarkable additions to the field of language scale development, which was predominantly EGP-oriented,

30

2 Assessing Business English: The Role of Language Scales

they were defected by the following four limitations: (1) an ESP theoretical basis was lacking; (2) the scale levels were developed intuitively, thus potentially undermining their validity; (3) these scales lack flexibility in that they do not provide a mechanism for prospective users to adapt the scale levels to local needs; (4) the descriptor style of the scales needs improvement as some employed technical terms while other relied on the use of qualifiers to differentiate descriptors at adjacent levels. In light of these limitations as well as the fact that the existing scales were all developed in the context of English speaking countries, it was argued that a BE writing proficiency scale should be developed which is oriented towards the features as well as objectives of BE teaching, learning and assessment in China. Detailed information concerning how the scale was developed and validated is reported in the following chapters.

References Alderson, J. C. (1991). Bands and scores. In J. C. Alderson & B. North (Eds.), Language testing in the 1990s (pp. 71–86). Macmillan. Alderson, J. C. (Ed.) (2002). Common European framework of reference for languages: Learning, teaching, assessment: Case studies. Council of Europe. Alderson, J. C. (2007). The CEFR and the need for more research. The Modern Language Journal, 91(4), 659–663. Alderson, J. C. (2017). Foreword to the special Issue “The common European framework of reference for languages (CEFR) for English language assessment in China” of language testing in Asia. Language Testing in Asia, 7(20), 1–9. ALTE. (1998). ALTE handbook of European language examinations and examination systems: Descriptions of examinations offered and examinations administered by members of the Association of Language Testers in Europe. University of Cambridge Local Examinations Syndicate. ALTE. (2002). The ALTE can do project. ALTE. Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford University Press. Brindley, G. (1991). Defining language ability: The criteria for criteria. In L. Anivan (Ed.), Current developments in language testing (pp. 139–164). Anthology Series 25, SEAMEO/RELC. Carroll, B. J., & West, R. (1989). ESU framework. Longman. CCLB. (2012). Canadian language benchmarks: English as a second language for adults. Centre for Canadian Language Benchmarks. Celce-Murcia, M., Dörnyei, Z., & Thurrell, S. (1995). Communicative competence: A pedagogically motivated model with content specifications. Issues in Applied Linguistics, 6(2), 5–35. Council of Europe. (2001). Common European framework of reference for language learning, teaching and assessment. Cambridge University Press. Council of Europe. (2008). Relating language examinations to the Common European framework of reference for languages: Learning, teaching, assessment (CEFR), a manual. Council of Europe. Davies, A. (1990). Principles of language testing. Basil Blackwell Publisher. De Jong, H. A. L. (1990). Response to Masters: Linguistic theory and psychometric models. In H. A. L. De Jong & D. K. Stevenson (Eds.), Individualizing the assessment of language abilities (pp. 71–82). Multilingual Matters. Douglas, D. (2000). Assessing language for specific purposes. Cambridge University Press. Ellis, M., & Johnson, C. (1996). Teaching business English. Oxford University Press. Figueras, N. (2012). Impact of the CEFR. ELT Journal, 66(4), 477–485.

References

31

Figueras, N., North, B., Takala, S., Verhelst, N., & Avermaet, P. V. (2005). Relating examinations to the common European framework: A manual. Language Testing, 22(3), 261–279. Griffin, P. E. (1989). Monitoring proficiency development in language. Paper presented at the Annual Congress of the Modern Language Teachers Association of Victoria, held at Monash University, July 1989. Hulstijn, J. H. (2007). The shaky ground beneath the CEFR: Quantitative and qualitative dimensions of language proficiency. The Modern Language Journal, 91(4), 663–667. Hulstijn, J. H. (2011). Language proficiency in native and nonnative speakers: An agenda for research and suggestions for second-language assessment. Language Assessment Quarterly, 8, 229–249. Ingram, D. E. (1984). Report on the formal trialing of the Australian Second Language Proficiency Ratings (ASLPR). Department of Immigration and Ethnic Affairs. Ingram, D. E. (2003). English language proficiency in Australia. Paper presented to the 2003 summer international conference of the Korea association of teachers of English (KATE), Chungnam National University, Daejeon City, Korea, 26 to 28 June. Jin, Y., Wu, Z., Alderson, C., & Song, W. (2017). Developing the China standards of English: Challenges at macropolitical and micropolitical levels. Language Testing in Asia, 7(1), 1–19. Kellett, M., & Cumming, J. (1995). The influence of English language proficiency on the success of non-English speaking background students in a TAFE vocational course. Australian and New Zealand Journal of Vocational Education Research, 3(1), 69–86. Khalifa, H., & Ffrench, A. (2009). Aligning Cambridge ESOL examinations to the CEFR: Issues and practice. Cambridge ESOL Research Notes, 37, 10–14. Kim, H., & Elder, C. (2009). Understanding aviation English as a Lingua Franca: Perceptions of Korean Aviation Personnel. Australian Review of Applied Linguistics, 32(3), 1–17. Languages Lead Body. (1992). National standards for languages: Units of competence and assessment guidance. UK Languages Lead Body. Lee, T. (1992).A many-faceted Rasch analysis of ASLPR ratings: Report to the steering committee for the Australian assessment of communicative English skills test. Centre for Applied Linguistics and Languages, Griffith University. Little, D. (2007). The common European framework of reference for languages: Perspectives on the making of supranational language education policy. The Modern Language Journal, 91(4), 645–655. Liu, J. (2019). China’s standards of English language ability. Foreign Languages in China, 16(3), 10–12. Martyniuk, W. (2010). Aligning tests with the CEFR: Reflections on using the Council of Europe’s draft manual. Cambridge University Press. McDonough, J. (1984). ESP in perspective: A practical guide. Collins ELT. Mislevy, R. J. (1993). Foundations of a new test theory. In N. Frederiksen, R. J. Mislevy, & I. I. Bejar (Eds.), Test theory for a new generation of tests (pp. 19–40). Hillsdale N. J.: Lawrence Erlbaum Associates. Neil, J. (2000). Background to the validation of the ALTE “Can-do” project and the revised common European framework. Cambridge ESOL Research Notes, 2, 11–13. Nelson, M. (2000). The business English lexis site. Retrieved from https://user.utu.fi/micnel/bus iness_englsih_lexis_site.htm North, B. (1991). Standardization of continuous assessment grades. In J. C. Alderson & B. North (Eds.), Language testing in the 1990s (pp. 167–177). Macmillan. North, B. (1994). Scales of language proficiency: A survey of some existing systems. Council of Europe. North, B. (2000). The development of a common framework scale of language proficiency. Peter Lang. North, B. (2002). Developing descriptor scales of language proficiency for the CEF common reference levels. In J. C. Alderson (Ed.), Common European framework of reference for languages: Learning, teaching assessment: Case studies (pp. 87–105). Council of Europe.

32

2 Assessing Business English: The Role of Language Scales

North, B. (2014). The CEFR in practice. Cambridge University Press. North, B., & Schneider, G. (1998). Scaling descriptors for language proficiency scales. Language Testing, 15(2), 217–263. Pawlikowska-Smith, G. (2000). Canadian language benchmarks 2000. Centre for Canadian Language Benchmarks. Pollitt, A., & Murray, M. L. (1996). What raters really pay attention to? In M. Milanovic & N. Saville (Eds.), Language testing 3: Performance, testing, cognition and assessment (pp. 74–91). Cambridge University Press. Tannenbum, R. J., & Wylie, E. C. (2008). Linking English-language test scores onto the common European framework of reference: An application of standard-setting methodology. ETS. Trim, J. L. M. (1978). Some possible lines of development of an overall structure for a European unit/credit scheme for foreign language learning by adults. Council of Europe. UCLES. (2008). Business English certificates: Handbook for teachers. http://www.CambridgeESO L.org Van Ek, J. A. (1976). The threshold level in a European unit/credit system for Modern language learning by adults. Council of Europe. Worth, G., & Marshall, M. (1981). Organizing in-company language training. In R. Freudenstein, J. Beneke, & H. Ponisch (Eds.), Language incorporated (pp. 123–141). Oxford: Pergamon. Wylie, E. (2010). An overview of the ISLPR. ISLPR Language Service.

Chapter 3

Understanding Business English Writing: A Genre Perspective

It is well acknowledged that the development of a language scale should have a base in linguistic theory (North, 2000). As this study aims to develop a BE writing proficiency scale, this chapter focuses on the nature of BE writing proficiency, positioned as a branch of ESP. Given the central role that genre plays in ESP teaching, learning and assessment, the genre perspective was adopted to elucidate BE writing proficiency in this chapter. Specifically, we adopted Tardy’s (2009) model of Genre Knowledge as the underlying theoretical framework for the development of the BE writing scale. This model consists of four dimensions including Formal knowledge, Process knowledge, Rhetorical knowledge, and Subject-matter knowledge, each of which contributes to the development of expertise in producing disciplinary genres.

3.1 BE as an Offshoot of ESP English for Specific Purposes (ESP) generally refers to the teaching of English as a second or foreign language where the goal of the learners is to use English in a particular domain. Since the 1960s, ESP has experienced dramatic development and become identifiable in its own right, complementing significantly the teaching of English for General Purposes (EGP). According to Dudley-Evans and St. John (1998), the boom of the ESP phenomenon can be largely attributed to the following three factors: (1) the constantly expanding use of the English language as an international means of communication in important areas such as science, technology and business, (2) the development in linguistics emphasizing the use of language in reallife communication, and (3) the development in educational psychology stressing the influence of learner’s needs and interest on the motivation and effectiveness of their learning. The maturity of ESP over time has given rise to multiple views concerning its subdivision and a great number of acronyms have been coined to characterize

© Springer Nature Singapore Pte Ltd. 2021 L. Wang and J. Fan, Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach, https://doi.org/10.1007/978-981-16-5449-7_3

33

34

3 Understanding Business English Writing: A Genre …

the micro-branches of the field. Scrutiny into the terms created reveals that ESP is generally viewed from three perspectives: discipline, domain and specificity. Discipline-based branches of ESP highlight the integration between English language studies and discipline knowledge, such as science and technology, commerce, law, medicine and social sciences. Typical names for courses of this kind are English for Medical Purposes, English for Legal Purposes, and English for Sociocultural Purposes. Sometimes the disciplines are indicated by means of job titles, as are reflected by terms such as English for Chemists, English for Diplomats, English for Secretaries, English for Technicians and English for Pilots. ESP can also be approached from the domain in which it is intended for use. Two major domains bearing distinctive features have been identified, namely the academic domain and the occupational domain, generating terms of English for Academic Purposes (EAP) and English for Occupational Purposes (EOP). When disciplinebased ESP is juxtaposed with the two domains, new sub-categories emerge. Take English for Medical Purposes as an example. It can be further divided into English for Academic Medical Purposes and English for Occupational Medical Purposes, with the former emphasizing language skills associated with the academic community such as thesis writing, while the latter centering around the goal to help learners operate in real-life interactions in the medical context. Further complicating the issue is that ESP can also be defined according to the level of specificity involved, differentiating hence highly specific ESP courses and more general ones. Following this line of thought, Language for Academic Purposes can be further divided into Language for General Academic Purposes or Language for Specific Academic Purposes (Jordan, 1997). In the former case, students from a range of disciplines are grouped together and instruction focuses on their common academic needs and skills, such as note-taking skills and the writing of an argumentative essay. In the latter case, students are grouped according to their disciplines and instruction focuses on features of language use critical for successful communication in their respective disciplines. Such division can also be found in the occupational domain. For example, a general English for Legal Purposes course might focus on the general terminologies of legal English, while specific English for Legal Purposes courses might be very different, meeting the highly specified needs of different legal professions such as judges, barristers and directors of public prosecutions. Business English (BE), or English for Business Purposes, is seen as a branch of ESP in that it requires “the careful research and design of pedagogical materials and activities for an identifiable group of adult learners within a specific learning context” (Johns & Dudley-Evans, 1991, p. 298) and is “designed to meet specific needs of the learner” (Strevens, 1988, p. 1). Recent years have witnessed a rapid growth of BE, primarily as a consequence of globalization which requires business people to transact across territorial, linguistic, cultural as well as socio-political boundaries. This new business climate is further bolstered by the use of English as a lingua franca in the vibrant international environment of E-commerce. As the largest sector for ESP publications, BE has been attracting burgeoning interest from teachers, learners, publishers and companies worldwide (Belcher et al., 2011; Swales, 2000).

3.1 BE as an Offshoot of ESP

35

As a flourishing field within ESP, BE is spinning off its own offshoots. Grounded in business-related disciplines, new sub-categories have been produced when taking into account the factor of domain, generating hence English for Academic Business Purposes (EABP)—dealing with BE skills in the academic context required by students such as thesis writing, and English for Occupational Business Purposes (EOBP)—dealing with BE skills in the occupational context such as making sales calls. The picture of BE is further complicated when the dimension of specificity is considered, giving rise to English for General Business Purposes and English for Specific Business Purposes in the academic context, in contrast with English for General Business Purposes and English for Specific Business Purposes in the occupational context (Belcher, 2009). In this book, BE is viewed from the perspective of English for General Business Purposes in the occupational context. This is because BE education in Chinese universities is intended to prepare students to be linguistically competent before entering into the international business arena by exposing them to the most general types of BE tasks and activities such as answering questions at job interviews, reading business letters, writing business reports and understanding telephone messages. As such, BE hereby is different from (yet belongs to) workplace English or English for professional purposes, as the latter encompasses a broader range of English varieties used in non-business settings, such as English for Medical Purposes or English for Legal Purposes.

3.2 ESP Writing Competence ESP approach to L2 writing draws primarily on Halliday’s (1970) functional account of language use. Different from the traditional theory of language competence, which gives priority to abstract grammatical knowledge and prescribes correct and incorrect usages (Chomsky, 1965), functional linguists are concerned with how language enables people to do things and how people use language to achieve social purposes. Language is therefore viewed as a resource for making meanings and language use is always meaning-driven (Halliday, 1970). The functional model of language also posits that when people use language for communicative purposes, they do not operate with isolated words or sentences, but with texts, which constitute the basic unit of meaning (Alsagoff & James, 2007). In other words, language is not simply about constructing meanings from isolated grammatical structures; instead, it is understood in relation to its purpose, audience and the contexts in which it occurs. Central to this functional perspective of language is the recognition that language does not exist in a vacuum; rather, it is constructed in context. Context here involves both the broader context of culture and the immediate context of situation (Alsagoff & James, 2007). According to systemic functional linguists, a language system is dynamic and ever-evolving as it is situated within a particular cultural context. Language is enabled and constrained by the culture, of which it is a part and in turn reflects and shapes that culture. Within the broader cultural context, people use

36

3 Understanding Business English Writing: A Genre …

language in a great variety of specific situations, such as writing an application letter to prospective employers. In other words, a writer’s choices are always contextdependent, motivated by variations in social activity, in writer-reader relations, as well as purposes of the interaction. Apart from the functional view of language, ideas such as communicative competence (Canale & Swain, 1980) and situated learning (Lave & Wenger, 1991) have also contributed to the ESP view that written texts are produced to accomplish particular purposes and writers must conform to the writing conventions valued by the targeted communities (Bhatia, 2004). Writing from the ESP perspective is hence conceptualised as a social endeavor, reflecting the widely accepted view that communication comprises reciprocity and depends on the negotiation or co-construction of meaning between two or more persons. Pedagogically, the ESP approach holds that students’ writing difficulties are not often caused by the technical aspects of grammar and organization, but their failure to recognize the deeper cultural and epistemological underpinnings of writing, which are rarely made explicit to them (Street, 1995). As such, the teaching of ESP writing “steadfastly concerns itself with communication, rather than isolated bits of language, and with the processes by which texts are created and used as much as with texts themselves” (Hyland, 2013, p. 96). Such ESP texts are usually categorized into different text types, termed as “genres”, according to their similar discursive practices and communicative purposes. Concomitantly, the most important goal of ESP writing instruction is to study, analyze and help learners control a range of genres valued in the academic or professional contexts they are likely to operate within. ESP writing ability hence becomes essentially synonymous with one’s competence in producing particular target genres (Hyland, 2007; Johns, 2002).

3.3 Genre as ESP Writing Competence 3.3.1 The Concept of Genre The term “genre” dates back to the study of classical rhetoric and has been traditionally associated with literature and literary criticism. Modern conceptions of genre have extended its use beyond literary texts into professional, academic, and everyday forms of speech and writing. Today, genre has become “one of the most important and influential concepts in language education” (Hyland, 2004, p. 5) and the concept has been interpreted in a variety of ways. According to Hyon (1996), genre theorists and practitioners can be lumped into three broad, overlapping schools: the New Rhetoric (NR) School, the Systemic Functional Linguistics (SFL) School, and the ESP School. The NR approach is advocated by composition researchers in North America, who are more interested in the social and ideological significance than in the rhetorical organization and linguistic features of genres (Freedman & Medway, 1994).

3.3 Genre as ESP Writing Competence

37

The SFL approach, originated from the Australian school context, puts emphasis on the broad genres of schooling, such as description, narration, exemplification and argumentation (Paltridge, 2002). The ESP approach—the major theoretical and pedagogical underpinning of the present study—is largely concerned with the use of genre as a tool for teaching and researching discipline-specific writing to L2 users in professional or academic settings (Bhatia, 2004; Swales, 2000). In a simpler taxonomy, Flowerdew (2011) dichotomized genre theorists into linguistic (the ESP School and SFL School) and non-linguistic (the New Rhetoric School) camps. The New Rhetoric group, influenced by post-structuralism, rhetoric and first language composition, defines genre “as the motivated, functional relationship between text type and rhetorical situation” (Coe, 2002, p. 195). Accordingly, the New Rhetoric group gives prominence to the rhetorical contexts in which genres are employed, as opposed to a detailed analysis of linguistic elements, and as a result, “its contribution to L2 writing instruction has been minimal” (Hyland, 2003, p. 22). SFL and ESP genre approaches are grouped under the linguistic camp in that they both concentrate on pedagogical analysis of the lexico-grammatical and rhetorical realization of communicative purposes embodied in a genre. As such, they have been closely engaged with issues of L2 teaching, and are unswerving in their efforts to equip students with knowledge of relevant genres so as to enable them to produce appropriate discourses in their Target Language Use (TLU) domain.

3.3.2 Genre in the ESP Tradition To gain a better understanding of genre from the ESP perspective, the distinction between ESP and SFL approaches is worth discussing. In fact, despite the pedagogical commitments that ESP and SFL genre approaches share, they differ in subtle but important ways. Most obviously, they are different in their target learners, with SFL genre approaches generally targeting economically and culturally disadvantaged school-age children in Australia and ESP genre approaches generally targeting non-native speakers (NNS) learners in academic and occupational contexts. This difference in their target learners has important implications for how SFL and ESP approaches perceive and analyze valued genres. SFL scholars and teachers, for instance, tend to characterize genres in terms of broad rhetorical patterns, such as narratives, descriptions and arguments. The ESP scholars and teachers, in contrast, are interested in the communicative needs of particular academic and professional groups, the specific writing practices of these groups and the conventions group members observe for those practices. Accordingly, community-specific genres used within academic or professional settings, such as “research article”, “literature review” in the case of EAP and “e-mail”, “business report” in the case of EBP, are the focus of ESP learning and assessment. The differences in target learners and valued genres between SFL and ESP approaches highlight a related difference in the two schools’ understanding of context. Scholars in the SFL tradition locate genre at the macro level of context of culture (Martin, 1992), whereas ESP advocates

38

3 Understanding Business English Writing: A Genre …

Table 3.1 ESP and SFL perspectives on genre Orientation

Target audience

Education context

Concept of context

Sample genres

ESP approach

NNS learners

Academic or occupational training

Discourse community

Report, letter, proposal

SFL approach

School-age children; Migrant adults

L1 school, adult Context of training culture

Explanations, recounts, description

Notes ESP = English for Specific Purposes; SFL = Systemic Functional Linguistics; NNS = Non-native Speaker

view genres within more specifically defined contexts, or discourse communities, where the genres’ communicative purposes are more specified and attributable. The differences between the two approaches are summarized in Table 3.1. In the ESP tradition, genres are defined as purposeful, staged and structured communicative events, motivated by various communicative purposes shared within specific discourse communities, and performed by members of those communities (Bhatia, 1993, 2004; Swales, 1990). These commonly occurring communicative purposes give rise to viable correlations between cognitive, rhetorical, and linguistic features in a genre that is sufficiently conventionalized in institutionalized academic or professional settings (Bhatia, 2004). Influenced by the seminal works of Swales (2000) and Bhatia (1993), the ESP genre approach features the analysis of “move”, namely a “bounded communicative act that is designed to achieve one main communicative objective” (Swales & Feak, 2004, p. 35), as well as the linguistic choices undergirding a particular move. Concomitantly, ESP practitioners have emphasized the balance between generic features, such as move-associated linguistic features, and the underpinning disciplinary and rhetorical contexts. Such a context-sensitive approach to genre analysis has generated numerous descriptions of the rhetorical features as well as the recurring patterns of discipline-specific genres, which are considered fundamental to ESP writing tasks. The findings are then fed into ESP pedagogy in the form of syllabus or textbook design. (Johns, 2003; Swales & Feak, 2004).

3.3.3 Genre Knowledge and ESP Writing In response to student writers’ needs to adroitly perform a range of genres in academic or occupational settings, ESP teaching is committed to exposing learners to a detailed and explicit analysis of these genres. The written genres valued by the learners’ prospective discourse communities, for instance, often constitute the basis of ESP writing courses (e.g., Belcher, 2004; Cheng, 2008; Flowerdew, 2002). The assessment of ESP writing ability, accordingly, is largely concerned with measuring the extent to

3.3 Genre as ESP Writing Competence

39

which L2 learners are able to construct these genres effectively (Hyland, 2007). As such, ESP writing ability has become virtually identical with the ability to produce the target genres that learners might encounter to survive or thrive in various Englishusing contexts outside the ESL classroom, highlighting therefore the importance for ESP learners to develop genre knowledge. Defined as “an individual’s repertoire of situationally appropriate responses to recurrent situations” (Berkenkotter & Huckin, 1995, p. 4), genre knowledge is especially important to ESP writing due to two reasons. Firstly, it stresses that genres are specific to particular cultures and communities, and such knowledge might be new or unfamiliar to the learners. Secondly, it urges ESP teachers to go beyond vocabulary and structures by incorporating into teaching the ways language is used in specific contexts to achieve various purposes (Hyland, 2007). Teaching genre knowledge, as such, is crucial to initiating L2 learners into the TLU domain as well as bridging the gap between classroom and workplace realities.

3.3.4 The Model of Genre Knowledge 3.3.4.1

The Four Dimensions

Genre is complex; hence genre knowledge “is necessarily multidimensional” (Johns et al., 2006, p. 239). Contemporary ESP studies that investigate what genre knowledge entails commonly distinguish among its various dimensions, such as the formal, subject-matter and rhetorical components. Adopting a view of genre that foregrounds the socially situated nature of writing knowledge, Tardy (2009, p. 260) has gone a step further and explored how genre knowledge developed over time on the part of nonnative English speakers. Based on a two-year longitudinal study of four L2 students studying in an American university (all from Asian countries including China, Japan, Thailand and South Korea), the study examined these writers’ developing expertise in composing disciplinary genres in instructional and on-instructional settings, collecting multi-sources of data including the writers’ written texts, feedback from mentors, student-mentor conferences, direct observations of the writing classroom and oral reflections on the texts and tasks in which the writers engaged. A fine-tuned analysis of such ethnographic data enabled her to propose a dynamic model of Genre Knowledge (Fig. 3.1), conceiving genre knowledge as a four-dimensional entity that can be decomposed into: Formal knowledge, Process knowledge, Rhetorical knowledge and Subject-matter knowledge (Tardy, 2009, pp. 20–22). The dimension of Formal knowledge mainly entails knowledge of the surfacelevel knowledge of generic forms, such as a genre’s organization, salient lexicogrammatical conventions and the structural moves. Compared with the other three knowledge dimensions, Formal knowledge is always prioritized in L2 instructional as well as assessment contexts (Tardy, 2006). Process knowledge encompasses the procedural practices that writers go through to carry out the genre. It is particularly related to the writers’ composing processes for producing written genres, including

40

3 Understanding Business English Writing: A Genre …

Fig. 3.1 The model of genre knowledge (from Tardy, 2009, p. 22)

how they maneuver their repertoire of writing knowledge to meet the demands of the writing task. Rhetorical knowledge is concerned with the writer’s knowledge of the discourse community and the socio-rhetorical context in which the genre occurs. Awareness of factors such as the reader-writer relationship is highlighted in this dimension. Rhetorical knowledge also includes cultural knowledge, a dimension which may pose particular challenges for L2 writers whose prior experiences are shaped by different sociocultural contexts. The subject-matter dimension involves the disciplinary or professional knowledge required to produce a genre. Subject-matter knowledge is a dynamic concept whose content changes along with the domain in which a genre is written for. For instance, in the case of business genres, subjectmatter knowledge is equivalent to knowledge of business-related subjects. According to Tardy (2009), genre knowledge consists of these four dimensions, whose interaction gives rise to the sophisticated knowledge that allows expert writers to manipulate genres for various purposes. In the learning context, however, these dimensions can be relayed to the learners as independent entities in a consecutive manner. For instance, at novice stages, L2 learners may focus on one or two comparatively easier dimensions (such as the generic moves or lexico-grammatical features of a genre) to the relative exclusion of others. When the learners become competent at these knowledge dimensions, they can attempt the more challenging tasks of mastering the other generic elements. Over time, learners will be able to integrate these knowledge dimensions, demonstrating automaticity as well as the kind of tacit knowledge characteristic of expert genre users.

3.3 Genre as ESP Writing Competence

3.3.4.2

41

Using the Model of Genre Knowledge to Develop BE Writing Scale

It is well acknowledged that the underlying construct of a language scale should have a base in linguistic theory (North, 2000). Of the existing linguistic theories available to be adopted or adapted for the development of a language scale, the models of communicative competence proposed by Canale (1983), Canale and Swain (1980), Bachman (1990) or Bachman and Palmer (1996) have been chosen by many language scale developers (e.g. Council of Europe, 2001; Hawkey & Barker, 2004). Models of communicative competence, noted for being generic and context-independent, are considered effective in enhancing a scales’ generalizability and transferability across task types (Knoch, 2011). On the flip side, however, as these models are by default EGP-oriented, their applicability to the ESP domain is arguably limited. Making a case for the uniqueness of Language for Specific Purposes (LSP), with ESP being viewed as one of its branches, Douglas (2000) proposed a model of LSP ability, which consists of three main components: Language knowledge (such as knowledge of vocabulary, syntax, cohesion), Strategic competence (such as the ability to evaluate the appropriateness of task response) and Background knowledge (such as topic knowledge of the language tasks). Douglas (2000) suggested that LSP assessment practices, including ESP test design and scale construction, should reflect the typical tasks in the TLU domain as well as the interaction between language knowledge and specific purpose content knowledge. Models of communicative competence and LSP ability, both offering important insights into what linguistic competence entails notwithstanding, fail to address the writing competence per se. In fact, several attempts have been made to develop a theory of writing, although an overarching model of L2 writing is still unavailable (Cumming, 1998). Grabe and Kaplan (1996) advanced the model of text construction which is potentially useful for scale construction due to its emphasis on the observable elements of writing performance (Knoch, 2011). Specifically, this model outlines seven essential areas for writing: syntax, semantics, lexicon, cohesion, coherence, functional dimensions and non-linguistic resources, shedding light on the various aspects and variables involved in writing. The model of text construction is rigorous in its explanatory power as far as the construct of writing is concerned; however, as it is primarily targeted at general writing proficiency, it has limited bearing on ESP writing. Overall, despite their influence and popularity, the models of communicative competence, LSP ability and text construction mentioned above are considered insufficient to serve as the theoretical base for the development of a BE writing scale, as BE writing prioritizes the expertise to produce a range of genres valued by the TLU domain. The model of Genre Knowledge (Tardy, 2009, p. 5) was thus chosen as the underlying theoretical framework to illuminate the construct of BE writing in the study. Specifically, the reasons for choosing this model are elaborated below. First, given the inextricable relationship between BE and ESP, and the fundamental role played by genre in the teaching, learning and assessment of BE writing, BE writing proficiency is considered synonymous with “knowledge and skills required

42

3 Understanding Business English Writing: A Genre …

to produce written genres valued by the business community”. Compared with the other linguistic models that theorize language, language proficiency and communicative competence in second language studies, the model of Genre Knowledge, currently “the most elaborated and explicitly articulated theory of genre knowledge” (Driscoll et al., 2020, p. 74), bears the closest relevance to BE writing and stands out as an independent and useful theoretical tool to examine the components essential to the mastery of genre knowledge from the perspective of nonnative writers. Second, one important aim for developing the model of Genre Knowledge is to support the task of genre instruction, especially when the writing challenges faced by multilingual students are concerned. By conceptually untangling the interrelated dimensions of knowledge that contribute to expert production of genres, the model sheds important pedagogical light on genre learning in a range of populations as well as “contexts in which English is a second or additional language” (Tardy, 2009, p.5). Finally, although Tardy (2009) cautions that the hypothesized knowledge dimensions are meant to “merely serve a heuristic purpose” rather than represent “any kind of epistemic reality” (p. 20), evidence for the modular or componential nature of genre knowledge is accumulated (Gentil, 2011). In addition, the validity of this model has been empirically verified by quite a number of studies (e.g., Kim & Belcher, 2018; Zhang, 2013). For example, Zhang (2013) examined how Chinese EFL learners developed BE writing expertise in the Chinese tertiary settings, demonstrating its relevance to the Chinese context. Therefore, the nature of BE writing was approached from the perspective of genre and the model of Genre knowledge was adopted as the theoretical base for the establishment of the descriptive scheme of the BE writing scale under development. Specifically, the four dimensions incorporated in the model, namely the Formal knowledge, Process knowledge, Rhetorical knowledge, and Subject-matter knowledge, were operationalized as the descriptive categories of the BE writing scale. An elaboration of these dimensions and how they were operationalized as the multi-layered descriptive categories of the BE writing scale is presented in Chap. 5.

3.4 Conclusion This chapter illustrates the choice of a theoretical framework which underpins the development of a scale of BE writing proficiency. First, the relationship between BE and ESP was examined, with the former being considered as a branch of the latter. Next, ESP writing competence was conceived from the perspective of genre, given the pivotal role played by genre in the teaching, learning and assessment of ESP writing. As an illuminating contribution to demystifying the components involved in the development of genre knowledge on the part of L2 writers, Tardy (2009) proposed the model of Genre Knowledge with four dimensions. Thanks to the theoretical and empirical rigor of the model, it was adopted in the study as the theoretical framework

3.4 Conclusion

43

for the conceptualization of BE writing proficiency. How the model was employed as the theoretical base for the establishment of the descriptive scheme of the BE writing scale in focus will be detailed in Chap. 5.

References Alsagoff, L., & James, J. E. (2007). From grammar to texts: A grammarian’s approach to text types. In T. R. F. Tupas, Y. Yi, & C. Nur (Eds.), Changing perspectives on pedagogical grammar (pp. 29–44). Singapore Association for Applied Linguistics. Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford University Press. Bachman, L. F., & Palmer, A. (1996). Language testing in practice. Oxford University Press. Belcher, D. (2004). Trends in teaching English for specific purposes. Annual Review of Applied Linguistics, 24(1), 165–186. Belcher, D. (Ed.). (2009). English for specific purposes: Theory and practice. University of Michigan Press. Belcher, D., Johns, A. M., & Paltridge, B. (Eds.). (2011). New directions in English for specific purposes research. The University of Michigan Press. Berkenkotter, C., & Huckin, T. (1995). Genre knowledge in disciplinary communication. Lawrence Erlbaum. Bhatia, V. K. (1993). Analyzing genre: Language use in professional settings. Longman. Bhatia, V. K. (2004). Worlds of written discourse: A genre-based view. Continuum International. Canale, M. (1983). From communicative competence to communicative language pedagogy. In J. C. Richards & R. W. Schmidt (Eds.), Language and communication (pp. 2–27). Longman. Canale, M., & Swain, M. (1980). Theoretical basis of communicative approaches to second language teaching and testing. Applied Linguistics, 1(1), 1–47. Cheng, A. (2008). Analyzing genre exemplars in preparation for writing: The case of an L2 graduate student in the ESP genre-based instructional framework of academic literacy. Applied Linguistics, 29(1), 50–71. Chomsky, N. (1965). Aspects of the theory of syntax. MIT. Coe, R. M. (2002). The new rhetoric of genre: Writing political briefs. In A. M. Johns (Ed.), Genre in the classroom: Multiple perspectives (pp. 197–210). Lawrence Erlbaum. Council of Europe. (2001). Common European framework of reference for language learning, teaching and assessment. Cambridge University Press. Cumming, A. (1998). Theoretical perspectives on writing. Annual Review of Applied Linguistics, 18, 61–78. Douglas, D. (2000). Assessing language for specific purposes. Cambridge University Press. Driscoll, D. L., Paszek, J., Gorzelsky, G., Hayes, C. L., & Edmund, J. (2020). Genre knowledge and writing development: Results from the writing transfer project. Written Communication, 37(1), 69–103. Dudley-Evans, T., & St John, M. (1998). Developments in ESP: A multi-disciplinary approach. Cambridge University Press. Flowerdew, J. (2002). Genre in the classroom: A linguistic approach. In A. M. Johns (Ed.), Genre in the classroom: Multiple perspectives (pp. 91–102). Lawrence Erlbaum Associates. Flowerdew, J. (2011). Action, content and identity in applied genre analysis for ESP. Language Teaching, 44, 516–528. Freedman, A., & Medway, P. (Eds.) (1994). Learning and teaching genre. Boynton/Cook. Gentil, G. (2011). A biliteracy agenda for genre research. Journal of Second Language Writing, 20(1), 6–23. Grabe, W., & Kaplan, R. B. (1996). Theory and practice of writing. Longman.

44

3 Understanding Business English Writing: A Genre …

Halliday, M. A. K. (1970). Language structure and language function. In J. Lyons (Ed.), New horizons in linguistics (pp. 440–465). Penguin. Hawkey, R., & Barker, F. (2004). Developing a common scale for the assessment of writing. Assessing Writing, 9, 122–159. Hyland, K. (2003). Genre-based pedagogies: A social response to process. Journal of Second Language Writing, 12(1), 17–29. Hyland, K. (2004). Genre and second language writing. University of Michigan Press. Hyland, K. (2007). English for specific purposes: Some influences and impacts. In J. Cummins & C. Davison (Eds.), International handbook of English language teaching (pp. 391–402). Springer. Hyland, K. (2013). ESP and writing. In B. Paltridge & S. Starfield (Eds.), The handbook of English for specific purposes (pp. 95–113). Wiley. Hyon, S. (1996). Genre in three traditions: Implications for ESL. TESOL Quarterly, 30(4), 693–722. Johns, A. M. (2002). Genre in the classroom: Multiple perspectives. Lawrence Erlbaum. Johns, A. M. (2003). Genre and ESL/EFL composition instruction. In B. Kroll (Ed.), Exploring the dynamics of second language writing (pp. 195–217). Cambridge University Press. Johns, A. M., & Dudley-Evans, T. (1991). English for specific purposes: International in scope, specific in purpose. TESOL Quarterly, 25(2), 297–314. Johns, A. M., Bawarshi, A., Coe, R., Hyland, K., Paltridge, B., Reiff, M., & Tardy, C. (2006). Crossing the boundaries of genre studies: Comments by experts. Journal of Second Language Writing, 15(3), 234–249. Jordan, R. R. (1997). English for academic purposes: A guide and resource book for teachers. Cambridge University Press. Kim, M., & Belcher, D. D. (2018). Building genre knowledge in second language writers during study abroad in higher education. Journal of English for Academic Purposes, 35, 56–69. Knoch, U. (2011). Rating scales for diagnostic assessment of writing: What should they look like and where should the criteria come from? Assessing Writing, 16, 81–96. Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Cambridge University Press. Martin, J. R. (1992). English text: System and structure. John Benjamins. North, B. (2000). The development of a common framework scale of language proficiency. Peter Lang. Paltridge, B. (2002). Genre, text type and the EAP classroom. In A. M. Johns (Ed.), Genre in the classroom: Multiple perspectives (pp. 73–90). Lawrence Erlbaum Associates, Publishers. Street, B. V. (1995). Social literacies: Critical approaches to literacy in development, ethnography and education. Longman. Strevens, P. (1988). ESP after twenty years: A re-appraisal. In M. Tickoo (Ed.), ESP: State of the art (pp. 1–13). SEAMEO Regional Language Centre. Swales, J. M. (1990). Genre analysis: English in academic and research settings. Cambridge University Press. Swales, J. M. (2000). Languages for specific purposes. Annual Review of Applied Linguistics, 20, 59–76. Swales, J. M., & Feak, C. B. (2004). Academic writing for graduate students: Essential tasks and skills. University of Michigan Press. Tardy, C. M. (2006). Researching first and second language genre learning: A comparative review and a look ahead. Journal of Second Language Writing, 15(2), 79–101. Tardy, C. M. (2009). Building genre knowledge. Parlor Press. Zhang, Z. C. (2013). Business English students learning to write for international business: What do international business practitioners have to say about their texts? English for Specific Purposes, 32(3), 144–156.

Chapter 4

Developing a Scale of Business English Writing Proficiency: Considerations and Methods

Literature on language scale development is largely situated within the EGP domain, which is in practice often contrasted with ESP. When developing an ESP scale, we contend that the unique features of ESP should be taken into account besides drawing on valuable insights from the EGP domain. This chapter starts with an elaboration of the important issues concerning ESP assessment. Next, the fundamental principles for scale development are discussed, followed by an analysis of the methods commonly employed for language scale construction. In the final section, the multi-phased research design of the study is presented, outlining the procedures involved in the scale development process.

4.1 Important Considerations in ESP Assessment 4.1.1 EGP Versus ESP Although there is considerable overlap between EGP and ESP, in practice they are often contrasted with each other. The essential difference between the two, according to Widdowson (1983, p. 6), lies in the way “purpose” is defined and the manner of its implementation: “In ESP ‘purpose’ refers to the eventual practical use to which the language will be put in achieving occupational and academic aims”. EGP, on the other hand, “seeks to provide learners with a general capacity to cope with undefined eventualities in the future.” Such differences in purposes are mirrored pedagogically. EGP programs place priority on the mastery of the English language, aiming to develop students’ general language ability. As such, the pedagogical objective in EGP classrooms is to enable a wide range of learners to apply the linguistic system in the most general types

© Springer Nature Singapore Pte Ltd. 2021 L. Wang and J. Fan, Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach, https://doi.org/10.1007/978-981-16-5449-7_4

45

46

4 Developing a Scale of Business English Writing Proficiency: Considerations …

of potential communicative circumstances. The classroom is essentially languagecentered where “language itself is the subject matter and the purpose of the course” (Robinson, 1980, p. 6). ESP programs, by contrast, do not aim at improving students’ overall English proficiency indiscriminately. The starting point for ESP curriculum development is an analysis of learners’ specialized language needs, which then lead to a detailed and selective specification of objectives geared to learners’ competence to complete language tasks in a specified context. As such, in ESP classrooms, language is not learnt for its own sake, but “to smooth the path to entry or greater linguistic efficiency in academic, professional or workplace environments” (Basturkmen, 2006, p. 18).

4.1.2 Key Features of ESP Assessment In line with ESP teaching, ESP assessment also demonstrates “a move away from discrete-item linguistically oriented approaches in the direction of better contextualized performance-based measures” (Basturkmen & Elder, 2004, p. 672). Generally used to refer to the assessment of English oriented to specified communicative needs, ESP assessment derives its test content and methods from specified domains of language use, such as English for Business, English for Tour Guides, or English for Pilots (Douglas, 2001). According to Douglas (2001), ESP assessment demonstrates three key features: (1) authenticity, (2) specificity, and (3) the interaction between language knowledge and background knowledge. (1)

Authenticity

Authenticity refers to “the degree of correspondence of the characteristics of a given language test task to the features of a TLU task” (Bachman & Palmer, 1996, p. 23). For general purpose assessment, authenticity is among the several aspects contributing to test quality and “may not necessarily be the prime consideration” (Elder, 2016, p. 148). In the field of specific purpose assessment, however, authenticity is regarded as a critical validity requirement and is of paramount importance to the construct validity of a test (Douglas, 2000). Emphasis on authenticity also renders specific assessment tools more motivating due to the closer alignment between test tasks and tasks in the TLU domain (Basturkmen & Elder, 2004). A key requirement for authenticity in specific purpose assessment is establishing what constitutes effective communication in the specified domain of interest, whether in the business domain (the focus of the current study) or other communicative settings. With regard to the development and/or validation of an ESP assessment tool, be it an ESP test or an ESP proficiency scale, Douglas (2000) argues that the critical characteristics of the assessment tasks should be matched to task features of the TLU situation. To this end, collaboration among stakeholders of the assessment tool, such as applied linguists and subject specialist informants with insider knowledge of the domain is advocated, as all can bring invaluable insights to the process of

4.1 Important Considerations in ESP Assessment

47

assessment (Elder & McNamara, 2016). The involvement of subject specialists is in particular emphasized, whose unique understanding of effective ESP communication is supposed to contribute to “greater authenticity for a test in terms of its capacity to represent the domain” (O’Hagan, Pill & Zhang, 2016, p. 196). Empirical studies, however, have shown that domain experts and language professionals may not always view the essential elements of successful ESP communication in the same light (Jacoby & McNamara, 1999; Knoch, 2009). It is therefore advised that carefully weighed decisions should be made to balance the expertise of language professionals and domain experts in ESP assessment endeavors so as to achieve “commensurability (or otherwise) of each party’s perspective on language and communication” (Elder, 2016, p. 151). (2)

Specificity

The second distinguishing feature of ESP assessment lies in specificity. According to Basturkmen and Elder (2004), in specified contexts of communication, language is used in constrained and fairly predictable ways and demonstrates a repertoire of context-specific features. Take English for Nursing as an example. Phonetically, differences between the phonemes /l/ and/n/, as in low blood pressure versusno blood pressure, are of paramount importance and serious results can arise if a nurse confuses the two (Cameron, 1998). At the lexical level, synonyms such as stomach, belly, tummy, abdomen, and gut can be used fairly interchangeably in general purpose contexts as they refer to the same part of the body; however, nurses are expected to be acutely aware of the subtle differences among these words whose connotations may vary significantly depending on variables such as the age, gender and social class of the patients (Cameron, 1998). Grammatically, as the ability to describe procedures is accentuated for nurses’ workplace communication, syntactical structures such as imperatives, passives and expressions of time are prioritized in a nurse’s grammatical repertoire (Marston & Hansen, 1985). A central consideration in developing and/or validating ESP assessment tools, therefore, is to identify and reflect such field-specific characteristics of language use valued in the TLU situations (Flowerdew & Peacock, 2001). In fact, a number of empirical findings have confirmed that there do exist lexical, semantic, syntactic and phonological characteristics, in addition to rhetorical, pragmatic, and sociolinguistic ones, unique to a specific domain of language use (e.g., Hyland, 2008; Nelson, 2006; Paltridge, 2009). Corpus and genre studies, in particular, have brought about increasingly sophisticated characterizations of specific areas of language use of ESP branches such as BE (e.g., Nelson, 2006; Zhang, 2005), English for Science and Technology (e.g., Parkinson, 2013; Ward, 2009) and English for Academic Purposes (Bruce, 2008; Coxhead, 2010). As these features “allow for people in that field to speak and write more precisely about the field in a way that outsiders sometimes find impenetrable” (Douglas, 2001, p. 48), they are expected to be closely studied and adequately reflected in ESP assessment tools to offer the kind of precision rarely found in EGP tests or scales.

48

(3)

4 Developing a Scale of Business English Writing Proficiency: Considerations …

The interaction between specific purpose language and background knowledge

The third distinguishing feature, and arguably the “defining feature” (Douglas, 2001, p. 46) of specific purpose assessment, is the recognition that language ability and background knowledge are inextricably related. “Background knowledge”, also called “content knowledge”, “content-specific knowledge” or “subject-matter knowledge”, is non-linguistic knowledge retrieved to complete language tasks (Douglas, 2000). For example, performing a task about e-mail writing requires knowledge of what E-mails are and how they work, etc. In general purpose assessment, background knowledge is considered as a confounding variable that might contribute to measurement error and construct-irrelevant variance (Douglas, 2000; Fulcher, 2000; Messick, 1989). However, when it comes to ESP assessment, it is contended that language knowledge and background knowledge are indistinguishable and both act as legitimate variables to be assessed (e.g., Douglas, 2000; O’Hagan, Pill & Zhang, 2016). In fact, a host of studies have reported that background knowledge plays a significant role in the process of performing ESP tasks (e.g., Alderson & Urquhart, 1988; Clapham, 1996). Clapham (1996), for instance, examined the effect of test candidates’ background knowledge on a test of English for Academic Purposes. She found that there was a correlation between content knowledge and test score, although the extent varied with the specificity of the test and the language proficiency of the candidates. In a similar vein, Fox et al. (1997) examined the role of background knowledge in the listening section of an ESP test and also identified a significant interaction between language proficiency and background knowledge. In light of empirical findings, Douglas (2001) contends that if background knowledge does not play a part in ESP test candidate’s performance, it is legitimate to question whether the tests are ESP tests or not. The three features of specific purpose assessment not only differentiate it from general purpose assessment, but also serve as essential guidelines for the development and/or validation of ESP assessment tools, and in the case of this study, the construction of a proficiency scale of BE writing. As such, the three features of ESP assessments will be given due consideration throughout the process of developing and validating the proficiency scale. In the meantime, it is also important to note that ESP assessment is not the diametric opposite of EGP assessment; instead, there is considerable overlap between the two. Therefore, insights gained from the EGP domain in developing assessment tools will be utilized or adapted to guide ESP assessment, where appropriate. In what follows, we review general principles of scale development and in particular, methods for language scale development as they have implications for the design of the study.

4.2 General Principles of Scale Development

49

4.2 General Principles of Scale Development 4.2.1 Types of Measurement Scales As defined by Stevens (1946), measurement involves the assignment of values or numbers to objects as per certain specified rules. Scales are used to indicate the numerical characteristics of measurement. Traditionally, measurement scales are grouped into four different types, namely nominal, ordinal, interval, and ratio scales, which differ from one another in terms of the meaning given to the assigned values and the relationship between the values for a given variable. In the scale of language scale development, North (2000) proposed a new type of scale, called “linear scale”, which, together with the other four types of scales, will be discussed at the end of this section. (1)

Nominal Scales

Qualitative in nature, nominal scales are typically used for classification, and the numbers assigned to the object serve as tags to categorize. As such, a variable is usually divided into two or more categories in nominal scales. A typical example of nominal scale is the scale used to collect gender information, where an individual can be categorized as either male (e.g., 1) or female (e.g., 2). The numbers assigned to the object do not represent an amount or quantity, nor do they indicate superiority or seniority. They are simply used for the purpose of classification. As such, any numbers could be used in the scale to indicate categorization. Nominal scales enjoy high reliability as they are often created by generating responses using close-ended questions. The downside to such scales, however, is that participants are not able to express their degrees of response. (2)

Ordinal Scales

The dominant feature of the ordinal scale is order, where numbers are used to indicate the order of quality or quantity. Common examples of ordinal scales include those employed in surveys that involve several points (e.g., disagree, strongly disagree), depicting non-mathematical ideas such as satisfaction, happiness, or sympathy. Ordinal scales posit that hierarchical order exists among the values; however, the degree of variation on ordinal scales is not fixed. In other words, the difference between the first and second values does not necessarily have the same meaning as that between the second and third. For instance, in a running race, the runner ranking the first and second might differ by a matter of milliseconds, whereas those ranking the second and third might differ by seconds. Therefore, on an ordinal scale, the rank of values counts, but the difference or distance between the values does not matter much.

50

(3)

4 Developing a Scale of Business English Writing Proficiency: Considerations …

Interval Scales

The interval scale possesses all the characteristics of an ordinal scale. Its distinguishing feature lies in the constant or equal interval between the values on the scale. In other words, interval scales are characterized not only by order but also by equal distances between the values of the object in the scale. As such, the difference between the values can be compared. However, one major disadvantage of interval scales is the lack of an absolute zero, which refers to the complete absence of the characteristic being measured. Since the interval scale has no true zero point, it is impossible to make conclusions about how many times higher or lower one value is than another. The most popular example of interval scales is the temperature scale. The difference between 40 degrees F and 50 degrees F, for instance, is the same as the difference between 70 degrees F and 80 degrees F. However, although there is a value on the scale named zero degree F, it does not mean the complete absence of temperature, so it is not true zero. As a result, it cannot be claimed that one feels twice as warm at a temperature of 40 degrees F as at 20 degrees F. (4)

Ratio Scales

Ratio scales have all of the characteristics of interval scales as well as a true zero. It has orders and equally distanced value between units. Whereas an interval scale describes differences between scale values in scale points, a ratio scale can compare values by ratios. For ratio scales, values can be systematically added, subtracted, multiplied and divided. Physical characteristics of persons and objects such as height and weight can all be measured with ratio scales. For instance, if two desks are measured in terms of height and Desk A is 10 inches tall and Desk B is 20 inches tall, it can be said that Desk B is twice as tall as Desk A. Due to their inherent emphasis on preciseness, ratio scales are widely used in physical science. (5)

Linear Scales

In most case, the interval and ratio scales are preferred to nominal and ordinal scales because the former two can quantify the differences between values, hence providing more information than the latter ones. However, it is highly unlikely for scales of language proficiency to be interval scales on which the steps are of equal size, like a ruler or thermometer; nor can the kind of preciseness demanded by ratio scales be applied to the description of language proficiency. North (2000) hence argued that the kind of scale that can serve as a common framework of the teaching, learning and assessment of language proficiency does not fall into any of the four conventional types of scales discussed above. He proposed a fifth type of scale, lying somewhere between ordinal and equal interval scales, and named it “linear scale”, which has not usually been mentioned in traditional classifications of scales. According to North (2000), Rasch measurement theory is the only statistical method that is currently available to convert raw data on ordinal and interval scales into linear scales. In

4.2 General Principles of Scale Development

51

fact, the Rasch model has been used extensively in the field of language assessment (e.g., Fan & Bond, 2019; McNamara, 1996; McNamara & Knoch, 2012), and it has been proved to be useful for calibrating the difficulty levels of the descriptors in language scales on linear scales, as exemplified by the development of high-stakes language scales such as the CEFR (North, 2000) and the CSE (Liu, 2019).

4.2.2 Essentials of Valid Measurement Scales In recognition of the complexity involved in the assessment of human characteristics, Thorndike (1904), a major contributor to the modern understanding of educational and psychological measurement, argued for the need to “measure things, differences, changes and relations” in social sciences (1904, p. 4). In order to develop scales similar to those in the physical sciences, Thorndike (1904) advanced five essentials of valid scales: (1) objectivity, (2) consistency, (3) definiteness of the facts, (4) comparability, and (5) reference to a defined zero point. These essentials should also be applied to language proficiency scales (North, 2000), each of which is briefly described below. (1)

Objectivity

One of the major measurement problems encountered in the social sciences is the lack of objectivity in scales. Thorndike (1904) proposed that a scale would be considered “objective” if it meets two criteria: (1) the scale should be sample-free and its function must be independent of the objects of measurement; (2) the values attached to the scale must be as free as possible, and preferably entirely free from the actual opinions of the individuals or groups. When it comes to language proficiency scale development, many scales developed using classical test theory (CTT) failed to reach the criterion of “objectivity” as statistics based on CTT is sample-dependent (Henning, 1984). By contrast, the Rasch model emerged as an innovative addition to the modern statistical toolbox, meeting the criteria to produce measures that are sample-free for the agents (here: scale) and test-free for the objects (here: people) (North, 2000). This unique strength of the Rasch measurement theory explains why an increasing number of language proficiency scales have been developed by utilizing the Rasch model (e.g., Council of Europe, 2001). (2)

Consistency

In CTT, consistency is understood as referring to the reliability of measurement. A measurement tool is considered consistent if it can separate a sample on an ability continuum. Therefore, this concept is related to psychometric unidimensionality because it can distinguish high- and low-achieving individuals in the sample reliably on the construct that is being measured.

52

4 Developing a Scale of Business English Writing Proficiency: Considerations …

Table 4.1 Sample of Bachman and Palmer’s Scale (1983) (adapted from Bachman, 1990. p. 327) 0

Extremely limited vocabulary (a few words and formulaic phrases. Not possible to discuss any topic, because of limited vocabulary)

1

Small vocabulary (difficulty in talking with examinee because of vocabulary limitations)

2

Vocabulary of moderate size (frequently misses or searches for words)

3

Large vocabulary (seldom misses or searches for words)

4

Extensive vocabulary (rarely, if ever, misses or searches for words. Almost always uses appropriate word)

When developing a language proficiency scale, consistency is an important property because the scale is intended to distinguish language learners at different levels reliably on the language construct that the scale captures, like in this study, Business English writing proficiency. Most computer programs for the Rasch model (e.g., Winsteps, ConQuest) provide separation statistics which are similar to a KR 20 estimate in CTT analysis (Wright & Masters, 1982), and can be interpreted as evidence of reliability. In addition, the property of unidimensionality can be further assessed through other indices such as item and person fit statistics and/or statistical procedures such as Rasch factor analysis (e.g., Fan & Bond, 2019). (3)

Definiteness of facts

This criterion of “definiteness of facts” places emphasis on a scale’s ability to produce definite descriptions of facts. With regard to language scales, they are expected to be behaviorally based and describe proficiency according to real-world performance in specific contexts. For instance, the descriptions of the levels should specify the particular tasks associated with each of the levels in the scale. Language scales devoid of a concrete description of language behaviors may give rise to problems such as overreliance on qualifiers (Alderson, 1991). The decontextualized type of scale, as exemplified by the scale proposed by Bachman and Palmer (1983, cited from Bachman, 1990) in Table 4.1, might seem to be straightforward in their application. However, by using terms like “small vocabulary” and “vocabulary of moderate size”, the scale becomes normative in nature and fails to meet the criterion of description incisiveness. (4)

Comparability

Comparability addresses measurement issues related to the use of calibrated scales. According to Engelhard (1992), the usefulness of a scale is achieved when the objects to be measured can be compared easily with the calibrated objects that define the scale. Specifically, scales that meet the criterion of comparability should be appropriate to the TLU domain and the context in which it is used (Brindley, 1991). In addition, the orientation of the scale must fit the purpose for which it is designed, be the scale constructor-oriented, user-oriented, assessor-oriented or diagnosis-oriented (see Chap. 2 for a detailed discussion of the scale purposes).

4.2 General Principles of Scale Development

(5)

53

A defined zero

Thorndike (1904) also points out that it is essential to know how the zero point of a measurement scale is defined, whose definition and location will influence the interpretation of the scores obtained with a scale. When using the Rasch model in scale development, the mean average of the descriptor difficulty values is usually set by default at zero, so that the participants’ ability is allowed to float from negative to positive infinity at both ends of the scale. When performing many-facets Rasch analysis involving the severity levels of the judges, one can set the mean average of the judges’ severity levels at zero, so that the descriptor difficulties and the participants’ abilities are allowed to vary.

4.3 Common Methods of Scale Construction 4.3.1 Intuitive Versus Empirical Scale Development Methods In the field of language scale development, a distinction has been made between intuitively and empirically developed scales (Council of Europe, 2001). Intuitive methods primarily rely on expert judgment and the “principled interpretation of experience” (Council of Europe, 2001, p. 208). Empirical methods, by contrast, are characterized by their emphasis on data collected from empirical sources. Previously, most scales were developed by appeal to intuition of an individual or a committee who are perceived to be expert(s) in the teaching and assessment of the construct of interest. Drawing largely upon experience and knowledge, the experts decide on what a language scale should incorporate and how it should look like. A number of researchers (e.g., Fulcher, 1996; Knoch, 2009; Turner & Upshur, 2002) have maintained that such intuitive design of language scales, which relies exclusively on expert judgment, is problematic in that the underlying rationale for putting some descriptors at a certain level is often implicit, if not unjustified. In a similar vein, North (2002) has also pointed out that whilst such intuitively developed scales may be appropriate in the development of in-house, low-stakes scales for a familiar population of learners, they suffer from “a fundamental lack of demonstrated validity” (p. 88). As a result, the acceptability of such scales has to “rely primarily on the authority of the scholars involved in their definition, or on the political status of the bodies that control and promote them” (De Jong, 1990, p. 72). Recognition of the limitations of intuitively derived language scales has given rise to empirically-based methods of scale development, divided into “quantitative” and “qualitative” ones, based on the type of data they draw on. Quantitative methods feature the use of statistical tools in data analysis such as G-theory (e.g., Lynch & McNamara, 1998; Sawaki, 2007), multi-dimensional scaling (Chalhoub-Deville, 1995), and in particular, Many-facets Rasch Measurement (MFRM) which, in the context of language assessment, can provide a means for exploring interactions

54

4 Developing a Scale of Business English Writing Proficiency: Considerations …

between the ability of individual test-takers, the severity and internal consistency of raters and the difficulty of tasks or assessment criteria (Sawaki, 2007). Qualitative approaches to scale development, on the other hand, include analysis of learner performance data, stakeholder perceptions and “think aloud” data collected during the assessment processes (e.g., Galaczi et al., 2011; Knoch, 2009; Lazaraton, 2002). In recent decades, mixed-methods design incorporating both quantitative and qualitative data collection methods have been adopted with a view to collecting multiple sources of evidence for scale development and validation (e.g., Cumming et al., 2001; Shaw & Weir, 2007).

4.3.2 Performance-Based Versus Descriptor-Based Scale Development Methods Given the pivotal role played by descriptors in language scales, scale development methods can also be divided into performance-based and descriptor-based ones, according to how the scale descriptors are arrived at. The performance-based scale development method places priority on learner performance data, in most cases, samples representing test-takers’ language use. This method features the identification of performance features by means of conducting a discriminant evaluation of test-takers’ language samples on the basis of conversation or discourse analysis (Fulcher, 1987), constructing a Yes/No binary algorithm to identify key components that separate the performances into adjacent levels (Turner & Upshur, 2002; Upshur & Turner, 1995), or a combination of the two (Fulcher et al., 2011). Apparently, the performance-based method is effective in producing scales characterized by rich and elaborate descriptions of language performance on typical tasks in a given communicative context. On the flip side, however, the resulting scales suffer from a lack of generalizability to contexts other than the one they are developed for (Fulcher et al., 2011). Consequently, although the performance-based method might work well for specific rating occasions, it is not ideal in capturing a comprehensive picture of learners’ overall language proficiency. The other type of scale development method is measurement-driven, which involves the direct scaling of performance descriptors. Currently, the most influential measurement-driven approach is the direct scaling of descriptors using the Rasch measurement theory, as demonstrated by the development of the CEFR (Council of Europe, 2001). Descriptors of the scale are no longer restricted by the analysis of actual learner performance data, usually derived from a very limited number of language tasks; rather, the descriptors can be collected from an enormous repository of resources that pertain to descriptions of the construct being examined. With regard to the selection and assignment of descriptors into different proficiency levels, the measurement model (e.g., Rasch) will act as an external arbiter to ensure that only the descriptors with robust statistical rigor survive in the scale development

4.3 Common Methods of Scale Construction

55

process. As such, the method stands out as a more desirable approach for producing a language proficiency scale to be used as a common point of reference across a range of teaching, learning and assessment contexts.

4.4 Four Phases of Developing a Scale of BE Writing Proficiency As the study was aimed at developing a BE writing proficiency scale, which is intended to be used in the Chinese tertiary context as a point of reference for BE teaching, learning and assessment, the descriptor-based scale development method was adopted due to its capacity to accommodate a much wider range of descriptions of the construct than the performance-based method, as discussed above. The distinctive advantages enjoyed by the method have been summarized by North and Schneider (1998, pp. 222–223). (1)

(2)

(3)

(4)

(5)

Criterion statements: Each descriptor of the scale is formulated as a standalone criterion statement that can serve as an independent statement of language proficiency. Curriculum relevance: The exploitation of various sources of information facilitates the formulation of a pool of descriptors that are potentially useful in the teaching, learning and assessment contexts of a construct. Consensus interpretation: The exploitation of various sources of information facilitates the formulation of comprehensible and relevant descriptors. Using questionnaires to elicit stakeholders’ ratings on descriptor difficulty helps base the descriptor levels on the collective interpretation of potential scale users. Quality control: The poorly drafted or irrelevant descriptors can be eliminated during the initial descriptor development stage by consulting with informants. Besides, Rasch analysis of the questionnaire data helps differentiate high-quality descriptors from problematic ones. Objectively scaled descriptors: By resorting to the Rasch measurement model in data analysis, scale developers can obtain stable and objective descriptor values which are calibrated through empirical data.

Situated within the descriptor-oriented scale development approach, the study also took advantage of the strengths of a range of intuitive, quantitative and qualitative approaches through the adoption of a mixed-methods research (MMR) design. MMR, known for the use of “more than one kind of technique for gathering, analyzing, and representing human phenomena” (Greene, 2007, p. 119), examines the complex nature of a social world from multiple perspectives and through multiple methodological lenses. In fact, MMR has recently been increasingly employed in the field of language scale development and validation due to its power to integrate and triangulate valuable insights derived from multiple sources of evidence (e.g., Cumming et al., 2001; Shaw & Weir, 2007).

56

4 Developing a Scale of Business English Writing Proficiency: Considerations …

Fig. 4.1 The four phases of scale development and validation

MMS may take several forms; in this study, the “developmental MMR design” (Jang et al., 2014) integrating intuitive, qualitative and quantitative data collection methods was employed, as it allows the researcher to collect different types of data independently and explore them in a complementary and cumulative manner. In addition, the research process is iterative involving both inductive and deductive reasoning processes and the preliminary results from one method can be consolidated to inform facets of the other (Jang, 2013). Specifically, the study was carried out in four phases, as diagrammatically presented in Fig. 4.1. Phase I: Establishing an initial descriptive scheme and collecting descriptors In the first phase of the study, intuitive data were explored to (a) establish a descriptive scheme of the BE writing scale, and (b) create a pool of descriptors that were tentatively put under the categories of the descriptive scheme. Drawing on the experience of developing the CEFR (Council of Europe, 2001), a descriptive scheme of the BE writing scale incorporating two broad descriptive categories, one theory-based to enhance the scale’s theoretical value, and the other activity-based to enhance the scale’s practical utility, was established. As discussed in Chap. 3, the construct of BE writing was conceptualized from the perspective of genre, due to the pivotal role that genres play both in BE writing classrooms and actual commination in the business arena. The model of “Genre Knowledge” proposed by Tardy (2009), was hence adopted as the framework to develop theory-based categories of the descriptive scheme. In addition to the theory-informed descriptive categories, activity-based categories were also determined by identifying the most typical written genres valued by the international business community. Identification of the genres was carried out through reviewing relevant documents (e.g., curricular requirements, tests, textbooks, empirical studies) exploring the written genres that Chinese BE learners are likely to encounter in the TLU domain. Besides establishing the descriptive scheme, a pool of descriptors was also collected. Sources for descriptor accumulation included the assessment criteria used by BE tests, descriptions of BE writing competence from pedagogical sources (e.g., syllabi, curricular requirements) as well as recruitment websites. The descriptors then

4.4 Four Phases of Developing a Scale of BE Writing Proficiency

57

were tentatively put under the descriptive categories of the descriptive scheme. For the categories where few existing descriptors were available, new ones were drafted by portraying the salient performance characteristics associated with different proficiency levels. Detailed information concerning phase I of the study is presented in Chap. 5. Phase II: Examining the relevance of the descriptive scheme and descriptors to China’s tertiary context In Phase II of the study, the descriptive scheme and descriptors were qualitatively validated by a group of experienced university teachers of BE writing in China to enhance their relevance to the Chinese tertiary context. Specifically, two workshops were held in which the teachers discussed whether the categories of the descriptive scheme served as useful parameters for describing BE writing proficiency, and whether the descriptors were informative enough to gauge the proficiency levels of BE learners in Chinese universities. The first workshop was devoted to the qualitative validation of the descriptive scheme. Training was first provided to familiarize the teachers with the concept and functions of proficiency scales, using the CEFR as an example. Next, the descriptive scheme of the BE writing scale derived from Phase I of the study was shown to the teachers, who went through all the descriptive categories, commented on their relevance to BE education in the Chinese tertiary settings, and discussed to what extent these categories could be described across proficiency levels. In view of the findings from the first workshop, the initial descriptive scheme was revised. The theme of the second workshop, which spanned three sessions, was to select descriptors of the best quality to be used in the next phase of the study. The first session of Workshop II was held one month after Workshop I. During the interval, the teachers evaluated all the descriptors in terms of descriptor categorization and descriptor quality on an individual basis. In the second and third sessions, the teachers were divided into small groups of five and compared their individual evaluations of the descriptors with their group members. For the descriptors considered problematic by one or more teachers (with respect to either descriptor categorization or descriptor quality), the group members worked together to determine whether the descriptors should be retained as they were, reworded, or deleted. In the last session of Workshop II, all the teachers went through the best descriptors demonstrating satisfactory category fit and descriptive rigor selected by small groups in the previous two sessions together. Revisions and selections were made again where necessary, yielding a pool of high-quality descriptors to be used for the next phase of this study. Detailed information concerning the second phase of the study is presented in Chap. 6. Phase III: Scaling the descriptors The descriptors that survived qualitative validation at Phase II were subsequently used for the development of a questionnaire in Phase III of the study. The questionnaire was then administered to Chinese university students who were asked to rate the difficulty levels of the descriptors. The student participants were chosen because (a)

58

4 Developing a Scale of Business English Writing Proficiency: Considerations …

they had received systematic and comprehensive training on BE writing, (b) they were familiar with the writing tasks in the descriptors, and (c) they constituted an important group of stakeholders of the BE writing scale under construction. The questionnaire data were then analyzed using the Rasch model, which produced stable and objective descriptor values calibrated through empirical data (North, 2000). After checking analysis results such as data-model fit and unidimensionality, misfitting items were identified and removed from the descriptor pool. Cut-offs were then set to differentiate proficiency levels on the basis of the descriptors’ scaled values. Such quantitative analysis was also complemented by a thorough check of content consistency to ensure that there was apparent content coherence and logical progression across the bands of descriptions, leading to the draft version of the BE writing scale. A “global scale”, which offers a panoramic view of the multi-dimensional BE writing scale, was also established to help potential users gain a general overview of the descriptions of BE writing proficiency. A detailed account of Phase III of the study is presented in Chap. 7. Phase IV: Validating the scale In the final phase of the study, the draft scale was validated by a group of experts consisting of both teaching experts and domain specialists. Specifically, the experts were asked to evaluate the appropriateness of descriptor categorization and descriptor level assignment included in the draft scale through quantities and qualitative methods. The participants firstly completed a questionnaire developed on the basis of the draft scale’s descriptors. They were asked to (a) evaluate whether the descriptors were appropriately put under their respective categories, and (b) rate the difficulty level associated with each descriptor. The purpose was to examine whether descriptors of the draft scale were properly categorized and whether the level assignment of the descriptors was reasonable. In addition, semi-structured interviews were conducted after the experts’ completion of the questionnaire to explore in greater depth their perceptions of the scale. An interview protocol was designed to probe into questions such as the experts’ overall impression of the scale and their perceptions of the quality of the descriptors. The questionnaire data were submitted to the many-facets Rasch analysis. Correspondence between the analysis results and the initial scale structure, both vertically and horizontally, was hypothesized as validity evidence for the newly developed scale. As to the interview data, they were digitally recorded, verbatim transcribed, and thematically coded following the procedures recommended by Corbin and Strauss (2008). In light of the findings, necessary revisions were made accordingly to further enhance the validity of the scale. Detailed information of Phase IV of the study is presented in Chap. 8.

4.5 Conclusion

59

4.5 Conclusion This chapter focuses on the critical issues that should be taken into account when developing a language proficiency scale. A review of the features of ESP assessment demonstrates that despite the considerable overlap between EGP and ESP, the latter seems to place priority on students’ mastery of the English language while the former aims to enable learners to perform language tasks in more specified contexts. Such a performance-oriented approach stressing language use characteristics in a specified context highlights three key features of ESP ability: (a) authenticity, (b) specificity, and (c) recognition of the intricate relationship between language ability and background knowledge. These distinguishing features of ESP are attended to when developing the BE writing scale, in addition to the general principles of scale development (i.e. objectivity, consistency, definiteness of facts, comparability and a defined zero). In addition, we also heed the methods which are commonly employed to develop language proficiency scales. These methods can be approached from two perspectives: (a) the inclusion or absence of empirical data, and (b) performancebased or descriptor-based method of scale construction. A descriptor-oriented scale development approach was adopted in the study, drawing on an MMR research design encompassing a range of intuitive, quantitative and qualitative data collection methods. Specifically, the study was implemented across four phrases, aimed at (1) establishing an initial descriptive scheme and collecting or drafting descriptors, (2) qualitatively validating the appropriateness of the descriptive scheme and descriptors to the Chinese tertiary context, (3) scaling the descriptors via Rasch analysis, and (4) validating the scale by eliciting qualitative and quantitative data from a group of BE experts in the pedagogical and occupational domains.

References Alderson, J. C. (1991). Bands and scores. In J. C. Alderson & B. North (Eds.), Language testing in the 1990s (pp. 71–86). Macmillan. Alderson, J. C., & Urquhart, A. H. (1988). This test is unfair: I’m not an economist. In P. Carrell, J. Devine, & D. Eskey (Eds.), Interactive approaches to second language reading (pp. 168–182). Cambridge University Press. Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford University Press. Bachman, L., & Palmer, A. S. (1983). Oral interview test of communicative proficiency in English. Urbana, IL: University of Illinois. Bachman, L. F., & Palmer, A. (1996). Language testing in practice. Oxford University Press. Basturkmen, H. (2006). Ideas and options in English for specific purposes. Lawrence Erlbaum. Basturkmen, H., & Elder, C. (2004). The practice of LSP. In A. Davies & C. Elder (Eds.), Handbook of applied linguistics (pp. 672–694). Blackwell. Bond, T. G., & Fox, C. M. (2007). Applying the Rasch model: Fundamental measurement in the human sciences. Erlbaum. Bond, T., & Fox, C. M. (2015). Applying the Rasch model: Fundamental measurement in the human sciences. London: Routledge.

60

4 Developing a Scale of Business English Writing Proficiency: Considerations …

Brindley, G. (1991). Defining language ability: The criteria for criteria. In L. Anivan (Ed.), Current developments in language testing (pp: 139–164). Singapore. Anthology Series 25, SEAMEO/RELC. Bruce, I. (2008). Academic writing and genre. Continuum. Cameron, R. (1998). Language-focused needs analysis for ESL—Speaking nursing students in class and clinic. Foreign Language Annals, 31, 203–218. Chalhoub-Deville, M. (1995). A contextualized approach to describing oral proficiency. Language Learning, 45(2), 251–281. Clapham, C. M. (1996). The Development of IELTS: A study of the effect of background knowledge on reading comprehension. Cambridge University Press. Corbin, J., & Strauss, A. (2008). Basics of qualitative research: Techniques and procedures for developing grounded theory (3rd ed.). Sage. Council of Europe. (2001). Common European framework of reference for language learning, teaching and assessment. Cambridge University Press. Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34, 213–238. Cumming, A., Kantor, R., & Powers, D. E. (2001). Scoring TOEFL essays and TOEFL 2000 prototype writing tasks: An investigation into raters’ decision making and development of a preliminary analytic framework. TOEFL Monograph Series 22. Princeton, NJ: Educational Testing Service. De Jong, H. A. L. (1990). Response to masters: Linguistic theory and psychometric models. In H. A. L. De Jong & D. K. Stevenson (Eds.), Individualizing the assessment of language abilities (pp. 71–82). Multilingual Matters. Douglas, D. (2000). Assessing language for specific purposes. Cambridge University Press. Douglas, D. (2001). Language for specific purposes assessment criteria: Where do they come from? Language Testing, 18(2), 171–185. Elder, C. (2016). Exploring the limits of authenticity in LSP testing: The case of a specific-purpose language test for health professionals. Language Testing, 33(2), 147–152. Elder, C., & McNamara, K. (2016). The hunt for “indigenous criteria” in assessing communication in the physiotherapy workplace. Language Testing, 33(2), 153–174. Engelhard, G. (1992). Historical views of invariance: Evidence from the measurement theories of Thorndike, Thurstone, and Rasch. Educational and Psychological Measurement, 52(2), 275–291. Jang, E. E., Wagner, M., & Park, G. (2014). Mixed Methods Research in Language Testing and Assessment. Annual Review of Applied Linguistics, 34, 123–153. Fan, J., & Bond, T. (2019). Unidimensionality and local independence. In V. Aryadoust & M. Rachelle (Eds.), Quantitative data analysis for language assessment (Volume I): Fundamental techniques (pp. 83–102). Routledge. Flowerdew, J., & Peacock, M. (2001). Issues in EAP: A preliminary perspective. In J. Flowerdew & M. Peacock (Eds.), Research perspectives on English for academic purposes (pp. 8–24). Cambridge University Press. Fox, J., Pychyl, T., & Zumbo, B. (1997). An investigation of background knowledge in the assessment of language proficiency. In A. Huhta, V., Kohonen, L., Kurki-Suonio, & S. Luoma (Eds.), Current developments and alternatives in language assessment (pp: 367–383). University of Jyvaskyla. Fulcher, G. (1987). Tests of oral performance: The need for data-based criteria. English Language Teaching Journal, 41(4), 287–291. Fulcher, G. (1996). Does thick description lead to smart tests? A data-based approach to rating scale construction. Language Testing, 13, 208–238. Fulcher, G. (2000). The “communicative” legacy in language testing. System, 28, 483–497. Fulcher, G., Davidson, F., & Kemp, J. (2011). Effective rating scale development for speaking tests: Performance decision trees. Language Testing, 28, 5–30. Galaczi, E., French, A., Hubbard, C., & Green, A. (2011). Developing assessment scales for large scale speaking tests: A multiple-method approach. Assessment in Education: Principles, Policy & Practice, 18(3), 217–237. Greene, J. C. (2007). Mixed methods in social inquiry. Jossey-Bass.

References

61

Henning, G. (1984). Advantages of latent trait measurement in language testing. Language Testing, 1(2), 123–133. Hyland, K. (2008). As can be seen: Lexical bundles and disciplinary variation. English for Specific Purposes, 27, 4–21. Jacoby, S., & McNamara, T. (1999). Locating competence. English for Specific Purposes, 18(3), 213–241. Jang, E. E. (2013). Mixed methods research in SLA. In P. Robinson (Ed.), The Routledge encyclopedia of SLA (pp. 429–431). Routledge. Knoch, U. (2009). Collaborating with ESP stakeholders in rating scale validation: The case of the ICAO rating scale. Spaan Fellow Working Papers in Second or Foreign Language Assessment, 7, 21–46. Lazaraton, A. (2002). A qualitative approach to the validation of oral language tests. Cambridge University Press. Liu, J. (2019). China’s standards of English language ability. Foreign Languages in China, 16(3), 10–12. Lynch, B. K., & McNamara, T. F. (1998). Using G-theory and many-facet Rasch measurement in the development of performance assessments of the ESL speaking skills of immigrants. Language Testing, 15(2), 158–180. Marston, J., & Hansen, A. (1985). Clinically speaking: ESP for refugee nursing students. Minne TESOL Journal, 5, 29–52. McNamara, T. (1996). Measuring second language performance. Addison Wesley Longman. McNamara, T., & Knoch, U. (2012). The Rasch wars: The emergence of Rasch measurement in language testing. Language Testing, 29(4), 555–576. McNamara, T., Knoch, U., & Fan, J. (2019). Fairness, justice, and language assessment: The role of measurement. Oxford University Press. Messick, S. (1989). Validity. In R. Linn (Ed.), Educational Measurement (pp. 13–23). Macmillan. Nelson, M. (2006). Semantic associations in Business English: A corpus-based analysis. English for Specific Purposes, 25, 217–234. North, B. (2000). The development of a common framework scale of language proficiency. Peter Lang. North, B. (2002). Developing descriptor scales of language proficiency for the CEF common reference levels. In J. C. Alderson (Ed.), Common European framework of reference for languages: Learning, teaching assessment: Case studies (pp. 87–105). Council of Europe. North, B., & Schneider, G. (1998). Scaling descriptors for language proficiency scales. Language Testing, 15(2), 217–263. O’ Hagan, S., Pill, J., & Zhang, J. (2016). Extending the scope of speaking assessment criteria in a specific-purpose language test: Operationalizing a health professional perspective. Language Testing, 33(2), 195–216. Paltridge, B. (2009). Afterword: Where have we come from and where are we now? In D. Belcher (Ed.), English for specific purposes in theory and practice (pp. 289–296). University of Michigan Press. Parkinson, J. (2013). English for science and technology. In B. Paltridge & S. Starfield (Eds.), The handbook of English for specific purposes (pp. 155–174). John Wiley & Sons Inc. Robinson, P. (1980). ESP (English for specific purposes). Pergamon Press. Sawaki, Y. (2007). Construct validation of analytic rating scales in a speaking assessment: Reporting a score profile and a composite. Language Testing, 24(3), 355–390. Shaw, S., & Weir, C. (2007). Examining writing: Research and practice in assessing second language writing. Cambridge University Press. Stevens, S. S. (1946). On the Theory Scales of the Measurement. Science, 103, 677–680. Tardy, C. M. (2009). Building genre knowledge. Parlor Press. Thorndike, E. L. (1904). An introduction to the theory of mental and social measurements. Teachers College, Columbia University.

62

4 Developing a Scale of Business English Writing Proficiency: Considerations …

Turner, C. E., & Upshur, J. (2002). Rating scales derived from student samples: Effects of the scale marker and the student sample on scale content and student scores. TESOL Quarterly, 36(1), 49–70. Upshur, J. A., & Turner, C. E. (1995). Constructing rating scales for second language tests. ELT Journal, 49(1), 3–12. Ward, J. (2009). A basic engineering English word list for less proficient foundation engineering undergraduates. English for Specific Purposes, 28, 170–182. Widdowson, H. (1983). Learning purpose and language use. Oxford University Press. Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. MESA Press. Zhang, Z. C. (2005). Business English: A discourse approach. University of International Business and Economics Press.

Chapter 5

The Establishment of a Descriptive Scheme for the Business English Writing Scale

This chapter presents how a descriptive scheme of the BE writing proficiency scale was established. According to North (2000), the descriptive scheme of a scale should have both theory-based and activity-based categories so as to enhance its theoretical rigor and practical utility. As illustrated in Chap. 3, BE writing proficiency was conceptualized from the perspective of genre and the Model of Genre Knowledge (Tardy, 2009) was adopted as the theoretical framework of the BE writing scale, based on which several categories in the descriptive scheme were determined. In addition to theory-informed descriptive categories, activity-based categories termed in accordance with typical BE written genres were also identified and incorporated into the descriptive scheme. Along with the establishment of the descriptive scheme, a pool of descriptors were subsequently collected by consulting sources such as existing language scales, BE tests and curricular requirements. The descriptors were tentatively put under the descriptive scheme of the BE writing scale under construction. For the descriptive categories where no existing descriptors were available, new ones were drafted by portraying the salient performance characteristics associated with different proficiency levels.

5.1 Research Procedures When a proficiency scale is intended to be used as a common point of reference across teaching, learning and assessment contexts, its descriptive scheme is expected to contain multilayered descriptive categories which fall into two major categories: one is theory-based with a view to enhancing the scale’s theoretical value, and the other is activity-based aiming to enhance the scale’s practical utility (North, 2000). Due to the fundamental role that genres play both in pedagogical and professional contexts associated with BE writing, the Model of Genre Knowledge proposed © Springer Nature Singapore Pte Ltd. 2021 L. Wang and J. Fan, Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach, https://doi.org/10.1007/978-981-16-5449-7_5

63

64

5 The Establishment of a Descriptive Scheme for the Business English …

by Tardy (2009), the first serious attempt to theorize how L2 writers’ expertise of genre writing develops, was adopted as the framework to develop theory-based categories of the BE writing scale. Specifically, a review of the literature concerning the model of Genre Knowledge was conducted. The four dimensions underlying the model, namely the dimensions of Formal Knowledge, Process Knowledge, Procedural and Subject-matter knowledge, were scrutinized before being operationalized as the descriptive categories of the BE writing scale. According to North (2000), using familiar terms as scale categories was conducive to users’ interpretation and acceptance of a new scale. Therefore, widely-recognized terms used in models of language proficiency or language scales were used as category names in the new scale, where appropriate. The resultant scale categories were all put under the heading of “Genre Knowledge” (i.e. the first major category of the descriptive scheme) in the descriptive scheme of the BE writing scale. In addition to theory-informed descriptive categories, activity-oriented descriptive categories were also pinpointed by means of a systematic review of documents pertinent to the language tasks that Chinese BE learners are likely to encounter in the target language use (TLU) domain. Specifically, BE textbooks, curricular descriptions, and BE tests were examined. In addition, given the possible gap between pedagogical settings and workplace realities (Bremner, 2010), empirical studies exploring the writing needs and activities emphasized by the business community were also reviewed. The findings were then synthesized and translated into scale categories, which were put under the heading of “BE Writing Activities” (i.e. the second major category of the descriptive scheme) in the descriptive scheme of the BE writing scale. While examining literature and documents concerning the establishment of the descriptive scheme, descriptions on the BE writing proficiency were simultaneously collected and tentatively put under the descriptive categories by the researchers. For categories where no existing descriptors were available, new descriptors were formulated. In the following sections, the scale categories along with exemplar descriptors were presented to facilitate readers’ understanding of the scale development process.

5.2 The Establishment of a BE Writing Descriptive Scheme 5.2.1 Theory-Based Descriptive Categories According to Davies (1990), defining the construct to be assessed has an important bearing on decisions to be made of the assessment results. Given the ESP nature of BE writing proficiency, traditional terms, such as competence and performance (Chomsky, 1965) or communicative competence (Bachman & Palmer, 1996), were not adopted in this study because they were too general to pin down exactly what

5.2 The Establishment of a BE Writing Descriptive Scheme

65

makes it possible for expert professionals to manipulate a broad range of sophisticated disciplinary genres (Bhatia, 2004). Recent developments in ESP (both EAP and EOP) studies apparently demand a more precise specification of BE writing ability. In light of the fundamental role played by genre in the teaching, learning and assessment of BE writing as elaborated in Chapter 3, BE writing ability in the current study is operationalized as “the knowledge and skills required to produce written genres valued by the business community”. Concomitantly, the model of Genre Knowledge proposed by Tardy (2009) was adopted as the theoretical basis for the descriptive scheme of the BE writing scale under development. The dimensions in this model, namely the dimensions of Formal knowledge, Process knowledge, Rhetorical knowledge, and Subject-matter knowledge, were conceptualized as four basic categories of the descriptive scheme against which BE writing proficiency was benchmarked. Most of these categories were further operationalized as several subcategories, depicting in greater detail aspects of proficiency involved in the production of BE written genres. Detailed information of how these multi-layered categories were pinpointed along with their exemplar descriptors is presented in Sects. 5.2.1.1, 5.2.1.2, 5.2.1.3 and 5.2.1.4.

5.2.1.1

Descriptive Categories of Formal Knowledge

According to Tardy (2009), Formal knowledge is concerned with the prototypical form and textual instantiation of a genre. Specifically, it entails knowledge of the lexico-grammatical rules, discoursal conventions, and structural moves of a genre, as well as the capacity to manipulate these features to produce acceptable written texts. Compared with the other dimensions contained in the model of Genre Knowledge, the Formal knowledge dimension tends to be prioritized in instruction as well as assessment contexts (Tardy, 2006). As Tardy (2009) did not specify the exact parameters included in this dimension, the rating criteria adopted by five renowned international BE tests for their writing sections (i.e. BEC, BULATS, TOEIC, English for Business Communication, English for Commerce) were reviewed and analyzed in this study to pinpoint the specific subcategories that fall into the parameter of Formal knowledge. The rationale underlying this decision was that assessment criteria in high-stakes tests usually reflect important aspects of writing performance which tend to be addressed in language classrooms. Moreover, relating the new scale to existing rating scale categories applied in the BE testing community could also help enhance its adoption rate (North, 2000). The process of analysis yielded five descriptive categories: Vocabulary, Genre format, Grammar, Cohesion and coherence, and Orthographical control. As summarized in Table 5.1, the scale category “Vocabulary” was derived from rating criteria “range & accuracy of vocabulary” (BEC), “accuracy & approriacy of language”1 1

Accuracy and appropriacy refer to different aspects of language use with the former falling into the vocabulary category while the latter falling into the sociolinguistic aspect, which will be discussed in the Sect. 5.2.1.3.

66

5 The Establishment of a Descriptive Scheme for the Business English …

Table 5.1 Writing assessment criteria adopted by BE tests Category

BEC

BULATS

Vocabulary

Range and accuracy of vocabulary

Accuracy and Vocabulary appropriacy of language

N/A

N/A

Genre format N/A

N/A

N/A

N/A

Length and format

Grammar

N/A

Quality and variety of sentences

Grammar

Grammar

Range and accuracy of grammatical structures

TOEIC

English for English for Business Commerce Communication

Cohesion and Organization Organization coherence and cohesion of ideas

Organization N/A

N/A

Orthographic control

N/A

N/A

N/A

Spelling and punctuation

Accuracy in spelling; Punctuation

Othersa

Content; effect on the target reader

Task achievement

N/A

Tone and fluency

Tone; Style; appropriacy of content; clarity and suitability of communication

Notes a Factors under this category fall outside the dimension of Formal knowledge and hence will not be discussed in this section N/A—not available

(BULATS) and “vocabulary” (TOEIC), which all dealt with the lexical aspect of language use. In addition, “range & accuracy of grammatical structures” (BEC), “quality & variety of sentences” (TOEIC), “grammar” (English for Business Communication, English for Commerce) were concerned with the syntactical aspect of language use, so the scale category of “Grammar” was determined. Emphasis on “Length and format”, an element in the rating scale of English by Commerce, was reflected in the category of “Genre format”. The fourth scale category was termed “Cohesion & coherence” to mirror the rating criteria of “organization & cohesion” in BEC, “organization of ideas” in BULATS and “organization” in TOEIC. Finally, “spelling & punctuation” emphasized by English for Business Communication, and “accuracy in spelling; punctuation” stressed by English for Commerce were reflected in the scale category of “Orthographical control”. The other rating criteria included in the rating scales of these BE tests, such as “content”, “task achievement” and “tone” were considered as either too task-specific or entailing elements other than what is covered by the dimension of Formal knowledge, and therefore were not included in the categories at this stage.

5.2 The Establishment of a BE Writing Descriptive Scheme

67

With the descriptive categories established, descriptors were collected or formulated to portray BE writing proficiency in these aspects. Exemplar descriptors for each scale category are presented below to shed light on the process. Vocabulary: As illustrated in Chap. 2, existing language scales mainly approached the category of “Vocabulary” from the EGP perspective, with few descriptors available for describing proficiency associated with BE vocabulary. Empirical studies, however, revealed that specialized or technical vocabulary is the most salient area that marks the difference between ESP and EGP, as ESP writers have very distinct lexical needs in terms of their community practices, their discourses and types of documentation (e.g., Li et al., 2015; Nelson, 2000; Northcott & Brown, 2006). Therefore, in this study, new descriptors under the sub-category of Vocabulary were formulated by highlighting the unique features of BE lexicon. According to Horner and Strutt’s (2004) taxonomy of ESP lexis, BE vocabulary can be put under four categories and the level of specificity increases as one moves from category 1 (including words whose meanings are the same in the business field and elsewhere but used more commonly in the business field) to category 4 (including words that rarely occur outside the business domain). The exact definitions of the four categories of words and their corresponding examples are presented in Table 5.2. Arguably, it is critically important for BE writers to demonstrate mastery of all the four types of lexical resources so that they could be recognized as professional members of the business discourse community. The third category (including words used both inside and outside the business field, but with a radical change in meaning), in particular, is highlighted in BE classrooms in China (Zhou, 2013). The descriptor “Demonstrates familiarity with business specialist terms (e.g., offer, bill), whose meanings are very different when they are used in the non-business context” was thus formulated to reflect this salient feature of the business lexicon. Another prominent feature of ESP vocabulary is its emphasis on the use of lexical resources which may be alien even to non-specialist native English speakers (Cao, 2008; Dudley-Evans & St John, 1998). Such lexical resources, including the nomenclature of business, acronyms or abbreviations, are not only necessary but also Table 5.2 Categories of business lexis Category

Examples

• The word rarely if ever occurs outside the business domain

Dead cat bounce, bill of lading, cost–benefit analysis

• The word is used both inside and outside the business field, but with a radical change in meaning

Bull (on the stock market), interest (as in rate of interest), chair (person)

• The word is used both inside and outside the Option (to buy or sell shares), drain (on business field; the specialized meaning can be resources), margin (profit margin) inferred through its meaning outside the business field • The word is more common in the business field Bank, discount, credit than elsewhere. (Note. In this category, there is little or no specialization of meaning though BE specialist would have a more in-depth idea of its meaning.) Adapted from Horner and Strutt (2004, p. 6)

68

5 The Establishment of a Descriptive Scheme for the Business English …

desired for precise and accurate communication among insiders of the academic, technical or vocational fields (Douglas, 2001). As this lexical feature also applies to the domain of BE, the descriptor “Demonstrates familiarity with common acronyms and abbreviations used in business documents such as HR and SWOT ” was drafted. Genre Format: Essentially, genre is a term for grouping texts according to their similarities, representing how writers typically use language to respond to recurring situations in familiar contexts (Hyland, 2004). The most important aspect of genre lies in its highly recognizable and sufficiently standardized format, which is enabled by collective observation of sets of mutually accessible conventions shared within professional, academic or institutional organizations (Swales, 1990). Internalizing these generic conventions will significantly facilitate new members’ access to the target discourse community. The ESP approach to genre highlights the analysis of “move”, which refers to a “bounded communicative act that is designed to achieve one main communicative objective”, as well as the linguistic choices undergirding a particular move (Swales & Feak, 2004, p. 35). Bhatia (1993) demonstrated how each move in a genre can be purposefully created and contributes to the realization of its overall communicative goal by using Sales Letters as an example. As illustrated in Table 5.3, the writing of a Sales Letter can be approached by seven consecutive moves, namely “establishing credentials”, “introducing the offer”, “offering incentives”, “enclosing documents”, “soliciting response”, “using pressure tactics” and “ending politeness”, with each one being associated with a particular communicative purpose realized through the use of conventional expressions. In a similar vein, Swales (1990) also stressed the importance of sticking to the conventionalized generic structure and format of a genre; besides, he made an important addition that not all the moves that might appear in a genre are obligatory and some moves are removable as long as the Table 5.3 Structural description of a sales letter Move

Purpose

Sample expressions

1. Establishing credentials To convince reader of company reputation

We are expertly aware that financial managers need confidence to …

2. Introducing the offer

To describe the offer and its value to the reader

We now provide a training course

3. Offering incentives

To create interest through promotional offer

There is a discount if six members enroll

4. Enclosing documents

To provide information and a way to respond

I enclose a reservation form

5. Soliciting response

To encourage further communication

Please call if you have any questions

6. Using pressure tactics

To encourage a quick decision and Places are limited so act quickly response

7. Ending politely

To maintain cordial relations with the reader

Bhatia (1993, pp. 46–56)

Thank you for your kind attention

5.2 The Establishment of a BE Writing Descriptive Scheme

69

overall effectiveness of communication remains intact. It is therefore critical for BE learners to be familiar with the conventions associated with genre format, an area which, unfortunately, is rarely addressed in the language scales currently available. A new descriptor was thus drafted to reflect this parameter of BE writing competence: Is aware of the essential and optional components of different business genres and can make appropriate selections in response to a variety of situations. Grammar, Cohesion & coherence, Orthographical control: Different from the subcategory of Genre format, the other three sub-categories in the dimension of Formal knowledge, namely Grammar, Cohesion and coherence, and Orthographical control, were quite elaborately dealt with by existing scales of language proficiency. It was therefore decided to adapt descriptors of these categories from available scale sources such as the CEFR. Some descriptors were adapted by highlighting the language use context in the business domain. The descriptor “Can create coherent and cohesive business text, making full and appropriate use of a variety of organizational patterns and a wide range of cohesive devices” in the category of “Cohesion & coherence”, for instance, was adapted from a CEFR descriptor “Can create coherent and cohesive text making full and appropriate use of a variety of organizational patterns and a wide range of cohesive devices (CEFR, Coherence and Cohesion, C2)” by changing “text” into “business text”. As to the formulation of other descriptors, findings of empirical studies on the features of the business language were consulted. Take the scale category of “Grammar” as an example. Research indicates that for the sake of achieving communicative purposes, effective business writers might prioritize the use of certain syntactical structures over others (e.g., Nelson, 2000; Nickerson, 2005; Yeung, 2007). For instance, Yeung (2007) examined the commonalities of business reports and found that nominalization, rational argument, and impersonality were syntactically preferred by the effective writers of BE reports. Business E-mails, by contrast, were often characterized by the use of syntactically simple and succinct structures (Jensen, 2009). These findings highlighted the importance for BE writers to adjust their grammatical resources to the writing tasks at hand. A descriptor was thus formulated as “Can employ a full range of grammatical structures with full flexibility and accuracy. For instance, can make skillful grammatical choices to enhance the overall communicative effectiveness of a business text” by enriching the ISLPR descriptor “Employs a full range of structures (ISLPR, writing 3, general professional proficiency)” through providing business-related examples. Besides incorporating research findings concerning the features of business language, studies examining Chinese students’ BE writing performance were also reviewed to capture proficiency descriptions relevant to the Chinese EFL context. Zhang (2017), for instance, found that Chinese students often made mistakes in terms of the use of punctuation. The descriptor “Punctuation is reasonably accurate” was thus drafted, apart from drawing on the CEFR descriptor “Spelling and punctuation are reasonably accurate but may show signs of mother tongue influence (CEFR, Orthographical Control, B2)”.

70

5.2.1.2

5 The Establishment of a Descriptive Scheme for the Business English …

Process Knowledge

Process knowledge encompasses the process(es) or all the procedural practices that users of a given genre go through in order to complete their intended action or to carry out the genre (Tardy, 2006). These processes include endeavors that “might facilitate the actual composing processes that aid the writer in text completion” (Tardy, 2009, p. 21). Apparently, this dimension stresses the writers’ ability to mobilize and balance their resources and to activate skills in order to meet the communicative demands. It hence reminds one of strategic competence in traditional linguistic terms, which has been widely regarded as a general ability, be it psycholinguistic or interactional, as it “enables an individual to make the most effective use of abilities in carrying out a given task” (Bachman, 1990, p. 106). Process knowledge in the current study was hence operationalized as Strategic competence that involves the ability to marshal and balance various resources—exploiting strengths as well as underplaying weaknesses—in order to maximize communicative effectiveness when performing a genre. Despite the central position attributed to strategic competence in many linguistic models (e.g., Bachman & Palmer, 1996; Canale & Swain, 1980), “much remains unclear in regard to the exact nature of strategic competence” (Phakiti, 2008, p. 238). There is considerable disagreement as to whether strategies should be considered as a particular type of psycholinguistic process (Selinker, 1972), a particular type of psycholinguistic plan (Faerch & Kasper, 1983), or a particular type of interactional process (Tarone, 1980). The picture is further complicated by the co-existence of a plethora of taxonomies, such as learning strategies (learning to learn) and communication strategies (an aspect of proficiency). Perhaps as a result of the confusion associated with the definitions of strategy, few existing scales of language proficiency treat strategic competence as a category. For instance, among the 42 language proficiency scales surveyed by North (1994), only two attempted explicit descriptions of strategic competence and their treatment of this category is largely confined to strategies involved in spoken interaction. As a recent contribution to the literature of language scale development, the CEFR stood out by incorporating illustrative scales that tapped into the area of strategy use. However, scrutiny of the CEFR descriptors reveal that they were mostly related to oral communication, as shown in descriptors such as “Can use an inadequate word from his/her repertoire and use gesture to clarify what he/she wants to say (CEFR, Compensating, A2)”, “Can plan what is to be said and the means to say it, considering the effect on the recipient/s (CEFR, Planning, B2)”, and “Can backtrack when he/she encounters a difficulty and reformulate what he/she wants to say without fully interrupting the flow of speech (CEFR, Monitoring and Repair, C1)”. Apart from reviewing existing scales, other sources such as BE writing courses, textbooks, and empirical studies examining the BE writing that might contain descriptions of BE writing strategies were also explored to create descriptors for the dimension of Process knowledge. The underlying principle was to formulate descriptors which accentuate the learners’ ability to adroitly regulate available resources, linguistic or nonlinguistic, to maximize communicative effectiveness.

5.2 The Establishment of a BE Writing Descriptive Scheme

71

In the business domain, a message conveyed via the written means is usually made up of essential parts and less essential ones (Bhatia, 2004). For example, for bad-news letters, the sentence that actually gives the bad news is critical, for many sales letters, the sentence that gives the price critical, and in persuasive letters, the sentence that actually makes the request critical (Ober, 1992). It is therefore of paramount importance for BE writers to develop the strategic competence of highlighting the critical parts in their messages. This feature of BE writing then informed the formulation of the following descriptor: “Can design and present information in a way that the most important messages are highlighted”. In a similar vein, Xian (2009), in his textbook on teaching BE writing to Chinese university students, advocated strategic presentation of messages in business written communication as “attractively designed and presented messages give an impression of competence and care, and they build credibility for the writer” (p. 17). Factors affecting the effective delivery of the message include the choice of font, size variations, underlining, italics as well as the use of graphics. Such pedagogical advice was then utilized for the creation of a new descriptor for the scale category: “Can make proper use of such visual aids as bar graphs, tables, and line charts to complement or reinforce information loaded in words”. It should be noted at this juncture that there are fewer sub-categories in the Process dimension than in the dimension of Formal knowledge, which might be attributable to the elusive nature of the Process dimension that renders it “less easy to capture” (Zhang, 2013, p. 151). As such, empirical studies conducted in the tertiary context of China were also examined to draw insights for descriptor formulation. Zhang (2013), for instance, has shown that referential intertextuality, which mainly concerns patterns of direct reference to other texts within a given text (Devitt, 1991), is an important linguistic realization of one’s Process knowledge. Given the importance of intertextuality in business communication (Cheng & Mok, 2008; Flowerdew & Wan, 2006), the following descriptor was thus created: “Can locate and incorporate available materials to enhance the overall effectiveness of the writing. For instance, can make appropriate use of quotations from an authorized organization to enhance the effectiveness of the writing without losing its apparent objectivity”.

5.2.1.3

Rhetorical Knowledge

According to Tardy (2009), the dimension of Rhetorical knowledge is concerned with a writer’s knowledge of the genre’s intended purposes as well as the power relationship between the writer and the reader enacted by the genre. As genre learning and writing oftentimes involve people whose prior experiences are situated within various socio-cultural contexts, Rhetorical knowledge also entails cultural knowledge, an aspect that may pose particular challenges for L2 writers (Leki, 2003). Seen in this light, the dimension of Rhetorical knowledge contained both linguistic and

72

5 The Establishment of a Descriptive Scheme for the Business English …

non-linguistic components. The linguistic element emphasized the social dimension of language use such as the reader-writer relationship, so it was operationalized as Sociolinguistic knowledge in the study; the non-linguistic element accentuated intercultural knowledge, and was operationalized as Intercultural competence accordingly. Sociolinguistic competence: Sociolinguistic competence refers to the proper use of linguistic codes in a social context and emphasizes the use of language in accordance with social rules, be they the norms, values, beliefs and behavior patterns expected by society (Canale & Swain, 1980). Underscoring language use in a specific context, sociolinguistic competence stresses important variables affecting communication such as contextual variation, the relationship between the communicators, and the purpose of the interaction (Hedge, 2000). As the first empirically developed scale dealing with Sociolinguistic competence, the CEFR approached description of the parameter from five aspects: “linguistic markers of social relations, politeness conventions, expressions of folk-wisdom, register differences, and dialect and accent” (Council of Europe, 2001, p. 118). The CEFR experiences were then drawn on and its descriptor “Can express him or herself confidently, clearly and politely in a formal or informal register, appropriate to the situation and person(s) concerned (CEFR, Sociolinguistic appropriateness, B2)” was adapted as “Can express him or herself confidently, clearly and politely in a formal or informal business register, appropriate to the purposes and audience(s) concerned” by narrowing the language use context described in the descriptor to the business domain. Apart from consulting language scales, studies (e.g., Bremner, 2006; Zhang, 2007) germane to BE writing were also reviewed to identify the linguistic realizations of sociolinguistic issues such as politeness conventions, sensitivity to register and norms of stylistic appropriateness in international business settings. Zhang (2013), for instance, investigated how university BE students’ writing was perceived by international business practitioners, using Tardy’s (2009) model of Genre Knowledge. He reported that BE students should be purposefully trained in two aspects under the rhetorical dimension of genre knowledge. The first salient aspect was concerned with the ability to convey positive information to avoid negative feelings. This feature of BE writing inspired the development of the descriptor “Is aware of the salient politeness conventions and acts appropriately. For instance, can keep discriminatory or derogatory language (e.g., sexist salutation) out of one’s business texts” was developed. The second aspect of sociolinguistic competence in business settings stressed the ability to choose an appropriate tone of communication in accordance with the reader-writer relationship, which served as a basis for the formulation of the descriptor “Shows a high level of appreciation of the register. For instance, can adopt a conversational tone where the situation desires informal communication”. Intercultural Competence: Intercultural competence refers to the ability to function effectively and appropriately in intercultural communication contexts by drawing on one’s intercultural knowledge, skills, and attitudes (Deardorff, 2006). As more and more business transactions are taking place across national boundaries, business people are expected to demonstrate intercultural competence so as to carry out smooth intercultural business communication. Language teaching, as a result, is

5.2 The Establishment of a BE Writing Descriptive Scheme

73

attaching greater importance to intercultural competence so as to prepare L2 learners for the increasingly globalized business world. In the Chinese tertiary context, for instance, a host of language programs, courses and textbooks have been developed to train students to be competent intercultural communicators (e.g., Hu, 2004; Schulz, 2007). To pinpoint the components constituting intercultural competence, numerous studies had been conducted (e.g., Deardorff, 2006; Jiao et al., 2020) and it has been generally acknowledged that intercultural competence is a multidimensional construct involving three basic dimensions: cognitive, behavioral, and motivational or affective dimensions, though different terminologies are used (Jiao et al., 2020). Chen and Starosta (1998), for instance, provided a model of intercultural competence focusing on the affect, cognitive and behavioral dimensions. Affectively speaking, an effective interculturalist possesses a positive self-concept and is open-minded, non-judgmental and relaxed in cross-cultural interactions. Cognitively speaking, the effective interculturalist shows cultural awareness and knowledge concerning how culture might affect language use in intercultural communication. Behaviorally speaking, the effective interculturalist demonstrates behavioral flexibility and adroit social skills, both verbally and nonverbally. Examining what intercultural competence entails in the business domain, Xu and Jiao (2017) posits that the cognitive dimension involves the business practitioners’ knowledge on areas such as cultures and different business practices. The behavioral dimension includes the skills to establish rapport with people from other cultures and handle international business conflicts. The affective dimension is manifested mainly as open-mindedness, the ability to get rid of negative emotions caused by ethnocentrism and stereotypes, and tolerance for cultural differences. It is important to note that though introduced in separate entities, the three dimensions, in actual intercultural business communication, seldom work in isolation, but function as an integrated competence system coordinate that produces a synergy effect (Jiao et al., 2020). Descriptors under the scale category entitled Intercultural competence were then written by exploring descriptions associated with the cognitive, behavioral and motivational or affective dimensions of intercultural competence from various sources. For instance, the descriptor “Demonstrates comprehensive understanding of the political, economic, historical, geographical, scientific and technological, and cultural characteristics of different nations” was designed by synthesizing relevant information from the National Curriculum for BE majors in China such as “are familiar with Chinese and Western politics, economies, and cultures” and “have fine intercultural communication abilities” (Zhang & Wang, 2011) as well as a scale descriptor “can usually discuss a variety of issues and subject matter that refer to the culture, such as history, politics, literature, and the arts (Interagency Language Roundtable Skill Level Descriptions for Competence in Intercultural Communication, Level 3: Professional Competence)”. CEFR descriptor “Is aware of, and looks out for signs of, the most significant differences between the customs, usages, attitudes, values and beliefs prevalent in the community concerned and those of his or her own (CEFR, Sociolinguistic appropriateness, B1)” was also referred to and modifications were made to underline the integrated nature of the behavioral and affective aspects

74

5 The Establishment of a Descriptive Scheme for the Business English …

of intercultural competence in the business domain, yielding hence the descriptor “Can consciously look out for signs of the most significant differences between the customs, attitudes, values and beliefs prevalent in those cultures and his/her own, to ensure smooth intercultural communication”. In light of the importance of nonlinguistic factors in intercultural competence, especially awareness of different cultural perceptions concerning time and space (Dou, 2007), a new descriptor “Is aware of the effect of non-linguistic elements on intercultural business written communication, including time, space, etc.” was created from scratch. Finally, to reflect the affective dimension of intercultural competence that highlight the control of anxiety and the ability to handle conflict confidently (e.g., Bennett, 2015; Nicole et al., 2011), the descriptor “Can overcome fear and anxiety and cope with cultural conflicts by means of understanding and analyzing cultural differences” was formulated.

5.2.1.4

Subject-Matter Knowledge

The dimension of Subject-matter knowledge refers to the disciplinary or professional knowledge required to produce a given genre (Tardy, 2006). The recognition that subject-matter knowledge is an indispensable element in genre production is consistent with the emphasis of Background knowledge in Douglas’s (2000) model of ESP ability, which regarded the interaction between language knowledge and specific purpose content knowledge as a unique feature of ESP proficiency. Empirically, the positive correlation between subject-matter knowledge and L2 writing performance has also been reported by a wealth of studies (e.g., He & Shi, 2012; Lee, 2004; Tedick, 1990). He and Shi (2012), for instance, investigated the effects of topical knowledge on ESL writing performance and found that all students, regardless of their proficiency levels, performed significantly better on a general writing topic than they did on an unfamiliar field-specific writing topic. Exploring business practitioners’ perceptions of the quality of Chinese university students’ BE written texts, Zhang (2013) also reported that the professionals “attached much importance to the subjectmatter dimension of genre knowledge”, and emphasized knowledge of fundamental theories, knowledge, and skills in international economics, management, and law (p. 152). Despite growing recognition of the integral role played by subject-matter knowledge in language performance, no existing language proficiency scale incorporates Subject-matter knowledge as a descriptive category. This has perhaps resulted from the fact that virtually all the scales are derived from theories of general language proficiency wherein background knowledge is treated as a construct-irrelevant variable (Douglas, 2001). Consequently, statements describing Subject-matter knowledge, namely Business knowledge in the study, were drafted from scratch. This task turned out to be quite challenging, partially because of the all-encompassing nature of business, and partially because of the difficulty to pin down the appropriate level of specificity of the descriptions concerned. Nevertheless, several descriptors for this category were carefully written by (1) referring to the curricular requirements for BE learners’ disciplinary knowledge in

5.2 The Establishment of a BE Writing Descriptive Scheme

75

Chinese universities, and (2) analyzing descriptions of business knowledge in BE tests. For instance, it was explicitly stipulated by the National Curriculum developed for BE majors in Chinese universities that BE majors were supposed to acquire business-related knowledge in areas such as economics, management, international law, international finance, international marketing, human resources, financial management, principles of business operations and information technology (Wang et al., 2015). This broad description of business knowledge requirement was phrased as a descriptor in the study, which says “Has a systematic knowledge of business subjects such as economics, human resources, international business management, international business law, international finance, and business ethics, etc.”. Description in BE tests pertinent to business disciplinary knowledge also served as a source of inspiration. When introducing the business issues that test takers were expected to master, BEC, in its Handbook for Teachers, for example, specified that its tests candidates should be able to understanding and talk about “general business topics, such as management skills, promotional strategies, training courses, conferences, business services, recruitment, rewarding staff, sponsorship, business premises, company budgets, use of IT in business, etc.” (UCLES, 2008, p. 4). This statement was thus shortened by retaining information most relevant to the pre-experience BE learners at Chinese universities, yielding hence the following descriptor: “Has a sound understanding of general business topics, such as promotional strategies, conferences, business services, recruitment, etc.”.

5.2.2 Activity-Based Descriptive Categories 5.2.2.1

The Importance of Incorporating BE Writing Activities

To meet the pragmatic needs of potential scale users, descriptive categories under the heading of “BE Writing Activities” were also determined, in addition to those which were theory-informed under the heading of “Genre Knowledge” discussed above. It was believed that the following advantages could be brought by stating language level in terms of the typical activities that BE learners at particular levels are expected to perform with the language at their disposal: (1)

(2)

(3)

Identifying such prototypical performance writing activities would help instantiate, to the extent possible, a set of learning objectives for BE learners at different proficiency levels. Incorporating activity-based proficiency descriptions would help promote the authenticity of the scale by relating its content to the TLU domain, and in this case, written communication in business community. Focusing on concrete activities is instrumental in providing more concrete descriptors, according to which an activity-based linguistic audit could be carried out to facilitate language training and recruitment.

76

(4)

5 The Establishment of a Descriptive Scheme for the Business English …

Describing achievements in terms of task success would render the scale more acceptable and useful to its potential users. For instance, the descriptors can be used to formulate job descriptions and specify language requirements for new positions.

Therefore, it was decided to add business writing activities to the scale’s descriptive scheme. Arguably, such a balanced scale format, encompassing both theoretically-informed and activity-oriented categories, would allow the scale to function more effectively as a common yardstick of reference across a wide range of teaching, learning and assessment contexts.

5.2.2.2

The Identification of Typical BE Writing Activities

With respect to the writing skill in the business world, typical activities and tasks are always envisaged in terms of genres. For instance, in the professional domain, one’s ability to effectively manipulate a range of genres in written communication with clients, colleagues, superiors, subordinates and suppliers has become increasingly important (e.g., Bhatia & Candlin, 2000; Jameson, 2006). These professional needs naturally feedback into BE classrooms where the central goal is to prepare students to be linguistically competent before entering into the international business arena, prioritizing the objective to train students to be proficient in producing the most typical types of BE written genres (e.g., Johns, 2002; Zhang, 2013). Typical BE writing activities in this study were thus determined by identifying the most important types of BE written genres that Chinese university students are supposed to master before entering the business world. As discussed in Chap. 3, genres are conceptualized in quite different ways due to the existence of three influential genre schools: the New Rhetoric (NR) School, the Systemic Functional Linguistics (SFL) School, and the ESP School. Of the three, the SFL School and the ESP School are more accredited for their contribution to L2 writing instruction as both aspire to help students produce appropriate written discourses in the TLU domain. Different from SFL genrists who consider genres as linguistic strategies for achieving rhetorical goals such as “narrative”, “argument” and “recount”, the ESP school gives prominence to the communicative needs of particular academic and professional groups and focus on the teaching and learning of community-specific genres, such as “research article”, “literature review” in the case of EAP and “E-mail”, “business report” in the case of BE. In this study, the ESP approach to genre was adopted as it provides teachers with “a powerful way of understanding the writing needs of their students” and can help students unpack the requirements of the target communities that they will operate within (Hyland, 2004, p. 43). The task of identifying typical business written genres turned out to be challenging, given the fact that a generic genre could be subdivided into various subgenres depending on the purpose of writing (Swales, 1990). Take the genre of letter as an example. It could be further divided into apology letters, appreciation letters,

5.2 The Establishment of a BE Writing Descriptive Scheme

77

congratulation letters, invitation letters, referral letters, and resignation letters, to mention just a few. Taking into account the factor of formality, letters could also be classified as formal and informal ones. Similarly, report was also an umbrella term that encompassed feasibility report, trip report, progress report, among others. In this study, it was decided that only the broad macro-genres (e.g., report) rather than more specific sub-genres (e.g., feasibility report, progress report) were incorporated in the descriptive scheme as (1) it would be practically infeasible to go into too detailed categorization of genre types and (2) commonalities did exist among each generic type (Johns, 2002). Guided by this principle, the selection of typical genres was conducted by means of document analysis, which allowed for the synthesis of a gamut of publications exploring the written genres that could be incorporated into the descriptive scheme. Specifically, available documents were carefully examined to identify the language tasks that Chinese BE learners are likely to encounter in the TLU domain. The documents reviewed fall into four major types: (1) the writing tasks included in the writing sections of 10 professionally developed BE tests in China and abroad (see Appendix); (2) popular BE writing textbooks used in China (see Appendix) as well as relevant studies summarizing the genres included BE textbooks which were used both in China and internationally at the time of this study (e.g., Mo & Sun, 2010; Zhou, 2013); (3) the genre types included in BE scales (see Chap. 2 for the five scales reviewed in this book) and (4) empirical studies examining the writing needs of the international business community (e.g., Bhatia & Candlin, 2000, 2001; Taillefer, 2007; Wang et al., 2014; Yu, 2014). To ensure the representativeness of the genres selected, only genres that appeared in at least two of the four aforementioned types of documents (i.e. BE tests, BE writing textbooks, BE scales, and empirical studies) were included. During the review process, some genre types like “press release” appearing in both BE test (i.e. English for Business, Level 4) and BE writing textbook (Zhou, 2013) turned out to be quite straightforward and were chosen directly as a category incorporated in the scale’s descriptive scheme. As to the other genres which were referred to with a varying level of specificity in the documents reviewed, subject decisions were made to pin down the genre to be added to the descriptive scheme. For instance, “formal reports and informal reports” in a language scale (CLB) and “internal company report” in a BE test (English for Business, Level 3) were consolidated under the heading of “Report” as a scale category. The genres termed as “company leaflet” in a BE test (English for Business, Level 2), and “brochure, instruction & manual” in a BE textbook (Dong, 2016) were lumped and put under a newly created scale category of “Documental materials”. After several rounds of comparison and iteration, the process finally yielded ten genres, which were Letter, E-mail, Report, Memo, Minutes, Note, Press release, Resume, Summary and Documentary materials. All these genre-based categories were then incorporated into the scale’s descriptive scheme under the heading of “BE Writing Activities”. A multi-tiered descriptive scheme of the scale was hence generated, consisting of two broad categories: Genre Knowledge and Typical Writing Activities, with each one being further divided into several sub-categories (see Fig. 5.1). This descriptive

78

5 The Establishment of a Descriptive Scheme for the Business English … Vocabulary Grammar

Formal knowledge

Orthographical control* Genre format Cohesion & coherence

Genre Knowlege

Process knowledge

The initial descriptive scheme

Rhetorical knowledge Subject-matter knowledge

Strategic competence Sociolinguistic competence Intercultural competence Business knoweldge

Letter* E-mail* Report Note*

Typical Writing Activities

Minutes Memo Press release Summary Resume Documentary* materials

Fig. 5.1 The initial descriptive scheme and categories. Notes Minor revisions of the descriptive scheme were made in the later stages of the study, including: (1) Letter and E-mail were collapsed into one category as Letter/E-mail at Phase II of the study; (2) The categories of Documentary materials, News Release and Note were deleted from the scheme at phase II and phase IV of the study respectively; (3) The heading of the category Orthographical control was revised as Mechanics in phase IV of the study

5.2 The Establishment of a BE Writing Descriptive Scheme

79

scheme, together with a pool of relevant proficiency descriptors accumulated along the review process, was subsequently validated by a group of experienced university teachers of BE writing, and the details are reported in Chap. 6.

5.3 Conclusion Guided by the principle that a scale of language proficiency should be both theoretically grounded and practically useful, this chapter reports on how the descriptive scheme of a BE writing scale was established. Decisions on the scale categories were informed by the model of Genre Knowledge (Tardy, 2009), as well as reference to typical genre-based writing activities, yielding a multi-layer descriptive scheme incorporating 19 descriptive categories. Specifically, the four dimensions of the model of Genre Knowledge were operationalized as nine subcategories of the descriptive scheme: Vocabulary, Grammar, Orthographical control, Genre format, Cohesion & coherence, Strategic competence, Sociolinguistic competence, Intercultural competence and Business knowledge. An examination of the typical business written genres contributed to ten scale categories of the descriptive scheme, which were Letter, E-mail, Report, Memo, Minutes, Note, Press release, Resume, Summary and Documentary materials. It should be noted at this point that the scale categories were solely conceptual artifacts, and ESP writing was a very complicated process involving multiple dimensions. Therefore, distinguishing and clustering these categories was not to suggest that they operated in an isolated manner; rather, the purpose was to highlight the most salient aspects that might affect L2 learners’ BE writing performance so as to facilitate the teaching, learning, and assessment of BE writing.

References Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford University Press. Bachman, L. F., & Palmer, A. (1996). Language testing in practice. Oxford University Press. Bennett, J. M. (2015). The Sage encyclopedia of intercultural competence. Sage. Bhatia, V. K. (1993). Analyzing genre: Language use in professional settings. Longman. Bhatia, V. K. (2004). Worlds of written discourse: A genre-based view. Continuum International. Bhatia, V., & Candlin, C. (2000). Multifaceted needs analysis in business education: Dimensions of professional discourse: New challenges for ESP. In Paper presented at the Annual TESOL Convention, Vancouver, March 2000. Bhatia, V., & Candlin, C. (2001). Teaching English to meet the needs of business education in Hong Kong. Center for English language education and communication research, Department of English, Faculty of Humanities and Social Sciences. Bremner, S. (2006). Politeness, power, and activity systems: Written requests and multiple audiences in an institutional setting. Written Communication, 23(4), 397–423. Bremner, S. (2010). Collaborative writing: Bridging the gap between the textbook and the workplace. English for Specific Purposes, 29(2), 121–132.

80

5 The Establishment of a Descriptive Scheme for the Business English …

Canale, M., & Swain, M. (1980). Theoretical basis of communicative approaches to second language teaching and testing. Applied Linguistics, 1(1), 1–47. Cao, H. J. (2008). A corpus-driven study of business English. University of International Business and Economic Press. Chen, G.-M., & Starosta, W. J. (1998). Foundations of intercultural communication. Allyn & Bacon. Cheng, W., & Mok, E. (2008). Discourse processes and products: Land surveyors in Hong Kong. English for Specific Purposes, 27(1), 57–73. Chomsky, N. (1965). Aspects of the theory of syntax. MIT. Council of Europe. (2001). Common European framework of reference for language learning, teaching and assessment. Cambridge University Press. Davies, A. (1990). Principles of language testing. Basil Blackwell Publisher. Deardorff, D. K. (2006). Identification and assessment of inter-cultural competence as a student outcome of internationalization. Journal of Studies in International Education, 10(3), 241–266. Devitt, A. J. (1991). Intertextuality in tax accounting: Generic, referential and functional. In C. Bazerman & J. Paradis (Eds.), Textual dynamics of the professions: Historical and contemporary studies of writing in professional communities (pp. 336–357). The University of Wisconsin. Dong, X. (2016). Business English writing. University of International Business and Economics Press. Dou, W. (2007). Fundamentals of intercultural communication. University of International Business and Economics Press. Douglas, D. (2000). Assessing language for specific purposes. Cambridge University Press. Douglas, D. (2001). Language for specific purposes assessment criteria: Where do they come from? Language Testing, 18(2), 171–185. Dudley-Evans, T., & St John, M. (1998). Developments in ESP: A multi-disciplinary approach. Cambridge University Press. Faerch, C., & Kasper, G. (Eds.). (1983). Strategies in interlanguage communication. Longman. Flowerdew, J., & Wan, A. (2006). Genre analysis of tax computation letters: How and why tax accountants write the way they do. English for Specific Purposes, 25(2), 133–153. He, L., & Shi, L. (2012). Topical knowledge and ESL writing. Language Testing, 29(3), 443–464. Hedge, T. (2000). Teaching and learning in the language classroom. Oxford University Press. Horner, D., & Strutt, W. (2004). Analyzing domain-specific lexical categories: Evidence from the BEC written corpus. Cambridge Research Notes, 5, 6–8. Hu, W. Z. (2004). Crossing cultural barriers. Foreign Language Teaching and Research Press. Hyland, K. (2004). Genre and second language writing. University of Michigan Press. Jameson, D. (2006). Teaching graduate business students to write clearly about technical topics. Business Communication Quarterly, 69(1), 76–81. Jensen, A. (2009). Discourse strategies in professional e-mail negotiation: A case study. English for Specific Purposes, 28(1), 4–18. Jiao, J., Xu, D., & Zhao, X. (2020). The Development and validation of an intercultural business communication competence scale: Evidence from mainland China. SAGE Open, 1, 1–14. Johns, A. M. (Ed.). (2002). Genre in the classroom: Multiple perspectives. Lawrence Erlbaum. Lee, H. (2004). Constructing a field-specific writing test for an ESL placement procedure. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign. Leki, I. (2003). Living through college literacy: Nursing in a second language. Written Communication, 20(1), 81–98. Li, C. F., Yu, J., & Lin, J. (2015). Business English lexis: Description, identification and teaching of business English vocabulary—Based on frequency data of business English vocabulary collected on the Fortune website. Foreign Language World, 2, 83–96. Mo, Z. S., & Sun, W. J. (2010). A study of current business English teaching materials. Foreign Language Education, 31(5), 81–85. Nelson, M. (2000). The Business English lexis site. Retrieved from https://users.utu.fi/micnel/bus iness_english_lexis_site.htm

References

81

Nickerson, C. (2005). English as a lingua franca in international business contexts. English for Specific Purposes, 24(4), 367–380. Nicole, L. G., Mary, L. T., Garth, C., & Scott, B. W. (2011). Examining intercultural growth for business students in short-term study abroad programs: Too good to be true? Journal of Teaching in International Business, 2, 91–106. North, B. (2000). The development of a common framework scale of language proficiency. Peter Lang. North, B. (1994). Scales of language proficiency: A survey of some existing systems. Strasbourg: Council of Europe. Northcott, J., & Brown, G. (2006). Legal translator training: Partnership between teachers of English for legal purposes and legal specialists. English for Specific Purposes, 25(3), 358–375. Ober, S. (1992). Microwriting: A guidance approach to developing business writing skills. Bulletin of the Association for Business Communication, 55(1), 42–46. Phakiti, A. (2008). Construct validation of Bachman and Palmer’s (1996) strategic competence model over time in EFL reading tests. Language Testing, 25(2), 237–272. Schulz, R. A. (2007). The challenge of assessing intercultural understanding in the context of foreign language instruction. Foreign Language Annals, 40(1), 9–26. Selinker, L. (1972). Interlanguage. IRAL, 10(4), 209–230. Swales, J. M. (1990). Genre analysis: English in academic and research settings. Cambridge University Press. Swales, J. M., & Feak, C. B. (2004). Academic writing for graduate students: Essential tasks and skills. University of Michigan Press. Taillefer, G. F. (2007). The professional language needs of economics graduates: Assessment and perspectives in the French context. English for Specific Purposes, 26(2), 135–155. Tardy, C. M. (2006). Researching first and second language genre learning: A comparative review and a look ahead. Journal of Second Language Writing, 15(2), 79–101. Tardy, C. M. (2009). Building genre knowledge. Parlor Press. Tarone, E. (1980). Communication strategies, foreign talk and repair in interlanguage. Language Learning, 30(2), 417–428. Tedick, D. J. (1990). ESL writing assessment: Subject-matter knowledge and its impact on performance. English for Specific Purposes, 9(2), 123–143. UCLES. (2008). Business English certificates: Handbook for teachers. Retrieved from http://www. CambridgeESOL.org Wang, Y. Y., Wang, G. L., & Zheng, L. N. (2014). A study of the needs and cultivation pattern of business English majors. Foreign Language World, 2, 34–41. Wang, L., Ye, X., Yan, M., Peng, Q., & Xu, D. (2015). Interpretation of the key points of the national standards for business English Majors. Foreign Language Teaching and Research, 47(2), 297–302. Xian, X. (2009). An English course in business writing. Higher Education Press. Xu, D. J., & Jiao, J. (2017). A theoretical model of intercultural business communication competence. In L. Hale, R. Q. Liang, W. Cheng, & J. Zhang (Eds.), Conference proceedings of the international conference on Business Linguistics and Intercultural Business Communication (BL-IBC) (pp. 17–24). Yeung, E. (2007). In search of commonalities: Some linguistic and rhetorical features of business reports as a genre. English for Specific Purposes, 26, 156–179. Yu, J. Y. (2014). A study of students’ felt needs for BE courses: Implications for BE curriculum reconstruction. Foreign Language World, 2, 25–57. Zhang, Z. C. (2007). Towards an integrated approach to teaching Business English: A Chinese experience. English for Specific Purposes, 26(4), 399–410. Zhang, Z. C. (2013). Business English students learning to write for international business: What do international business practitioners have to say about their texts? English for Specific Purposes, 32(3), 144–156.

82

5 The Establishment of a Descriptive Scheme for the Business English …

Zhang, Z. C. (2017). Learning business English in China: The construction of professional identity. Palgrave Macmillan. Zhang, Z. C., & Wang, L. F. (2011). Curriculum development for business English students in China: The case of UIBE. The Asian ESP Journal, 7(1), 10–28. Zhou, W. (2013). The study on teaching materials for business writing: Textbook design based on genre theory. Unpublished Doctoral Thesis, Shanghai International Studies University.

Chapter 6

Evaluating the Quality of the Descriptive Scheme and Descriptors: Teacher Perspectives

This chapter reports on how a group of experienced BE writing teachers in Chinese universities examined the relevance of the descriptive scheme and a pool of descriptors developed in the first phase of the study to the Chinese tertiary context. Specifically, two workshops were organized to evaluate the descriptive scheme and a pool of descriptors accumulated by the authors. In Workshop I, the teachers discussed whether the categories of the descriptive scheme served as useful parameters for describing BE writing proficiency. In Workshop II, the teachers examined whether the descriptors were informative enough to gauge the proficiency levels of BE learners in Chinese universities. The process yielded a revised descriptive scheme of 17 categories and 93 descriptors to be used for the next phase of this study.

6.1 Research Procedures Descriptive schemes and descriptors are two essential components of a language proficiency scale (Council of Europe, 2001). As illustrated in Fig. 5.1 in Chap. 5, in the first phase of this study, a descriptive scheme of the BE writing scale was established by resorting to the theoretical model of Genre Knowledge (Tardy, 2009) and a review of the literature exploring typical BE writing activities. In this phase of the study (i.e. Phase II), a pool of descriptors was firstly collected by the authors via exploiting sources such as existing language scales, BE curricular requirements, textbooks and recruitment advertisements for BE graduates. The collected descriptors were tentatively put under the descriptive scheme (see Fig. 5.1) developed in the first phase of this study. As to the scale categories where few existing descriptors were directly usable, new ones were drafted by the authors. Next, a group of 15 BE teachers from Chinese universities were invited to examine the relevance of the descriptive scheme and the descriptors to the Chinese tertiary context. Specifically,

© Springer Nature Singapore Pte Ltd. 2021 L. Wang and J. Fan, Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach, https://doi.org/10.1007/978-981-16-5449-7_6

83

84

6 Evaluating the Quality of the Descriptive Scheme …

the teachers attended two workshops, with the first one devoted to the evaluation of the descriptive scheme and the second one the evaluation of the descriptors. In Workshop I, the teachers examined the appropriateness of the categories included in the descriptive scheme. Based on the discussion results, decisions were made concerning which scale categories should be retained or modified, yielding a revised version of the descriptive scheme. Following Workshop I, the teachers were given one month to review all the collected descriptors based on a handout that was provided to them (see examples in Table 6.2). Specifically, they were asked to examine whether the descriptors were (1) put under the correct category, and (2) effective in reflecting Chinese university students’ BE writing proficiency. After that, they attended Workshop II, which spanned three sessions. In the first and second sessions, the teachers worked in three small groups, with five in each group, to select the well-written descriptors. In the third session, they examined each of the descriptors selected by the three groups again to cherry-pick those with more desirable quality. This process yielded a pool of 93 descriptors to be used for the next phase of this study. Figure 6.1 below illustrates the research procedures of the teachers’ evaluation of the descriptive scheme and descriptors through the two workshops.

Fig. 6.1 Research procedures of teachers’ evaluation of the descriptive scheme and descriptors

6.1 Research Procedures

85

6.1.1 Instruments 6.1.1.1

Guiding Questions

To facilitate the teachers’ evaluation of the descriptive scheme, several pre-designed questions were prepared, aimed at exploring their opinions about the feasibility to substantiate the descriptive categories with descriptors across different proficiency levels. Specifically, these guiding questions were formulated to explore (1) whether the descriptive categories represented the essential parameters for profiling BE writing proficiency, and (2) whether these categories could be described in terms of different levels of university BE learners’ writing proficiency.

6.1.1.2

Handouts Facilitating Teacher Discussion

Two handouts containing all the potentially useful descriptors were developed to facilitate teachers’ evaluation of these proficiency statements. The first handout (see examples in Table 6.1) was developed to facilitate teachers’ individual evaluations of the descriptors along two dimensions: Category Appropriateness and Descriptor Quality. Space was also provided for the teachers to note down their opinions, as well as rationales for the descriptors that they considered problematic. The codes used for descriptor evaluation are presented in Table 6.2. The second handout was created for recording teachers’ discussion results in the first and second sessions of Workshop II, in which the teachers were divided Table 6.1 A Sample of the handout for individual evaluation of descriptors Descriptor (category) Demonstrates some control of basic syntactic patterns. (Grammar) Demonstrates familiarity with business specialist terms (e.g., offer, bill), whose meanings are very different when they are used in the non-business context. (Vocabulary)

Table 6.2 Codes used for individual evaluation of descriptors

Category appropriateness √

Descriptor quality

Rationale

×

The word “syntactic” is too technical

√

√

Code √

Category appropriateness

Descriptor quality

Correct category

Clear and useful

×

Wrong category

Unclear/unhelpful

?

Uncertain

Uncertain

86

6 Evaluating the Quality of the Descriptive Scheme …

Table 6.3 A sample of the handout for group evaluation of descriptors Scale category: vocabulary Original descriptor

Treatment

Rationale New descriptor

1. Has a basic vocabulary repertoire of isolated Use/revise/delete words and phrases related to general business topics 2. Can use a wide range of business vocabulary fluently and flexibly to convey precise meanings

Use/revise/delete

3. Demonstrates an awareness of current business terminology

Use/revise/delete

into three small groups of five to carry out group evaluation of the descriptors. The handout presented all the descriptors put under their respective categories, and each group was asked to mark whether a descriptor should be retained directly for later use, revised or deleted by ticking “use”, “revise” or “delete” under the column of “Recommendation”. A third column “Rationale” was also provided in the handout for the teachers to note down their reasons for any suggested revisions. A sample of the handout with three exemplar descriptors is presented in Table 6.3.

6.1.2 Participants Altogether 15 teachers participated in this phase of the study (denoted as T1-15 in this study). Of them, 10 teachers had a master’s or doctoral degree in Applied Linguistics, and 5 had a bachelor’s degree in English Language and Linguistics and a master’s degree in business-related subjects such as Marketing and International Trade. All teacher participants had over 10 years’ experience in teaching businessrelated English courses (e.g., BE reading, BE listening) and at least 5 years’ experience in teaching BE writing. Recommended by their deans as outstanding teachers of BE writing, all of them demonstrated a keen understanding of the interdisciplinary nature of BE writing and had experiences collaborating with teachers of businessrelated disciplines for delivering business courses in English. They were familiar with the teaching and learning objectives for both BE and non-BE majors as their students included both types. In addition, all the teachers had plentiful experience in marking students’ BE writing assignments, which helped to sensitize them to the strengths and weaknesses of BE learners in Chinese universities.

6.1 Research Procedures

87

6.1.3 Data Collection Both workshops took place in a conference room, facilitated by one of the authors. In Workshop I, the facilitator first introduced to the teachers the purpose of the study, which was to develop a proficiency scale of BE writing for BE learners in Chinese universities. As most of these teacher participants considered language proficiency scales as equivalent to rating scales used in standardized tests such as the CET, the facilitator provided training to the teachers to familiarize them with the concept of language proficiency scales. Specifically, she explained the functions that language scales might serve and highlighted in particular the different functions performed by rating scales and proficiency scales. She also showed examples of language proficiency scales such as the CEFR to help the teachers understand how a proficiency scale might look like. Next, the descriptive scheme of the BE writing scale was shown to the teachers, who commented on the appropriateness of the descriptive categories included in the descriptive scheme and made necessary revisions or edits. After Workshop I, the teachers were asked to evaluate individually (1) whether the descriptors were put under the correct category, and (2) whether the descriptors were effective in reflecting the Chinese university students’ writing proficiency based on a handout (see examples in Table 6.2). In workshop II, the teachers carried out their discussions in small groups of five (in the first two sessions) and as a whole (in the third session). The facilitator observed all the discussions and encouraged the teachers to express opinions concerning the descriptor quality, particularly in areas where different views were voiced. All the discussions were conducted in Chinese, the participants’ first language, though they were free to use English as desired. Workshop I and each session of Workshop II lasted approximately 3 to 4 hours. With the consent of the participants, both workshops were digitally recorded and subsequently transcribed verbatim for analysis.

6.1.4 Data Analysis The qualitative data were thematically coded following the procedures recommended by Corbin and Strauss (2008), which allowed the generation of themes reflecting the uniqueness of the research context. Specifically, several preliminary readings were conducted to gain an overview of the data. The analysis focused on the teachers’ comments on the descriptive scheme as well as the descriptor quality. The transcripts were then scrutinized to compile a list of categories summarizing the major themes emerged out of the analysis, including (1) the teachers’ comments on the usefulness of the descriptive scheme; (2) the teachers’ suggestions for modifications of the descriptive categories; (3) the teachers’ suggestions for modification of the descriptors. To ensure the trustworthiness of the coding results, the authors went through a second round of coding about two weeks later. In addition, a research

88

6 Evaluating the Quality of the Descriptive Scheme …

assistant was invited to act as the second coder. When discrepancy surfaced, the coders discussed the coding results until they reached a consensus.

6.2 Initial Accumulation and Categorization of BE Writing Descriptors Along with the establishment of the descriptive scheme (see Chap. 5 for detailed information), a pool of descriptors were also constructed and tentatively put under their respective categories, before being evaluated in the teacher workshops. According to Pan and Zou (2020), literature review and descriptor formulation are two major approaches to collecting scale descriptors. Literature review refers to the direct appropriation of descriptors from existing documents such as language proficiency scales, curricular requirements, and teaching syllabi. Descriptor formulation is an approach characterized by the formulation or adaptation of descriptors. In this study, both methods were employed to accumulate descriptors. Using the first method of literature review, a host of sources (see Appendix) were explored that fell into the following four categories: (1)

Existing language proficiency scales

When constructing a scale of language proficiency, it is customary to use descriptors from existing scales as a starting point, which are instrumental in reflecting the consensus over achievement in foreign language learning at different levels of competence (North, 2000). Since the aim of this study was to create a transparent and coherent scale describing BE writing proficiency, we conducted a comprehensive survey of 12 existing language proficiency scales such as the ELTDU scale and the CEFR, where we identified and collected descriptors related to BE writing. (2)

BE tests

Language tests usually offer tasks with graded difficulties and corresponding performance descriptions, rendering them hence an important avenue for extracting descriptors. In this study, we reviewed ten internationally and nationally renowned business tests such as BEC and TOEIC and related materials where BE writing proficiency might be found. In particular, we paid attention to the rating criteria of the writing sections of these BE tests, level descriptions associated with different test score ranges, as well as handbooks to test candidates where ability descriptions of BE learners could be found. (3)

Pedagogical materials

BE teachers rely on various documents to guide classroom teaching, and an important aspect addressed by such pedagogical sources are teaching and learning objectives that can be readily translated into proficiency descriptors. As such, we scrutinized 13

6.2 Initial Accumulation and Categorization of BE Writing Descriptors

89

websites (e.g., www.businessenglishonline.net) that offered pedagogical guidance to the teaching and learning of BE writing, 15 business textbooks (e.g., Xian 2009) published in China or abroad that were commonly employed in Chinese universities, and 3 curricula designed for BE-related language programs or courses to gather descriptions of BE writing (e.g., The National Curriculum for BA Program in Business English of China). (4)

Recruitment advertisements

Recruitment advertisements constituted another source where job-related demands concerning BE writing proficiency could be located. We thus examined two major recruiting websites in China (i.e. www.51job.com; www.zhaopin.com) and searched keywords like “Business English” or “English writing ability” to identify usable descriptors of BE writing. As discussed in Chap. 5, a descriptive scheme was built on the basis of the model of Genre Knowledge (Tardy, 2009) and an informed selection of typical BE writing activities. After collecting descriptors from the above-mentioned sources, we then recorded information of all the potentially useful descriptors by adopting a consistent format, including the descriptor, the type of source where the descriptor was found, the exact source from which the descriptor was retrieved, and the scale category under which the descriptor should be put. Table 6.4 exemplifies how information of the descriptors selected from the four types of sources was documented in the study. After the first round of descriptor collection and categorization, we found that some scale categories of the descriptive scheme were inadequately addressed in these sources, that is, few existing descriptors could be directly retrieved to substantiate the scale categories. We therefore adopted the second approach to descriptor collection, Table 6.4 Descriptor documentation format Descriptor

Type of source

Source

Scale category

Can write notes including times, dates and places

Language scales

ALTE overall breakthrough

Typical writing activities: note

CAN write a simple BE tests report of a factual nature and begin to evaluate, advise, etc

BEC can-do summary (B2)

Typical writing activities: report

Acquire knowledge Pedagogical about other cultures, sources especially the culture from which your readers may come

BE textbook (Xian, 2009, p. 8)

Genre knowledge: Intercultural competence

Is familiar with foreign trade operation process and related laws and regulations

Recruitment website (www.51job.)

Genre knowledge: Business knowledge

Recruitment advertisements

90

6 Evaluating the Quality of the Descriptive Scheme …

namely descriptor formulation, to draft more descriptors. As exemplified in Chap. 5, the new descriptors were formulated by resorting to the following two methods. The first method was to incorporate business elements into EGP-oriented descriptors. For instance, the CEFR descriptor “Can create coherent and cohesive text making full and appropriate use of a variety of organizational patterns and a wide range of cohesive devices (CEFR, Coherence and Cohesion, C2)” was adapted by changing “text” into “business text”, producing hence a new descriptor for this study “Can create coherent and cohesive business text, making full and appropriate use of a variety of organizational patterns and a wide range of cohesive devices (Scale category: Cohesion and coherence). The second method was to convert descriptions of BE writing proficiency from BE tests, pedagogical sources, and findings of empirical studies about the features of BE writing into scale descriptors. For instance, a new descriptor was developed on the basis of the BEC Handbook for Teachers. In this handbook, it was specified that BEC candidates should be able to handle “general business topics, such as management skills, promotional strategies, training courses, conferences, business services, recruitment, rewarding staff, sponsorship, business premises, company budgets, use of IT in business, etc.” (UCLES, 2008, p. 4). We then formulated the descriptor “Has a sound understanding of general business topics, such as promotional strategies, conferences, business services, recruitment, etc.” by retaining information of the original descriptor most relevant to the pre-experience BE learners at Chinese universities. In addition, we also formulated new descriptors by consulting empirical studies exploring BE writing. For instance, we drafted the descriptor “Can locate and incorporate available materials to enhance the overall effectiveness of the writing. For instance, can make appropriate use of quotations from an authorized organization to enhance the effectiveness of the writing without losing its apparent objectivity (Scale category: Strategic competence)” by consulting Zhang (2013) that highlighted the role played by referential intertextuality as an important linguistic realization of BE writer’s sociolinguistic competence. The initial process of descriptor accumulation yielded 246 descriptors which, as mentioned, were evaluated by a group of 15 BE writing teachers through two workshops, together with the descriptive scheme developed in the previous phase of the study.

6.3 Evaluating the Appropriateness of the Descriptive Scheme and Descriptors With a descriptive scheme and a pool of descriptors, as mentioned, two workshops were held, in which 15 BE college teachers evaluated the appropriateness of the descriptive scheme as well as the quality of the descriptors.

6.3 Evaluating the Appropriateness of the Descriptive Scheme …

91

6.3.1 Evaluating the Relevance of the Descriptive Framework to the Chinese Tertiary Context In the first workshop, all the teacher participants worked together to evaluate (1) whether the categories included in the descriptive scheme were appropriate in profiling Chinese college students’ BE writing ability; and (2) whether these descriptive categories could be described in terms of different levels of university BE learners’ writing proficiency. The author firstly briefed teachers on the purpose of the study and unpacked important concepts to them such as language proficiency scales. Then the teachers were provided with the descriptive scheme developed in the first phase of the study (see Fig. 5.1), consisting of two broad categories: Genre Knowledge (further divided into sub-categories of Vocabulary, Grammar, Orthographical Control, Genre Format, Cohesion and Coherence, Strategies Competence, Sociolinguistic Competence, Intercultural Competence and Business Knowledge) and Typical BE Writing Activities (further divided into sub-categories of Letter, E-mail, Report, Memo, Minutes, Note, Press release, Resume, Summary and Documentary materials). The teachers were asked to scrutinize and comment on the relevance of the descriptive categories to the Chinese tertiary context. They were also encouraged to offer rationales for categories which they considered inappropriate or problematic. Qualitative analysis of their opinions revealed three themes, relating to: (1) the usefulness of the theory-based scale categories under the heading of “Genre Knowledge”; (2) the inclusion of workplace BE written genres rather than academic BE genres, and (3) the need to revise three scale categories: “Letter”, “E-mail” and “Documentary materials”. Details of these findings are presented below. (1)

The usefulness of the theory-based scale categories under the heading of “Genre Knowledge”

After examining the sub-categories put under the category of Genre Knowledge, the teacher participants commented positively on the multi-dimensional structure of the descriptive scheme. In particular, they deemed it necessary to incorporate all these dimensions for portraying Chinese university students’ BE writing proficiency. For example, T7 noted that “it is valuable to decompose BE into its constituents, and this can serve as a detailed map to orient teachers in the process of teaching BE writing”; In a similar vein, T1 remarked that “students can have a clearer idea as to the areas that they should improve upon in order to be proficient BE writers”. They especially favored the inclusion of the categories such as Intercultural competence and Business knowledge, which were also highlighted by the National Curriculum stipulated for BE undergraduates across China as essential to BE education (Wang et al., 2011). T4, for example, commented that “BE learners should be different from English majors by stressing such non-linguistic knowledge”. Echoing the comment of T4, T8 argued that “BE writing differs from general English writing as it demands a more nuanced understanding of sociocultural matters and business content knowledge”. (2)

The inclusion of workplace BE written genres rather than academic genres

92

6 Evaluating the Quality of the Descriptive Scheme …

The teacher participants voiced some concerns over the sub-categories put under the category of BE Writing Activities. The initial descriptive scheme included ten written genres that BE learners are likely to engage with in the occupational context. During the workshop discussion, T2 raised an issue as to whether academic BE genres should be added to the descriptive scheme since “BE learners in some Chinese universities are asked to write academic theses as a precondition for graduation”. After in-depth discussions, the teachers concluded that it would be unnecessary to include academic BE genres in light of the following two rationales. Firstly, workplace genre writing constituted the predominant content of BE teaching in the Chinese tertiary context and “a growing number of universities have started to replace academic theses with practical reports as requirements for graduation” (T13). Secondly, BE thesis writing belonged to an academic writing genre that “should be dealt with by academic writing scales rather than BE writing scales” (T6). Interestingly, the teachers’ comments corresponded largely with Dudley-Evans and St John (1998) who contended that English for Academic Purposes (EAP) tasks demonstrate significant commonalities across disciplines. Therefore, EAP genres such as thesis writing was not added to the BE writing scale in this study. (3)

The need to revise the scale categories “Letter”, “E-mail” and “Documentary materials”

In Workshop I, the teachers also weighed upon the appropriateness of the activitybased scale categories. Specifically, they discussed (1) whether the most important BE written genres were included, and (2) whether it was feasible to describe these categories using descriptors reflective of different proficiency levels. On the whole, the teachers considered the inclusion of Report, Memo, Minutes, Note, News release, Resume and Summary in the descriptive scheme appropriate since “they are the genres that students are very likely to encounter in the workplace and therefore should be taught to prepare the students for meeting their future writing demands” (T9). By contrast, the three scale categories of “Letter”, “E-mail” and “Documentary materials” were considered problematic by the teachers. Most teachers commented that in the digitalized world of business, “Letter” and “E-mail” had become increasingly indistinguishable in that “most letters are now sent by means of E-mails” (T10). In light of the overlapping characteristics shared by Letter and E-mail in the digitalized world of business, the teacher participants suggested combining them in one genre category. A third problematic scale category “Documentary materials” was originally conceptualized as one genre containing a set of sub-genres such as leaflets and company brochures. However, the teachers felt it difficult to identify the commonalities of these text types, let alone providing proficiency descriptors spanning several levels. T11, for instance, mentioned that “this is a vague category that can refer to any documented file in a company. It would conflate too many elements for it to be described in terms of progression in proficiency”. Following T11’s suggestion, the teachers agreed that this category be deleted from the descriptive scheme.

6.3 Evaluating the Appropriateness of the Descriptive Scheme …

93

Findings that emerged in the first workshop thus served as the basis for revising the descriptive scheme. Specifically, all the nine descriptive categories under the heading of Genre Knowledge were considered appropriate and thus not revised. As to the descriptive categories under the heading of BE Writing Activities, the sub-categories of “Letter” and “E-mail” were collapsed into one category “Letter & E-mail”, due to their commonalities in the increasingly digitalized world of business, as indicated by the teachers. In addition, the category “Documentary materials” was deleted due to the difficulty involved in providing graded proficiency descriptors for this category. Consequently, eight genres (i.e. Letter, E-mail, Report, Memo, Minutes, Note, News release, Resume and Summary) were retained for the broad category of “BE Writing Activities”. The revised descriptive scheme contained hence 17 scale categories, as graphically presented in Fig. 6.2. It should be noted that as a result of the workshop discussions, some descriptors were eliminated or combined when the scale categories were deleted or integrated.

6.3.2 Evaluating the Relevance of the Descriptors to the Chinese Tertiary Context Workshop II was held one month after the completion of Workshop I (see Fig. 6.1). During the one-month interval, we gave each teacher a handout (A sample is presented in Table 6.2) for them to examine (1) whether the descriptors were put under the correct category, and (2) whether the descriptors were effectively drafted to reflect the Chinese university students’ writing proficiency. Workshop II was implemented across three sessions. During the first and second sessions, the teachers examined the quality of the descriptors in small groups of five, made necessary selections and revisions to identify high-quality descriptors for later use. Specifically, the teachers compared their individual evaluations of the descriptors with their group members. The descriptors which were considered as properly categorized and well-written by all the group members were kept for use in the third session of Workshop II. For the descriptors which were considered problematic either in terms of category categorization or descriptor quality, the group members carried out further discussion and collectively determined whether the descriptors should be retained as they were, reworded, or deleted. The group discussion results derived from the two sessions were recorded in a handout (see examples in Table 6.3). After the first two sessions of Workshop II, the descriptors selected by the small groups were submitted to the first author who synthesized the group findings and created a new handout (A sample section is presented in Table 6.5), recording the recommendations from the three groups for each descriptor as well as their underlying rationales. In the third session of Workshop II, the teacher participants were given a copy of the handout in which group evaluations of the descriptors were presented. For the descriptors that all groups deemed appropriate, they were retained for later use as they were; for descriptors on which the groups failed to reach an agreement,

94

6 Evaluating the Quality of the Descriptive Scheme … Vocabulary

Grammar

Formal knowledge

Mechanics Genre format Cohesion & coherence

Genre Knowlege

Process knowledge

Revised descriptive scheme

Rhetorical knowledge Subject-matter knowledge

Strategic competence Sociolinguistic competence

Intercultural competence Business knoweldge

Letter & E-mail Report Note

Business Writing Activities

Minutes Memo Press release Resume Summary

Fig. 6.2 The revised descriptive scheme of the BE writing scale

further discussions were carried out to determine how they should be addressed. The iterative rounds of discussion finally yielded a pool of 93 high-quality descriptors to be used in the next phase of this study. Analysis of the data collected from the three sessions of Workshop II revealed that in terms of the problematic descriptors revised for later use, the teachers in general employed six methods of descriptor modification to enhance their quality for the development of a BE writing scale for Chinese university students. The six methods

6.3 Evaluating the Appropriateness of the Descriptive Scheme …

95

Table 6.5 A Sample section of the handout for session 3 of Workshop II Scale category: vocabulary Descriptor

Recommendation Rationale

1. Has a basic vocabulary repertoire of isolated words and phrases related to general business topics

Group I: Group II: Group III:

Group I: Group II: Group III:

2. Can use a wide range of business vocabulary fluently and flexibly to convey precise meanings

Group I: Group II: Group III:

Group I: Group II: Group III:

3. Demonstrates an awareness of current business terminology

Group I: Group II: Group III:

Group I: Group II: Group III:

identified during the workshops, together with exemplar descriptors, are illustrated below. (1)

Removing technical language

While scrutinizing the descriptors, several teachers reported that some descriptors contained technical terms, which might hinder non-specialists’ comprehension. The descriptor “Demonstrates some control of basic syntacticpatterns. (ISLPR, writing 1 + , elementary proficiency plus; CEFR: A2)” was a case in point as it contained the word “syntactic”, considered by many teachers as too technical for laymen to understand. T1, for instance, maintained that “the word ‘syntactic’ might only be familiar to people who have linguistics or language teaching background”. Acknowledging T1’s concern, the teachers therefore decided to replace the technical terms with more reader-friendly ones. T5’ suggestion to change “syntactic patterns” in the descriptor into “grammar and sentence patterns” was followed, yielding a revised descriptor “Shows control of a few simple grammatical structures and sentence patterns”. (2)

Splitting or simplifying descriptors that can be put under more than one category

In Workshop II, the teachers also flagged some descriptors on the ground that they juxtaposed too many factors and failed to serve as a definite criterion statement. The descriptor “Spelling, punctuation and layoutare accurate enough to be followed most of the time (CEFR, Orthographic control, B1)” was a typical example. As T6 pointed out, “spelling, punctuation and layout touch upon different aspects of BE writing and should not be placed in the same descriptor”. T7 and T9 echoed her view, saying that “‘layout’ should be addressed under the scale category of ‘Genre format’ (T7) and “even though ‘punctuation’ and ‘spelling’ could be under put the category of ‘Orthographic control’, they should be treated in two separate descriptors”. These concerns then led the teachers to revise the descriptor as “Spelling is accurate enough to be followed most of the time” by retaining only “Spelling” as the only subject of description in this descriptor.

96

(3)

6 Evaluating the Quality of the Descriptive Scheme …

Integrating or deleting descriptors expressing similar ideas

Two descriptors taken from two language proficiency scales, namely the ISLPR and the CEFR, were considered by the teachers as containing overlapping information. The two descriptors read: “Demonstrates consistent control of compound and complex structures. (ISLPR, writing 3, general professional proficiency)” and “Can consistently maintain a high degree of grammatical accuracy; errors are rare and difficult to spot. (CEFR, Grammatical accuracy, C1)” respectively. As T10 pointed out, “both descriptors emphasize consistent control of English grammar, which naturally involve factors such as grammatical accuracy and complexity”. Following T11’s argument that “control of grammar often entails accuracy and absence of errors”, the teachers combined the two descriptors into one, which says “Demonstrates consistent control of complex structures”. (4)

Providing examples to improve descriptor clarity

While trying to collect descriptors for the scale category “Strategic competence”, the authors explored numerous BE teaching materials and found that the ability to present written information adroitly to maximize its communicative effectiveness featured prominently in these materials. Since no directly usable descriptors could be retrieved, the authors drafted the descriptor “Can design and present information in a way that the most important messages are highlighted” to reflect this feature of BE writing (see Sect. 5.2.1 for details). In the workshop, several teachers observed that this descriptor was a bit vague because, according to T3, “the expression ‘design and present information in a way’ does not specify the exact approaches commonly employed to highlight written messages”. The teachers then decided to add an example to facilitate the readers’ understanding of the descriptor, which was thus revised as “Can design and present information in a way that the most important messages are highlighted. For instance, can selectively apply bullets or numbers to call attention to important points”. (5)

Revising descriptors to enhance their relevance to the business context

As mentioned earlier, many of the descriptors were collected from existing language scales. One problem with these descriptors was that they were mainly EGP- oriented (see Chap. 2 for a detailed discussion). Take the descriptor “Has a basic vocabulary repertoire of isolated words and phrases. (CEFR, Vocabulary range, A1)” as an example. It only touched upon the learners’ lexical competence in the general communication context. T12 then advised modifying the descriptor “to reflect the lexical requirements specific to the business domain”. Following his suggestion, the teachers adapted the original descriptor as “Has a basic vocabulary repertoire of isolated words and phrases related to general business topics” to highlight the importance for BE learners in China to master business-related lexicons.

6.3 Evaluating the Appropriateness of the Descriptive Scheme …

(6)

97

Reformulating descriptors to highlight their relevance to the Chinese context

As discussed in Chap. 2, existing language scales were mostly developed in Englishspeaking or ESL contexts, with their descriptors reflecting the features of those contexts as different from the Chinese EFL context. The teachers thus made collective efforts to enhance the relevance of many potentially useful descriptors to the Chinese tertiary settings. For instance, the descriptor “Communicates with reasonable accuracy in familiar contexts; generally good control though with noticeable mother tongue influence. (CEFR, Grammatical accuracy, B1)” was collected from the CEFR which was originally designed to promote pluralism in the European countries (Council of Europe, 2001). In the workshop, T14 pointed out the need to replace the general term “mother tongue influence” in this descriptor with “Chinese language influence”. In addition, as the descriptor was put under the scale category of Grammar, the word “grammatical” was added to avoid confusion as “accuracy can also refer to lexical accuracy” (T11). This descriptor was therefore revised as “Communicates with reasonable grammatical accuracy in familiar contexts; apparent Chinese language influencedoes not lead to misunderstanding” in the workshop sessions. The six descriptor modification methods, together with examples, are summarized in Table 6.6.

6.4 Conclusion This chapter presents how the descriptive scheme and descriptors developed by the authors for the development of a BE writing scale were evaluated by a group of experienced BE writing teachers from Chinese universities. Regarding the descriptive scheme, teacher evaluation results revealed that the scale categories under the heading of “Genre Knowledge” were all well constructed, while those under the heading of “BE Writing Activities” should be revised by collapsing “Letter” and “E-mail” into one scale category and the category of “Documental materials” should be deleted. As a result, the revised descriptive scheme contained 17 scale categories, put under the heading of Genre Knowledge and BE Writing Activities respectively (see Fig. 6.2). After evaluation of the descriptive scheme, a second workshop was conducted across three sessions, in which the teachers examined the quality of the descriptors in small groups of five, and as a whole. Scrutinizing all the descriptors through iterative rounds of discussions, the teachers made selections, revisions, deletions and additions to

98

6 Evaluating the Quality of the Descriptive Scheme …

Table 6.6 Methods and examples of descriptor modification Modification/formulation method

Descriptor category

Original descriptor

Revised descriptor

Removing technical language

Grammar

Demonstrates some control of basic syntactic patterns

Shows control of a few simple grammatical structures and sentence patterns

Splitting or simplifying Orthographic descriptors that can be put control under more than one category

Spelling, punctuation and layout are accurate enough to be followed most of the time

Spelling is accurate enough to be followed most of the time

Integrating or deleting similar descriptors

Grammar

Demonstrates consistent Demonstrates consistent control of compound control of complex and complex structures structures Can consistently maintain a high degree of grammatical accuracy; errors are rare and difficult to spot

Providing examples to improve descriptor clarity

Strategic competence

Can design and present information in a way that the most important messages are highlighted

Can design and present information in a way that the most important messages are highlighted. For instance, can selectively apply bullets or numbers to call attention to important points

Revising descriptors to enhance their relevance to the business context

Vocabulary

Has a basic vocabulary repertoire of isolated words and phrases

Has a basic vocabulary repertoire of isolated words and phrases related to general business topics

Communicates with reasonable accuracy in familiar contexts; generally good control though with noticeable mother tongue influence

Communicates with reasonable grammatical accuracy in familiar context; apparent Chinese language influence does not lead to misunderstanding

Reformulating descriptors Grammar to highlight their relevance to the Chinese context

enhance the descriptors’ relevance to the Chinese tertiary context. Six methods were found to be employed by the teachers during the process, yielding a pool of 93 descriptors to be used in the next phase of the study.

References

99

References Corbin, J., & Strauss, A. (2008). Basics of qualitative research: Techniques and procedures for developing grounded theory. Sage. Council of Europe. (2001). Common European framework of reference for language learning, teaching and assessment. Cambridge University Press. Dudley-Evans, T., & St John, M. (1998). Developments in ESP: A multi-disciplinary approach. Cambridge University Press. North, B. (2000). The development of a common framework scale of language proficiency. Peter Lang. Pan, M., & Zou, S. (2020). China English Standards–-Development of the writing scale. Higher Education Press. Tardy, C. M. (2009). Building genre knowledge. Parlor Press. UCLES. (2008). Business English certificates: Handbook for teachers. Retrieved from http://www. CambridgeESOL.org. Xian, X. (2009). An English course in business writing. Higher Education Press. Wang, L., Chen, Z., & Zhang, Z. (2011). Developing national curriculum for BA program in Business English of China. The Asian ESP Journal, 7(1), 29–61. Zhang, Z. C. (2013). Business English students learning to write for international business: What do international business practitioners have to say about their texts? English for Specific Purposes, 32(3), 144–156.

Chapter 7

Developing the Business English Writing Proficiency Scale

This chapter focuses on how the draft version of a BE writing proficiency scale was developed based on the descriptors that were carefully selected by 15 Chinese university teachers, as reported in Chap. 6. To empirically determine the difficulty levels of the descriptors, a questionnaire was constructed and distributed to Chinese university students who had received systematic and comprehensive training on BE writing. The Rasch measurement model was utilized to calibrate the difficulty values of the descriptors. This was followed by an assessment of the psychometric properties of the scale as well as a scrutiny of its content, resulting in the draft version of the BE writing scale. A global scale was also constructed to present a holistic overview of the multi-dimensional BE writing scale.

7.1 Research Procedures As reported in Chap. 6, two workshops were conducted in which 15 experienced teachers of BE writing in Chinese universities examined the relevance of the descriptive scheme and descriptors to BE education in the Chinese tertiary context. During this process, the teachers selected the most relevant descriptors and made revisions where necessary, thereby yielding a pool of 93 descriptors. To empirically assign the selected descriptors into appropriate proficiency levels, they were incorporated into a questionnaire, and Chinese university students were invited to evaluate the difficulty level of the proficiency descriptions indicated by each descriptor.

© Springer Nature Singapore Pte Ltd. 2021 L. Wang and J. Fan, Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach, https://doi.org/10.1007/978-981-16-5449-7_7

101

102

7 Developing the Business English Writing Proficiency Scale

7.1.1 Instrument The questionnaire consisted of two parts. The first part was designed to collect the participants’ demographic data, including their age, major and grade. The second part was comprised of the 93 descriptors identified in the teacher workshops as clear, well-written, and relevant to the Chinese tertiary context (see Chap. 6). In the questionnaire, these descriptors were presented under their respective categories such as Vocabulary and Report (a sample section is presented in Table 7.1). The respondents were asked to rate the perceived difficulty level associated with each descriptor on a five-point Likert scale, with 1 denoting “very easy” and 5 “very difficult”. The questionnaire was presented in Chinese, the participants’ first language, to facilitate their understanding of the items. As many descriptors were accumulated from language scales written in English, they were translated into Chinese by the first author. Two College English teachers were invited to check the clarity of the Chinese translation of the descriptors and their equivalence to the original English version. Prior to the main study, the questionnaire was piloted among 30 university students whose characteristics were similar to the target population, and minor revisions were made to the questionnaire based on the students’ feedback.

7.1.2 Participants The purposive and convenience sampling method was adopted and students from four universities were selected to participate in the study (Patton, 2002). The four universities, located in different parts of China, were chosen due to their representativeness of four types of universities that offered BE courses to their students. The first one was a key comprehensive university (i.e. 211/985 university) in Hainan province, where BE courses were delivered to students majoring in business-related subjects. The second one, a foreign language university located in Xi’an, and the third one, a non-key university in Heilongjiang Province, were among the earliest institutions granted the qualifications to confer BA degrees to BE majors. The fourth one was a community college in Sichuan Province, which had been endeavoring to develop ESP programs including BE courses. Altogether 572 students from the four universities participated in the study. The students were selected because: (1) they constituted an important group of stakeholders of the BE writing scale under construction, as the scale was intended to be used for the teaching, learning, and assessment of BE writing in the Chinese tertiary context; (2) all of them had attended BE writing courses which featured the adoption of genre-based approaches, encompassing typical BE written genres valued by the business community; as such, they were expected to be familiar with the content and tasks of the descriptors in the questionnaire. The participants included both BE majors and non-BE majors specializing in business-related disciplines, accounting

7.1 Research Procedures

103

Table 7.1 A sample section in the students’ questionnaire Category: vocabulary

Very easy

Moderately easy

Neither easy nor difficult

Moderately difficult

Very difficult

1. Has a basic vocabulary 1 repertoire of isolated words and phrases related to general business topics

2

3

4

5

2. Can use a wide range of business vocabulary fluently and flexibly to convey precise meanings

1

2

3

4

5

3. Demonstrates an awareness of current business terminology

1

2

3

4

5

4. Demonstrates familiarity with business specialist terms (e.g., offer, bill), whose meanings are very different when they are used in the non-business context

1

2

3

4

5

5. Demonstrates familiarity 1 with common acronyms and abbreviations used in business documents such as HR and SWOT

2

3

4

5

6. Has a good command of a 1 very broad business lexical repertoire, including colloquial and idiomatic expressions

2

3

4

5

7. Demonstrates satisfactory 1 control over synonyms such as merger, consolidation and acquisition, especially their connotative levels of meaning in the business environment

2

3

4

5

for 56% (n = 320) and 55% (n = 252) of the total number respectively. We intended to include more BE majors in the survey, given the leading role played by the discipline in affecting BE education across China (Wang et al., 2011). At the time of investigation, all participants were in their third or fourth year in their degree programs and had completed their BE writing courses, lasting for at least six months. All participants reported that they had passed CET4 at the time of data collection.

104

7 Developing the Business English Writing Proficiency Scale

7.1.3 Data Collection The questionnaires were distributed and collected by the students’ course instructors and the first author was present in person to facilitate the administration of the questionnaires. Prior to questionnaire distribution, the first author introduced to the students the purpose of the study and the important concepts included in the questionnaire. The students were also told that their participation was voluntary and anonymity was guaranteed. Altogether 590 questionnaires were administered and the response rate was 98%. The data was then input into Excel files by the first author and two research assistants. Data cleansing was conducted to identify outliers such as values that were outof-range between the lowest and the highest ones (Tabachnick & Fidell, 2001). After removing the outliers, 572 responses were considered valid and subsequently submitted to Rasch analysis, using Winsteps version 3.70.0.3.

7.1.4 Data Analysis 7.1.4.1

The Rasch Model

As mentioned previously, the questionnaire data in this study was analyzed using the Rasch model. In this section, we briefly introduce the Rasch model to those who are not familiar with this analysis method. The Rasch model is a probabilistic model which calibrates person ability and item difficulty on the same interval scale (Bond & Fox, 2015). A fundamental principle that underlies the Rasch model is that a person with a stronger ability than another on a particular trait or construct has a higher likelihood of getting an item correct; similarly, for an item with a higher difficulty level, the likelihood of getting this item correct is lower. The Rasch model is an umbrella term which encompasses a family of models, including the basic (or dichotomous) Rasch model (Rasch, 1960), the Rating Scale Model (Andrich, 1978), the Partial Credit Model (Masters, 1982), and the ManyFacets Rasch Model (Linacre, 1989). The basic Rasch model was developed by Gorge Rasch (Rasch, 1960), typically used to analyze dichotomous data (i.e. responses that are either correct or incorrect). The other members in the Rasch family of models represent the various extensions of the basic Rasch model. For example, the Rating Scale Model (Andrich, 1978), the one employed in this study to analyze the questionnaire data, was developed to analyze Likert-scale type data. The many-facets Rasch model, a more recent development of the Rasch model, is typically used to analyze data from rater-mediated assessments (Eckes, 2011; Linacre, 2017a, 2017b). The Rasch model has been used extensively in language studies, particularly in the field of language assessment (see e.g., Fan & Knoch, 2019; McNamara & Knoch, 2012; McNamara et al., 2019 for an overview of the application of the Rasch model in language assessment), due to several advantages as compared to the traditional

7.1 Research Procedures

105

data analysis method based on classical test theory (CTT, Bond & Fox, 2015). A salient advantage of the Rasch model is “specific objectivity” (McNamara, et al., 2019, p. 176), which means item difficulty estimates are independent of the person samples that are included in the analysis. This advantage makes it particularly useful for calibrating the difficulty levels of the descriptors in language proficiency scales or frameworks, commonly developed on the basis of the data collected from a wide range and variety of stakeholder groups (e.g., teachers, learners). In addition to item difficulty calibration, the Rasch model can also identify the items (i.e. descriptors) which do not fit the model expectations and function as intended. Due to their aberrant response patterns, these misfitting descriptors need to be either revised or removed from the descriptor bank (North, 2000). Indeed, the Rasch model was used in several high-profile projects aiming to develop language proficiency scales. In the development of the CEFR, for instance, the essential stage of constructing the scale was characterized by the involvement of teaching practitioners to assess their learners against a host of descriptors of language proficiency in an overlapping series of questionnaires for the purpose of anchoring. The questionnaire data were then analyzed using the Rasch rating scale model in order to (1) mathematically scale the difficulty values for the descriptors and (2) identify descriptors with high stability of values across different contexts which can be used as common reference levels in a wide variety of contexts (Council of Europe, 2001, p. 219). In a more recent project to develop the China’s Standards of English (CSE), descriptors of language proficiency in seven language skills of listening, speaking, reading, writing, translation, interpretation, and linguistic knowledge (grammar and pragmatics) were accumulated for the design of about 50 questionnaires, which were distributed to around 10,000 teachers and 100,000 students in over 1,000 schools across the nation. Descriptors in the questionnaires were rated by teachers against their students’ English language ability on a five-point scale; in addition, students also evaluated the descriptors’ difficulty levels. The data were analyzed using the many-facets Rasch model (MFRM) for the purpose of achieving vertical scaling of the descriptors, based on which cutoff points of levels were determined (He & Chen, 2017).

7.1.4.2

Data Analysis Procedure

In our analysis of the questionnaire data, we followed the best practice of Rasch analysis as delineated in Bond and Fox (2015) and McNamara et al. (2019). First, we examined the item-person map (or the Wright map) which provides a wealth of useful information about the functioning of the descriptors and the psychometric properties of the scale as a whole. We also examined item and person reliability estimates and separation indices as important indicators of the statistical properties of the scale. Next, we performed unidimensionality analysis in Winsteps which is basically a factor analysis of the Rasch residuals representing the differences between the actual observations and the model expectations (also known as Rasch factor analysis, see Fan & Bond, 2019). To future investigate the psychometric quality of the scale,

106

7 Developing the Business English Writing Proficiency Scale

we performed Differential Item Functioning (DIF) analysis to explore whether the descriptors functioned equivalently on the two subgroups of participants (i.e. BE majors and non-BE majors). Finally, we examined the fit statistics of each descriptor in the questionnaire, through which we identified the misfitting descriptors, that is, the descriptors whose data deviated significantly from the expectations of the model. Based on the difficulty values generated by Rasch analysis, cut-offs were determined and proficiency level bands were created. Following a content analysis of the descriptors assigned to different level bands, we developed the draft version of a multi-dimensional BE writing scale with several illustrative scales. To offer a holistic overview of the BE writing scale, a global scale was also constructed, which could help render the scale more accessible to non-specialist users and orientation points to its potential users (Council of Europe, 2001).

7.2 Scaling the Descriptors 7.2.1 Psychometric Properties of the Descriptors 7.2.1.1

The Item-Person Map

A distinctive advantage of Rasch analysis lies in that it can graphically illustrate the distribution of person ability relative to item difficulty in the form of an itemperson variable map (or Wright map), where person ability and item difficulty are located on the same interval scale. In this study, the Wright map can provide valuable information as to whether the perceived difficulty of the descriptors is appropriate for the sample. The Wright map, including 93 items (i.e. descriptors) and 572 participants, is presented in Fig. 7.1. The left- and right-most column of Fig. 7.1 shows the common interval scale (in logit), with each # on the left side of the horizontal line representing two participants and each X on the right side an item. Items at the top were perceived by the participants as the most difficult while those at the bottom the easiest. As indicated in this figure, the cluster of persons was basically located opposite to the cluster of items. The figure also revealed that the distribution of items failed to some extent to cover the range of person abilities. For example, there were a few participants located at both ends of the scale with no items targeting them. Nevertheless, it can be observed that on the whole the difficulty levels of the descriptors reasonably well targeted the ability levels of the participants.

7.2.1.2

Reliability Estimates and Separation Indexes

To further examine the psychometric quality of the descriptors, we examined the reliability estimates and separation indices of the questionnaire. In Rasch analysis,

7.2 Scaling the Descriptors

107

Fig. 7.1 Wright map. Notes M—the mean of the person or item estimates; S—one standard deviation from the mean, T—two standard deviations from the mean

the person reliability index indicates the extent to which the person ordering can be replicated if the same sample of persons is given another set of items measuring the same construct; similarly, the item reliability index indicates the extent to which the same hierarchy of item difficulty can be established if the same items are given to another sample with comparable ability levels (Bond & Fox, 2015). According to Linacre (2012), acceptable item and person reliability estimates should be above the threshold of 0.8. Item reliability estimate is used to verify item difficulty hierarchy, whereas a high person reliability estimate suggests that the instrument is sensitive to distinguish high- and low-ability participants. In addition to reliability estimates, Winsteps also provides item and person separation indices which indicate the spread of items and persons along the measured variable. The commonly accepted criterion for the item and person separation index is at least 2.0 (Bond & Fox, 2015). The

108

7 Developing the Business English Writing Proficiency Scale

Table 7.2 Summary statistics for Rasch analysis Measures

Items

Persons

Mean

0.00

−0.26

SD

0.07

1.00

N Infit MnSq Outfit MnSq Separation index Reliability

93

572

Mean

0.99

0.99

SD

0.20

0.45

Mean

1.00

1.00

SD

0.20

0.45

12.17

6.58

0.99

0.98

summary statistics including item and person reliability estimates and separation indices are presented in Table 7.2. As shown in Table 7.2, item and person reliability estimates in the study were 0.99 and 0.98 respectively, both well above the threshold value of 0.8, suggesting that there were enough spread of item difficulty and person ability; in addition, the high item and person reliability estimates gave us confidence in the reproducibility of relative measure locations (Linacre, 2017a). Similarly, the item and person separation reliability indices were also high. Specifically, the item separation index was 12.17 (SD = 0.70) and the person separation index was 6.58 (SD = 1.00), both well above the criterion of 2.0. Similar to the interpretations of reliability estimates, the high person separation index suggests that the instrument was sensitive enough to distinguish between high- and low-performers whereas the high item separation implies that the person sample was large enough to confirm the item difficulty hierarchy (Boone et al., 2014). In addition, item and person infit statistics were both 0.99, within the range of −2 to +2, suggesting that the overall response patterns observed fit the expectations of the Rasch model well and such data-model fit bolstered confidence in the validity of the scale.

7.2.1.3

Utility of Response Categories

Next, we examined the functionality of the five categories in the questionnaire (see Table 7.3). Six criteria were applied to verify the utility of each response category (see McNamara et al., 2019 for the guidelines), including (1) a minimum of 10 observations is needed for each category, (2) the average category measures must increase monotonically with categories, (3) the outfit mean square statistic for a given rating category should not exceed 2.0, (4) the category thresholds (i.e. RaschAndrich thresholds) should advance monotonically with categories, (5) the size of the increase in category threshold values should be within the 1.4–5.0 logit range, and (6) the shape of the probability curves should peak for each category.

7.2 Scaling the Descriptors

109

Table 7.3 Category structure of the rating scale Category

Observed count (%)

Average measure

Outfit MnSq

Threshold calibration

1. Very easy

2597 (5)

1.33

1.13

2.59

2. Moderately easy

12,107 (23)

0.71

0.94

0.72

3. Neither easy nor difficult

19,271 (36)

−0.15

0.94

−0.85

4. Moderately difficult

14,298 (27)

−0.99

0.95

−2.45

5. Very difficult

4860 (9)

−1.76

1.06

NONE

The statistics in Columns 2 and 3 in Table 7.3 demonstrate that each category had over 10 observations (e.g., n = 4860 for category 5); the average measures increased monotonically (from −1.76 to 1.33 logits) as the categories moved from 1 to 5, suggesting that a larger category represented a higher ability. None of the outfit mean square statistics (from 0.94 to 1.13) exceeded the recommended upper limit of 2.0, suggesting that little unexpected noise was introduced by these categories into the measurement process. In addition, the distances between the calibrations of adjacent categories were all within the range of 1.4–5.0 logits, indicating that the five categories in the questionnaire defined distinct positions on the measured variable. The probability curve for each category is graphically illustrated in Fig. 7.2. The horizontal axis represents the item difficulty scale; the vertical axis represents the probability for each category to be selected. As shown in this graph, there is a separate

Fig. 7.2 Category probability curves for the five categories in the questionnaire

110

7 Developing the Business English Writing Proficiency Scale

peak for each category, and such distinctness, together with the statistics reported above, lent support to the conclusion that the five categories in the questionnaire were properly ordered and functioned as intended.

7.2.2 Dimensionality Analysis The assumption of unidimensionality, considered essential to Rasch analysis, explained why Rasch measurement theory was initially objected to by language testers. The construct of language proficiency was arguably not unidimensional (McNamara & Knoch, 2012). Skehan (1989), in particular, voiced reservations about the feasibility of using Rasch measurement in the context of ESP testing by highlighting “the dimensions of proficiency or enabling skills in ESP” (p. 4). These misgivings, however, reflected misconceptualizations of the notion of unidimensionality in Rasch measurement theory (Henning, 1992; McNamara, 1991). With the gradual uptake of the Rasch as a valid measurement tool by the language testing community, it is now widely accepted that unidimensionality defined in Rasch analysis is not pre-determined by the underlying knowledge or ability derived from theoretical grounds as originally assumed (McNamara & Knoch, 2012). Instead, it is a psychometric measure constructed on the basis of test data. McNamara (1991), for instance, by drawing on data from a listening test designed for health professionals, demonstrated that a single dimension could be constructed for an ESP test which seemingly contained several dimensions. According to Linacre (2015, p. 560), “unidimensionality is never perfect”; in other words, multidimensionality always exists to a lesser or greater extent as a faithful reflection of reality. Consequently, the question is not whether the data are perfectly unidimensional, but whether multi-dimensionality in the data are significant enough to merit dividing the items (descriptors) into separate tests (subscales), or constructing new tests (scales) for each dimension. A statistical procedure known as Rasch factor analysis is commonly used to investigate data dimensionality in Rasch analysis (Fan & Bond, 2019). Rasch factor analysis usually starts with a regular Rasch analysis procedure, which is followed by a factor analysis of the measurement residuals, that is, the part of the data which is not explained by the Rasch dimension. Variance greater than or equal to 50% for the Rasch dimension can be regarded as valid evidence for scale unidimensionality (Linacre, 2012). If a significant amount of variance is found in the second dimension, the scale may be considered multi-dimensional. In this study, Principal Component Analysis (PCA) was performed to examine whether the residuals were significant enough to form a sub-dimension unexplained or unmodeled by Rasch analysis. As shown in Table 7.4, nine descriptors showed the largest contrast loadings as above 0.5, indicating that they might correlate on an additional dimension (Bond & Fox, 2015). However, an analysis of the descriptors’ content revealed that the nine descriptors scattered across eight sub-categories (i.e. Note, Report, Summary, Resume, Genre format, Minutes, Vocabulary, Sociolinguistic competence) and as a

7.2 Scaling the Descriptors

111

Table 7.4 A summary of the descriptors with largest contrast loadings No.

Descriptor

Category

Loading

A

Descriptor 73 Can write notes including times, dates and places

Note

0.74

B

Descriptor 68 Can write very simple reports Report which pass on routine information

0.73

C

Descriptor 90 Can reproduce keywords and Summary phrases from a short text on subjects related to general business topics

0.72

D

Descriptor 89 Can fill out job application forms with short comments on previous experience, abilities and strengths

Resume

0.71

E

Descriptor 24 Is aware of business writing conventions regarding layout and paragraphing

Genre format

0.68

F

Descriptor 77 Can write minutes on a Minutes routine meeting about the most important points, even though some information might be missing

0.67

G

Descriptor 1 Has a basic vocabulary repertoire of isolated words and phrases related to general business topics

0.57

H

Descriptor 75 Can write a short, Note comprehensible note on daily business activities. For example, can write a note to a colleague expressing gratitude, apology and condolence, etc.

0.57

I

Descriptor 40 Can establish basic business contact by simple but effectively using common polite expressions

0.56

Vocabulary

Sociolinguistic competence

result did not form a meaningful sub-dimension. The descriptors were thus retained at this stage of analysis.

7.2.3 Differential Item Functioning (DIF) Analysis Differential item functioning (DIF) occurs when subgroups with equal ability on a construct show a differential probability of getting an item correct. Test providers commonly perform DIF analysis to identify items that exaggerate the true differences between subgroups (Pae, 2004). Statistically speaking, the presence of DIF signals multi-dimensionality (Camilli & Shepard, 1994; Roussos & Stout, 1996). A large DIF value suggests the presence of additional constructs that function differently from one group to another. In this respect, the DIF analysis provides researchers

112

7 Developing the Business English Writing Proficiency Scale

with powerful tools to identify irrelevant factors that unfairly discriminate against a particular group (Pae, 2004). In this study, two groups of participants, that is, BE and non-BE majors rated the difficulty levels of the descriptors presented in the questionnaire. As this study aimed to develop a proficiency scale targeting all the BE learners in the Chinese tertiary context, we compared the Rasch measures for the two groups of participants to investigate whether their academic backgrounds had any impact on their perceived difficulty levels of the descriptors. As shown in Fig. 7.3, some items appeared to be slightly more difficult for one group than the other. According to Linacre (2012), the value of DIF contrast, namely the difference in item measures between the subgroups for a DIF analysis, should be at least 0.50 logits for the DIF to have an impact. An inspection of the magnitude of DIF contrasts in our analysis reveals little variation between BE and non-BE majors. Only two items (descriptors 61 and 83) were found to display a DIF contrast of 0.51 and 0.52 respectively (see Table 7.5). Specifically, non-BE majors perceived descriptors 61 and 83 as more difficult than BE majors. DIF effect identified on descriptor 61, which reads “Has a systematic knowledge of business subjects such as economics, human resources, international business management, international business law, international finance, and business ethics,

Fig. 7.3 Plot of item estimates between BE majors and non-BE majors

7.2 Scaling the Descriptors

113

Table 7.5 Items demonstrating possible DIF effect Descriptor

DIF measure Non-BE

DIF contrast BE

Descriptor 61 Has a systematic knowledge of business 0.57 subjects such as economics, human resources, international business management, international business law, international finance, and business ethics, etc.

1.08 0.51

Descriptor 83 Can create a press release to introduce a new 0.30 product or service

0.82 0.52

etc.”, might be due to the different curricular requirements for BE and non-BE majors. In Chinese universities, BE majors are typically required to have a comparatively wider range of knowledge across different subject areas, attending courses such as international trade, marketing, accounting, etc. (Wang et al., 2011), whereas non-BE majors are more concerned with mastery of knowledge specified to one disciplinary area such as accounting or finance. As to descriptor 83 “Can create a press release to introduce a new product or service”, the DIF result might be caused by the following two reasons: (1) the textbooks used by non-BE majors did not include the genre “News release”, or (2) the textbooks used by non-BE majors included the genre “News release”, but in actual BE writing classrooms the genre was skipped by the instructors due to its complexity and high demand on language skills. According to North (2002), a certain degree of instability across subgroups in scale development is not in itself necessarily an argument for dropping the descriptor(s) concerned since it is not entirely surprising that learners in certain academic sectors might be relatively stronger in some aspects of proficiency than those from other sectors. Therefore, the two descriptors were retained in the sense that their DIF contrasts were only slightly above the 0.5 threshold.

7.2.4 Item Statistics In addition to DIF analysis, we also checked item statistics generated by Rasch analysis including item measures, infit and outfit mean square statistics and pointmeasure (PTMEA) correlations. PTMEA correlations of all the items in the questionnaire were positive, with the strength of relationship from moderate to strong (0.39–0.70), suggesting that the items were correlated on a unidimensional latent trait (Linacre, 2015). Next, we checked the item fit statistics (reported as infit and outfit mean square statistics) and flagged the ones on which the data failed to match the expectations of the Rasch model. Generally speaking, mean square statistic greater than 1 indicates underfit whereas a value less than 1 indicates overfit. Underfit is caused by unmodeled noise or other sources of variance in the data. Overfit, on the other hand, means that the

114

7 Developing the Business English Writing Proficiency Scale

Table 7.6 A summary of misfitting items Descriptor

Infit

Descriptor 28 Can access a limited range of cohesive devices which are often inadequately and inaccurately used

2

Descriptor 10 Demonstrates mastery of very basic structures with only rare use of subordinate clauses

1.77

Descriptor 19 Writing is orthographically free of error

1.41

Descriptor 57 Demonstrates a thorough understanding of at least one business-related discipline such as international trade

1.39

Descriptor 84 Can write a press release to foster goodwill for a company, project or campaign

1.38

response patterns on an item fit the model expectations too well. In Rasch analysis, both infit and outfit mean square statistics should be used to assess item fit, but more emphasis is usually placed on infit than outfit values in identifying misfitting persons or items (Bond & Fox, 2015,). This is because outfit mean square statistics is susceptible to outliers (i.e. highly unexpected responses) whereas infit is less affected by outliers and more accurately reflects the response patterns. Different criteria have been proposed to assess item fit. In this study, the infit mean square range of 0.70–1.30 was adopted as suggesting satisfactory data-model fit (Bond & Fox, 2015). The items with infit mean square values below 0.70 overfit the model, but as explained earlier, they did not pose a significant threat to the validity of the scale. Therefore, they were not removed from the descriptor bank. On the other hand, the items with infit mean square values above 1.3 underfit the model, and were identified as misfitting items. In this analysis, five misfitting items (Descriptors 28, 10, 19, 57, 84) were detected. Detailed information on these items is presented in Table 7.6. In an attempt to explain why some descriptors misfit in Rasch analysis, North and Schneider (1998) advanced five possible reasons: (1) the concept was a separate construct from the intended one, (2) the descriptor was vague in meaning, (3) raters of the descriptor provided inconsistent responses, (4) raters of the descriptor were unfamiliar with the activities or tasks described by the descriptor, and (5) negative formulations were included in the descriptor. Content analysis of descriptors 28, 19 and 10 revealed that they all involved the use of words or phrases with negative connotations, such as “inadequately”, “inaccurately”, “rare”, and “free of”, which probably contributed to inconsistent interpretations on the part of the participants. As to descriptors 57 and 84, they might have tapped into knowledge areas (i.e. international trade, news release) with which not all participants were familiar. Since the five items posed potential threats to the scale validity, they were all removed from the descriptor pool. At this point, it was found that only two items were left in the subcategory of News release and one of two remaining items (i.e. Can create a press release to introduce a new product or service) was also found to be problematic in DIF analysis. As such, the scale category “News release” was

7.2 Scaling the Descriptors

115

removed. We contend here that such deletion was appropriate and would not undermine the content validity of the scale because the activity of news release writing was of much lower frequency than other genres such as letter and report for BE learners at Chinese universities (Wang, 2020).

7.3 Constructing the BE Writing Scale 7.3.1 Creating Level Bands of Proficiency After the removal of seven items, as explained above, the scale contained a total of 86 descriptors calibrated on a common logit scale. We then divided the descriptors into several proficiency bands or levels. According to North (2000), in order to counterbalance the inherent subjectivity involved in creating band levels for language scales, scale developers can (1) refer to the logit values of the descriptors to create a scale of more or less equal intervals, and (2) examine patterns, clusters and apparently natural gaps on the vertical scale of descriptors to ensure that the differentiation of levels makes sense. In this study, we adopted both methods suggested by North (2000). First, we marked out approximately equal distances on the logit scale to identify cut-off points among the calibrated descriptors. Next, we conducted a content analysis of the descriptors across the graded band levels to ensure that progression of proficiency was apparent.

7.3.1.1

Setting Cut-Off Points

Based on the Rasch analysis results, three levels were established to differentiate the descriptor bands. The main argument for adopting three levels was largely a psychometric one in that “fewer, broader bands usually lead to clearer distinctions in the criterion descriptions”, producing in turn “higher reliability in the calibration of qualifications” (North, 2000, p. 279). In addition, according to Knoch (2011), for a scale used at a certain proficiency level, “three to four bands would be sufficient” (p. 92). As this BE writing scale was developed for Chinese BE learners in collegiate settings, who had at least 12 years of English-learning experience before college and at least two years of EGP/ESP studies at universities, a narrower proficiency range was considered more reasonable for this specific group of BE learners. As shown in Table 7.7, the difficulty estimates of the 86 descriptors ranged from −1.65 to 1.45 logits and three proficiency bands, termed Advanced (Level C), Intermediate (Level B), and Lower Intermediate (Level A), were pinpointed according to the difficulty levels of the descriptors. Cut-off points between the levels were set at almost equal intervals of approximately 1 logit (0.92 in the middle band; 1.09 and 0.94 at both ends). Level C was substantiated by 27 descriptors (with logit

116

7 Developing the Business English Writing Proficiency Scale

Table 7.7 Levels and cut-offs of the scale Level

Cutoff

Logit range

Logit difference

Number of descriptors

Advanced (Level C)

0.36

0.36 to 1.45

1.09

27

Intermediate (Level B)

−0.59

−0.59 to 0.33

0.92

40

Lower intermediate (Level A)

−1.65

−1.65 to −0.71

0.94

19

values ranging from 0.36 to 1.45), and Level A was substantiated by 19 descriptors (with logit values ranging from −1.65 to −0.71). Level B had the largest number of descriptors (n = 40), whose logit values ranged from −0.59 to 0.33. The large number of descriptors found at Level B could be a reflection of the tendency for a Rasch logit scale to distort towards the ends, rendering more descriptors clustering at the middle (North, 2000); it could also be the case that the descriptor pool failed to take adequate account of BE writing proficiency at higher and lower levels. To address such a skewed structure, “a branching approach” is advocated, which allows prospective users of the scale to create relevant descriptions of competence to suit local or sector-specific need and “yet still relate back to a common system” (Council of Europe, 2001, p. 32). For instance, if more descriptors for the Lower Intermediate level were desired in a certain BE learning context, what was included in the BE writing scale could serve as a starting point of reference for developing more descriptors to suit the local needs. Following the cut-off setting standard discussed above, the 86 empirically calibrated descriptors were substantiated into the 16 scale categories of the descriptive scheme (i.e. Vocabulary, Grammar, Orthographical control, Genre format, Cohesion & coherence, Strategic competence, Sociolinguistic competence, Intercultural competence, Business knowledge, Letter/E-mail, Report, Memo, Minutes, Note, Resume, Summary), yielding hence 16 illustrative scales of the BE writing proficiency scale.

7.3.1.2

Checking Content Coherence of the Illustrative Scales

According to North (2000), producing a proficiency scale with equal intervals purely from statistical data was not sufficient for the establishment of a valid scale; content analysis of the descriptors across the proficiency bands should also be carried out. In this study, the Rasch analysis results were interpreted in conjunction with an examination of descriptor content coherence across the three proficiency levels of the 16 illustrative scales. The method used for checking descriptor content was to identify the salient elements highlighted by the descriptors in all the illustrative scales. Specifically, the elements singled out for each illustrative scale were put in a chart to examine whether there was an apparent progression of proficiency. The illustrative scale of

7.3 Constructing the BE Writing Scale

117

Table 7.8 The illustrative scale of grammar Level

Logit

Descriptor

Advanced

0.46 Demonstrates consistent grammatical control of complex structures

Advanced

0.29 Can employ a full range of grammatical structures with full flexibility and accuracy. For instance, can make skillful grammatical choices to enhance the overall communicative effectiveness of a business text

Intermediate

−0.6

Intermediate

−0.65 Communicates with reasonable grammatical accuracy in familiar contexts; apparent Chinese language influence does not lead to misunderstanding

No grammatical structures are missing from the learner’s repertoire; errors of grammar are fairly rare, and are often picked up and corrected if he/she reviews the text

Lower intermediate −0.76 Shows control of a few simple grammatical structures and sentence patterns in a learned repertoire Lower intermediate −0.96 Can use some simple structures correctly, but still systematically makes basic mistakes—for example, tends to mix up tenses and forget to mark agreement

Grammar (see Table 7.8) is used here as an example to illustrate the process. As shown in Table 7.8, the descriptors of the scale category Grammar were put into three proficiency levels according to their logit values, yielding two descriptors for each proficiency band in this illustrative scale. A scrutiny into the content of the descriptors revealed that the descriptors could be compared against two dimensions: “language” and “limitation”. An analysis chart (see Table 7.9) was hence created for the scale category of “Grammar”. The two descriptors falling into the Advanced level of the illustrative scale (i.e. Demonstrates Table 7.9 Analysis chart built for the illustrative scale of “grammar” Level

Language

Advanced

– consistent grammatical control of – Nil complex structures – a full range of grammatical structures with full flexibility and accuracy

Intermediate

– no grammatical structures are missing – reasonable grammatical accuracy in familiar contexts

Lower Intermediate – control of a few simple grammatical structures and sentence patterns – some simple structures used correctly

Limitation

– grammatical errors are often corrected under review – Chinese language influence which does not lead to misunderstanding – systematically makes basic mistakes (e.g., tense, agreement)

118

7 Developing the Business English Writing Proficiency Scale

consistent grammatical control of complex structure; Can employ a full range of grammatical structures with full flexibility and accuracy. For instance, can make skillful grammatical choices to enhance the overall communicative effectiveness of a business text) were thus broken into “consistent grammatical control of complex structures” and “a full range of grammatical structures with full flexibility and accuracy” respectively by extracting the most salient parts in these descriptors related to the description of language. No limitation was mentioned in these two descriptors, so “Nil” was written down under the column of “Limitation” in the chart. Similarly, the descriptors in the Intermediate level (No grammatical structures are missing from the learner’s repertoire; errors of grammar are fairly rare, and are often picked up and corrected if he/she reviews the text; Communicates with reasonable grammatical accuracy in familiar contexts; apparent Chinese language influence does not lead to misunderstanding) were broken into “no grammatical structures are missing” and “reasonable grammatical accuracy in familiar contexts” under the column of “Language”. Descriptions in the two descriptors related to the learners’ linguistic incompetence or potential problems were lumped under the column of “Limitation” as “grammatical errors are often corrected under review” and “Chinese language influence which does not lead to misunderstanding”. As to the descriptors at the Lower Intermediate level (Shows control of a few simple grammatical structures and sentence patterns in a learned repertoire; Can use some simple structures correctly, but still systematically makes basic mistakes—for example, tends to mix up tenses and forget to mark agreement), they were simplified as “control of a few simple grammatical structures and sentence patterns; some simple structures used correctly” under the column of “Language” and as “systematically makes basic mistakes (e.g., tense, agreement)” under the column of “Limitation”. Such a chart thus revealed a clear visual progression across the three proficiency bands: A learner should firstly master “some simple grammatical structures” and go through the process of “systematically making basic mistakes” at the Lower Intermediate level before being able to “communicate with reasonable grammatical accuracy in familiar context and often correct mistakes in retrospect” at the Intermediate level, which should be followed by the level marked by consistent control of “a full range of complex grammatical structures used with full flexibility and accuracy” at the Advanced level, where no linguistic limitations were conspicuously present. Following the process delineated above, content analysis of descriptors of all the 16 illustrative scales was conducted, which demonstrated apparent proficiency differences across the three level bands.

7.3.2 Establishing a Global Scale Language proficiency scales, such as the CEFR and the CSE developed to serve as a common point of reference across a range of settings, typically contain several illustrative scales. One major advantage of such a multi-categorical structure is that they offer a meta-language for detailed and analytical discussion of central issues

7.3 Constructing the BE Writing Scale

119

concerning language proficiency. The disadvantage, however, is that a wealth of descriptions are likely to hinder the scale users from forming a holistic judgment of language proficiency, as they might “get cognitive overload” with too many scale categories (Council of Europe, 2001, p. 190). North (2000) therefore recommended the development of “a global scale”, which could help scale users gain a quick and holistic overview of the language scale. Language proficiency scale, therefore, typically consist of a global scale and several illustrative scales, whose structure is compared to a hypertext. Council of Europe (2001, p. 40) summarizes the benefits of this structure below: The user is presented with an information pyramid and can get an overview by considering the top layer of the hierarchy (here the ‘global’ scale). More detail can be presented by going down layers of the system, but at any one point, what is being looked at is confined to one or two screens –or pieces of paper. In this way, complexity can be presented without blinding people with irrelevant detail, or simplifying to the point of banality. Detail is there – if it is required.

In the case of this study, a global scale of the BE writing scale was also developed. As presented in Table 7.10, the global scale contains three proficiency bands and holistically summarizes what each level of the BE writing scale entailed in single paragraphs. It is believed that such a global representation of the BE writing scale will make it easier to communicate the system to non-specialist users and will also provide scale users with orientation points. For instance, a learner can browse the global scale first, decide the level they might be at or the level they should be attempting, before going into greater detail concerning the proficiency descriptions provided by the illustrative scales. The global scale was developed by selecting descriptors from the 16 illustrative scales. According to North (2000), descriptors to be used in the global scale should demonstrate sounder statistical rigor from Rasch analysis results than the other descriptors. As explained in Sect. 7.2.4, descriptors falling within the infit and outfit mean square range of 0.7–1.3 was adopted as the standard for selecting descriptors included in the illustrative scales; as to the descriptors to be included in the global scale, a stricter standard (i.e. 0.8–1.2) was observed and only descriptors whose fit values were within this range were considered potentially useful in the global scale. Besides setting a quantitative criterion, qualitative analysis of the descriptor content was also carried out, and representative descriptors from different scale categories were consistently and coherently included in each proficiency level of the global scale to ensure an apparent progression of proficiency. For instance, “Vocabulary” was a parameter included in all the three proficiency levels of the global scale and three descriptors in the illustrative scale of Vocabulary, demonstrating robust statistical rigor (i.e. Can use a wide range of business vocabulary fluently and flexibly to convey precise meanings; Demonstrates familiarity with business specialist terms (e.g., offer, bill), whose meanings are very different when used in the non-business context; Has a basic vocabulary repertoire of isolated words and phrases related to general business topics), were included to indicate ability progression. At this juncture, it is worth noting that while the global scale describes an aligned set of knowledge and abilities at each proficiency level that reflects a seemingly linear

120

7 Developing the Business English Writing Proficiency Scale

Table 7.10 BE writing proficiency: a global scale Advanced

Can use a wide range of business vocabulary fluently and flexibly to convey precise meanings. Can employ a full range of grammatical structures with full flexibility and accuracy. Can create coherent and cohesive business text making full and appropriate use of a variety of organizational patterns and a wide range of cohesive devices. Is aware of the essential and optional components of different business genres and can make appropriate selections in response to a variety of situations. Can adopt appropriate strategies to handle messages that are likely to arouse negative feelings in the readers. For instance, can provide additional expository information beyond a simple negative message to help readers better connect with the intended message. Appreciates fully the sociolinguistic implications of language used by speakers of other cultures and can react accordingly. Demonstrates comprehensive understanding of the political, economic, historical, geographical, scientific and technological, and cultural characteristics of different nations. Can write letters/E-mails to persuade the receiver to take certain actions. Can write a formal report that is analytical in nature to present a case or give a critical appreciation of proposals. Can write detailed minutes during a meeting on topics in his/her field of interest, recording the information so accurately and so close to the original that the minutes could also be useful to other people. Can summarize a wide range of texts of general commercial nature, commenting on and discussing contrasting points of view and the main themes

Intermediate

Demonstrates familiarity with business specialist terms (e.g., offer, bill), whose meanings are very different when used in the non-business context. No grammatical structures are missing from the learner’s repertoire; errors of grammar are fairly rare, and are often picked up and corrected if he/she reviews the text. Punctuation is reasonably accurate. Is aware of the generic conventions regarding layout and paragraphing, etc. Can produce clear, smoothly flowing, well-structured business text, showing controlled use of organizational patterns, connectors and cohesive devices. Can make proper use of visual aids such as bar graphs, tables and line charts to complement or reinforce information loaded in words. Can express him or herself confidently, clearly and politely in a formal or informal business register, appropriate to the purposes and audience(s) concerned. Demonstrates awareness of intercultural differences and respect for other cultures. Has a sound understanding of general business topics, such as promotional strategies, conferences, business services, recruitment, etc. Can write a(n) non-routine letter/E-mail, provided that they are related to his/her work or field of expertise. Can write a business report which develops an argument, giving reasons in support of or against a particular point of view and explaining the advantages and disadvantages of various options. Can write minutes as a list of key points during a clearly structured meeting on a familiar subject. Can write memos for a variety of purposes on common business activities, such as explaining a new corporate policy. Can write an effective personal resume, highlighting his/her special skills, strengths, achievements and qualifications to enhance his/her marketability. Can summarize from a passage/passages of business text to produce, for example, a list of structured notes (continued)

7.3 Constructing the BE Writing Scale

121

Table 7.10 (continued) Lower-intermediate Has a basic vocabulary repertoire of isolated words and phrases related to general business topics. Can use some simple structures correctly, but still systematically makes basic mistakes—for example, tends to mix up tenses and forget to mark agreement; nevertheless, it is usually clear what he/she is trying to say. Can spell his/her address, occupation and other personal details. Can use the most frequently occurring connectors like and, but, so to link simple sentences. Can establish basic business contact by simply but effectively using common polite expressions. Has a preliminary intercultural knowledge, abandoning prejudice against or blind adoption of exotic cultures. Can handle basic social correspondence on everyday business topics. Has a basic understanding of common daily office communications. Can write very simple reports which pass on routine information. Can write a short, comprehensible note to colleagues or known external contacts. Can write minutes on a routine meeting about the most important points, even though some information might be missing. Can compose a memo relaying simple information such as a meeting notice. Can fill out job application forms with short comments on previous experience, abilities and strengths

progression across the levels, this is done only for the purpose of presentation and understanding. Actual second language acquisition does not necessarily occur in a linear fashion within or across proficiency levels (Hulstijn, 2007). For instance, a BE learner at any given point along his or her trajectory of English learning may exhibit some abilities (e.g., business knowledge) at a higher proficiency, while at the same time exhibiting other abilities (e.g., vocabulary range) at a lower proficiency level. Thus, the scale content should be interpreted in light of the fact that while a student may be identified—based on either assessment results or other criteria—as being eligible for a particular proficiency level, the student’s actual abilities may vary by quantity (what language tasks the learner is able to perform) and quality (how well the learner is able to perform the tasks).

7.4 Conclusion In this chapter, a BE writing scale was developed. As reported in Chap. 6, a pool of 93 descriptors were produced from teacher workshops in the previous stage of the study. To empirically assign the selected descriptors into appropriate proficiency levels, they were incorporated into a questionnaire and 572 Chinese university students who had attended BE writing courses were invited to evaluate the difficulty level of the proficiency description indicated by each descriptor. The questionnaire data were then analyzed using the Rasch model. After a detailed examination of Rasch analysis results such as item-person map, separation indices, unidimensionality analysis, differential item functioning (DIF), and fit statistics of each descriptor, 86 descriptors

122

7 Developing the Business English Writing Proficiency Scale

that demonstrated robust statistic properties were retained for the construction of the BE writing scale. Based on the difficulty values of the descriptors generated by Rasch analysis results, cut-offs were determined on approximately equal intervals and the descriptors were ascribed to three proficiency levels, termed as Advanced (indicating that a learner has achieved an advanced level of competence suitable for complex business writing tasks), Intermediate (indicating that a learner is effective enough to handle moderately complex writing tasks pertinent to his or her field of expertise) and Lower Intermediate (indicating that a learner can access the business world by handling daily, routine tasks in familiar contexts). The 86 descriptors were then put under their respective scale categories (i.e. Vocabulary, Grammar, Orthographical control, Genre format, Cohesion & coherence, Strategic competence, Sociolinguistic competence, Intercultural competence, Business knowledge, Letter/E-mail, Report, Memo, Minutes, Note, Resume, Summary), yielding 16 illustrative scales of BE writing proficiency. The illustrative scales were also examined by conducting a content analysis of all the descriptors included, which demonstrated apparent progressions up the different band levels. To orient the multi-dimensional scale system to its potential users, “a global scale” was also developed to provide a general overview of the proficiency levels. This was done by selecting descriptors demonstrating robust statistical rigor from the illustrative scales. Efforts were also made to select the descriptors addressing the same parameter of BE writing proficiency across the three level bands to ensure that the content areas included in each proficiency level were consistent, comparable and demonstrate an apparent progression of proficiency. This proficiency scale, consisting of a global scale and 16 illustrative scales substantiated by 86 empirically calibrated descriptors, will be validated by a group of experts in the next phase of the study.

References Andrich, D. (1978). Rating formulation for ordered response categories. Psychometrika, 43(4), 561–573. Bond, T. G., & Fox, C. M. (2015). Applying the Rasch model: Fundamental measurement in the human sciences. Routledge. Boone, W. J., Staver, J. R., & Yale, M. S. (2014). Rasch analysis in the human sciences. Springer. Camilli, G., & Shepard, L. (1994). Methods for identifying biased test items. Sage. Council of Europe. (2001). Common European framework of reference for language learning, teaching and assessment. Cambridge University Press. Eckes, T. (2011). Introduction to many-facet Rasch measurement. Peter Lang. Fan, J., & Knoch, U. (2019). Fairness in language assessment: What can the Rasch model offer? Papers in Language Testing and Assessment, 8(2), 117–142. Fan, J., & Bond, T. (2019). Unidimensionality and local independence. In V. Aryadoust & M. Rachelle (Eds.), Quantitative data analysis for language assessment (Volume I): Fundamental techniques (pp. 83–102). New York and London: Routledge.

References

123

He, L., & Chen, D. (2017). Developing common listening ability scales for Chinese learners of English. Language Testing in Asia, 7(4), 1–12. Henning, G. (1992). Dimensionality and construct validity of language tests. Language Testing, 9(1), 1–11. Hulstijn, J. H. (2007). The shaky ground beneath the CEFR: Quantitative and qualitative dimensions of language proficiency. The Modern Language Journal, 91(4), 663–667. Knoch, U. (2011). Rating scales for diagnostic assessment of writing: What should they look like and where should the criteria come from? Assessing Writing, 16, 81–96. Linacre, J. M. (1989). Many-facet Rasch measurement. MESA Press. Linacre, J. M. (2015). Facets Rasch measurement computer program [computer software]. Winsteps. Linacre, J. M. (2012). Winsteps Tutorial. Retrieved April 7, 2014, from http://www.winsteps.com/ tutorilas.htm Linacre, J. M. (2017a). Facets computer program for many-facet Rasch measurement, version 3.80.0. Winsteps.com. Retrieved from http://www.winsteps.com Linacre, J. M. (2017b). A user’s guide to winsteps. Retrieved from https://www.winsteps.com/a/ Winsteps-Manual.pdf Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149–174. McNamara, T. (1991). Test dimensionality: IRT analysis of an ESP listening test. Language Testing, 8(2), 139–159. McNamara, T., & Knoch, U. (2012). The Rasch wars: The emergence of Rasch measurement in language testing. Language Testing, 29(4), 555–576. McNamara, T., Knoch, U., & Fan, J. (2019). Fairness, justice, and language assessment: The role of measurement. Oxford University Press. North, B. (2000). The development of a common framework scale of language proficiency. Peter Lang. North, B., & Schneider, G. (1998). Scaling descriptors for language proficiency scales. Language Testing, 15(2), 217–263. North, B. (2002). Developing descriptor scales of language proficiency for the CEF common reference levels. In J. C. Alderson (Ed.), Common European framework of reference for languages: Learning, teaching assessment: Case studies (pp. 87–105). Council of Europe. Pae, T. (2004). DIF for examinees with different academic backgrounds. Language Testing, 21(1), 53–73. Patton, M. Q. (2002). Qualitative research and evaluation methods (3rd ed.). Sage Publications. Rasch, G. (1960). Probalistic models for some intelligence and attainment tests. Paedagogiske Institut. Roussos, L., & Stout, W. (1996). A multidimensionality-based DIF analysis paradigm. Applied Psychological Measurement, 20(4), 355–371. Skehan, P. (1989). Language testing. Part II. Language Teaching, 22(1), 1–13. Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics (4th ed.). HarperCollins. Wang, L. F., Chen, Z. M., & Zhang, Z. C. (2011). Developing national curriculum for BA program in Business English of China. The Asian ESP Journal, 7(1), 29–61. Wang, J. (2020). Application of trans-editing in Chinese English translation of double first-class initiative related news releases. Unpublished Master thesis. Shanghai International Studies University.

Chapter 8

Validating the Business English Writing Scale

This chapter reports on how the newly developed BE writing scale was validated by a group of experts from the teaching and professional domains. The experts were asked to examine the scale in terms of its descriptor categorization and level assignment through a questionnaire. Next, one-on-one interviews were conducted to explore indepth issues such as their overall perceptions of the scale as well as areas that were in need of improvement. Modifications were then made after carefully weighing upon the experts’ comments to further enhance the validity of the BE writing proficiency scale.

8.1 Research Procedures In this phase of this study, a sequential mixed-method design was adopted to further examine the validity of the BE writing scale. This approach featured sequential collection, analysis and comparison of both quantitative and qualitative data (Creswell, 2009). The integration of different types of data is conducive to the development of a comprehensive, elaborated and nuanced understanding of the complex issue under examination (Tashakkori & Teddlie, 2003). Specifically, ten experts, including five experienced university BE writing teachers (teaching experts) and five fully-employed business people (domain experts), were selected to evaluate the validity of the scale. To elicit the experts’ opinions, questionnaires were constructed which consisted of the descriptors forming the draft version of the BE writing scale developed in the previous phase of the study. The experts rated each descriptor along two dimensions: (1) the appropriateness of descriptor categorization, and (2) the difficulty level associated with each descriptor. The experts’ ratings on the descriptor categorization were analyzed via SPSS to identify the ones that were considered improperly categorized by the experts. The experts’ ratings on

© Springer Nature Singapore Pte Ltd. 2021 L. Wang and J. Fan, Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach, https://doi.org/10.1007/978-981-16-5449-7_8

125

126

8 Validating the Business English Writing Scale

descriptor difficulty were analyzed with the many-facets Rasch model. Correspondence between the experts’ ratings of the descriptor category and level assignment and those of the draft scale was believed to represent validity evidence for the new scale. To further explore the experts’ perceptions of the scale, follow-up interviews were also conducted after their completion of the questionnaire. Interview questions were directed at their overall perceptions of the scale as well as areas where they deemed inappropriate or problematic. Revisions were then made to further enhance the validity of the scale.

8.1.1 Research Instruments 8.1.1.1

Questionnaire

Two questionnaires, with a parallel design for the teaching experts and domain experts were developed on the basis of the draft version of the BE writing scale, with 86 descriptors lumped under 16 illustrative scales (see Chap. 7 for details). Each questionnaire contained two parts and was presented in Chinese, the participants’ first language. The first part collected the participants’ background information. The teacher version elicited the respondents’ personal information such as gender, educational background, and teaching experience. The domain experts’ version, on the other hand, collected the respondents’ bio-data as well as the type of company where they worked, their area of expertise, English language qualifications, and daily English use experience. The second part of each questionnaire was comprised of the descriptors in the BE writing scale, which were assigned to three proficiency levels based on statistical and content analysis (see Chap. 7 for the details). The descriptors were presented in a format that facilitated the experts’ ratings of each descriptor along two dimensions: 1) whether the descriptor was properly categorized, and (2) among Lower-Intermediate, Intermediate and Advanced levels, at which level should the descriptor be placed. As illustrated in a sample section of the second part of the questionnaire (see Table 8.1), the experts should first assess whether the descriptors fit well into their scale categories, using the two categories of “Yes” and “No”. At the end of each table, space was provided for the participants to note down the rationales of their responses, if they disagreed with the original categorization of a descriptor. Second, the participants were asked to rate their perceived difficulty level of each descriptor on a 3-point Likert scale with 1 denoting the Lower-Intermediate level, 2 denoting the Intermediate level, and 3 denoting the Advanced level, consistent with the three proficiency bands in the BE writing scale developed in the previous phase of the study (see Chap. 7 for details).

8.1 Research Procedures

127

Table 8.1 A sample section of the questionnaire Category: Vocabulary

This descriptor is properly categorized

Assign the descriptor to one of the three levels below

Yes

No

Lower Intermediate

Intermediate

Advanced

1. Can use a wide range of 1 business vocabulary fluently and flexibly to convey precise meanings

0

1

2

3

2. Has a good command of 1 a very broad business lexical repertoire, including colloquial and idiomatic expressions

0

1

2

3

3. Demonstrates satisfactory control over synonyms such as merger, consolidation and acquisition, especially their connotative levels of meaning in the business environment

1

0

1

2

3

4. Demonstrates familiarity 1 with business specialist terms (e.g., offer, bill), whose meanings are very different when they are used in the non-business context

0

1

2

3

5. Demonstrates an awareness of current business terminology

1

0

1

2

3

6. Has a basic vocabulary repertoire of isolated words and phrases related to general business topics

1

0

1

2

3

7. Demonstrates familiarity 1 with common acronyms and abbreviations used in business documents such as HR and SWOT

0

1

2

3

I do not think Item(s) _______ (please provide the item number) is (are) properly categorized, because (please provide your rationales below): ________________________________________ _______________________________________.

128

8.1.1.2

8 Validating the Business English Writing Scale

Interview Guide

An interview guide was developed to steer the interview questions to the research purpose and to optimize the limited time available in an interview situation (Patton, 1990). Specifically, the guiding questions were designed to explore (1) the experts’ overall impression of the scale, (2) their rationales of whether the descriptors were properly categorized, (3) their perceptions of the difficulty indicated by the descriptors, and (4) their comments on the descriptor quality (e.g. wording, length).

8.1.2 Participants The participants in this phase of the study included five BE teaching experts and five domain experts. The involvement of these two groups of participants was considered crucial in this study in that the assessment of BE, an important branch of LSP (Languages for specific purposes) is by its nature interdisciplinary. Concomitantly, building assessment tools such as a BE writing scale “requires collaboration between content experts” (in this case, experts from the business domain) and “applied linguists” (in this case, experienced BE teachers), as both have unique understandings of what matters for effective communication in specific domain and can therefore bring richer insights into the issue under investigation (Elder & McNamara, 2016, p. 154). The purposive sampling method was adopted for expert selection. Specifically, two factors were considered when choosing the teaching experts: (1) their experience in teaching BE writing, and (2) their influence in the field of BE education. As shown in Table 8.2, the five teaching experts (denoted as TE 1–5 in this study), consisted of 3 females and 2 males. These teaching experts had between 11 and 22 years’ experience in teaching BE writing to both BE majors and Non-BE majors in the Chinese tertiary context. Three of them (i.e., TE2, TE3 and TE5) had an educational background in both English language and culture and business-related subjects such as International Business Studies whereas the rest two (i.e., TE1 and TE4) had master’s degree in Applied Linguistics and were writers of popular BE textbooks. All of them were involved in the development of the National Curriculum for university BE majors that serves as the guiding document for the BE major programs across the nation. They also frequently attended academic conferences on the teaching of BE in China. In addition, they all had extensive experience providing language training to company employees who needed to communicate in English in international business settings. The five domain experts were recruited through the assistance of two senior staff members working in the Human Resources department of two multinationals in China. At the time of data collection, they had been working in the China-based business administration offices of prominent multinational companies for over 15 years and were recommended as experienced and competent BE writers in their respective companies. As shown in Table 8.3, these domain experts, including 3 males

8.1 Research Procedures

129

Table 8.2 Details of the participating teaching experts Code

Gender

Academic background

Language training experience

TE1

Female

BA/English language Language trainer for and culture; company employees MA/Applied linguistics

22

TE2

Male

BA/English language and culture; MA/International Business Studies

As above

11

TE3

Female

BA/English language and culture; MA/MBA

As above

13

TE4

Female

BA/English language As above and culture; MA/Applied linguistics

20

TE5

Male

BA/English language and culture; MA/Applied linguistics; PhD/Management

15

As above

Teaching experience (yrs)

Table 8.3 Details of the participating domain experts Code

Gender

Academic background

Highest English qualification

Field of expertise

Work experience (yrs)

DE1

Male

MA/Engineering

CET6

Commodity Design

15

DE2

Female

MA/Accounting

CET6

Financial analysis

20

DE3

Male

MBA

IELTS (Band 7)

Consulting

13

DE4

Female

MA/Marketing

CET6

Marketing

10

DE5

Male

MA/Applied Linguistics

TEM 8/BEC Higher

Procurement

15

and 2 females, had relatively strong English language proficiency, indicated by their CET6 certificates (approximately equivalent to B2 or C1 on the CEFR), BEC Higher (equivalent to C1 on the CEFR) or IELTS Band 7 (equivalent to C1 on the CEFR). They held MA degrees in business-related fields such as Accounting and Marketing, with their field of expertise ranging from product design to procurement. All participants confirmed that English was the primary means of written communication in their daily work, and that their clients and colleagues included Chinese speakers of English, native English speakers as well as speakers of English from countries such as Germany, France, Singapore and India.

130

8 Validating the Business English Writing Scale

8.1.3 Data Collection 8.1.3.1

Questionnaire Data

The authors distributed the questionnaires to the participants. As scale development was an unfamiliar field for some of the participants, especially the domain experts, the authors contacted all participants in advance and briefly explained to them the purpose of the study as well as the major concepts involved such as language proficiency scales. In addition, the authors went through with each participant a sample scale category (i.e., Vocabulary) of the questionnaire to orient them to what they were expected to do during the research. The participants were also encouraged to mark the descriptors or categories that they felt problematic during the process of questionnaire completion. They were assured of confidentiality of their responses, and were given two weeks to finish the questionnaire.

8.1.3.2

Interview Data

Follow-up interviews were conducted upon the experts’ completion of the questionnaire to further explore their perceptions of the descriptors. The interviews were scheduled in such a way that the participants’ memories of the descriptors were still fresh. The experts were first invited to express their overall impression of the scale. Then, referring both to the experts’ completed questionnaires and the draft version of the BE writings scale, the authors elicited the participants’ opinions on descriptors that they considered improperly categorized as well as the ones that they assigned to a proficiency level different from that in the draft scale (see Chap. 7). Along the process, the participants were also encouraged to comment on the quality of the descriptors, particularly in relation to the areas in need of improvement. The interviews were conducted in Chinese, the participants’ first language, though they were free to use English as desired. The interviews were conducted either faceto-face in a quiet conference room or via telephone. With the consent of the participants, all interviews were digitally recorded. A typical interview lasted approximately 60 minutes, but some lasted up to 80 minutes. All interviews were transcribed verbatim for subsequent analysis. The direct quotes in this chapter were translated from Chinese into English.

8.1 Research Procedures

131

8.1.4 Data Analysis 8.1.4.1 (1)

Analysis of the Questionnaire Data

Data of Category Appropriateness

Using SPSS version 17.0, descriptive statistics including frequencies and percentages were computed to summarize the ten experts’ responses to the appropriateness of descriptor categorization. These descriptive statistics were mainly used to portray the extent to which the experts’ opinions converged with descriptor categorization in the draft scale. Space was also provided in the questionnaire for the participants to note down comments on descriptors they considered inappropriately categorized. Since these comments mainly served as reminders for participants to elaborate during follow-up interviews, they were analyzed as complementary to the interview data. (2)

Data of Descriptor Difficulty Level

The data concerning descriptor difficulty were analyzed using FACETS version 3.71.1. As an extension of both the rating scale model and the partial credit model, it is an additive linear model based on a logistic transformation of the observed ratings to a logit scale. In this model, the logit on the interval scale could be viewed as the dependent variable, while the various facets included in the analysis function as independent variables influencing these logits (Myford & Wolfe, 2003). When the analysis is performed, the various facets can be analyzed simultaneously but independently and then be calibrated onto a common logit scale. In the current study, a fully crossed design was chosen in FACETS analysis; that is, all ten experts rated all the descriptors. According to Myford and Wolfe (2003), although such a fully crossed design was not necessary for FACETS to run the analysis, it could make the analysis more stable and robust, based on which more valid conclusions could be drawn from the results. Specifically, to explore how the participating experts perceived the difficulty levels of the descriptors and whether their backgrounds had any impact on their judgments, a three-facet MFRM model was constructed, including Raters (experts), Rater Group (expert group) and Items (descriptors). The Rater facet was specified as negative: the harsher raters were those who gave lower ratings to the descriptors whereas the more lenient ones ascribed higher ratings to the descriptors. Rater group was entered into the FACETS model as a dummy facet, whose difficulty was constrained to zero. As such, they “do not alter the rest of the model, contributing nothing to the estimation of the other measures, but are nonetheless available for further analysis in Facets” (Batty, 2015, p. 12). Bias analysis between rater groups (expert groups) and items (descriptors) was implemented to explore whether the background of the experts (i.e. teaching vs. domain experts) had a significant impact on their perceptions of descriptor difficulty.

132

8.1.4.2

8 Validating the Business English Writing Scale

Analysis of the Interview Data

Guided by the emic principle of analysis (Watson-Gegeo, 1988), according to which theory develops out of the data intrinsic to a given situation, the qualitative data were examined by means of analytic induction (Goetz & LeCompte, 1984) and constant comparison (Miles & Huberman, 1994). Specifically, several preliminary readings were conducted to gain an overview of the data. The transcripts were then analyzed in detail to compile a list of salient and recurring themes or patterns, of which many were common to both expert groups. The list of initial themes for each group of participants was then refined by grouping similar or interrelated themes together. Care was taken not to force the data into categories already established. To ensure intra-rater reliability, the first author went through a second round of categorization and coding from scratch about two weeks later. In addition, a scholar specialized in language assessment was invited to act as an independent coder. To ensure inter-rater reliability, the scholar performed the same coding procedures and compared the themes arising from his analysis with the authors’. In case of discrepancy, the problematic data sets were revisited and continuous negotiations were carried out until consensus was reached. As an additional check, “member validation” (Richards, 2003) was adopted. The themes that emerged from qualitative data analysis were summarized and e-mailed to the participants to check whether interpretations of their perceptions and attitudes were accurate. Apart from one expert who made minor changes of his wording, all participants confirmed that our interpretations of the data reflected their opinions accurately.

8.2 Experts’ Perceptions of Scale Categories 8.2.1 Descriptor Categorization As described above, the experts were asked to examine whether the descriptors fit well into the categories in the draft scale. A Yes-No binary option was provided: “Yes” indicated the experts’ agreement with the initial descriptor categorization and “No” disagreement. The findings show that the experts in general supported the original categorization of the descriptors in the scale, and of the 86 descriptors included in the 16 descriptive categories of the draft version of the BE writing scale, only four descriptors (i.e., descriptors 17, 37, 45 and 54) were singled out as potentially problematic ones. Detailed information of the four problematic descriptors is presented in Table 8.4. During the follow-up interviews, questions were asked to explore in greater depth the experts’ opinions concerning the categorization of the four descriptors. The experts’ comments on the problematic descriptors are summarized and presented below under the heading of their respective categories, namely Orthographic control

8.2 Experts’ Perceptions of Scale Categories

133

Table 8.4 Details of potentially problematic descriptors Problematic descriptor

Scale category

Experts’ ratings Yes

No

Descriptor 17 Punctuation is reasonably accurate

Orthographic control

5

5

Descriptor 37 Appreciates fully the sociolinguistic implications of language used by speakers of other cultures and can react accordingly

Sociolinguistic competence

6

4

Descriptor 45 Is aware of the effect of Intercultural competence, non-linguistic elements on intercultural business written communication, including time, space, etc

4

6

Descriptor 54 Demonstrates awareness of the sources from which information of prospective clients can be accumulated to establish new business relationships

3

7

Business knowledge

(for descriptor 17), Sociolinguistic competence (for descriptor 37), Intercultural competence (for descriptor 45) and Business knowledge (for descriptor 54). Please note that the italicized parts refer to the wording in the original descriptors. We added some notes in square brackets to facilitate readers’ understanding of the quotes where necessary. Orthographic Control Five experts considered descriptor 17 improperly categorized, which reads “Punctuation is reasonably accurate”. DE5, for instance, was skeptical about this descriptor categorization and made the following comment: DE5: “Actually I felt it a bit weird to put punctuation under the category of spelling, so I chose ‘No’ when filling the questionnaire.” In contrast to the intuitive judgment provided by DE5, the teaching experts justified their choice by alluding to the uniqueness of Chinese BE learners’ writing features, as exemplified by the following two comments: TE4: “I don’t think punctuation belongs to the category of Orthographical control. It should be put under the category of format. Some students often confuse the Chinese period [a hollow circle in form of ‘。’] with the English period. Some use the symbol “、” [a Chinese punctuation used to connect parallel elements in a sentence] to separate parallel word units instead of comma… All these mistakes to me are format mistakes. So I think this descriptor should be put under the category of Genre format.” TE1: “Students’ mistakes with punctuation usually appear in compound sentences and complex sentences. Their problems lie in the use of comma when a semicolon is the correct choice…such as the sentences beginning with ‘however’. Such

134

8 Validating the Business English Writing Scale

mistakes in essence reflect students’ grammatical problems. So I think to some extent, punctuation should be considered from the perspective of grammatical competence.” When invited to suggest a way to modify the descriptor category, one teaching expert made the following comment: TE3: “You might want to change the category heading from Orthographical control to Mechanics…As we can see from many writing textbooks published in America, aspects like spelling, punctuation and capitalization are all discussed under the umbrella term Mechanics.” In light of the controversies surrounding the categorization of descriptor 17, we heeded TE3’s suggestion and as a result, the scale category “Orthographical control” was revised as “Mechanics” to be more inclusive. Sociolinguistic Competence Some domain experts expressed concerns over descriptor 37, that is, “Appreciates fully the sociolinguistic implications of language used by speakers of other cultures and can react accordingly” in the sub-scale of Sociolinguistic Knowledge. Specifically, they claimed that the term “sociolinguistic implications” made little sense to them, except DE5 who used to major in English language studies. Consequently, these domain experts had to resort to other words in the descriptor to infer the descriptor’s difficulty level. The following comments illustrate this problem: DE1: “I don’t quite understand the term sociolinguistic implications. Maybe it means how one is related to society? Anyway, this descriptor must be quite difficult, because it reads appreciates fully the sociolinguistic implications.” DE3: “I don’t know what sociolinguistic implications mean.” All the teaching experts, by contrast, reported no problem with this descriptor, which is quite understandable given their background in Applied Linguistics. Nevertheless, as the scale was intended for users with various backgrounds, the term “sociolinguistic implications” was deleted from the descriptor, which was then revised as “Appreciates fully the effects of social factors (e.g., power relations) on language use and can react accordingly” to make it more accessible to non-specialists. Intercultural Competence In the sub-scale of Intercultural competence, many teaching experts considered descriptor 45 “Is aware of the effect of non-linguistic elements on intercultural business written communication, including time, space, etc.” confusing. Two major concerns were raised about this descriptor. First, the teaching experts commented that non-verbal factors were largely associated with oral communication such as gestures and postures. Second, it was unclear how time and space could affect business written communication. Below are comments made by two teaching experts: TE4: “I don’t think time and space can play a role in written communication. It is really difficult to conceptualize what they mean. Maybe space refers to format? If so, it would be better to delete this descriptor and write a new one about the writing format.”

8.2 Experts’ Perceptions of Scale Categories

135

TE2: “This descriptor talks about the impact of non-verbal factors on business written communication. Yet I feel it is unclear how time and space can affect written communication… Non-verbal factors are mainly about gestures, which have little to do with written communication.” The domain experts, interestingly, attached great importance to nonverbal communication and resorted to many examples illustrating the importance of nonverbal factors in business written communication. Two domain experts, for instance, made the following remarks: DE4: Time is very important. You know, we have a lot of international meetings to attend… The participants might be in Texas, California, Singapore or other countries. It is important for one to be clear about the others’ time zones when they arrange a meeting. DE3: When we engage in online business negotiations, time arrangement is a very important issue…. if possible, we tried to arrange “a conference call” in China’s daytime. Scrutiny at these examples, however, revealed that the nonverbal elements cited by the domain experts were largely confined to the domain of oral communication such as meetings and negotiations. Therefore, descriptor 45 was removed from the writing scale in the sense that it was more relevant to the oral mode of BE communication than the written mode. Business Knowledge In the illustrative scale of Business Knowledge, descriptor 54 “Demonstrates awareness of the sources from which information of prospective clients can be accumulated to establish new business relationships” was identified as potentially problematic. The interview data suggested that this descriptor was hardly relevant to the description of business disciplinary knowledge; rather, it was more concerned with one’s interpersonal skills. Inclusion of this descriptor, therefore, might violate the integrity of the illustrative scale as a whole, as explained in the following remarks: TE3: This descriptor is too specific. It is something about marketing. Whether a person knows the sources from which information of prospective clients can be accumulated to establish new business relationships has little to do with their ability to produce a satisfactory piece of business writing. So, I don’t think this descriptor should belong to this category. TE2: This descriptor is quite different from the other four in this illustrative scale. It is a bit weird to cluster them together. In fact, concomitant with the qualitative comments cited above, this descriptor was also identified as a misfitting item in the FACETS analysis (see Sect. 8.4 for detailed information). It was therefore decided to remove it from the scale.

136

8 Validating the Business English Writing Scale

8.2.2 Category Appropriateness During the interviews, the experts were also asked about their overall perceptions of the scale, particularly the appropriateness of the sub-categories included in the scale. In general, the experts provided quite positive comments on the scale and its illustrative scales, in terms of their comprehensiveness and usefulness in BE writing instruction. The following quotes are indicative of such an opinion: TE5: “I think this scale offers a very comprehensive summary of the abilities needed to survive and thrive in BE written communication. It’s very fine-grained, covering all the important aspects that should be addressed in business English writing lessons.” TE3: “Information provided by the scale can offer rich feedback to students, helping them to know about their strengths and weaknesses.” DE3: “I think this scale will be very useful for in-service language training. You know, we can tell the trainers what language help we need.” Besides the positive comments mentioned above, criticisms were also raised, interestingly all by the domain experts, concerning the illustrative scale of Note. Specifically, they challenged the usefulness of this scale category by arguing that note-writing was an outdated writing activity in an increasingly digitalized workplace environment where more convenient means of communication such as E-mails and text messages were available. Similarly, the participants voiced concerns over the descriptors in the illustrative scale due to their limited relevance to the workplace realities, as illustrated by the comments below: TE4: “Note is an informal genre, but this descriptor [descriptor 70: Can write notes in appropriate language to convey important information of immediate relevance to superiors or new clients, getting across the key points] says using notes to express important information. This is a bit contradictory and very unlikely to occur in the real world.” DE5: “If the information is important, I won’t write it on a note, which might be seen by others, and I might not know whether the intended receiver of the message has got the information or not if no feedback is provided. So this descriptor [descriptor 70] made little sense to me. As to the other one [Descriptor 71: Can write simple and explicit notes on daily business activities. For instance, can write a note to express thanks, express apology and condolence to a business associate.], if I need to thank someone, I will use more formal means to express my appreciation instead of making do with a simple note, which appears to be quite insincere.” According to DE4, even when notes were used to convey information, the notes would be written in Chinese, as most of her colleagues sitting in the same office were Chinese: DE4: “I occasionally write notes. For instance, once I was looking for my boss to sign a document, but he was not in his office. I could have sent him a text message, but I was not sure when he would check his cell phone. Therefore, I would write him a note and leave it at a place where he could easily spot it so that when he came back

8.2 Experts’ Perceptions of Scale Categories

137

to the office, he would find the document and sign it…I would use Chinese. My boss in our China office is Chinese. Rarely do you work in the same office as those from other countries.” In light of the feedback from the domain experts, the illustrative scale of Note with three descriptors was deleted. At this juncture, it is worth noting that the experts’ perceptions of note-writing might reflect the uniqueness of business written communication in the Chinese EFL context. As note-writing is a writing activity that is commonly included in existing BE scales developed in English speaking or ESL contexts such as the ALTE and CLB (see Chap. 2 for details), more studies are needed to explore the role of note-writing to capture a more nuanced description of real-world BE writing activities.

8.3 Experts’ Perceptions of Descriptor Levels 8.3.1 Inter-Rater Reliability: Rater Consistency and Severity Inter-rater reliability, an indicator of the degree of consistency and/or agreement across the raters, is regarded as an important source of evidence demonstrating the validity of a proficiency scale (Kaftandjieva & Takala, 2002). Two sets of values provided by FACETS are deemed relevant to inter-rater reliability. The first set encompasses fit statistics which provide information on intra- and inter-group consistency. The second set includes group severity measures (in logits) at the individual and group levels. In what follows, we present the analysis results generated by FACETS.

8.3.1.1

Rater Consistency

Tables 8.5 and 8.6 present a summary of the analysis results about the two facets of Rater Group and Rater respectively. In this table, the fit statistic for each facet indicates the degree to which each of them is behaving in a manner that is predicted by the Rasch model (Weigle, 1998). The infit mean square statistic has an expected value of 1, with higher values indicating more variation than expected or underfitting the model and lower values indicating less variation than expected, or overfitting the model. The closer the value is to 1, the better aligned the rater’s ratings are to model expectations. Normally, more emphasis is placed on infit values than outfit values in identifying misfitting persons or items (Bond & Fox, 2015) and overfits which indicate too little variation in the observed response pattern, are less of a concern than high ones (underfits) (Bond & Fox, 2015). According to Eckes (2011), infit mean square values between 0.40 and 1.60 in FACETS analysis are considered to be acceptable. Infit mean square values higher than 1.6 denote ratings that are erratic and significantly inconsistent from the values that the model predicts. This is a sign that the rater significantly underfits the

138

8 Validating the Business English Writing Scale

Table 8.5 Rater measurement report Experts

Measure

S.E

Infit MnSq

Correlation PtMea ZStd

DE3

1.47

0.25

1.58

3.00

0.7

DE2

0.44

0.24

1.42

2.30

0.76

DE4

0.27

0.24

0.85

0.80

0.83

TE2

−0.03

0.25

0.83

1.00

0.85

TE4

− 0.27

0.25

0.96

0.10

0.84

TE5

− 0.27

0.25

0.90

0.60

0.83

DE5

− 0.45

0.25

0.54

3.30

0.89

TE3

−1

0.25

0.90

0.50

0.83

TE1

− 1.12

0.25

0.84

0.90

0.84

DE1

− 1.44

0.26

0.90

Mean

− 0.24

0.25

0.97

−0.3

0.82

SD

0.85

0.00

0.30

1.8

0.05

0.50

0.82

Notes Reliability of separation index = 0.91; fixed (all same) chi-square: 103.1, df: 9; significance: p < 0.00; DE = Domain Expert; TE = Teaching Expert

Table 8.6 Group measurement report Expert group

Measure

S.E

Infit

Outfit

Correlation PtMea

MnSq

ZStd

MnSq

ZStd

TP

0.00*

0.11

0.89

−1.5

0.82

−2.0

0.84

DS

0.00*

0.11

1.06

0.8

1.19

1.6

0.81

Mean

0.00

0.11

0.97

−0.4

1.00

−0.2

Note * A dummy facets has a measure of 0 which does not add to, or subtract from, the other facets in the analysis

model and displays too much variability in his or her ratings, thus suggesting that he or she does not rate consistently. Similarly, values lower than 0.40 indicate that a rater overfits the model, most likely caused by a lack of variability in ratings. As shown in Table 8.5, the infit mean square values of all the experts are within the acceptable range (from 0.54 to 1.58), demonstrating that the rater response patterns are basically consistent with model expectations. This suggests that all the expert ratings were consistent, which is critical to valid calibration of descriptor values. Table 8.6 provides summary statistics of the two expert groups. The infit mean square values for the two groups, that is, the TE (Teaching Experts) group and DE (Domain Experts) group, were 0.89 and 1.06 respectively, indicating that more variation was found within the DE group than within the TE group. Nevertheless, both values are very close to the ideal value of 1, thus lending important evidence to intra-group consistency. In addition, the point-biserial values of the two rater groups

8.3 Experts’ Perceptions of Descriptor Levels

139

were similar (0.84 and 0.81 respectively), demonstrating hence high inter-group consistency.

8.3.1.2

Rater Severity

The Rater facet in the current study was specified as negative. That is to say, a higher rater severity measure indicates a more severe rater. As shown in the second column of Table 8.5, the severity measures between the most lenient rater (DE1; logit value: 1.47) and the most severe rater (DE3; logit value: -1.44) was 2.91 logits. The reliability of the separation index was very high (0.91). Note that this value does not indicate the degree of agreement between raters (the traditional meaning of reliability indices) but the extent to which they differ in their level of severity. According to Linacre (2015), rater separation reliability indices close to zero suggest that raters exercise similar levels of severity, while indices near 1.0 indicate that raters differ substantially in terms of their levels of severity. Also, the chi-square value of 103.1 with 9 degrees of freedom (df), which was significant at p < 0.00, overriding the null hypothesis that all raters were equally severe. On the whole, both teaching experts and domain experts rated the difficulty levels of the descriptors consistently, although their judgments of the descriptor difficulty differed to some extent. Generally speaking, the teaching experts as a whole perceived the descriptors to be slightly more difficult than the domain experts, who exhibited larger variation in their ratings. We will further explore these differences through interview data in Sect. 8.3.2.

8.3.2 Bias Analysis While measures of rater severity can indicate their overall rating tendencies, bias analysis makes it possible to examine whether the raters are systematically lenient or severe (Crocker & Algina, 2008). The degree of departure from what might have been expected allowing for normal variation is expressed in terms of z-score, with values greater than + 2 or smaller than -2 displaying significant bias (McNamara, 1996). In order to investigate whether the two expert groups differed in terms of their perceptions of descriptors, bias analysis between rater groups and items (descriptors) was conducted. The analysis results showed that two interactions reached statistical significance, accounting for only 0.13% of the total count, as demonstrated in Table 8.7. Columns 1 and 2 list the rater group and item numbers. Columns 3 and 4 show what might be expected in the pattern of scores for that particular combination, compared with what was actually found. Column 5 displays the size of the bias computed in terms of raw scores. Column 6 presents the bias size in logit units “relative to the rater subgroup’s overall severity measure” (Winke et al., 2013, p. 244). The likely

140

8 Validating the Business English Writing Scale

Table 8.7 Rater group: item bias analysis Rater group

Item

Observed score

Expected score

Obs-Exp average

Bias (logit)

Error

Z-Score

Infit MnSq

TP

47

13

10.58

0.48

1.90

0.95

2.01

1.5

DS

47

9

11.39

−0.48

−1.77

0.85

−2.07

1.9

error of this bias is shown in column 7. The last two columns provide information on the z-score and the infit mean square values. Interestingly, both expert groups displayed bias on the same item—descriptor 47 [Demonstrates awareness of intercultural differences and respect for other cultures. For instance, can make compliments, invitations, refusals, etc. in ways acceptable to people of different cultural origins] and the bias values were 2.01 and −2.7 respectively. Specifically, the expected score from the teaching experts on this descriptor was 10.58, yet the observed total score was 13; this descriptor received higher scores than expected. In other words, the teaching experts perceived the descriptor to be easier than was expected. The reverse pattern was identified with regard to the domain experts’ ratings on the same descriptor. The expected score from this group on this descriptor was 11.39, whereas the observed score was 9, suggesting that the same descriptor was awarded lower scores by the domain experts. Simply put, the domain experts perceived the descriptor to be more difficult than was normally expected. Interview data shed light on why the two expert groups exhibited unexpected rating patterns towards this descriptor. As to the teaching experts, language functions such as making compliments, invitations and refusals in the pedagogical contexts were usually simplified as a few commonly used formulaic expressions, which did not pose many challenges to students. Domain experts, on the other hand, were exposed to a wide range of understandably more complicated language use in the real world and thus were likely to be more sensitive to the subtlety involved in performing appropriate language functions. The two excerpts below explain this finding: TE4: “This [What is described in descriptor 47] is quite elementary. These daily communicative activities usually involve the use of some set expressions, such as how to refuse others. These expressions are usually quite simple in terms of grammar or vocabulary. Students only have to memorize them and apply them in proper situations.” DE3: “I put this descriptor at the Advanced level… I think this describes true culture. What we learned from books or saw from movies accounted for at most 30 to 40 percent of the knowledge required to fulfill such language tasks satisfactorily. One has to communicate with people from other cultures in person to know what intercultural communication really involves. This is definitely not easy.” In addition, as indicated in the excerpt below, one teaching expert (TE5) also referred to the inherent ambiguity of the descriptor, which affected his judgement on the descriptor’s difficulty level.

8.3 Experts’ Perceptions of Descriptor Levels

141

TE5: “Intercultural differences are quite easy to be mastered as long as students receive formal training in this respect. But I am not sure what “other cultures” refer to. Does it mean all cultures or the culture of the country (where one is engaged in the business activity? If we require students to know about all cultures, I will put it at The Intermediate level; if it only refers to certain countries where business transactions are carried out, then I will put it at The Lower-Intermediate level.” In light of the inconsistent interpretations aroused by descriptor 47 on the part of the experts, this descriptor was deleted from the scale.

8.4 Scale Level Validation 8.4.1 Item Fit Analysis Infit and outfit mean-square statistics in FACETS outputs provide useful information of how compatible the data are with the model. The expected infit and outfit values are 1, but the fit statistics can range from 0 to infinity. As mentioned in Chap. 7, acceptable mean-square fit values in this study ranged from 0.40 to 1.60. Fit values greater than 1.60 (misfits) suggest underfit to the model and indicate unmodeled noise or other sources of variance in the data. In the current study, only 5 misfitting items with infit values greater than 1.60 were examined. Fit statistics of these items are presented in Table 8.8 It is interesting to note that three of the descriptors listed in Table 8.8 (descriptors 47, 54 and 69) were already identified as problematic ones in earlier analysis. Specifically, descriptor 47 (Demonstrates awareness of intercultural differences and respect for other cultures. For instance, can make compliments, invitations, refusals, etc. in ways acceptable to people of different cultural origins.) was the item on which both expert groups displayed significant bias, and descriptor 54 (Demonstrates awareness of the sources from which information of prospective clients can be accumulated to establish new business relationships.) violated the coherence of the descriptors in the illustrative scale it belonged to. As to descriptor 69 (Can write notes in appropriate language to convey important information of immediate relevance to superiors or new clients, getting across the key points.), the tasks involved were considered too artificial and irrelevant to the Chinese context. They were hence all deleted from the final version of the scale for validity concerns. Interview data were then examined to figure out why the other two descriptors misfit. The qualitative analysis revealed that descriptor 35 was probably flawed by an inappropriate selection of examples while descriptor 58 might be too contextsensitive to be interpreted consistently. The comments made by the participants on each descriptor are presented below. Descriptor 35: Can provide an appropriate and effective logical structure (e.g., heading, topic sentence) to facilitate reader comprehension of the business text.

142

8 Validating the Business English Writing Scale

Table 8.8 A summary of the misfitting Items Descriptor

Category

Descriptor 47 Demonstrates awareness of Sociolinguistic competence intercultural differences and respect for other cultures. For instance, can make compliments, invitations, refusals, etc. in ways acceptable to people of different cultural origins

0.49

2.42

Descriptor 69 Can write notes in appropriate Note language to convey important information of immediate relevance to superiors or new clients, getting across the key points

0.49

2

Descriptor 35 Can provide an appropriate and effective logical structure (e.g. heading, topic sentence) to facilitate reader comprehension of the business text

Strategic competence

0.12

1.78

Descriptor 58 Can write a(n) business letter/E-mail in an official capacity to an organization, concerning a straightforward business problem or transaction

Letter & E-mail

1.21

1.66

Descriptor 54 Demonstrates awareness of the sources from which information of prospective clients can be accumulated to establish new business relationships

Business knowledge

0.12

1.66

Some participants pointed out that one problem with this descriptor was the use of examples. The quotes below from three participants who gave three different ratings to the descriptor reflected how the inclusion of the examples in the descriptor had resulted in the inconsistent evaluation of its difficulty level: TE2: “When I first saw the descriptor I did not notice your examples. I felt that it was pretty difficult to be able to provide an appropriate and effective logical structure to facilitate reader comprehension of the business text. So I put it at The Advanced level. But a closer look at the examples… you know, heading and topic sentence are comparatively much easier. So I would like to change the descriptor to The Intermediate level.” TE3: “I chose 3 for this descriptor because students often perform poorly in this respect. Headings are not as easy as many people assume. When students are writing something, they often fail to keep the headings parallel in terms of format. For example, if one starts the heading with “–ing” structure, he or she has to stick to this structure throughout the text; likewise, if one starts the heading with “to”, he or she should keep using infinitives. The same is true with the use of noun phrases. If I don’t emphasize this point to the students, they will never realize the problem… And it is really hard to find an appropriate heading for paragraphs or sections.” DE2: “I put this item at The Lower-Intermediate level. It is very simple. I think heading refers to the “subject” of an “E-mail”. When writing an “E-mail”, we are used

8.4 Scale Level Validation

143

to putting the most important information at the beginning, and then add necessary details. This is what we do every day and it’s quite easy.” The qualitative data revealed that the examples included in the descriptor, which were originally intended to facilitate reader understanding, turned out to have an unexpected confounding impact on the experts’ perceptions. This descriptor was thus revised as “Can effectively structure the business text so as to facilitate reader comprehension” by deleting the example”. Descriptor 58: Can write a(n) letter/E-mail in an official capacity to an organization, concerning a straightforward business problem or transaction. Descriptor 58 was also given very different ratings by the experts. The following three quotes illustrate the various interpretations maintained by the participants: DE3: “The Lower-Intermediate level. This describes what we are doing every day. Every staff is communicating with others in an official capacity. This is a routine task that everyone is supposed to be good at.” DE2: “I put it [descriptor 58] at The Intermediate level. This may involve communication between companies, so it may be a bit more difficult than communication within the company, especially when there are conflicts of interest between the two companies. The person will have to undertake important responsibilities in this case… So it is not very easy.” TS5: “I put this one at The Advanced level… You know, as far as students are concerned, it is very difficult for them to write E-mails in an official capacity… And the descriptor reads concerning a straightforward business problem or transaction. This makes the task no longer a simple writing task; it also involves business negotiation skills. Students must integrate their negotiation skills with language skills… So I think this is very difficult for students, but to in-service employees, I think this is a very basic requirement.” These comments show that domain experts, in general, did not consider this task difficult as writing E-mails was a common activity in the workplace that every business employee is supposed to be good at. TE5 stood out in the interview by demonstrating a more nuanced understanding of the problem. He divided English learners into two groups, namely university students who had little real-world BE writing experiences and in-service employees, who had rich working experience but might be lacking in systematic English training. TE5’s opinion revealed that this descriptor was sensitive to the population; that is, for workers with considerable working experience, this task might be elementary, but for university students lacking in practical experience, the task might be fairly challenging. To close such gaps, the elements in the descriptor that caused inconsistent interpretations among the participants were removed (i.e., “to an organization”, “business transactions”). As such, the simplified version of the descriptor was adopted in the validated scale, which reads “Can write a(n) letter/E-mail to an organization concerning straightforward business issues”.

144

8 Validating the Business English Writing Scale

8.4.2 Setting Cut-Offs With a set of validated descriptors available, the next step was to check to what extent these descriptors could be assigned to proficiency levels that corresponded with the level structure of the draft scale. Table 8.9 shows the item measurement report generated by FACETS. The separation index of 2.93 indicated that this pool of items could be separated into about three statistically distinct levels of proficiency (Eckes, 2011). The reliability of the item separation index was 0.90, suggesting that the items were well-spread along the measures of difficulty and could be reliably separated into distinct difficulty levels. The chi-square value of 690.9 with 85 degrees of freedom was significant at p < 0.00, indicating that at least two items were different in terms of their difficulty levels. The Wright map graphically displays item difficulty estimates(see Fig. 8.1) on an interval scale. The left-most column shows the common scale (in logits), with the item number to the right of the horizontal line depicting the difficulty estimate of each descriptor. The items are ordered with the most difficult ones placed at the top and the least difficult ones placed at the bottom and are matched to three difficulty levels in the right-most column. According to Fig. 8.1, the items displayed considerable dispersion (i.e., item discrimination was high), demonstrating that the range of descriptors sufficiently represented three proficiency levels. The level distribution and their corresponding logit values are shown in Table 8.10. As demonstrated in Table 8.10, there was considerable variation in item difficulty, ranging from −5.70 to 5.28 on the logit scale. Among these items, 24 items fell within the first level, 39 items within the second level, and 23 within the third level. The smallest logit difference was found in The Intermediate level (2.57 logits), which included approximately 1.5 times more descriptors than those at the other two levels. This finding was consistent with the structure of the original scale. Another validity evidence lied in the apparent gaps between the level cut-offs (0.74 logit from The Lower-Intermediate level to The Intermediate level, 0.41 logit from The Intermediate level to The Advanced level), suggesting that there did exist a plateau for learners to progress across the levels. Table 8.9 Summary of item measurement report Measure

Infit

Outfit

Mean

S.E

MnSq

ZStd

MnSq

ZStd

−0.03

0.87

0.97

0.0

0.99

0.0

Note Fixed (all same) chi-squere = 690.9; df = 85; p < 0.00

Separation index

Reliability

2.93

0.90

8.4 Scale Level Validation

145

Fig. 8.1 Wright map Table 8.10 Levels and Cut-offs based on experts’ ratings Level/Proficiency code Advanced level Intermediate level

Cutoff Logit range 2.01 −0.97

Lower-Intermediate level −1.71

Logit difference Number of descriptors

2.01–5.28

3.27

23

−0.97 to 1.60

2.57

39

−1.71 to −5.70 3.98

24

146

8 Validating the Business English Writing Scale

8.4.3 Validated Scale Versus Draft Scale: Level Assignment The descriptor levels calibrated on the basis of experts’ ratings were then compared to the level structure of the draft version of the BE writing scale and descriptors that were put into different levels were scrutinized before decisions were made to change or retain the original level assignment. As presented in Table 8.11, altogether there were nine cases of discrepancy, accounting for about 10.4% of the 86 descriptors included in the original scale. This meant that 89.6% of the descriptors remained the same in the validation phase of the study, indicating a satisfactory degree of level consistency. In addition, the difference identified was no more than one level up or down. Of the original 27 descriptors contained in The Advanced level, 23 remained the same and the agreement rate was 87.5%. The Intermediate level originally contained 40 descriptors, and five items were allocated into a different level according to the expert ratings, demonstrating an agreement rate of 90%. In terms of The Lower-Intermediate level, the agreement rate reached 100%, as all the original 19 descriptors remained at the same level. According to Cronbach (1990), “the job of validation is partly to support an interpretation, and partly to detect anything wrong with it” (p. 186). For the purpose of further improving the scale, interview data were examined to explore in greater depth why the experts ascribed different levels to the descriptors listed in Table 8.11. Their opinions were carefully weighed upon before revising or retaining the original level of the descriptors in question. At this juncture, it is worth noting that three of the nine cases (i.e., descriptors 54, 58 and 70) were identified as problematic descriptors in previous sections. Therefore, only the remaining six descriptors (i.e., descriptors 2, 16, 17, 53, 56 and 57) were examined through the interview. To facilitate reader comprehension, these descriptors were presented and discussed under the heading of their respective scale categories. Vocabulary Descriptor 2 “Has a good command of a very broad business lexical repertoire including idiomatic expressions and colloquialisms” in the sub-scale of Vocabulary, which was initially placed at The Intermediate level in the draft scale, was put at The Advanced level in the validation stage. The domain experts, in particular, ascribed higher ratings to this descriptor mainly because of the examples included in it. They maintained that idiomatic language had strong socio-cultural connotations and might cause serious intercultural misunderstandings if used improperly. As such, they discouraged the use of idiomatic language in professional settings. The following comments are illustrative of such a perspective: DE1: “Using idiomatic language is more likely to incur mistakes, you know, the kind of social mistakes. So in written communication, I often try to use rather simple expressions to facilitate reader comprehension of the intended messages.” DE4: “When writing E-mails, for example, I seldom use idiomatic expressions and colloquialisms. As we are communicating for business affairs, using idiomatic

8.4 Scale Level Validation

147

Table 8.11 Summary of descriptors assigned into different levels Descriptor

Category

Descriptor 2 Has a good command of a very broad business lexical repertoire including idiomatic expressions and colloquialisms

Vocabulary

Measure

Infit

New level

Original level

1.6

1.06

Intermediate

Advanced

Descriptor 16 Mechanics Demonstrate control of the capitalization rules in business written communication

−1.71

1.46

Lower-Intermediate

Intermediate

Descriptor 17 Punctuation is reasonably accurate

Mechanics

−2.51

1.16

Lower-Intermediate

Intermediate

Descriptor 53 Is familiar with corporate running and managing in terms of planning, scheduling, reporting, etc. For instance, can handle payment issues and describe corporate performance results

Business knowledge

0.12

0.5

Intermediate

Advanced

Descriptor 54 Demonstrates awareness of the sources from which information of prospective clients can be accumulated to establish new business relationships

Business knowledge

0.12

1.56

Intermediate

Advanced

Descriptor 56 Has a Business preliminary knowledge understanding of fundamental business concepts such as advertising, branding and product promotion

−2.51

1.02

Lower-Intermediate

Intermediate

(continued)

148

8 Validating the Business English Writing Scale

Table 8.11 (continued) Descriptor

Category

Descriptor 57 Has a Business basic understanding knowledge of common daily office communications and conventions Descriptor 58 Can write a(n) business letter/E-mail in an official capacity to an organization, concerning a straightforward business problem or transaction

Letter& E-mail

Descriptor 70 Can Note write notes in appropriate language to convey important information of immediate relevance to superiors or new clients, getting across the key points

Measure

Infit

New level

Original level

−4.39

1.2

Lower-Intermediate

Intermediate

1.21

1.56

Intermediate

Advanced

−2.09

0.77

Lower-Intermediate

Intermediate

language would appear to be too informal. So I don’t think that the use of idiomatic expressions should be recommended, as they might project too informal an image on the part of the writer, and also the company he represents. Only when the receiver and I are very close to each other or when I am sending the E-mail to just one person, not “cc” to others, idiomatic language might be used. But generally speaking, to me, idiomatic language should not appear in business documents.” Teaching experts, however, expressed a more favorable attitude towards this descriptor, arguing that mastery of idiomatic expressions can greatly bolster the effectiveness of communication and therefore should be treated as an important aspect of learning in BE classrooms. The following remark is illustrative of such a view: TE5: “Idiomatic expressions and colloquialisms in my opinion are something basic and essential for students to master. Otherwise, their writing won’t be accurate and native-like.” Advocating the use of idiomatic language in BE written communication, some teaching experts did not perceive idiomatic expressions as difficult learning points as domain experts did. TE2, for instance, made the following observation:

8.4 Scale Level Validation

149

TE2: “When students are introduced to the course of Business English writing, I often require them to recite a lot of idiomatic expressions to demonstrate their professionalism. These expressions are fixed ones used in rather fixed situations, like “Your earliest reply will be highly appreciated” at the end of an E-mail. So I think they are very easy and [descriptor 2] should be placed at The Lower-Intermediate level.” Scrutiny of all the quotes cited above revealed that inconsistent interpretations of the descriptor was largely caused by the example concerning idiomatic language use rather than the main content in the descriptor which emphasized having “a good command of a very broad business lexical repertoire”. This descriptor was thus revised as “Has a good command of a very broad business lexical repertoire” by deleting the original example included and was then assigned to The Advanced level. Orthographical Control Two descriptors (descriptors 16 and 17) in the sub-scale of Orthographical control, which were originally put at The Intermediate level, turned out to be at The LowerIntermediate level in the validation stage. In fact, the two expert groups differed in terms of their perception of descriptor 16, which reads “Demonstrate control of the capitalization rules in business written communication”. Specifically, the teaching experts believed that this descriptor should be placed at The Intermediate level whereas the domain experts preferred to put it at The Lower-Intermediate level. The reasons held by the teaching experts are provided below: TE1: “I put it [descriptor 16] at The Intermediate level. Students make quite a lot of mistakes of this kind in their writing. Of course, they all know that the first letter of the first word in a sentence should be capitalized, but they are much less familiar with the capitalization rules concerning some technical terms and acronyms.” TE3: “Capitalization rules are not very difficult, but students tend to neglect this aspect. So it [descriptor 16] should at least be put at The Intermediate level. For example, the first letter of the “heading” and “sub-heading” of a report should be capitalized, but students often make mistakes in these areas.” Unlike the teaching experts, domain experts believed that mastery of capitalization rules in English mattered little in their occupational context, which prioritized meaning over form. Probably influenced by such beliefs, most domain experts ascribed lower ratings to this descriptor. DE1, for example, mentioned: DE1: “I am not very sensitive to the capitalization rules in English and sometimes make mistakes. For instance, when spelling the names of people, organizations or products, I may forget to capitalize some letters. But since this does not affect the communication of meaning, I don’t think it is a big issue. So I put this descriptor at The Lower-Intermediate level. It is very easy to master such rules as long as one pays attention.”

150

8 Validating the Business English Writing Scale

DE2: “We ‘care’ little about capitalization rules. My boss, an American, even doesn’t capitalize the word “I”… I only care about capitalization when giving presentations and it is not difficult at all.” DE5, the only domain expert who had an English language educational background, however, emphasized that capitalization rules were of great importance to some fields of work and were not as easy as many people assumed. He made the following comment: DE5: “My work involves contract writing. Usually, in a contract, some words have special meanings when their first letters are capitalized, but not many people are sensitive to this issue. I have read a lot of contracts drafted by my colleagues who fail to address this aspect. To me, this would project a very unprofessional image of the company, so I often ask them to correct such mistakes… Knowing when to capitalize requires actually rich lexical knowledge, so I put this descriptor at The Intermediate level.” Similar disagreements were also identified between the two expert groups in terms of the other descriptor in the sub-scale of Orthographical control, which reads “Punctuation is reasonably accurate”. The domain experts believed that the ability described was very simple, or to be more exact, was of little significance in real-world workplace communication, which obviously degraded its difficulty level. A typical comment made by them was: DE4: “I don’t think punctuation is important at all. For business English writing, its content accounts for 80% of the communicative effect, while the other things account for 20%. Things like punctuation are the least important as its major function is to separate words and phrases. So it [descriptor 17] obviously belongs to The Lower-Intermediate level.” DE3, the harshest one among all the experts according to FACETS analysis, was the only exception in the DE group by putting this descriptor at The Intermediate level. His opinion overlapped significantly with that of the teaching experts, as elaborated below: DE3: “Punctuation is not as easy as many people assume. To me, to differentiate the use of comma, period and semicolon can be quite challenging.” TE1: “Students have no problem with most of the punctuations, but they can’t use punctuation accurately. So I put it [descriptor 17] at The Intermediate level. Their problems lie in the use of a comma when the semicolon is the correct choice, such as the sentences beginning with “however”. They make a lot of mistakes of this kind.” The different opinions and perspectives voiced by the experts were carefully considered in light of the purpose of the research, which was to describe English writing proficiency in the business context. This meant that form was at least as important as meaning and language quality should be stressed to promote a more professional image of the writer. A decision was then made to keep both descriptors at their original level (i.e., The Intermediate level) in hope that the close association between accurate punctuation use and one’s grammatical competence in the English language could be highlighted and more instructional help would be granted in this respect.

8.4 Scale Level Validation

151

Business Knowledge The sub-scale of Business knowledge turned out to be the illustrative scale where most unstable descriptors were identified. Specifically, four (Descriptors 53, 54, 56 and 57) out of the six items in this sub-scale demonstrated level differences during the scale validation phase. An interesting pattern was identified, nonetheless, that the experts placed all the four descriptors to one level lower than the original scale developed on the basis of students’ self-assessment data. Take Descriptor 53 (Is familiar with corporate running and managing in terms of planning, scheduling, reporting, etc. For instance, can handle payment issues and describe corporate performance results) as an example. This descriptor was originally placed at The Advanced level, yet many experts, especially domain experts, considered that the descriptor should be placed at The Intermediate level. Below was an explanation: DE5: “All the tasks described are not difficult at all, and people have various resources available to fulfill these tasks. For instance, if I’m required to describe my company’s performance results, I can resort to the quarterly or annual reports produced by the company and write very beautiful reports. Indeed, once the correct resources are found, the tasks will become very easy.” An exception was TE5, who pointed out how different groups of BE learners in the Chinese tertiary context might approach the descriptor differently. He said: TE5: “It is important to know about who we are describing. If the descriptor is written for Business English majors, I will put it at The Advanced level. If the descriptor is written for non-English major students who study business-related subjects, I would put it at The Intermediate level… Business English majors are essentially English major students. More than 60% of what they learn at university is language-related skills. For students majoring in Business Management, knowledge of corporate structure and culture is basic and simple. But for business English majors, a very general understanding of such knowledge usually suffices to help them complete tasks such as business translation, you know, the type which does not involve much disciplinary knowledge.” Resorting to a different rationale, DE4 also expressed uncertainty when she was rating the difficulty level of the descriptor: DE4: “If you ask me to describe corporate performance results, I think I can complete the task as demanded. But being able to complete a task and being able to complete a task effortlessly make a great difference. And the quality of writing also varies.” Divergent opinions were voiced with regard to the other descriptors in this category, reflecting the elusory nature of Business knowledge and the fact that descriptor levels might be affected by factors such as learners’ educational backgrounds and working experiences. Nevertheless, as the BE writing scale being developed in the study was also supposed to serve as learning objectives for BE learners in Chinese universities, these descriptors were all placed at one level lower than was originally structured in the draft scale in accordance with the experts’ ratings. It was hoped that by doing so, proficiency described in these descriptors would not be limited to

152

8 Validating the Business English Writing Scale

advanced learners; instead, intermediate learners should also attempt higher learning goals. In terms of the other descriptors identified, similar revisions and changes were then made accordingly after careful consideration of the experts’ opinions. A summary of the revisions made in the validation phase of the study was summarized in Table 8.12. The final validated version of the scale hence contained 15 illustrative scales substantiated by 80 empirically calibrated descriptors.

8.5 Conclusion This chapter presents the results from the final phase of the study centering on scale validation. Drawing on a sample of ten experts, including five teaching experts and five domain experts, multiple sources of evidence were collected to examine the validity of the draft version of the scale. Specifically, the ten participants completed questionnaires consisting of the descriptors of the BE writing scale, before being interviewed on an individual basis. The major aim was to explore whether the experts considered the descriptors were appropriately categorized and whether their perception of the difficulty level associated with each descriptor was consistent with that of the original scale. In terms of descriptor categorization, analysis of the quantitative and qualitative data revealed that in general the experts considered the categorization of the descriptors appropriate, with the exception of four descriptors (see Table 8.4). These four descriptors were then revised in accordance with the expert’s suggestions, including changing the category name or the wording of the descriptor. As to the experts’ ratings on descriptor difficulty, the data were analyzed in FACETS version 3.71.1, using the many-facets Rasch model. After examining interrater reliability (rater consistency and severity), bias analysis, item fit analysis, three proficiency bands were created on the basis of the logit values of the items, which were then compared with the descriptors’ original level assignment in the BE writing scale. The analysis results show that only nine out of the 86 descriptors displayed level change, indicating a satisfactory degree of level consistency. Exploration of the interview data revealed that experts’ perceptions of the descriptor level were influenced by factors such as the examples included in the descriptors, and as a result, revisions were made to further enhance the validity of the scale. The revised version of the BE writing proficiency scale, which consists of 15 descriptive categories substantiated by 80 empirically calibrated descriptors, is presented in the last chapter of the book.

8.5 Conclusion

153

Table 8.12 Revisions made in the validation phase Problematic area

Revision

Rationale

The sub-category of “Note”

Deleted

Identified as an artificial category

The heading of “Orthographical Revised as “Mechanics” control”

To extend the broadness of the category

Descriptor 37 Appreciates fully the sociolinguistic implications of language used by speakers of other cultures and can react accordingly

To render the descriptor more accessible to non-specialists

Revised as “Appreciates fully the effects of social factors (e.g., power relations) on language use and can react accordingly”

Descriptor 45 Is aware of the Deleted effect of non-linguistic elements on intercultural business written communication, including time, space, etc

Irrelevant to written communication

Descriptor 54 Demonstrates awareness of the sources from which information of prospective clients can be accumulated to establish new business relationships

Identified as an inappropriately categorized and misfitting item

Deleted

Descriptor 47 Demonstrates Deleted awareness of intercultural differences and respect for other cultures. For instance, can make compliments, invitations, refusals, etc. in ways acceptable to people of different cultural origins

Identified as a biased and misfitting item

Descriptor 35 Can provide an appropriate and effective logical structure (e.g., heading, topic sentence) to facilitate reader comprehension of the business text

Revised as “Can effectively To delete inappropriate structure the business text so as examples to facilitate reader comprehension”

Descriptor 58 Can write a(n) letter/E-mail in an official capacity to an organization, concerning a straightforward business problem or transaction

Revised as “Can write a(n) business letter/E-mail in to an organization, concerning straightforward business issues”

Descriptor 2 Has a good Revised as “Has a good command of a very broad command of a very broad business lexical repertoire business lexical repertoire” including idiomatic expressions and colloquialisms

To enhance consistent interpretation of the item

To delete inappropriate examples

154

8 Validating the Business English Writing Scale

References Batty, A. O. (2015). A comparison of video- and audio-mediated listening tests with many-facet Rasch modeling and differential distractor functioning. Language Testing, 32(1), 3–20. Bond, T. G., & Fox, C. M. (2015). Applying the Rasch model: Fundamental measurement in the human sciences. Routledge. Creswell, J. W. (2009). Research design: Qualitative, quantitative, and mixed methods approaches. Sage. Crocker, L., & Algina, J. (2008). Introduction to classical & modern test theory (2nd ed.). Cengage Learning. Cronbach, L. (1990). Essentials of psychological testing. HarperCollins. Eckes, T. (2011). Introduction to many-facet Rasch measurement. Peter Lang. Elder, C., & McNamara, K. (2016). The hunt for “indigenous criteria” in assessing communication in the physiotherapy workplace. Language Testing, 33(2), 153–174. Goetz, J. P., & Lecompte, M. D. (1984). Ethnography and qualitative design in educational research. Academic Press. Kaftandjieva, F., & Takala, S. (2002). Council of Europe scales of language proficiency: A validation study. In J. C. Alderson (Ed.), Common European framework of reference for languages: Learning, teaching, assessment: Case studies (pp. 106–129). Council of Europe. Linacre, J. M. (2015). Facets Rasch measurement computer program [Computer software]. Winsteps. McNamara, T. (1996). Measuring second language performance. Addison Wesley Longman. Miles, M. B., & Huberman, A. M. (1994). Qualitative data analysis: A sourcebook of new methods. Sage. Myford, C. M., & Wolfe, E. W. (2003). Detecting and measuring rater effects using many-facet Rasch measurement: Part I. Journal of Applied Measurement, 4(4), 386–422. Patton, M. Q. (1990). Qualitative evaluation and research methods. Sage. Richards, K. (2003). Qualitative inquiry in TESOL. Palgrave Macmillan. Tashakkori, A., & Teddlie, C. (Eds.). (2003). Handbook of mixed methods in social and behavioral research. Sage. Watson-Gegeo, K. A. (1988). Ethnography in ESL: Defining the essentials. TESOL Quarterly, 22(4), 575–592. Weigle, S. (1998). Using FACETS to model rater training effects. Language Testing, 15(2), 263–287. Winke, P., Gass, S., & Myford, C. (2013). Raters’ L2 background as a potential source of bias in rating oral performance. Language Testing, 30(2), 231–252.

Chapter 9

Conclusions

9.1 A Recap of the Study The overarching purpose of this study was to develop and validate a scale of BE writing proficiency, which (1) bore close relation to the theory-based and activity-based categories used for the description of BE writing proficiency, and (2) consisted of level descriptions calibrated by the Rasch measurement model. To this end, a mixed-methods design integrating intuitive, qualitative and quantitative data collection methods was adopted. Specifically, the study was carried out in four phases. In the first phase, a descriptive scheme of the scale was established. Guided by the principle that a valid scale of language proficiency should be both theoretically grounded and practically useful (North, 2000), categories of the descriptive scheme were determined on the basis of (1) the model of Genre Knowledge (Tardy, 2009), and (2) a comprehensive review of literature pertinent to typical BE writing activities. Specifically, the four dimensions of the model of Genre Knowledge (Tardy, 2009) were operationalized as nine subcategories of the descriptive scheme: Vocabulary, Grammar, Orthographical control, Genre format, Cohesion and coherence, Strategic competence, Sociolinguistic competence, Intercultural competence and Business knowledge. The literature review of the typical business written genres contributed to ten scale categories of the descriptive scheme, which were Letter, E-mail, Report, Memo, Minutes, Note, Press release, Resume, Summary and Documentary materials. A multi-layer descriptive scheme incorporating 19 descriptive categories were hence generated. Meanwhile, a pool of potentially useful descriptors were collected by referring to sources such as existing language proficiency scales, BE curricular requirements, BE writing textbooks, and BE test handbooks. In the second phase of the study, two workshops were organized in which 15 experienced university teachers of BE writing examined the relevance of the descriptive scheme and the descriptors to the Chinese tertiary context. In the first workshop, training was provided to orient the teacher participants to the purpose of © Springer Nature Singapore Pte Ltd. 2021 L. Wang and J. Fan, Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach, https://doi.org/10.1007/978-981-16-5449-7_9

155

156

9 Conclusions

the study, and to familiarize them with the concepts and functions associated with language proficiency scales. Then, the teachers discussed whether the categories of the descriptive scheme served as useful parameters for describing BE writing proficiency and after that, they also made necessary revisions to the descriptive scheme. The revised descriptive scheme contained 17 scale categories, which were put under the headings of Genre Knowledge (including Vocabulary, Grammar, Orthographical control, Genre format, Cohesion and coherence, Strategic competence, Sociolinguistic competence, Intercultural competence and Business knowledge) and BE Writing Activities (including Letter, E-mail, Report, Memo, Minutes, Note, News release, Resume and Summary) respectively. In the second workshop which spanned three sessions, the teachers examined the quality of the descriptors individually, in small groups, and as a whole. They scrutinized all the descriptors through iterative rounds of discussions and made necessary revisions to the descriptors, thus yielding 93 high-quality descriptors to be used in the next phase of the study. In the third phase of this study, the descriptive scheme containing 17 descriptive categories and 93 descriptors yielded from teacher workshops in the previous phase of the study were used for the construction of a questionnaire. 572 students who had completed their training in BE writing in Chinese universities were invited to rate their perceived difficulty of the descriptors. The questionnaire data were then calibrated using the Rasch model. After a detailed examination of Rasch analysis results such as item-person map, separation indices, unidimensionality analysis, and item fit statistics, 86 descriptors that demonstrated robust measurement properties were differentiated into three proficiency levels according to their logit values. After conducting a content analysis of the descriptors across the proficiency levels, they were put under their respective categories, yielding 16 illustrative scales of BE writing proficiency. In addition, a “global scale” was also developed to provide a general overview of the proficiency levels. In the final phase of the study, the BE writing scale was validated by a group of experts, including five teaching experts and five domain experts. Specifically, the ten participants completed questionnaires consisting of the descriptors of the BE writing scale, which was followed by one-on-one interviews. The study in this phase was aiming to explore (1) whether the experts considered the descriptors appropriately categorized and (2) whether their perception of the difficulty level of each descriptor was well aligned with that set in the original scale. In terms of descriptor categorization, analysis of the quantitative and qualitative data revealed that in general the experts considered the categorization of the descriptors appropriate, with the exception of only four descriptors, which were revised in accordance with the experts’ suggestions. With regard to descriptor level assignment, the experts’ ratings on descriptor difficulty were analyzed using the many-facets Rasch model. After examining inter-rater reliability (rater consistency and severity), bias analysis, item fit analysis, three proficiency bands were created on the basis of the logit values of the descriptors, which were then compared with the descriptors’ original level assignment in the BE writing scale. The analysis results show that the proficiency levels of 75 (87%) descriptors corresponded with those in the original scale, indicating a satisfactory degree of level consistency. After exploring the interview data,

9.1 A Recap of the Study

157

necessary revisions were made, yielding the validated version of the BE writing proficiency scale.

9.2 Presentation of the Scale The final version of the scale consists of (1) a descriptive scheme (see Fig. 9.1) encompassing 15 descriptive categories subsumed under the two broad categories of Genre Knowledge and BE Writing Activities respectively, and (2) 80 descriptors of BE

Vocabulary

Grammar

Formal knowledge

Mechanics Genre format Cohesion & coherence

Genre Knowlege

Process knowledge

The initial descriptive scheme

Rhetorical knowledge Subject-matter knowledge Letter & E-mail Report

Typical Writing Activities

Minutes Memo Summary Resume

Fig. 9.1 The descriptive scheme of the validated BE writing scale

Strategic competence Sociolinguistic competence

Intercultural competence Business knoweldge

158

9 Conclusions

writing proficiency spanning three consecutive levels—A (The Lower-Intermediate level), B (The Intermediate level), C (The Advanced level). The category of Genre Knowledge contains nine sub-categories: Vocabulary, Grammar, Mechanics, Genre format, Cohesion and coherence, Strategic competence, Sociolinguistic competence, Intercultural competence and Business knowledge. The category of BE Writing Activities contains six sub-categories: Letter and E-mail, Report, Memo, Minutes, Summary and Resume. By defining BE writing proficiency mainly in terms of what the learners can do, the scale is presented as one global scale (see Table 9.1) holistically summarizing BE writing proficiency at three levels, and 15 illustrative scales, delineating analytically BE writing proficiency from the perspective of its many constituents (see Tables 9.2, 9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9, 9.10, 9.11, 9.12, 9.13, 9.14, 9.15 and 9.16).

9.2.1 Presentation of the Global Scale A global scale (see Table 9.1) of the BE writing proficiency scale is established to provide an overview of the stages through which BE learners are expected to progress. It is made up of descriptors which are deemed as most effective in reflecting the features of each proficiency band, under the headings of “Genre Knowledge” and “Typical Writing Activities” respectively at each level.

9.2.2 Presentation of the Illustrative Scales 9.2.2.1

Illustrative Scales Under the Category of Genre Knowledge

The illustrative scale of “Vocabulary” (see Table 9.2) profiles the lexical elements that the BE learners will need or be required to master across three proficiency levels. The lexical elements emphasized in the scale include business terminologies, in addition to common BE acronyms and abbreviations. The ability to differentiate the subtle differences between synonyms is in particular highlighted in the Advanced level of the illustrative scale. The illustrative scale of “Grammar” (see Table 9.3) profiles the learners’ grammatical competence, namely their knowledge of and ability to use the grammatical resources of English in the process of conducting business communication. Both grammatical accuracy and syntactical complexity are emphasized in this illustrative scale. At the Advanced level, the learners’ ability to make appropriate grammatical choices to suit communicative purposes is stressed, as opposed to a sole emphasis on linguistic complexity, since a highly skillful BE writer is expected to adjust the syntactical complexity of his or her written discourses to contextual factors such as the purpose of writing and formality of the occasion (Pawlak et al., 2020).

9.2 Presentation of the Scale

159

Table 9.1 Global scale of the BE writing scale Advanced

Genre knowledge

Can use a wide range of business vocabulary fluently and flexibly to convey precise meanings. Can adopt appropriate strategies to handle messages that are likely to arouse negative feelings in the readers. Appreciates fully the effects of social factors (e.g. power relations) on language use and can react accordingly. Demonstrates comprehensive understanding of the political, economic, historical, geographical, scientific and technological, and cultural characteristics of different nations

Typical writing activities Can write a formal report that is analytical in nature to present a case or give a critical appreciation of proposals. Can summarize information from different sources, reconstructing arguments and accounts in a coherent presentation of the discourse Intermediate

Genre knowledge

Demonstrates an awareness of current business terminology. Can make proper use of visual aids such as bar graphs, tables and line charts to complement or reinforce information loaded in words. Can adjust the level of difficulty and complexity of business text appropriately to the writing occasion. Has a sound understanding of general business topics, such as promotional strategies, conferences, business services, recruitment, etc.

Typical writing activities Can write a(n) non-routine letter/E-mail, provided that they are related to his/her work or field of expertise. Can write minutes as a list of key points during a clearly structured meeting on a familiar subject Lower intermediate Genre knowledge

Has a basic vocabulary repertoire of isolated words and phrases related to general business topics. Can produce business texts by imitating existing samples. Can establish basic business contact by simply but effectively using common polite expressions. Has a basic understanding of common daily office communications

Typical writing activities Can compose a memo relaying simple information such as a meeting notice. Can fill out job application forms with short comments on previous educational and working experience, etc.

160

9 Conclusions

Table 9.2 Illustrative scale—vocabulary Advanced

Can use a wide range of business vocabulary fluently and flexibly to convey precise meanings. Has a good command of a very broad business lexical repertoire. Demonstrates satisfactory control over synonyms such as merger, consolidation and acquisition, especially their connotative levels of meaning in the business environment

Intermediate

Demonstrates familiarity with business specialist terms (e.g., offer, bill), whose meanings are very different when they are used in the non-business context. Demonstrates an awareness of current business terminology

Lower intermediate Has a basic vocabulary repertoire of isolated words and phrases related to general business topics. Demonstrates familiarity with common acronyms and abbreviations used in business documents such as HR and WTO

Table 9.3 Illustrative scale—grammar Advanced

Demonstrates consistent grammatical control of complex structures. Can employ a full range of grammatical structures with full flexibility and accuracy. For instance, can make skillful grammatical choices to enhance the overall communicative effectiveness of a business text

Intermediate

No grammatical structures are missing from the learner’s repertoire; errors of grammar are fairly rare, and are often picked up and corrected if he/she reviews the text. Communicates with reasonable grammatical accuracy in familiar contexts; apparent Chinese language influence does not lead to misunderstanding

Lower intermediate Shows control of a few simple grammatical structures and sentence patterns in a learned repertoire. Can use some simple structures correctly, but still systematically makes basic mistakes—for example, tends to mix up tenses and forget to mark agreement

Table 9.4 Illustrative scale—mechanics Advanced

Spelling is accurate even without access to additional resources such as dictionary

Intermediate

Spelling is correct, apart from occasional slips of pen. Demonstrate control of the capitalization rules in business written communication. Demonstrate control of punctuation conventions observed in business English documentation

Lower intermediate Can write with reasonable phonetic accuracy (but not necessarily fully standard spelling) short words that are in his/her oral vocabulary. Can spell his/her address, occupation and other personal details

The illustrative scale of “Mechanics” (see Table 9.4) involves descriptions of the learners’ ability to produce correct written symbols while engaging in BE writing activities. In addition to common areas that should be attended to such as the correct spelling of words, difficulties that are often encountered by Chinese students due to their Chinese writing habits are also highlighted, such as awareness of the differences between Chinese and English punctuation conventions.

9.2 Presentation of the Scale

161

Table 9.5 illustrative scale—genre format Advanced

Is aware of the essential and optional components of different business genres and can make appropriate selections in response to a variety of situations. Can adopt an appropriate format when transferring data taken from different genres. For instance, can write a formal report generated from a series of pieces of information written in an informal style

Intermediate

Is aware of the essential components of different business genres. Demonstrates the ability to reproduce or synthesize information from a wide range of visual, tabular and diagrammatic sources in concise written language with appropriate format

Lower intermediate Is aware of business writing conventions regarding layout and paragraphing Table 9.6 Illustrative scale—cohesion and coherence Advanced

Can create coherent and cohesive business text, making full and appropriate use of a variety of organizational patterns and a wide range of cohesive devices. Can produce clear, smoothly flowing, well-structured business text, showing controlled use of organizational patterns, connectors and cohesive devices

Intermediate

Can provide a logical order to the content of the writing and demonstrate clear progression throughout. Can use a variety of cohesive devices efficiently to mark clearly the relationships between ideas

Lower intermediate Can use a limited number of cohesive devices to link his/her writing into clear, coherent discourse, though there may be some “jumpiness” in a long discourse. Can use the most frequently occurring connectors like and, but, so to link simple sentences Table 9.7 Illustrative scale—strategic competence Advanced

Can adopt appropriate strategies to handle messages that are likely to arouse negative feelings in the readers. For instance, can provide additional expository information beyond a simple negative message to help readers better connect with the intended message

Intermediate

Can locate and incorporate available materials to enhance the overall effectiveness of the writing. For instance, can make appropriate use of quotations from an authorized organization to enhance the effectiveness of the writing without losing its apparent objectivity. Can make proper use of visual aids such as bar graphs, tables and line charts to complement or reinforce information loaded in words. Can design and present information in a way that the most important messages are highlighted. For instance, can selectively apply bullets or numbered points to call attention to important points. Can effectively structure the business text so as to facilitate reader comprehension

Lower intermediate Can produce business texts by imitating existing samples

162

9 Conclusions

Table 9.8 Illustrative scale—sociolinguistic competence Advanced

Appreciates fully the effects of social factors (e.g. power relations) on language use and can react accordingly. Can express him or herself confidently, clearly and politely in a formal or informal business register, appropriate to the purposes and audience(s) concerned

Intermediate

Can adjust the level of difficulty and complexity of business text appropriately to the writing occasion. For instance, can drop all technical jargon when the reader is an outsider of the field. Shows a high level of appreciation of the register. For instance, can adopt a conversational tone where the situation desires informal communication Is aware of the salient politeness conventions and acts appropriately. For instance, can keep discriminatory or derogatory language (e.g., sexist salutation) out of one’s business communications

Lower intermediate

Can establish basic business contact by simply but effectively using common polite expressions

Table 9.9 Illustrative scale—intercultural competence Advanced

Demonstrates comprehensive understanding of the political, economic, historical, geographical, scientific and technological, and cultural characteristics of different nations. Is familiar with the intercultural connotations of the symbols, abbreviations and acronyms used in the business environment

Intermediate

Can correctly and flexibly express subjective feelings and attitudes to users of English from different countries without unintentionally amusing or irritating them Can consciously look out for signs of the most significant differences between the customs, attitudes, values and beliefs prevalent in those cultures and his/her own, to ensure smooth intercultural communication. Can overcome fear and anxiety and cope with cultural conflicts by means of understanding and analyzing cultural differences

Lower intermediate

Has rudimentary knowledge of foreign cultures and customs. Has a preliminary intercultural knowledge, abandoning prejudice against or blind adoption of exotic cultures

Table 9.10 Illustrative scale—business knowledge Advanced

Has a systematic knowledge of business subjects such as economics, human resources, international business management, international business law, international finance, and business ethics, etc. Is familiar with corporate running and managing in terms of planning, scheduling, reporting, etc. For instance, can handle payment issues and describe corporate performance results

Intermediate

Has a sound understanding of general business topics, such as promotional strategies, conferences, business services, recruitment, etc. Has a preliminary understanding of fundamental business concepts such as advertising, branding and product promotion

Lower intermediate

Has a basic understanding of common daily office communications

9.2 Presentation of the Scale

163

Table 9.11 Illustrative scale—letter and e-mail Advanced

Can write letters/E-mails to persuade the receiver to take certain actions. For instance, can provide information about a new product, demonstrating to the customers how well that product will fill a need they truly have

Intermediate

Can write a(n) business letter/E-mail for an organization, concerning straightforward business issues. Can write business letters/E-mails that carry negative messages. For instance, can write a formal letter of complaint about products or services. Can write non-routine letters/E-mails, provided that they are related to his/her work or field of expertise. Can write letters/E-mails to convey messages of factual nature, such as a letter providing information about a new product

Lower intermediate

Can handle basic social correspondence on everyday business topics. For instance, can write a semi-formal reminder letter of a previous arrangement made with his/her client

Table 9.12 illustrative scale—report Advanced

Can write a formal report that is analytical in nature to present a case or give a critical appreciation of proposals. For instance, can write a feasibility report which provides data, analyses and recommendations to win approval of a project. Can write a report that synthesizes a large amount of data and complex information, collected through methods such as questionnaires, interviews. For instance, can coherently present information and state a position on a previously researched topic

Intermediate

Can write a report in response to requirements of information. For example, can write a series of progress reports on a regular basis to provide information on the progress of a project. Can write a business report which develops an argument, giving reasons in support of or against a particular point of view and explaining the advantages and disadvantages of various options

Lower intermediate Can write very simple reports which pass on routine information Table 9.13 Illustrative scale—minutes Advanced

Can write detailed minutes during a meeting on topics in his/her field of interest, recording the information so close to the original that the minutes could also be useful to other people

Intermediate

Can write minutes as a list of key points during a clearly structured meeting on a familiar subject. Can summarize and synthesize the main points advanced by different participants during the meeting in form of minutes and convert the conversational style into a formal narrative style

Lower intermediate Can write minutes on a routine meeting about the most important points, even though some information might be missing

164

9 Conclusions

Table 9.14 Illustrative scale—memo Advanced

Can compose fairly long memos in record of important events, such as a memo for international business negotiation

Intermediate

Can write memos for a variety of purposes on routine business situations or activities such as announcing a new corporate policy

Lower intermediate

Can write a memo relaying simple information (e.g. meeting notice) in a business context

The illustrative scale of “Genre format” (see Table 9.5) is concerned with the learners’ ability to produce generically appropriate texts in the business settings. The focus is particularly placed on their knowledge of the structure or format of BE written genres that are commonly recognizable and typically observed by experienced writers of the business community. To this end, knowledge of moves associated with typical BE written genres, in addition to the ability to differentiate the essential parts of genre from less important parts, is emphasized in this category. The illustrative scale of “Cohesion & coherence” (see Table 9.6) consists of level descriptions about the learners’ ability to produce business texts that are clearly structured and smoothly flowing. The ability to resort to a wide array of cohesive devices such as connectors and transition markers is accentuated in this scale. The illustrative scale of “Strategic competence” (see Table 9.7) involves level descriptions of the ability to marshal and balance various resources in order to maximize communicative effectiveness when producing a business genre. As the visual presentation of messages matters greatly in business communication, the ability to highlight important messages by resorting to nonverbal cues such as bullet points and diagrammatical tools is emphasized. In addition, dealing with negative messages constitutes one of the major challenges that BE writers are likely to encounter; therefore, the ability to skillfully handle such information is featured prominently in the Advanced level of the scale. The illustrative scale of “Sociolinguistic competence” (see Table 9.8) describes the learners’ ability to use linguistic codes appropriately in a social context. Factors such as the use of language in accordance with social rules, be they the norms, values, beliefs and behavior patterns expected by a society, are all reflected in the scale. The ability to adjust one’s language to the purposes of writing, the formality of the occasions and reader-writer relationship are in particular highlighted. The illustrative scale of “Intercultural competence” (see Table 9.9) consists of level descriptions about the learners’ intercultural communication knowledge and skills. At the Lower-Intermediate level, writers are supposed to demonstrate openness and tolerance to cultural differences. At the Intermediate level, the ability to cope with cultural conflicts is highlighted. Mastery of a broad array of subjects germane to intercultural communication such as politics, economic, and history is deemed necessary for advanced writers.

9.2 Presentation of the Scale

165

Table 9.15 Illustrative scale—resume Advanced

Can write an effective personal resume, highlighting his/her special skills, strengths, achievements and qualifications to enhance his/her marketability

Intermediate

Can write a resume presenting factual personal information that is pertinent to the needs of the prospective employer. Can use the right, effective keywords to draw the attention of prospective employers who are relying on digital databases to search for qualified job candidates

Lower intermediate Can fill out job application forms with short comments on previous educational and working experience, etc

The illustrative scale of “Business Knowledge” (see Table 9.10) describes learners’ knowledge of the business discipline. Advanced learners should demonstrate a mastery of general knowledge areas such as economics, human resources, international business management, international business law, international finance, and business ethics. For Intermediate learners, a preliminary understanding of fundamental business concepts such as advertising is required. Knowledge of common business practices such as routine office communication is required for Lower-Intermediate learners.

9.2.2.2

Illustrative Scales Under the Category of BE Writing Activities

The illustrative scale of “Letter & E-mail” (see Table 9.11) describes the learners’ ability to draft a wide array of letters and E-mails in the international business context. Advanced learners should be able to write letters/E-mails to perform the function of persuasion. At the Intermediate level, the ability to handle negative feelings in letters/ E-mails is stressed. Learners at the Lower-Intermediate level should be able to handle basic social correspondence on simple, routine business topics. The illustrative scale of “Report” (see Table 9.12) consists of level descriptions about the learners’ ability to produce English reports. The key elements involved in producing effective reports such as evaluating proposals and making valid arguments are emphasized throughout the levels. The ability to synthesize empirical data collected from various sources is particularly highlighted at the Advanced level. The illustrative scale of “Minutes” (see Table 9.13) consists of level descriptions about the learners’ ability to produce minutes in the context of business meetings. Learners of minutes are supposed to be able to record key information discussed in a meeting as accurate as possible. In addition, the learners’ ability to convert the spoken style of meeting discussions into narratives appropriate in minutes is also accentuated in the scale. The illustrative scale of “Memo” (see Table 9.14) describes the learners’ ability to produce memos on formal and informal occasions in the international business context. Content included in the memos ranges from very simple ones to fairly complicated ones across the proficiency levels in the illustrative scale. The illustrative scale of “Resume” (see Table 9.15) consists of level descriptions

166

9 Conclusions

Table 9.16 Illustrative scale—summary Advanced

Can summarize information from different sources, reconstructing arguments and accounts in a coherent presentation of the discourse. Can summarize extracts from business news items, interviews or documentaries containing opinions, arguments and discussions

Intermediate

Can summarize from a passage/passages of business text to produce, for example, a list of structured notes. Can summarize a wide range of texts of general commercial nature, commenting on and discussing contrasting points of view and the main themes

Lower intermediate Can reproduce keywords and phrases from a short text on subjects related to general business topics

about the learners’ ability to produce resumes for job-seeking purposes. At the Lowerintermediate level, the learners can fill in simple forms and offer short comments on previous educational or working experiences. At the Intermediate level, the learners are supposed to be able to present factual information about themselves. At the Advanced level, selective presentation of information about oneself for the purpose of optimizing one’s strengths is required. The illustrative scale of “Summary” (see Table 9.16) consists of level descriptions about the learners’ ability to produce written summaries for business discourses of various kinds. Difficulty of the tasks increases along with the increased complexity of the original texts across the three proficiency scales.

9.3 Implications for ESP Scale Development This study, aiming to develop and validate a scale of BE writing proficiency, is, to the best of our knowledge, among the few attempts to explore the development of proficiency scales in the ESP domain. The theoretical, methodological and practical implications of this study are detailed in this section.

9.3.1 Theoretical Implications: Understanding the BE Writing Construct According to North (2000), it is essential that the development of language proficiency scales be based on some theories in order to safeguard their validity (North, 2000). In fact, many high-profile projects aiming to develop language proficiency scales (e.g., the CEFR and the CSE) have resorted to models of communicative competence (e.g., Canale & Swain, 1980) as the basis for profiling language proficiency. It is believed that these theoretical models, due to their generic and

9.3 Implications for ESP Scale Development

167

context-independent nature, will render the language scales more generalizable and transferable (Knoch, 2011). In this study, we argue that models of communicative competence are by default oriented to general English proficiency, so their applicability to the description of an ESP construct such as BE writing is questionable. For this reason, the construct of BE writing here is conceptualized from the perspective of genre by adopting the model of Genre Knowledge (Tardy, 2009). The four dimensions in the model are operationalized as nine descriptive categories, namely Vocabulary, Grammar, Genre format, Cohesion and coherence, Mechanics, Strategic competence, Socio-cultural competence, Intercultural competence, and Business knowledge. It is important to note that these categories encompass both the formal aspects of language use such as grammar and vocabulary, and the nonlinguistic aspects of language use such as subject-matter knowledge and intercultural competence. Traditionally, nonlinguistic factors are seen as confounding the notion of second language proficiency and are usually treated as potential sources of measurement error. Concomitantly, criteria for assessing written communication in professional settings are oftentimes derived from theories of language competence, and have in practice largely drawn on traditional language-focused categories (e.g., Brunfaut, 2014; Douglas, 2001; Jacoby & McNamara, 1999). However, both the quantitative and qualitative findings in this study demonstrate that nonlinguistic factors are not only scalable but also crucial to the conceptualization of BE writing proficiency. This finding thus challenges the language-focused assessment criteria currently adopted by many BE writing tests and may be used as a springboard for the establishment of new criteria, which are more relevant to what constitutes effective BE writing communication.

9.3.2 Methodological Implications: ESP Scale Development Adopting a mixed-methods design integrating intuitive, data-driven qualitative and quantitative data collection techniques, the study features a dynamic interaction between the scale developer and its prospective users and stands out as one of few empirical endeavors to develop an ESP proficiency scale. Methodologically, the study has three implications for ESP proficiency scale development: (1) the collection and formulation of ESP descriptors, (2) the use of the Rasch measurement model to calibrate the descriptor difficulty, and (3) the involvement of domain experts in the refinement of the scale. To start with, this study provides insights into how descriptors could be collected and developed in constructing ESP proficiency scales. According to North (2000), identifying or drafting appropriate descriptors often poses a formidable challenge to proficiency scale developers. A common practice of new scale construction is hence to derive descriptors from existing scales. This method has turned out to be inadequate for the current study because most existing language scales are EGP scales, whose descriptors rarely touch upon ESP proficiency. In addition, most existing

168

9 Conclusions

scales have been developed in English-speaking or ESL contexts, which may differ significantly from the context within which the Chinese BE learners are likely to operate. In light of these problems, efforts were made to collect as many potentially useful descriptors as possible, by exploring sources such as BE textbooks, proficiency statements in language curricula, and recruitment advertisements in China, apart from existing language scales. To further enhance the validity of the collected descriptors, workshops were held in which experienced teachers of BE writing from Chinese universities worked together to revise or refine the descriptors to enhance their relevance to the Chinese tertiary context, yielding a pool of high-quality descriptors, forming the cornerstone for the construction of the BE writing proficiency scale that this study aimed to develop. Second, this study also demonstrates the usefulness of the Rasch measurement model in both scale development and validation processes. Due to several advantages as compared to the traditional data analysis method based on Classical Test Theory (CTT), the Rasch model has been increasingly used in the field of language assessment (Fan & Knoch, 2019). A salient advantage of the Rasch model is “specific objectivity” (McNamara, et al., 2019, p. 176), rendering item difficulty estimates independent of the person samples that are included in the analysis. This advantage makes it particularly appealing for calibrating the difficulty levels of the descriptors in language proficiency scales, commonly based on the data collected from a variety of stakeholder groups (e.g., teachers, learners). In addition to item difficulty calibration, the Rasch model can also identify the descriptors which underperform statistically, and thus should be either revised or removed from the descriptor bank (North, 2000). In fact, the usefulness of the Rasch measurement model has been testified by the development of several language proficiency scales such as the CEFR and the CSE, and this study lends further support to its applicability to ESP proficiency scale development. The third methodological implication of the study is related to the different perspectives identified from the teaching experts and domain experts respectively in the final phase of the study for the validation of the scale. The findings, for instance, show that in general, the domain experts are more sensitive to technical terms such as “sociolinguistic implications” and the artificialness of a language task than the teaching experts, who placed more emphasis on the formal aspects of BE writing proficiency. In addition, the domain experts came closer to what McNamara (1996) defined as the strong approach to performance assessment as they assessed descriptor difficulty in terms of task completion. The teaching experts, on the other hand, veered towards the weaker approach by giving priority to the description of language quality in the descriptors. The findings of this study highlight the need to involve both domain experts and language experts in ESP scale development and validation, as ESP in itself is interdisciplinary in nature. Actually, in the field of language assessment, such collaboration has been increasingly emphasized, ranging from test design to rating scale development and validation. A host of studies, for instance, have already been conducted to explore the possibility for bringing together insights from the workplace and pedagogical contexts for the validation of ESP rating scales (e.g.,

9.3 Implications for ESP Scale Development

169

Jacoby & McNamara, 1999; Knoch, 2014). The current study serves as an important addition as it proves that such collaboration is also crucial to language proficiency scale validation.

9.3.3 Practical Implications: Implementation of the BE Writing Scale Given the boom of BE education in the tertiary context of China, as evidenced by the massive influx of courses and tests, this proficiency scale can serve as a common point of reference for the elaboration of BE writing syllabi, curricular guidelines, textbooks and facilitate the discussions of central issues involved in the teaching and learning of BE writing. Specifically, the reference levels can offer a systematic and accurate description of Chinese learners’ BE competence, against which course designers, teachers and materials writers can reflect on their current practices. Meanwhile, the description of expected learning outcomes in positively formulated terms can empower self-regulated learning and make learning progression a more transparent process. Perhaps the most significant contribution of the scale lies in its ability to address the “Tower of Babel” problem in making sense of BE writing course certificates and test scores. The provision of objective criteria for describing BE writing proficiency will aid test score interpretation, and accordingly facilitate mutual recognition of qualifications gained in different learning contexts. As such, potential users of the scale can situate and coordinate their efforts, bringing about more employment opportunities and educational cooperation across geographical and institutional divides. In addition, the scale can also be used as specifications for BE tests and examinations and enable companies and employers to explicate the writing abilities and skills they desire on the part of their prospective or in-service employees. It should be noted at this juncture that ESP writing is, in fact, plotted along a continuum, ranging from one end of English for general business purposes to the other end of English for specific business purposes. The scale developed for this study is somewhat at the weak end of the specificity continuum, as it is designed to facilitate the teaching, learning and assessment of BE writing in the Chinese tertiary context. Considering the perceived gap between the academy and the workplace and the transferability of ESP education (Bhatia, 2004), adjustments to the scale may be desired if it is to be used in workplace settings where factors such as industry types and positions may demand differing descriptions of BE writing proficiency.

170

9 Conclusions

9.4 Limitations of the Study Due to constraints of time and resources, the current study is exploratory and relatively small in scale, especially when compared with high-profile projects to develop language proficiency scales such as the CEFR (Council of Europe, 2001) and the CSE (Alderson, 2017). Though carefully executed, the relevant findings that emerged from this study should be interpreted in light of the following two limitations. The first limitation is concerned with the scale development phase of the study. Specifically, descriptor values of the draft scale were calibrated solely on the basis of university students’ evaluation of the difficulty level of each descriptor in the scale. Although Rasch analysis of the data yielded satisfactory statistical results, the involvement of other shareholder groups such as novice BE writers at work would have provided a more comprehensive picture of the phenomenon under investigation. In addition, to better understand stakeholders’ perceptions and behaviors, qualitative data collection techniques should have been adopted to complement and explain the quantitative findings. For instance, teachers’ evaluation of test taker’s actual performance using the proficiency descriptors in the scale would contribute to a more in-depth understanding of the research issue. The second limitation is related to the small sample size in the scale validation phase of the study. Due to financial and practical constraints, only ten teaching and domain experts were invited to participate in the survey. Incorporating the opinions of more stakeholders, with more diverse backgrounds such as language testing or materials writing, would have yielded more insights into the validity of the scale.

9.5 Areas for Future Research Focusing on the writing skill of BE proficiency, this study has broadened our understanding of ESP scale development in general and BE scale development in particular. To portray a more comprehensive, coherent and nuanced description of BE proficiency, more research is warranted. To start with, developing language proficiency scales is by no means a once-andfor-all endeavor. The new scale needs to be evaluated, revised and continue to evolve along with the development of language scale construction theories and practices. It is thus particularly important to investigate how the scale might be perceived by other stakeholder groups not included in the current study, such as employers, test designers, materials writers, etc. Views and comments concerning the usefulness and appropriateness of the scale shall be collected, which can then feedback into later revisions or refinement of the scale. The next logical step is to examine the extent to which the proficiency levels depicted in the scale accord with the picture of learner proficiency progress. Studies examining the alignment between the scale levels and BE learners’ performance on well-constructed language tests, for instance, are desired.

9.5 Areas for Future Research

171

Besides, as this scale focuses solely on the BE writing skill, future research is warranted to address the other skill areas involved in BE communication, such as speaking, reading, and listening skills, which are all essential for BE education in the Chinese tertiary context. In addition, to what extent the BE scales developed in the Chinese context can be applied to other contexts is also worth investigating.

9.6 Concluding Remarks The current study has developed and validated a scale of BE writing proficiency, which consists of a descriptive scheme for analyzing what is involved in BE writing as well as descriptions of BE writing proficiency at three consecutive levels. The triangulation of different methodologies in the study has engendered confidence in the scale and provided encouraging validity evidences for its use. It is hoped that this scale can serve as a common discourse within which central issues regarding the learning, teaching and assessment of BE writing can be fruitfully discussed among its end-users in China and similar EFL contexts. Grounded in the ESP domain in general and BE research in particular, the study represents one of the few attempts to develop and validate a theoretically informed, data-driven and statistically calibrated BE writing proficiency scale for Chinese EFL learners. Although this study focuses only on the BE writing skill, it is illuminating as it showcases an example of how an ESP proficiency scale could be developed and validated, which has been rarely discussed in the literature on ESP assessment. It would also be interesting to examine the extent to which the procedures of scale development and validation reported in this study are applicable to other disciplineor occupation-specific domains. After all, as the scope of ESP programs continues to expand, it is even more crucial to undertake rigorous research programs on theoryand data-driven ESP proficiency scales as well as their applications to ESP teaching, learning, and assessment.

References Alderson, J. C. (2017). Foreword to the special issue “The Common European Framework of Reference for Languages (CEFR) for English Language Assessment in China” of language testing in Asia. Language Testing in Asi Bhatia, V. K. (2004). Worlds of written discourse: A genre-based view. Continuum International. Bond, T. G., & Fox, C. M. (2007). Applying the Rasch model: Fundamental measurement in the human sciences (2nd ed.). New Jersey, NJ: Lawrence Erlbaum Associates. Brunfaut, T. (2014). Language for specific purposes: Current and future issues. Language Assessment Quarterly, 11(2), 216–225. Canale, M., & Swain, M. (1980). Theoretical basis of communicative approaches to second language teaching and testing. Applied Linguistics, 1(1), 1–47. Council of Europe. (2001). Common European framework of reference for language learning, teaching and assessment. Cambridge University Press.

172

9 Conclusions

Douglas, D. (2001). Language for specific purposes assessment criteria: Where do they come from? Language Testing, 18(2), 171–185. Fan, J., & Knoch, U. (2019). Fairness in language assessment: What can the Rasch model offer? Papers in Language Testing and Assessment, 8(2), 117–142. Jacoby, S., & McNamara, T. (1999). Locating competence. English for Specific Purposes, 18(3), 213–241. Knoch, U. (2011). Rating scales for diagnostic assessment of writing: What should they look like and where should the criteria come from? Assessing Writing, 16, 81–96. Knoch, U. (2014). Using subject specialists to validate an ESP rating scale: The case of the International Civil Aviation Organization (ICAO) rating scale. English for Specific Purposes, 33, 77–86. McNamara, T. (1996). Measuring second language performance. London and New York: Addison Wesley Longman. McNamara, T., Knoch, U., & Fan, J. (2019). Fairness, justice, and language assessment: The role of measurement. Oxford University Press. North, B. (2000). The development of a common framework scale of language proficiency. Peter Lang. Pawlak, M., Michalik, U., Sznicer, I., & Stwora, A. (2020). Exploring business language and culture. Springer. Tardy, C. M. (2009). Building genre knowledge. Parlor Press.

Appendix

Sources for Descriptor Collection

Proficiency Scales Foreign Service Institute Absolute Proficiency Ratings Interagency Language Roundtable Language Skill Level Descriptions International Second Language Proficiency Ratings (ISLPR) American Council on the Teaching of Foreign Languages Proficiency Guidelines The ALTE Framework The British National Language Standards The Eurocentres Scale of Language Proficiency English Speaking Union Scale Common European Framework of Reference Canadian Language Benchmarks International English Testing Service (IELTS): Band Descriptors for the Writing Test Hong Kong Workplace English Benchmarks BE Websites BESIG

www.besig.org

BBC Business

www.bbc.co.hk/business

Business Spotlight

www.business-spotlight.de

Mike Nelson’s Business English Lexis Site http://user.utu.fi/micnel/business_english_lexis_s ites.htm Delta Intercultural Academy

www.dialogin.com (continued)

© Springer Nature Singapore Pte Ltd. 2021 L. Wang and J. Fan, Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach, https://doi.org/10.1007/978-981-16-5449-7

173

174

Appendix: Sources for Descriptor Collection

(continued) Businessballs

www.businessballs.com

How Stuff Works

www.howstuffworks.com

Macmillan Business English

www.businessenglishonline.net

Business English Pod

https://www.businessenglishpod.com/category/bus iness-english/

British Council

www.britishcouncil.orgprofessionals-businessand-work.htm

Business English UK

www.businessenglishuk.org.uk

Market Leader

www.market-leader.netindex.html

BE Tests English for Business (LCCIEB) English for Commerce 1-3 (LCCIEB) BEC: Preliminary, Vantage, Higher (Cambridge ESOL) BULATS (Cambridge ESOL) International Certificate in Financial English (City & Guilds) English for Business Communications (City & Guilds) TOEIC (ETS) Test for Business English Majors (The Examination Centre for Business Majors in China) English Test for International Communication (ETIC) (Beijing Foreign Studies University) Curricular Requirements The National Curriculum for BA Program in Business English of China College English Curriculum Requirements of China Vocational English Curriculum Requirements of China

References

BE Textbooks Barker, A. (2006). Improve your communication skills (Creating success series). Kogan Page Ltd. Chen, Z., & Chen, J. (2010). Business English writing. Higher Education Press. Donna, S. (2000). Teach Business English (Cambridge handbooks for language teachers). Cambridge University Press. Frendo, E. (2005). How to teach business English (How to). Longman. Fu, M. (2000). Contemporary business English writing. Beijing Institute of Technology Press. Goddard, R. J. (2007). Teaching English for international business. Authors Online Ltd. He, G. ( 2017). Writing in business English. Shanghai Education Press. Hu, Y., & Che, L. Business English writing. Beijing: Foreign Language Teaching and Research Press. Jones, A., & Alexander, R. (2000). International business English teacher’s book: A course in communication skills (Teachers’ guide). Cambridge University Press. Li, H. (2019). Practical business English writing course. China Renmin University Press. Shi, D., & Cai, W. (2008). Business English. Beijing Institute of Technology Press. Xian, X. (2009). An English course in business writing. Higher Education Press. Xiong, K., & Chen, Y. (2003). Business English writing. China Renmin University Press. Yang, L. Business English writing. Beijing: China Renmin University Press. Zhang, C. & Yu, M. Business writing. Beijing: Higher Education Press.

Recruitment Websites www.51job.com www.zhaopin.com

© Springer Nature Singapore Pte Ltd. 2021 L. Wang and J. Fan, Working Towards a Proficiency Scale of Business English Writing: A Mixed-Methods Approach, https://doi.org/10.1007/978-981-16-5449-7

175