Transforming Assessment in Education: The Hidden World of Language Games 303126990X, 9783031269905

This book transforms our current understanding of assessment practice in different educational settings and cultures. D

225 104 6MB

English Pages 227 [228] Year 2023

Table of contents :
Foreword
Rediscovering the Foundations of Assessment
Reference
Acknowledgements
Introduction
The Theory and Science of Assessment
Assessment as Language Games, Noticing and Tacit Evaluative Knowledge
Assessment and Evaluation
The Argument of This Book: Bridging Student Assessment with Critical Realism and Theoretical Insights
The Act of Assessment
Structure of the Book
References
Contents
List of Figures
List of Tables
Chapter 1: How Might Critical Realism Extend Our Understanding of Assessment?
1.1 The Origins of Critical Realism
1.2 Key Concepts in Critical Realism
1.2.1 The Concept of the Dialectic
1.2.2 The Social Cube and the Structure–Agency Distinction
1.3 Methodology
1.4 Summary
References
Chapter 2: New Forms of Society: New Forms of Assessment
2.1 Assessment from Ancient Times
2.2 Modern Society
2.3 The Meaning of Competence in the Knowledge Society
2.4 Assessment Practices and the Mode of Extension in the Knowledge Society
2.5 Inclusion, Exclusion and the Diploma Disease
2.6 Closing Comments
References
Chapter 3: Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?
3.1 Origins
3.2 Defining Assessment for Learning
3.3 Assessment for Learning as a Motorway for Improved Learning Outcomes
3.4 Theories of Assessment for Learning
3.5 Feedback
3.6 Closing Comments
References
Chapter 4: Motivation, Learning and Assessment
4.1 Theories of Motivation and Learning
4.2 Enlarging the Index of Motivation
4.3 Grading Individuals
4.4 The Power and Responsibility of the Teacher Who Grades Individuals
4.5 Grading Group Work/Projects
4.6 Closing Comments
References
Chapter 5: Assessment as Connoisseurship
5.1 Creativity, Elitism and Democracy
5.2 Definition of Creativity
5.3 The Emergence of Creativity in an Educational Setting
5.4 Assessing Creativity
5.5 Assessment as Connoisseurship
5.6 Closing Comments
References
Chapter 6: Challenging the Culture of Formative Assessment: A Critical Appreciation of the Work of Royce Sadler
6.1 Understanding the Debate about Assessment of, for and as Learning
6.2 The Inspiration of Polanyi and Wittgenstein
6.3 Assessment Capital in a Knowledge Society
6.4 Assessment Judgements
6.5 Closing Comments
References
Chapter 7: Moving Assessment in New Directions
7.1 What Is Assessment?
7.2 What Can We Learn About the Relationship Between Society and Assessment Practices?
7.3 Why Is Motivation So Important in Learning and Assessment?
7.4 How We Value and Assess Ourselves and Others and with What Language
7.5 Transforming Assessment Practices: Challenges and Opportunities
References
Glossary
References

Recommend Papers

Transforming Assessment in Education: The Hidden World of Language Games 9783031269912, 9783031269905, 3031269918

This book transforms our current understanding of assessment practice in different educational settings and cultures. Dr

129 92 6MB Read more

Transforming Ourselves, Transforming the World : Justice in Jesuit Higher Education [1 ed.] 9780823254330, 9780823254309

Transforming Ourselves, Transforming the World is an insightful collection that articulates how Jesuit colleges and univ

131 27 4MB Read more

Transforming Ourselves, Transforming the World: Justice in Jesuit Higher Education 9780823254323

Transforming Ourselves, Transforming the World is an insightful collection that articulates how Jesuit colleges and univ

107 12 3MB Read more

The Ethics of Language Assessment: A Special Double Issue of Language Assessment Quarterly 0805895256, 9780805895254

First Published in 2004. Routledge is an imprint of Taylor & Francis, an informa company.

119 74 Read more

Transforming World Language Teaching and Teacher Education for Equity and Justice: Pushing Boundaries in US Contexts 9781788926522

This edited book expands the current scholarship on teaching world languages for social justice and equity in K-12 and p

103 27 2MB Read more

Language Education in a Changing World: Challenges and Opportunities 9781788927864

Synthesises academic research and makes it accessible for teachers, policymakers and other professionals This book con

99 90 1MB Read more

Kannada Through Language Games

636 108 22MB Read more

Language Education During the Pandemic: Rushing Online, Assessment and Community 3031358546, 9783031358548

This edited book explores and illustrates successful practices for online assessment and community-building, based on th

106 70 6MB Read more

Principles of Assessment in Medical Education [2 ed.] 9789354652479

•Revised version of the previous edition. •New chapters, diagrams and tables have been added to make the concepts clear.

114 76 16MB Read more

Code: The Hidden Language of Computer Hardware and Software

510 104 9MB Read more

Transforming Assessment in Education: The Hidden World of Language Games
303126990X, 9783031269905

Author / Uploaded
Stephen Roderick Dobson
Fuad Arif Fudiyartanto

Similar Topics
Education

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

The Enabling Power of Assessment 10 Series Editor: Claire Wyatt-Smith

Stephen Roderick Dobson Fuad Arif Fudiyartanto

Transforming Assessment in Education The Hidden World of Language Games

The Enabling Power of Assessment Volume 10

Series Editor Claire Wyatt-Smith, Faculty of Education and Arts, Australian Catholic University Brisbane, QLD, Australia

This series heralds the idea that new times call for new and different thinking about assessment and learning, the identities of teachers and students, and what is involved in using and creating new knowledge. Its scope is consistent with a view of assessment as inherently connected with cultural, social practices and contexts. Assessment is a shared enterprise where teachers and students come together to not only develop knowledge and skills, but also to use and create knowledge and identities. Working from this position, the series confronts some of the major educational assessment issues of our times.

Stephen Roderick Dobson Fuad Arif Fudiyartanto

Transforming Assessment in Education The Hidden World of Language Games

Stephen Roderick Dobson School of Education and the Arts Central Queensland University Rockhampton, QLD, Australia

Fuad Arif Fudiyartanto UIN Sunan Kalijaga Yogyakarta Yogyakarta, Indonesia

ISSN 2198-2643 ISSN 2198-2651 (electronic) The Enabling Power of Assessment ISBN 978-3-031-26990-5 ISBN 978-3-031-26991-2 (eBook) https://doi.org/10.1007/978-3-031-26991-2 © Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Dedicated to Dekat, Abimanyu, Johannes, Ida, Kerensa, Ocean and Sonja – the next generation.

Foreword

Rediscovering the Foundations of Assessment Assessment can be intended as an act of communication, as in a teacher assigning a grade to communicate something related to the student’s knowledge and skills. It is in this sense a language game, as Wittgenstein would have put it. Many handbooks and textbooks provide instructions on how to design tests and exams and share grades, explaining in detail how this communication should take place and what might be its direct repercussions on the teaching and learning processes both before and after the so-called assessment act. Few, if any, reflect on the philosophical meaning of the entire practice and on its possible implications at the individual and systemic levels. This book is centred on the pivotal idea that assessment can be a form of educational research in itself, based on the paradigm of critical realism, and shaped through the direct experience of teachers and students. The authors reflect on the philosophical and theoretical foundations of assessment and its practice and, in so doing, reshape its common definition as the opportunity to act upon available information. Rich in examples and references to relevant epistemological roots, with a particular emphasis on the work of Wittgenstein, this work provides a unified theory of the evaluative knowledge-building process, including a systematic review of its basic principles, shifting from the ontological perspective to the cultural connotations produced in different social contexts. Critical realism provides the platform to offer the reader a meta-theory connecting meaningful practices with generative mechanisms, and thus to underpin assessment acts and how they are perceived and intended by all agents, examiners, examinees, test developers, curriculum writers, policy makers and so on. It moves from the distinction between the ontology of assessment, that is, what is being evaluated and thus known through assessment, and the epistemology of assessment, that is, the conditions in which this knowledge is accumulated and used. Even if assessment can be alleged as an act of communication, sometimes it can be difficult to describe an achievement level or a student’s performance and to vii

viii

Foreword

clearly articulate the reasons why a judgement is as it is. To this end, the authors also consider the possibility of a sort of duality in assessment between explicit and implicit judgements, analysing the emerging meanings that arise from the tacit expression of a result and how those could be turned into an explicit and intentional act of meaning. Moreover, a part of the tacit knowledge that assessment can be generated is due to the complexity of identifying and delimiting its object. In this sense, evaluating can be seen as an inferential reasoning that constantly needs to prove its validity, namely its relevance with respect to its object, and its accountability, to demonstrate the actual capability to describe latent constructs in a consistent, meaningful manner. Consequently, it cannot be reduced merely to a diagnostic function, useful to redirect educational design, nor to a way to (im)prove teaching effectiveness. Sadler’s well-rehearsed understanding of evaluative knowledge is reshaped by the authors’ tripartite definition of assessment and its purposes: • Assessment of learning, or summative assessment, is only centred on what is observable. • Assessment for learning attempts instead to move towards the tacit knowledge without being able to completely resolve it. • Whereas assessment as learning allows one to achieve the common tacit knowledge by means of imitation. In the well-known definition by Wiliam (2010), assessment represents a bridge between learning and teaching. This implies the need to clarify key concepts and the foundational methods followed to shape the interpretation of the empirical world, in its possible regularities. This can be considered a research process, so that assessment and research can represent the two halves of a coherent unit under the commonalities provided by the critical realism paradigm. Nevertheless, a fundamental difference between assessment and research has to be taken into account. Not all educational research approaches have a predetermined idea of the outcome that should preferably be achieved, whereas every assessment has a model beneath, often unexpressed, with a multi-layered structure, made of knowledge, skills, attitudes and values. In understanding the model that guides a specific assessment procedure, it is possible to grasp its transformative agency, moving individuals from what they are to what they are supposed to be. However, this shifting is not trivial, and it has consequences both at the individual and at the social level. When assessment calls for rational control and accountability, it creates rankings, quantifying the distance between what is and what should be. In this respect, this book describes in an insightful way the implications of assessment for the bildung (formation) of the individual, in terms of self-esteem and motivation, two fundamental components of learning, and at a broader and systemic level, for example considering the implications of large-scale assessment studies on national curricula and teaching practices. Simply put, assessment creates people. Finally, the book casts a light on the concept of assessment habitus, which is usually tacitly gained from the social practice in institutions and includes shared

Foreword

ix

values or dispositions descending from the cultural context considered. To this end, it provides a path across centuries, reporting the assessment procedures from the ancient Greeks up to the most recent trends in a global perspective, and offering direct examples at different levels from Australian, Norwegian, New Zealand and many other places. The assessment habitus inspired by Bourdieu’s work on social, economic and cultural capital comes to mind and Chap. 6 elaborates briefly upon what is an innovative concept Assessment capital signifies a specific subtype of cultural capital, possessed both by teachers and students, consisting in the knowledge and skills needed to perform high-quality assessment acts and the capacity to carry out a self-assessment of their own teaching and learning. Again, this set of knowledge and skills can be unspoken and implicit, whereas it is exactly in the opportunity of reflecting on learning in a systematic and explicit manner that we are offered a chance to progress towards a deeper understanding. In the chapters that follow, the reader will find the elaboration of a platform to move us in this direction and, in so doing, to gain a greater understanding of assessment as language games with endlessly different cultural twists and permutations. Lumsa University Gabriella Agrusti Lumsa, Italy

Reference Wiliam, D. (2010). The role of formative assessment in effective learning environments. In D. Hanna, I. David, & B. Francisco (Eds.), The nature of learning: Using research to inspire practice (pp.135–159). OECD Publishing.

Acknowledgements

We owe our gratitude to the untiring support of our families. Members of the academic and professional community who have provided useful comments and advice: Professors Stobart, Agrusti, Dawson, Steinsholt, Johnston, along with Associate Professors Pak Zuhdi, Svoen, Stahl, Higgins, Gran, Engh, Hartberg, Caramelo, Pinto, Damiani, Combra, Pak Al-Makin, and Ibu Suwarsih Madya. For editorial assistance our thanks are to Kate Leeson and Pam Berry. Last, but not least, we would also like to thank all students and teachers who have attended our courses in assessment, where many of these ideas have been explored and enriched. All translations from Bahasa to English and from Norwegian to English are our own. The origin of this book dates to conversations with Professor Erling Lars Dale, one of Norway’s most prolific and influential educators for several decades. His support led to the decision to weave in critical realism and he remains the inspirational voice in this book.

xi

Introduction

The formation of one’s identity is what remains after we have forgotten everything we have learnt (Ellen Key, as cited in Steinsholt and Dobson 2011, p. 7).

It is not uncommon for teachers across early childhood, primary or secondary schools, or lecturers in vocational, higher education and lifelong learning to suggest that they cannot define quality but know when they encounter it in the performance of a student. An even greater silence might be encountered if they are asked to offer an account and theoretical justification of what they understand to be going on as they make such an assessment. This involves making explicit what is normally considered tacit. Put differently, it is the attempt to theorise assessment and include all its elements, both the tacit experience and the explicit aspects of referencing marking rubrics and the movement between the two. The very mention of the term assessment theory will wrinkle the forehead of many experienced teachers, and even some researchers and policy makers in the field of assessment. Sadler reflected upon this challenge of using assessment criteria to account for and justify one’s assessment practices: A work judged as ‘brilliant’ overall may not rate as outstanding on each criterion. This would be necessary, logically and arithmetically, for the work to be assigned the top grade. Conversely, another work that comes out well on each criterion may be judged as only mediocre overall (2008, p. 165).

The global practice and theoretical accounts of student assessment pose what seems to be an ever-increasing number of potentially intractable, wicked challenges. They cover a number of assessment topics and challenges: how to use IQ and national standardised tests even though they may not directly tap domain-specific curriculum knowledge and skills; the meaning of international PISA test results for national assessment policy and classroom practice; a real-time understanding with data of how the ethnic, gender and class background of students can influence performance in the short, medium and longer term; how assessment for learning can be practised in relation to, but without merely becoming assessment of, learning; and the distortion and loss of distinct national or local cultures of assessment under the influence of transnational practices of assessment (Alarcón & Lawn, 2018). The list is much xiii

xiv

Introduction

longer and has of course to consider what we have learnt from the experience of COVID-19 for learners around the world. How can we engage and motivate students for shorter or longer periods of time when they peer or squint at the teacher and their school friends through a screen? What are the memories of learning and even assessment? What of that impossibly difficult question: how can we measure the learning loss if students have not learnt it in the first place? It is customary to think of assessment as a high-stress, high-stakes activity where students either qualify or fail, carrying with them the mark of their performance. It can easily become part of their identity and sense of self-esteem. However, it is equally important to understand that much assessment is of a low-stakes quality, such as the regular end of the week spelling test or the teacher’s comment mid lesson, ‘you are all making good progress.’ For the teacher the classroom can be a site of ‘continuous evaluation’ (Bernstein, 1971, p. 36), such that assessment pervades the whole educational context. There is also a third form of educational assessment, one that is more spontaneous and unplanned, responding to the unforeseen. It encompasses the need to act on ‘moments of contingency’ (Black & Wiliam, 2009) and make snapshot assessments.1 In a snapshot, the eye of the beholder, the teacher, can assume a position of all-powerful dominance, noticing the students’ bodies, utterances and performances, and judging in a split second whether they meet a more or less explicitly defined threshold of competency. Syed (2010, p. 43) has appropriately called this ‘chunking’ perceptual information into manageable units in order to be in a position to make an assessment. In this book we are interested in taking a step back and reflecting upon the philosophical and theoretical foundations of assessment and its practice. In our opinion there are and have been relatively few philosophers of assessment. Perhaps this is because the field has been dominated by the needs and experiences of practitioners in the classroom or the needs of those designing and implementing tests and exams to meet curriculum imperatives. The philosophers of assessment whom we have been inspired by are Royce Sadler in Australia, Andrew Davis in England and Samuel Messick in the USA. There are of course others, but Sadler helps ask those difficult and yet fundamental questions about assessment and the role of tacit evaluative knowledge, Davis with an interest in analytical philosophy offers an opportunity to delimit and conceptualise what we call the ‘act of assessment’, and the work of Messick remains central in the theorisation of assessment; specifically in how the quality of assessment can be determined by utilising assessment principles, such as an understanding of assessment validity. From the outset we use the term philosopher of assessment to include those who reflect not merely upon theories that encompass what is going on before, during and after assessment acts (e.g. theories of motivation and theories of technological support and measurement), but those who additionally direct attention toward the theoretical and conceptual foundations of assessment and the conditions required to

1 The term used in Dobson’s co-authored Norwegian language book Feedback i skolen [Feedback in the school] (Hartberg et al. 2012) is ‘blikk for øyebikk’ (awareness of the moment).

Introduction

xv

provide sound scientific knowledge and evidence of assessment practice and the decision-making process it requires. These concerns betray an interest in a meta- reflection on assessment. Firstly, we are interested in how a greater theoretical understanding of assessment practices and their accompanying assessment acts can be achieved by drawing upon a limited number of what we would call the most often noted assessment principles: validity, reliability, fairness (framed in terms of justice and equity), transparency and accountability.2 It is also possible to make the case that sustainability and/or manageability are assessment principles. Secondly, we focus on how theoretical concepts derived from critical realism and theorists such as Wittgenstein might be mobilised and bridged with assessment concepts (e.g. assessment for learning, the assessor as connoisseur or the understanding of tacit evaluative knowledge) to provide scientific knowledge (epistemology) of assessment practices and the structures and mechanisms that bring about their existence (ontology). All assessment is based on acting upon available information about the students and entering into an inferential process (Bennett, 2011). We cannot always know with certainty what is going on inside the head of a student.3 It is also a challenge to arrive at and conclude that our assessment judgements are valid and reliable across contexts, especially with regard to students with special needs and access requirements or coming from different socio-economic and gender backgrounds. Moreover, the moment of judgement is crucial, and it is never culturally free. Following this point, it is surely the case that a) assessment acts are steeped in cultures that are marked by traditions and thus contain an in-built inertia with respect to change and innovation, and b) understanding them requires insight into the role played by tacit and not merely explicitly voiced agreements by participants. In this book we will directly address this challenge in Chap. 6 when we consider how to bridge tacitly understood assessment and attempt to make it explicit. Put differently, not only is it important to understand the manner in which assessment practices are marked by questions of validity, reliability and fairness, but participants need assessment to aspire to and meet standards of transparency and accountability. Assessment and its acts are part of a process. In our view, a test, to take an example, is never valid, reliable or transparent from the outset in a once and for all manner. Only in the course of the assessment act, when stakeholders such as students and assessors are involved, does it become possible to assign degrees of validity, reliability and transparency. Put differently, such a line of argumentation considers assessment to be a social process marked by its openness with regard to the actions of stakeholders. This is the case despite the institutional, professional and individual baggage and experiences participants bring to acts of assessment. Among assessment practitioners, there are other working principles that complete the conceptual toolkit. Sometimes manageability and sustainability are referenced as assessment principles, especially in the wake of the UNESCO (2015) Sustainable Development Goals. 3 Of course, if the assessed students ‘talk aloud’ as they complete the assessment, as in for example an apprentice explaining how and why they change the breaks on a car, we do know this. 2

xvi

Introduction

The first is fairness and its accompanying norms of equity and justice. Students are quick to compare the manner in which two teachers in the same subject, let us say English literature, might grade their classes differently. One might be regarded as too strict and the other as an easy mark for a higher grade. This raises the issue of distributive justice, comparing assessors, and also procedural justice (Broadfoot, 2005) in the sense that the assessors might potentially take into account different forms of evidence. The last mentioned raises the issue of equity: should assessors seek, through their assessment practices, to redress socio-economic and cultural imbalances between students? Not all students begin on the same set of starting blocks; some might suffer from dyslexia or questions in an exam or test might disadvantage ethnic minorities who misread or even refuse culturally embedded cues in questions. Assessment with equity in mind can thus seek to take account of diverse needs, rather than treating all students as the same. This is sometimes known as the desire for accessible assessment (Round Table on Information Access for People with Print Disabilities, 2019). Another example of equity is the Māori entry pathway into a New Zealand medical school. In this case, students only need to reach the minimum grade threshold and demonstrate family lineage (whakapapa) to be accepted into the second-year program. For students who are not eligible for the pathway, meeting the minimum grade threshold will not necessarily guarantee admission if study spaces are limited. The last remaining assessment principle typically considered is accountability. It is related to assessment targets set by schools and authorities to meet standards of achievement for students or assessment costs per student. After the announcement of new targets, a period of time will follow when schools re-adjust and ‘play the game’, so to speak, of finding ways of meeting these targets. For example, they might offer coaching to students who are just below a critical national curriculum boundary (reflecting teachers re-allocating their time, putting a higher priority on test/exam preparation and less on teaching, what is commonly known as ‘teaching to the test’), cheat (e.g. the headmaster who changed students’ answers of students: McCrea 2012), inflate results (e.g. students may be gaining better test scores because they are better at the tests, rather than possessing deeper knowledge of the particular subject), and the Lake Wobegon effect (whereby all the children are above average: Changing Minds 2021). Stobart (2008) talks of intelligent accountability to denote the desire to return the control of accountability to schools and teachers. He cites school self-evaluation as an example of this where negotiated collaboration on assessment takes place and can involve different stakeholders, such as teachers, school leaders, parent groups and even students (Sjøbakken & Dobson, 2013). It is not always a total success: in New Zealand the Tomorrow’s Schools model introduced in 1989 was regarded as a ‘lost decade’ by Wylie (2012) and many local school boards were not sufficiently empowered. On the contrary, they lacked the required accounting and leadership skills to manage their own strengthened decision-making responsibilities. Fast forwarding to 2020, the New Zealand Education Review Office (ERO) has sort to introduce what appears to be a process of accountability more in line with Stobart’s intelligent accountability: Under our new Operating Model, ERO will

Introduction

xvii

shift from event-based external reviews to supporting each school in a process of continuous improvement. This more differentiated approach will use a developmental evaluation that reflects individual schools’ context, culture and needs. It aims to strengthen the capability of all schools through embedding a continuous improvement approach, strengthening schools’ own engagement with and accountability to whānau. ERO will become an evaluation partner alongside each school, to support every school to be a great school and every child a success (ERO, 2020, p. 2).4 It is fitting that the term aromatawai is about noticing, a term that echoes Sadler’s interest in what those assessing or self-assessing choose to weight in the assessment act. Noticing is central concept in the later work of Wittgenstein (1967) on language games and so too in this book where we identify family resemblances in and between the many terms used in the social practices of assessment (forms of life), such as assessment for, of and as learning.

The Theory and Science of Assessment In one respect, using and reflecting upon assessment principles constitutes a key component in assessment theory and the science of assessment. They provide insight into how a particular form of assessment is practised, and they also provide a guideline for evaluating assessment quality at a meta-level, for example how valid it is as well as the process of validation. There are also other ways of theorising assessment practices. For example, if the focus is instead on evaluating learning either during or after a period of learning, then assessment theory can quickly become associated with general theories of learning, motivation and identity formation (bildung). This said, student assessment literature is notable for its focus upon journal article outputs and the paucity of book-length accounts that seek to provide a deep-ploughing reflection on the philosophical, and by this we mean theoretical, foundations of assessment and its legitimacy as a science. There are similarly few innovative, theoretically inspired contributions. Hattie’s (2009) ambitious and ground-breaking Visible Knowing: A Synthesis of over 800 Meta-analyses Relating to Achievement is to be admired for its extensive research and its coverage of the importance of feedback, but it nevertheless breaks limited new theoretical ground. His theory of student learning, at least in that book, is somewhat traditional, articulating the move from superficial learning (recall of facts or ideas) to deep meta-cognitive learning, which is relational (organising knowledge in cognitive structure and patterns) and elaborative (finding rules that hold for all cases). A book that meets our goal of an innovative and deep-reaching theoretical engagement with assessment is Marshall’s (2018) Shaping the University of the Future: Using Technology to Catalyse Change in University Learning and Teaching.

https://ero.govt.nz/sites/default/files/2021-04/New%20schools%20Operating%20Model%20 Your%20Go-to-Guide.pdf (accessed 4.12.22). 4

xviii

Introduction

The almost 600 pages of detailed exposition propose that universities undertake continual ‘sense-making’ exercises (read assessment activities) to develop the identity of the institution and the staff and students who work or study within its brick or virtual walls. Broadly speaking, the books that exist on assessment can be grouped into a limited number of generic classes. Firstly, there are those of the type ‘What to do to improve my own performance in assessment activities as a student or teacher.’ They are rarely based upon a mix of theories or a critical discussion of theoretical traditions in assessment. Typical examples of this are Marzano’s Formative Assessment and Standards-Based Grading (2010) with an evidence-based approach, Broadfoot’s An Introduction to Assessment (2007) and Pollack’s Feedback: The Hinge That Joins Teaching and Learning (2012). The second class of books is more popular in format, written to draw the general reader into the world of assessment, not just practising teachers and assessment professionals. In this class of book, we find Stobart’s Testing Times: The Uses and Abuses of Assessment (2008). This book covers everything from learning styles to school accountability practices and IQ debates, in addition to the assessment for learning debate. While this book draws upon a number of theoretical traditions, the breadth of the approach and the captivating, yet popular style limit the space allocated for detailed theoretical elaboration on particular topics or practices. The third class of books are more academic in style, where the planned readership is not always clearly defined; sometimes targeting assessment researchers/ policy makers or assessment providers, and sometimes teachers in primary or secondary education or, alternatively, trainee teachers and their teachers in higher education. They are often anthologies, where contributions cover particular aspects of student assessment practice or policy in selected countries. The contributions in such anthologies rarely have assessment theory as the prime focus or motive. An exception in this respect are some of the contributions in the edited collection by Wyatt-Smith and Cumming, Educational Assessment in the 21st Century: Connecting Theory and Practice (2009). The book by Marcus and Borsboom (2013) entitled Frontiers of Test Validity: Measurement, Causation and Meaning is also exceptional in that it moves into the realm of testing and psychometry and yet at the same time adopts a highly theoretical and philosophical stance. What then those who still contend that the science of assessment is justifiably to be found in academic journal articles or in the work of examining boards and commercial and state-funded providers of tests and other assessment-related resources for the teacher and student? We would argue that individual journal articles tend to focus on single theories, rather than upon combining or balancing several theoretical perspectives in order to provide a meta-theoretical reflection. To take an example, a reader of a special issue on assessment for learning in an assessment journal5 will find individual articles exploring single theoretical approaches: one on professional learning, one on socio-cultural, situated learning and so on. Such an approach

Assessment in Education: Policy, Principles and Practice, 18(4).

5

Introduction

xix

risks a science of assessment that fragments its object, assessment for learning. As a number of commentators have pointed out, assessment for learning has to date failed to provide a unified theory (Bennett, 2011; Stobart, 2008) and remains in many senses ‘theoretically confused’ (Baird, 2011). What of the science of assessment as found in the work of examining boards and other test providers? A case in point is the American Scholastic Aptitude Test (SAT) reasoning test. It dates from 1901 and is today administered on behalf of the College Board by the Education Testing Service and Pearson Educational Measurement (Isaacs, 2007). The test is used as part of the selection process of universities in the USA and in the allocation of semi-finalist positions for National Merit Scholarships. In brief it entails the following: three sections, ‘Critical Reading,’ ‘Mathematics’, and ‘Writing,’ each scored on a 200–800 point scale. The 171 questions are nearly all multiple-choice; the exam now includes one brief essay, and ten math questions require students to ‘grid in’ the answer. By design, the test is ‘speeded’ which means that many test takers are unable to finish all the questions … The additional SAT Subject Tests (SAT II), formerly ‘Achievement Tests,’ are one-hour subject exams, entirely in a multiple-choice format6 (FairTest, 2007).

The SAT attempts to predict in scientific fashion future first year college grades. Its scientific merit is rooted in the extent to which it establishes a causal link to future academic performance and the aim is to score well in terms of predictive validity. And yet, despite its ambitions, the SAT consistently underpredicts the performance of women in college and overpredicts the performance of males. Ethnic minorities who do not have English as their first language are at a disadvantage in the speeded test. There is a wider lesson here, and it is not just to do with what standardised tests such as SAT predict or fail to predict; it is to do with how the tests are used to select and at the same time exclude specific groups from further study (Zwick, 2002). In 2001, the president of the University of California proposed not using the SAT to select students because it was unfair, especially to ethnic minorities, and did not reflect what students had actually learnt in high school. Despite closer curriculum alignment of the SAT in 2016, the university of California has remained sceptical and announced in 2020 that it would step away from using the SAT for admission decisions (Hubler, 2020).

ssessment as Language Games, Noticing and Tacit A Evaluative Knowledge A key proposition in this book is that selected socially oriented linguistic philosophers and theorists can usefully be drawn upon to advance our understanding of assessment mechanisms and structures. In particular, we are thinking of the

It includes questions attempting to measure reading comprehension, vocabulary, basic writing techniques, algebra, geometry, statistics and probability. These questions are curriculum independent. 6

xx

Introduction

inspiration provided by the work of Wittgenstein. We will discuss this in more detail in the following paragraphs. Ellen Key, the famous Swedish educational activist of the early 1900s, noted that ‘the formation of one’s identity is what remains after we have forgotten everything we have learnt’ (as cited in Steinsholt & Dobson, 2011, p. 7). In other words, what remains over time is the tacit awareness that learning is a social activity and, in our context of the field of assessment, it is the tacit knowing of self-referential (and/or other-directed) assessment. Simply put, it is the not always fully conscious self- awareness that assessment is taking place or an understanding of the how, why and where of assessment. Accordingly, the tacit awareness that learning is a social activity and the tacit knowing of assessment as self-referential become part of our identity; and yet we remain scarcely aware of it. It is inculcated in our personal sense of who and what it means to be assessed or in the reverse to assess another. It is our argument in this book that the work of Wittgenstein in particular can further our understanding of the tacit and explicit character of the mechanisms and structures at work in assessment. In this connection we will present some of his concepts, such as language games. We might understand assessment in whatever form it is practised as one of the many possible language games of assessment, as played out by the student and the teacher, for example the language game of feedback, exams, the regular test, even the mock interview or law school moot (Berger & Wild, 2017a, b) and so on. What is a language game? For Wittgenstein (1953/1967), and much later for Lyotard and Thébaud (1985), a language game is connected to a form of life, or in our terminology to a social practice, and thus is the source of generative mechanisms and structures. The term language game does not mean that language is used in a childish, playful manner with no consequence for reality. Language games are both played and used in real situations with real consequences and contain socially shared terms and norms of usage; at times in dispute and at times agreed. A key theme in this book is the relationship between tacit evaluative knowledge and how and to what extent it can become explicit. In other words, the extent to which we able to apprehend, interpret, understand and talk of the language games of assessment and also grasp the mechanisms and structures that create and support their existence in social settings. Wittgenstein understood that playing a language game means to obey the rules without necessarily being able to understand or explain them as they are played. Consider his reflection: And hence also ‘obeying a rule’ is a practice. And to think one is obeying a rule is not to obey a rule. Hence it is not possible to obey a rule ‘privately’: otherwise thinking one was obeying a rule would be the same thing as obeying it (Wittgenstein, 1953/1967, § 202).

In the moment of reflection after the assessment, we no longer obey its generative mechanisms and structures. Thus, a student might take a multiple-choice test without understanding how it is constructed, with what goals and so forth. Or put differently, in riding a bicycle we concentrate on the direction and the movement, while balance has with time become tacit and taken for granted. We might not be able to talk of balance, but we are aware of it. And if we try to talk of balance, are we unable

Introduction

xxi

to talk of direction and movement. In each case, something of the rules disappears into the shadows. To rephrase it, we notice something as salient and something else is sacrificed. The practice of noticing an aspect, Wittgenstein’s (1953/1967, § xl) phrase, and making it salient is central in assessment, as in the adage, ‘Do we value what we can measure or measure what we value?’ This awareness approaches what Polanyi (Polanyi & Sen, 2009) called the tacit, a term Wittgenstein never used to our knowledge, but was used by Sadler.7 The tacit is an evaluative knowledge that cannot be articulated simply or adequately by linguistic means (Johnson, 2016). The tacit is understood as the sedimentation of what we feel and corporeally experience; it is in short, our habits of social practice and mind practice. If the assessment stops ‘working’ in its normal rule-like manner, as when the assessment scores add up to 101% or when the assessment papers have been leaked online to all participants the night before the exam, we might have cause to reflect upon its generative mechanisms and structures. That is, our tacit evaluative skills are suddenly called into action to give an account. Many are familiar with the following: when the car breaks down or there is a power cut, we begin to reflect on the taken for granted and wonder how the car engine works or electricity is produced. However, even in these cases we might not be able to adequately recover and communicate in a linguistic sense all the reasons that can offer an account of what is going on. We may not be able to apportion responsibility (read accountability), achieve a consensual understanding, or bring to light and make explicit all the generative mechanisms or structures of the resultant situation or the so-called ‘normal’ rule-like practice of an assessment act and its accompanying form of social practice. As Wittgenstein noted, to think of the rules is to cease obeying them. But the mere thinking of them is no guarantee that a linguistic explanation can be adequately voiced and agreed. However, this should not make us despondent. Even if tacit evaluative knowledge is hard to grasp and communicate linguistically, we might find linguistic approximations or proxies, such as assessment constructs, or bodily expressions to show and demonstrate the knowledge in a concrete, practical manner. As we shall suggest later in this book (Chap. 5), assessment as a form of connoisseurship can offer rich verbal and visual metaphors that draw attention to distinctive features communicating what is normally held to be tacit and not communicated. We cannot, and will not in this book, give up the project of searching and seeking to communicate the generative mechanisms and structures in assessment practices. Without this effort, assessment practices will remain simply obeyed and taken for granted, a set of tacitly agreed rules.

Heidegger used a different conceptual approach to Wittgenstein and Sadler. He understood this as an ontological, pre-rational experience of a culture as authentic, meaningful, intimate and familiar (Heidegger 1927/1962, § 14-24). 7

xxii

Introduction

Assessment and Evaluation A good example of language games is the use of the terms assessment and evaluation. Are they actually two different language games applying to different practices with different meanings or can they be used interchangeably and belong to the same language game? Put in Wittgenstein’s terminology, do they share a family resemblance and belong to the same group of language games or are the distinctly different language games sharing no likeness and overlaps? I can think of no better expression to characterise these similarities than “family resemblances”; for the various resemblances between members of a family: build, features, colour of eyes, gait, temperament, etc. etc. overlap and criss-cross in the same way. – And I shall say: “games” form a family. For instance the kinds of number form a family in the same way. Why do we call something a “number”? Well, perhaps because it has a direct relationship with several things that have hitherto been called number; and this can be said to give it an indirect relationship to other things we call the same name. And we extend our concept of number as in spinning a thread we twist fibre on fibre. And the strength of the thread does not reside in the fact that some one fibre runs through its whole length, but in the overlapping of many fibres. (Wittgenstein, 1953/1967, § 67)

With this distinction in mind, let us consider the following social practices, or in Wittgenstein’s terminology forms of life and how assessment and evaluation have been understood.8 Sadler (1989) in his well-known text on formative assessment uses the term evaluation to denote it since in his view it suits the American setting. This suggests a certain interchangeability and resemblance, despite the difference in the social context. Later in his writing Sadler seems to abandon the term evaluation and used the term formative, more or less exclusively. Later in this book we dedicate a chapter to the work of Sadler. The Teaching + Learning Lab that offers support to MIT educators, staff and administrators upholds a clear difference in the terms assessment and evaluation and seeks thus, to consider them as separate and unrelated language games. As the Lab writes: You may often hear the terms assessment and evaluation used interchangeably. However, they are different processes. Assessment aims to enable learners to adjust their approach or study habits so that they can enhance their learning. Examples of assessment include implementing “mud cards,” polling students to gauge understanding during the class, and assigning reflection papers. In contrast, evaluation is a process that uses a variety of quantitative or qualitative techniques to analyze program, pedagogical, or course outcomes to determine whether they have been met. Assessment is a diagnostic tool focused on the learning of individual students, whereas evaluation determines the extent to which a program or pedagogy achieves predetermined goals or outcomes.9

Wittgenstein (1953/1967), §19, §23, §241 and how assessment and evaluation have been understood. 9 https://tll.mit.edu/research-evaluation/assessment-vs-evaluation/ (accessed 28.11.22) 8

Introduction

xxiii

This statement suggests assessment possesses a more of a formative character, while evaluation is predominantly summative. The challenge with such an attempt to separate assessment and evaluation into distinct language games is that much assessment practice and conceptualisation has a summative goal. A third example is from the OECD (2013, pp 59) in Synergies for better learning: An international perspective on evaluation and assessment, seeks to offer a theoretical framework that clearly delimits the parameters of assessment and evaluation: The evaluation and assessment framework consists of the co-ordinated arrangements for evaluation and assessment which ultimately seek to improve student outcomes within a school system. The framework typically contains various components as student assessment, teacher appraisal, school evaluation, school leader appraisal and education system evaluation, and includes the articulation between the components and their coherent alignment to student learning objectives. This framework differentiates between the terms assessment, appraisal and evaluation: • The term assessment is used to refer to judgements on individual student progress and achievement of learning goals. It covers classroom-based assessments as well as largescale, external assessments and examinations. • The term appraisal is used to refer to judgements on the performance of school level professionals, e.g. teachers, school leaders. • The term evaluation is used to refer to judgements on the effectiveness of schools, school systems, policies and programmes.

Of note is the manner, in which assessment refers to a language game denoting social practice on the level of the individual and evaluation to a language game denoting social practice on the over-individual unit of analysis. The OCED introduces the term appraisal which seems in this context to belong to the family resemblances of the evaluation language when the focus is upon school level professionals such as school leaders, and assessment if the focus is upon teachers. Simply put, appraisal it a bridge seeking to join the two distinct social practices of school leader (evaluation) and teacher (assessment) in a shared language game. In this book we are most closely aligned to the view that assessment and evaluation are distinct language games. For us, evaluation is found in the language games of school leaders and those interested in system level analyses. Of course, school leaders are interested in assessment of individual students. In such cases we would argue this is when they partake of a different language game, namely the assessment language game which can encompass family resemblances, such as the test, exam, viva, written assignment and so on with in the classroom practice of their teachers. We do not follow the introduction of an appraisal to cover both, since in our view appraisal in assessment is distinct and different to appraisal in evaluation. The focus of and hence social practice as forms of life remain different. Our view of course does not mean that in some social practices (forms of life), such as in the Lab identified at MIT the distinction between the language games of assessment and evaluation is difficult to maintain, or conversely as in Sadler’s paper early in his career the terms and social practices are used interchangeably with negligible difference.

xxiv

Introduction

he Argument of This Book: Bridging Student Assessment T with Critical Realism and Theoretical Insights Do we need a book-length study that seeks to understand student assessment with the resources of critical realism, along with theories informed by Wittgenstein’s linguistic turn with noticing and language games, Messick’s understanding of validity, and Sadler’s understanding of the tacit? The proposition of this book is that the practice and theory of assessment have exhibited a tendency to shy away from sustained theoretical work traversing the items in Fig. 1 that summarises the conceptual architecture of this book. More specifically, we argue in this book that the following are required: (a) A greater interest in theories that understand what is going on before, during and after assessment acts. In this book, such an understanding is facilitated by drawing on the theory of critical realism to understand generative mechanisms and structures in general and of assessment in particular. This incorporates among other things the cultural, socio-economic, technological and political preconditions and consequences of an assessment, rather than simply a focus on the actual assessment act. It involves understanding assessment as a chain of connected acts from the development of the assessment to its implementation, interpreting the results and taking appropriate action, or failing to do so, as the consequences become evident. It entails in this connection understanding the role assessment plays in bildung (identity formation), making or breaking a person.

Theories of assessment (e.g. tacit evaluative knowledge, validity)

Assessment act

Critical realism (generative mechanisms and structures)

Social and linguistic theories (language games, noticing, illocutionary acts, forms of life)

Fig. 1 Bridging student assessment acts with critical realism and social, linguistic and assessment theories

Introduction

xxv

(b) A greater reflection upon the theoretical foundations of assessment and the conditions required to provide sound, scientific knowledge of assessment practices. An example of this is the need to understand how assessment principles can generate theoretically informed discussion and scientific insights about what might be a valid assessment act. Understanding the role of tacit evaluative knowledge is another example. What counts as scientific if undoubtedly important, and in the final chapter we look at the challenges and opportunities when different world views underpinning what counts as science (e.g. Māori science) are integrated with what are considered mainstream assessment practices supported by an alternative of science. (c) A greater use of theoretical concepts from other domains, such as language games, noticing, illocutionary acts and capital to name a few. They are generally not considered a source of mainstream theories of assessment, especially given that measurement rarely plays a central role in their conceptual armoury and they rarely, if at all, reference the more traditional conceptual tools of assessment, such as the assessment principles. In this book, they offer important insights into the generative mechanisms and structures creating and supporting cultures and practices of assessment in different societies and social settings. It is possible to find both philosophers of assessment and researchers who have explored in particular point (a) and (b). The works of Sadler and Messick are good examples. The case can also be made for (c) and its relevance to assessment. The social theory of Bourdieu (1996, p. 102) is pertinent. He notes that the successful graduate who attains (elite) status also confirms the ‘function of consecration’ found in the system of assessment. Another example is the linguistic philosopher Wittgenstein, who proposed the concepts of noticing and language games, among many others. But there are fewer examples of the wider discussion for which point (b) provides only the springboard. For example, what kind of scientific knowledge of assessment practice, beyond that rooted in assessment principles, is sound and desirable? In this respect, existing theoretical insights, even though they may be fragmented in terms of scope, might usefully be bridged with theoretical resources from critical realism. Simply put, what counts as science and what might this entail? In the next chapter, an introduction to critical realism will be provided for those not already conversant with its key concepts. Examples bridging student assessment with critical realism will also be presented. At this juncture, a few cursory points will be made. Critical realism argues for distinguishing between the objects of knowledge (ontology) and the conditions for knowledge about the objects (epistemology). Broadly speaking, point (a) above has as its goal the objects of knowledge (realm of ontology) understood as assessments acts and stakeholders’ involvement in such acts (including before, during and after the assessment acts), the goal of point (b) above is the conditions of knowledge about the objects (realm of epistemology) and point (c) covers theorists who with careful work can in our opinion provide inspiration to traverse the concerns of ontology (a) and epistemology (b).

xxvi

Introduction

Critical realism moves beyond the already posed postulate that there is a need for greater theoretical reflection by providing more clearly defined content for theory and detailing how it is to achieve the status of scientific knowledge. Accordingly, theories must satisfy certain conditions: • They must provide knowledge of generative mechanisms and structures of, in our case, assessment practices in the classroom (or other learning organisations), and also include structures and mechanisms outside of the classroom that impact upon the classroom performance of students and their teachers. • The generative mechanisms and structures identified will operate in causal fashion on and between different levels: transactions with material and intra- psychological processes, social interaction and the social structures of institutions. They reach behind the surface appearance and pattern of events, such that reality is seen to possess ontological depth. • The theories must explore the domains of the empirical (assessment acts as observed and experienced), the real (generative mechanisms and structures) and the actual events (beyond empirical observation, but where mechanisms and structures can be actualised and operate). The theoretical constructs of the empirical, real and actual, and the critical realist meaning attributed to them, are elaborated below. Put simply, critical realism offers the opportunity to work with a meta-theory that can develop connections with practice, such as assessment acts, and at the same time draw upon other theories of a social and linguistic character. It seeks to re- direct our focus, such that existing social, linguistic and assessment theories can be recast and bridged with the concepts of critical realism. The goal is to create a greater understanding of generative mechanisms and structures (the real) that support and govern how assessment acts (the empirically experienced) are played out and realised, even if beyond observation (the actual) by different stakeholders, such as teachers, students, school principals, parents, researchers and policy makers. As a precursor to understanding bridging, we will first provide an example to illustrate the meaning of the real, empirical and actual, terms derived from the tradition of critical realism. In Fig. 2, we represent the assessment journey of a student. Through different generative mechanisms and structures (the realm of the real), we consider the specific organisation of assessment from private or public high schools and onwards, in which the student experiences specific forms of assessment (the empirical realm). The mechanisms and structures at work may not be visible but are working (in the real of the actual or, simply understood, are actualised), for example teaching to the test, which not all are aware of in every moment, but it is happening. Moving onto the concept of bridging, a second example is offered. In an essay by Dobson promoting returning the activity of translating from an ivory tower pursuit reserved for qualified professional translators to the work of the pedagogue in the classroom, the argument was made that the ‘pedagogue as translator can provide a bridge for the combination of translation theory with educational practice (translation pedagogy) in the classroom’ (Dobson, 2012, p. 273). The mediator between these elements in this context was the pedagogue himself/herself and their actions,

Introduction

High school

xxvii

→ University entrance or portfolio assessment or interview

→ Course (feedback and single grade) Minors and majors (feedback, grades, transition to higher levels) Program (final GPA)

→ Graduate attributes Employability skills Lifelong learning

Fig. 2 The assessment journey of students from high school to university and then graduation. (The figure was co-developed by Sue Walbran, Edward Schofield and Stephen Dobson in 2020 in an unpublished discussion paper on assessment frameworks for Victoria University of Wellington in New Zealand)

as guided by the dual and simultaneous focal points of inter-linguistic activity (translation theory) and teaching students how to translate (translation pedagogy). Translation pedagogy understood in the specific sense of the praxis of teaching translation offers, and more strongly implies, the corresponding idea that any pedagogue can (metaphorically speaking) be considered a translator interested in teaching their students how to engage in making meaning in their respective subjects and that this requires student acts of translation from the given communicated knowledge to the attainment of personally crafted meaning. The point is that the pedagogue holds the responsibility for bridging and joining these components (translation pedagogy and translation theory). In a similar sense, and moving away from the pedagogue as translator example to the practice of assessment, the bridging of social, linguistic and assessment theories with critical realism requires a mediator, and this is supplied by the assessment acts of stakeholders, such as students but not limited to them, within and also outside the classroom. Put differently, the conceptual resources of these different theories can be combined and bridged, each playing a part, to understand the manner in which the assessment acts of the stakeholders emerge and are shaped to receive both a form and a content. Critical realism provides a focal point for generative mechanisms and structures and linguistic theories are also providing increased insight. Assessment theory for its part considers assessment principles, tacit evaluative knowledge and other assessment constructs (e.g. assessment for, of and as learning), along with general theories of learning, motivation and the formation of identity.

xxviii

Introduction

It must however be underlined that, in making the stakeholder the mediator, the goal is not to re-incarnate a form of anthropic thinking with the actor as the origin and focal point. As we have argued, the focus on generative mechanisms and structures represents an attempt to tone down the centrality of the actor, even though the actor subjected to them is capable of modifying their impact. Secondly, this reducing of the focus on the actor is reinforced with a focus upon the assessment act, which in accordance with its supporting structures and goals seeks to play or influence the actor rather than the actor simply playing or influencing the assessment act.

The Act of Assessment The concept of the assessment act as an action requires further definition and clarification before we proceed (Dobson & Nes, 2009). It must be noted that it is connected with what happens before the assessment act, as well as what happens after the assessment act. In the former the family, community or learning environment can play a role in the student’s performance in the assessment act to follow, and in the latter an assessment act can have an impact upon a student’s entry to continued education, a professionally accredited body or a particular occupation. A student’s sense of identity can be enhanced by the result of the assessment or alternatively diminished. The period before the assessment act also includes the development of the particular assessment by the test developer or teaching practitioner. They also might be included in some aspect of its consequences. Black and Wiliam (1998, p. 2) once defined student assessment in terms that might further an understanding of the assessment act: ‘“Assessment” refers to all those activities undertaken by teachers, and by their students in assessing themselves, which provide information to be used as feedback to modify the teaching and learning activities in which they are engaged.’ Even if the classroom can be a site of ‘continuous evaluation’ (Bernstein, 1971, p. 36), this does not necessarily mean that all of the teacher’s activity is assessment related, which might be one rather liberal interpretation of this definition. Black and Wiliam’s later definition does not totally allay such a reading: Practice in a classroom is formative to the extent that evidence about student achievement is elicited, interpreted, and used by teachers, learners, or their peers, to make decisions about the next steps in instruction that are likely to be better, or better founded, than the decisions they would have taken in the absence of the evidence that was elicited (2009, p. 12).

A tighter delimitation will meet our conceptual needs. Haugstveit et al. defined the assessment act in the classroom as follows: With the concept assessment act we understand an action that encompasses assessment, and has a linguistic expression, written or verbal. Assessment acts can arise in situations that are intended to be assessment situations and in situations where assessment is not specifically in focus, but assessment nevertheless occurs. Here are some examples of assessment acts: teacher/student written/verbal comments to student work; teacher feedback to student/

Introduction

xxix

class; teacher presents/discusses/agrees on assessment criteria with the students (2006, p. 118).

This is a narrower definition of an assessment act, with the advantage in research and allied terms that the unit of observation and analysis is the same, namely a clearly delimited assessment act. Moreover, by including an explicit or implicit, read tacit, understanding that assessment is taking place, the authors make it possible to anticipate how didactic goals rooted in an understanding of the curriculum can be formulated and expressed clearly and explicitly with the intention of directly planning learning activity and assessment acts. Alternatively, didactic goals framed in an understanding of the curriculum might be formulated in a more open, less direct fashion, simply providing guidelines and a foundation for changing and developing learning activity and assessment acts. In the latter, assessment-related activity might still take place, but greater interpretative work to identify it is required by participants. This definition can nevertheless be refined in a number of ways. Firstly, we can extend the definition to include the assessment interaction between students, for example, students talking about their grades or setting their learning goals in a collective fashion. Admittedly, Haugstveit et al. (2006) provide the assessment act example of student groups presenting the product of a period of group work for other members of the class, but their focus is predominantly upon the individual, goal-related focus of different assessment acts, and it is necessary to widen this. The conception of an assessment act should be widened to include three interrelated didactic levels: • Individual student (self-development of the student) • Goals achieved or desired (a focus on mastering the taught curriculum material) • Group processes and products Our second refinement of their concept of the assessment act seeks to widen its scope in order to incorporate corporeal forms of assessment, such as tone of voice, and use of the hands and body as assessments are made. This is a great importance in the creative and performing arts part of the curriculum along with most if not all vocational education. Including such factors makes it possible to surmise a more diffuse entity, namely the assessment atmosphere or mood in a classroom or practice environment. Heidegger (1927/1962, p. 176), the well-known existential philosopher, called such atmospheres stimmung or in English how a person is existentially attuned to the state in which they find themselves (Befindlichkeit). This attunement incorporates a state of mind and: ‘Existentially, a state of mind implies a disclosive submission to the world, out of which we can encounter something that matters to us’ (1927/1962, p. 176). Thus, a teacher’s hard and abrupt tone in communicating feedback to a student in the course of teaching to the whole class will make all the students aware that assessments are far from open to debate and negotiation. A softer tone might indicate the possibility of dialogue about the feedback in a more relaxed and open, inclusive manner. In existential terms, the tone of voice influences the mood in a classroom and provides insight into how the assessment act is lived and experienced

xxx

Introduction

as a way of Being being.10 The term ‘Being being’ means simply ‘being’ or entity (e.g. the group taking a quiz) is experiencing itself in different ways of ‘Being’ (e.g. concentrating, laughing or nervous). Those inspired by assessment taxonomies might attempt to operationalise the corporeal components of the assessment act in terms of attitudes (Krathwohl et al., 1964) or expressive experiences (Eisner, 1985). The former refers to the affective domain and the connection of emotions to a wider sphere of values (e.g. the value a person attaches to something), interests (e.g. taking an active part and showing enthusiasm) and awareness (e.g. showing a willingness to listen). The latter highlights that the student’s formative experience of the assessment act as it takes place is central, rather than some clearly defined summative learning or assessment outcome from the experience. Lastly, the scope of the assessment act can also be widened to include an understanding of its linguistic component. Wittgenstein provides the impetus towards understanding language in use, what he called ordinary language philosophy. Here we are thinking specifically of some of his followers, namely Austin (1976) and Searle (1969), and what they called the speech act. They talked of utterances that can exploit an illocutionary potential, such as a shared understanding that a phrase refers to a command or promise or a prediction, without necessarily agreeing on its locutionary content (i.e. sense and meaning). Accordingly, the illocutionary potential of an assessment act can refer to the shared understanding embedded in a shared world of conventions (Austin, 1976, pp. 105, 109) held by student and teacher. Assessment is thus at stake in a speech/corporeal action, and this might be the case even when assessment is not explicitly in focus; it might be tacitly known or felt. Searle also identified perlocutionary statements,11 which can cause hearers to do things. In the context of an assessment act, this would be an assessment by a teacher or fellow student that causes a student to change their future learning actions, motivation or assessment performance. With these points in mind, and the understanding that an assessment act can also include the curriculum goal-setting phase as the students become familiar with how they are to be assessed and also the final point when an assessment result is verbalised/textualised, we define an assessment act as follows: An action by student(s) teacher(s) or other professionals in an educational setting with written, verbal or corporeal components that may or may not have as its stated goal an assessment of a student, individually or collectively, but with assessment as the outcome. The assessment act also includes assessment-related activity, such as the formulation and setting of learning goals, which can be supported by curriculum resources. It can also include an illocutionary and perlocutionary potential. Heidegger calls the foreground Being, das Sein, to describe our way of Being being, where being is the ontic that-which-is of entities, das Seiende. We have a number of choices in how we will live the ontic of Being-there, Dasein, and these choices found ways of Being, with ontological valued meanings and experiences of authenticity. 11 In the words of Wardhaugh (1992, p. 287): ‘If I say “I bet you a dollar he’ll win” and I say “On”, your illocutionary act of offering a bet has led to my perlocutionary act of accepting it. The perlocutionary force of your words is to get me to bet, and you have succeeded.’ 10

Introduction

xxxi

The assessment act bridges individual and social focus in four senses: firstly through the manner in which, even if the focus is upon the individual student, it can entail the individual student relating to others, such as the teacher (e.g. offering advice/feedback) or fellow students (e.g. in group work). Secondly, the use of written, verbal, corporeal or other forms of communication in the course of the assessment act always entails an orientation from the person concerned, either student or teacher or examiner toward another. Thirdly, participants recognise the existence of the assessment act on the basis of the conventions they share (illocutionary potential) and this positions the individual in a social, shared context which may result in further assessment-related (perlocutionary) acts. Fourthly, elaborating further, the bridging of the individual with the social through the mediation made possible by the assessment act necessitates the impression in the mind or body of the individual student being offered submissively to the social world of others (teachers, fellow students, examiners) and conventions (as stated in the curriculum and the norms associated with its interpretation by different stakeholders); and in the process a physical trace or mark is left (e.g. exam papers and exams marks). Skogvoll and Dobson (2011) have likened this to an assessment process and experience of bildung or identity formation for those concerned as the inntrykk (impression – of for example, the quiz question as if looks to a student’s knowledge and skills) is transformed into an uttrykk (expression – of for example, the formulation of an answer in the mind or whilst silently mumbling to oneself) to leave an avtrykk (mark or trace – of for example, writing the answer on an iPad or computer). Assessment acts can, as noted, enhance or diminish a person’s sense of identity, becoming part of who you are, your Being being. In Chap. 6 of this book, the bridging of the individual and the social gains yet another final layer of argument and conception with the inspiration of Bourdieu. He understood the individual as possessing an embodied habitus of dispositions and accompanying this is what some have suggested is an institutional habitus with the collective dispositions of the institution understood as norms, rules and values connected with learning and assessment practices. The assessment act in this sense is simultaneously related to the accumulated assessment habitus of both the individual and the institution. Importantly, the assessment act is thus connected implicitly and explicitly with accompanying practices that simultaneously express themselves individually and socially.

Structure of the Book Chapter 1 presents critical realism as a meta-theoretical framework for the whole book. Generative mechanisms and structures are key concepts derived from the scholarship of critical realism and are used extensively in the book. They are shorthand for a wider set of concepts and theories in this tradition and signal an interest in what drives and supports the practices of assessment in different educational

xxxii

Introduction

situations. Readers desiring a deeper exploration of the scholarship of critical realism are directed to key texts. Chapters 2, 3, 4, and 5 identify a number of key debates in the current practice and understanding of assessment. The approach is to pose a question and use the chapter to explore it with examples and theoretical reflections. Thus, Chap. 2 considers the relationship between changing forms of society and how this gives rise to different generative mechanisms and structures of assessment and accompanying assessment acts. Chapter 3 asks whether the hype around assessment for learning (called formative assessment in some educational ecosystems) as a high-speed motorway to enhance student learning is justified. Chapter 4 revisits the fundamental question so important to educators: is there a holy trinity of motivation, learning and assessment, where each is reliant on the other and cannot be fully understood on its own. Chapter 5 asks simply whether the practice of assessment would benefit from the understanding that it is connected with being a connoisseur. This is not to discount the importance of assessment acts being technically robust and paying due attention to assessment principles. The goal is to introduce another understanding of assessment which is already well-known to those who teach subjects within the creative and performing arts. In Chap. 6, we present a critical appreciation of the seminal work of Royce Sadler in the field of assessment. It offers a useful foundation for introducing a new understanding of assessment acts and practices organised around a) assessment as a language game, and b) assessment habitus and assessment capital. The inspiration is taken from Wittgenstein and Bourdieu, respectively. Chapter 7 weaves together some of the threads of previous chapters. Additionally, it asks what happens when assessment practice in an educational setting cannot simply assume that all are equally conversant or supportive of the same world view. Simply put, what happens when different world views of assessment are introduced into the discussion of assessment practices. The example introduced involves considering how Māori language games and forms of life connected with aromatawai, which means ‘to take notice of’ or ‘pay attention to’ and ‘to examine closely’, can be bridged with traditional, let us say more hegemonic, western language games of assessment that include assessing in a valid, reliable, authentic, equitable and informative manner.

References Alarcón, C., & Lawn, M. (Eds.). (2018). Assessment cultures: Historical perspectives. Berlin: Peter Lang. Austin, J. (1976). How to do things with words. Oxford University Press. Baird, J. (2011). Editorial. Does learning happen inside the black box? Assessment in Education: Policy, Principles and Practice, 18(4), 343–345. Bennett, R. (2011). Formative assessment: A critical review. Assessment in Education: Principles, Policy and Practice, 18(1), 5–25. Berger, D., & Wild, C. (2017a). Enhancing student performance and employability through the use of authentic assessment techniques in extra and co-curricular activities (ECCAs). Law Teacher, 51(4), 428–439.

Introduction

xxxiii

Berger, D., & Wild, C. (2017b). ‘Forgotten lore’: Can the Socratic method of teaching be used to reduce the attainment gap of black, Asian and minority ethnic students? Higher Education Review, 49(2), 29–55. Bernstein, B. (1971). Pedagogy, symbolic control and identity. Taylor and Francis. Biesta, G. 2009. Good education in an age of measurement: On the need to reconnect with the purpose in education. Education Assessment Evaluation and Accountability, 21, 33–46. Black, P., & Wiliam, D. (1998). Inside the black box: Raising standards through classroom assessment. Nelson. Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21(1), 5–31. Bourdieu, P. (1996). The state nobility: Elite schools in the field of power. Polity. Broadfoot, P. (2005). Dark alleys and blind bends: Testing the language of learning. Journal of Language Testing, 122(2), 123–141. Broadfoot, P. (2007). An introduction to assessment. Continuum. Changing Minds. (2021). Lake Wobegon effect. http://changingminds.org/explanations/theories/ lake_woebegon.htm. Accessed October 21, 2021. Dobson, S. (2012). The pedagogue as translator in the classroom. Journal of Philosophy of Education, 12(2), 271–286. Dobson, S., & Nes, K. (2009). Kan vurderingshandling være ‹the missing link› i elevvurderingsteori? [Can the assessment act be the missing link in student assessment?]. In T. Nordahl & S. Dobson (Eds.) Skolen og elevens forutsetninger: Om tilpasset opplæringer i pedagogisk praksi og forskning [The school and the student’s background: Adapted education in educational practice and research] (pp. 93–110). Oplandske Forlag. Education Review Office. (2020). New operating model. Your Go-to-guide. https://www.ERO. GOVT.NZ Eisner, E. (1985). The art of educational evaluation: A personal view. Falmer Press. FairTest. (2007). The SAT: Questions and answers. http://www.fairtest.org/facts/satfact.htm. Accessed June 1, 2012. Hartberg, E., Dobson, S., & Gran, L. (2012). Feedback i skolen [Feedback in the school]. Gyldendal. Hattie, J. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. Routledge. Haugstveit, T. B., Sjølie, G., & Øygarden, B. (2006). Vurdering som profesjonsfaglig kompetanse [Assessment as professional competence] (Report No. 5). Hedmark University College. Heidegger, M. (1962). Being and time. Blackwell. (Original work published 1927) Hubler, S. (2020, May 23). Why is the SAT falling out of favor? New York Times. https://www. nytimes.com/2020/05/23/us/SAT-ACT-abolish-debate-california.html. Accessed October 21, 2021. Isaacs, T. (2007). SAT og opptak til universitetene i USA [SAT and admissions to universities in USA]. Norsk Pedagogisk Tiddskrift, 91(2), 165–177. Johnson, S. (2016). Tacit Knowledge: An Assessment of Michael Polanyi’s Epistemology, M.Div. thesis. Regent University. Krathwohl, D. R., Bloom, B. S., & Masia, B. B. (1964). Taxonomy of educational objectives: The classification of educational goals. Handbook II: The affective domain. David McKay. Lyotard, J.-F., & Thébaud, J.-L. (1985). Just gaming. University of Minnesota Press. Marcus, K. A., & Borsboom, D. (2013). Frontiers of test validity: Measurement, causation and meaning. Routledge. Marshall, S. J. (2018). Shaping the university of the future: Using technology to catalyse change in university learning and teaching. Springer. Marzano, R. (2010). Formative assessment and standards-based grading. Marzano Research Laboratory. McCrea, N. (2012, September 18). Former Orono principal ordered test answers changed, report says. Bangor Daily News. https://bangordailynews.com/2012/09/18/news/bangor/formerorono-principal-ordered-test-answers-changed-report-says/. Access October 21, 2021.

xxxiv

Introduction

OECD. (2013). Synergies for better learning: An international perspective on evaluation and assessment. OECD. Polanyi, M., & Sen, A. (2009). The tacit dimension. University of Chicago Press. Pollack, J. (2012). Feedback: The hinge that joins teaching and learning. Corwin. Round Table on Information Access for People with Print Disabilities. (2019). Guidelines for Accessible Assessment. Sydney: The Round Table. h ttp://printdisability.org/guidelines/guidelines-for-accessible-assessment-2019/. Accessed August 8, 2021. Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18, 119–144. Sadler, R. (2008). Indeterminacy in the use of preset criteria for assessment and grading. Assessment and Evaluation in Higher Education, 34(2), 159–179. Syed, M. (2010). Bounce: How champions are made. Fourth Estate. Searle, J. (1969). Speech acts: An essay in the philosophy of language. Cambridge University Press. Sjøbakken, O., & Dobson, S. (2013). School self-evaluation in the longer time scale: Experiences from a small Scandinavian state. In M. Lai (Ed.), A developmental and negotiated approach to school and curriculum evaluation (pp. 213–232). Emerald. Skogvoll, V., & Dobson, S. (2011). Connoisseuren – Med blikket for øyeblikket: Et kroppsfenomenologisk essay om det profesjonelle [Connoisseur – Awareness of the moment: A corporeal phenomenological essay about professionalism]. In Ø. Haaland, S. Dobson, & G. Haugsbakk (Eds.), Pedagogikk for en ny tid [Education for a new time] (pp. 161–173). Oplandske. Steinsholt, K., & Dobson, S. (Ed.). (2011). Dannelse: Utsikt over en ullendt pedagogisk landskap Bildung [Introduction to an opaque educational landscape]. Akademika. Stobart, G. (2008). Testing times: The uses and abuses of assessment. Routledge. UNESCO. (2015). Sustainable Development Goals. https://en.unesco.org/sustainabledevelopmentgoals. Accessed October 21, 2021. Wardhaugh, R. (1992). An introduction to sociolinguistics. Basil Blackwell. Wittgenstein, L. (1967). Philosophical investigations (3rd ed., trans. G. E. M. Anscombe). Macmillan. (Original work published 1953). Wyatt-Smith, C., & Cumming, J. (2009). Educational assessment in the 21st century: Connecting theory and practice. Springer. Wylie, C. (2012). Vital connections: Why we need more than self-managing schools. New Zealand Council for Educational Research. Zwick, R. (2002). Fair game: The use of standardized admissions tests in higher education. Routledge Falmer.

Contents

1

How Might Critical Realism Extend Our Understanding of Assessment?�� 1 1.1 The Origins of Critical Realism�� 3 1.2 Key Concepts in Critical Realism�� 6 1.2.1 The Concept of the Dialectic�� 8 1.2.2 The Social Cube and the Structure–Agency Distinction�� 12 1.3 Methodology �� 14 1.4 Summary �� 17 References�� 19

2

New Forms of Society: New Forms of Assessment�� 21 2.1 Assessment from Ancient Times�� 23 2.2 Modern Society�� 24 2.3 The Meaning of Competence in the Knowledge Society �� 27 2.4 Assessment Practices and the Mode of Extension in the Knowledge Society �� 32 2.5 Inclusion, Exclusion and the Diploma Disease�� 35 2.6 Closing Comments�� 43 References�� 44

3

Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?�� 47 3.1 Origins�� 48 3.2 Defining Assessment for Learning�� 51 3.3 Assessment for Learning as a Motorway for Improved Learning Outcomes�� 53 3.4 Theories of Assessment for Learning �� 57 3.5 Feedback �� 64 3.6 Closing Comments�� 70 References�� 72

xxxv

xxxvi

Contents

4

Motivation, Learning and Assessment �� 77 4.1 Theories of Motivation and Learning �� 78 4.2 Enlarging the Index of Motivation�� 89 4.3 Grading Individuals�� 92 4.4 The Power and Responsibility of the Teacher Who Grades Individuals�� 96 4.5 Grading Group Work/Projects�� 99 4.6 Closing Comments�� 104 References�� 106

5

Assessment as Connoisseurship�� 111 5.1 Creativity, Elitism and Democracy �� 114 5.2 Definition of Creativity �� 117 5.3 The Emergence of Creativity in an Educational Setting�� 124 5.4 Assessing Creativity�� 128 5.5 Assessment as Connoisseurship�� 133 5.6 Closing Comments�� 136 References�� 139

6

Challenging the Culture of Formative Assessment: A Critical Appreciation of the Work of Royce Sadler�� 143 6.1 Understanding the Debate about Assessment of, for and as Learning�� 145 6.2 The Inspiration of Polanyi and Wittgenstein�� 147 6.3 Assessment Capital in a Knowledge Society�� 155 6.4 Assessment Judgements�� 157 6.5 Closing Comments�� 160 References�� 161

7

Moving Assessment in New Directions�� 165 7.1 What Is Assessment?�� 168 7.2 What Can We Learn About the Relationship Between Society and Assessment Practices?�� 170 7.3 Why Is Motivation So Important in Learning and Assessment?�� 172 7.4 How We Value and Assess Ourselves and Others and with What Language�� 175 7.5 Transforming Assessment Practices: Challenges and Opportunities�� 178 References�� 185

Glossary�� 189

List of Figures

Fig. 2.1

The diploma disease. (Source: Dore, 1976, p. 141)�� 38

Fig. 3.1 The KLT Theory of Action. (Cited in Bennett (2010, p. 9) he had permission to use this figure from the Education Testing Service)�� 58 Fig. 3.2 Assessment for learning organised around ‘big idea’ teaching�� 62 Fig. 4.1

Self, assessment practices, and motivation and learning�� 79

Fig. 5.1 Dobson’s sketch of some teaching design ideas on the back of an envelope�� 112 Fig. 5.2 Responses to an assignment. (Source: Kingore, 2004)�� 119 Fig. 5.3 Creativity as learning processes connected with meaning making, identity enhancement and function�� 122 Fig. 5.4 Image of a jellybean�� 130 Fig. 5.5 Arrowtown students’ creative solution for night skiing. (Source: Ministry of Education, 2020)�� 132 Fig. 7.1 Challenges to assessment practices�� 180 Fig. 7.2 Assessment judgements interrelated with other components of teaching and learning�� 181 Fig. 7.3 The kete of assessment language games�� 182 Fig. 7.4 The five principles of assessment and aromatawai�� 184

xxxvii

List of Tables

Table 1.1 A genealogy of the different modes of temporal and spatial extension of the viva�� 11 Table 1.2 The key concepts of critical realism�� 18 Table 3.1 Assessment acts evident in assessment for learning�� 52 Table 3.2 Developing potential for learning�� 55 Table 3.3 Theoretical concepts derived from different assessment for learning/formative assessment approaches�� 59 Table 3.4 Proposed mechanisms for beneficial and detrimental effects of praise�� 69 Table 4.1 A three-dimensional taxonomy of achievement emotions�� 82 Table 4.2 Different items used to assess mastery and performance goal orientation�� 84 Table 4.3 An enlarged index of motivation�� 90 Table 5.1 Differences between bright and gifted learners�� 118 Table 5.2 Contrasting the learning of the high achiever, gifted learner and creative learner�� 119 Table 6.1

Sadler’s understanding of evaluative knowledge and skills�� 154

xxxix

Chapter 1

How Might Critical Realism Extend Our Understanding of Assessment?

The candidates for the exams are the yearly lambs, some of whom have to be sacrificed at the examination table, in order to maintain the socially necessary common knowledge (Kvale, 1990, p. 120).

Abstract Critical realism is an emergent school of thought. It has evolved through a number of phases since the 1970s, gaining followers and also critics along the way. It has supported new research in a widening number of disciplines, including organisational studies, religion, special education needs and environmental studies, to mention a few. This chapter presents the main concepts of critical realism and how they can provide insights into the practice and theory of assessment. Complexity will be kept to a minimum. Readers interested in a more detailed account are directed to key texts in the scholarship of critical realism. To support the presentation of and argument for critical realism an example from the practice of assessment will be introduced at the beginning of the chapter. In the first instance the concepts of critical realism will not be evident in this example. As the chapter progresses and concepts are presented, reference will once again be made to the example. In this manner, the example gains new layers of meaning and illustrates at the same time how critical realism can make a contribution. A second example will also be presented on a single occasion in this chapter. This example traces the genealogy of the viva. Two in-service teachers enrolled in a course in a taught master’s program in student assessment developed an action-based research project in their two classrooms. In the course the two students had six months to complete the task and were allowed to freely choose the topic. Both worked in Norwegian primary schools and decided to let their fourth grade (9-year-old) and seventh grade (12-year-old) students develop their own tests in selected subjects, such as mathematics, natural sciences and history. The key point, and this must be underlined, is that each classroom © Springer Nature Switzerland AG 2023 S. R. Dobson, F. A. Fudiyartanto, Transforming Assessment in Education, The Enabling Power of Assessment 10, https://doi.org/10.1007/978-3-031-26991-2_1

1

2

1 How Might Critical Realism Extend Our Understanding of Assessment?

student, either alone or working with co-students, was to develop a test to demonstrate their own particular level of competence. This means that a stronger student might develop a more difficult test and the reverse would be the case for a weaker student; the same applies to a group of stronger or weaker students working together. It must be noted at the outset that these teachers were questioning the view that all students should learn exactly the same knowledge to the same level of achievement. They were, however, hoping that the students would take the challenge seriously and not develop tests that were far too easy, that is, choosing to remain within their comfort zone. The tests in selected subjects were not high stakes. On the contrary, they were low stakes at the end of a four-week period of teaching and study. To begin with, parents sent SMS messages to the teachers and asked why the students were to design their own tests; surely this was the job of the teachers? Were they intending to abandon their responsibility as teachers? They replied that the Norwegian Ministry of Education’s national assessment rules and regulations actually obligated teachers to make greater efforts to involve students in their own assessment. But the regulations did not stipulate in what manner this was to take place. It was up to the teachers. To begin with the students needed a lot of clarification and assistance. They were accustomed to teachers acting as the gatekeepers and assessors of student competence. It was a new experience for the students not only to learn the subject matter, but to develop a test, and additionally to prepare to administer and mark the test. Teachers assured the students that they would provide ample support throughout the process. On the first trial, the more able students developed their own tests with limited assistance, thus demonstrating a higher level of independence from the teacher. This number increased in subsequent trials to include students who were generally considered to be less able and self-confident. Projects similar but not identical are not unprecedented. In one study in Portugal, 25 primary school teachers were instructed about how to increase student self- assessment over a period of 20 weeks in mathematics (Fernandes & Fontana, 1996). Standardised tests before and after the project showed that the involved students compared with a control group improved their performance in the chosen subject. The two teachers in the project outlined above undertook a classroom survey and periodically interviewed a sample of their students about their experience of this form of learning and assessment. The students across a scale of competence said they were more motivated to learn and liked the fact that they could adapt the assessment test to match their individual level of learning. Were some of these students sacrificial lambs on the examination or assessment table? Admittedly, not all the students could devise the necessary self-assessment test with minimal teacher assistance. This might suggest that they were sacrificed for the benefit of the stronger students in this particular activity. However, the weaker students might also have improved their learning, but to a lesser degree than the stronger students. Moreover, as noted, all the weaker students might have felt a greater motivation to learn and might have taken greater responsibility for their own learning. Pertinent concepts from assessment theory in this example are motivation, learning and self-assessment, as well as the enhancement of self-esteem. Concepts from critical realism are now introduced in order to provide a new layer of

1.1 The Origins of Critical Realism

3

understanding as we move from ‘knowledge of manifest phenomena to knowledge of the structures that generate them’ (Bhaskar, 1998, p. 13). In other words, we will seek to provide an even deeper understanding of what was happening in the example above.

1.1 The Origins of Critical Realism Bhaskar, who has consistently been one of the ground-breaking writers on critical realism, did not originally use the term. To begin he used the term ‘transcendental realism’ in A realist theory of science (Bhaskar, 1975), and later extended his argument to include the social sciences with his term ‘critical naturalism’ in The possibility of naturalism (Bhaskar, 1978). The key idea of his ontology is that reality is stratified; there are multiple realities or domains: the empirical, the actual and the real (see Elder-Vass, 2004). He argues that the real world comprises structures that are independent of our existence. This world is hidden beneath the observable actual world, in which observable phenomena are created by interactions among the structures in the real world. Finally, human experiences create an empirical world in which these multiple structures of the real world are lived and interpreted by individuals interacting among them (Singh, 2018). In Bhaskar’s later work and the period in his life when he published the book to which we will return and focus upon – Dialectic: The pulse of freedom (Bhaskar, 1993) he steps away from the view that structures are independent of the individual. In our book, it is our intention to retain the concepts of the world of the real, actual and empirical and continue to use them alongside the terms general mechanisms and structures. The dialectical interdependence so important in Bhaskar’s later work can be noted in the work of central theorists, such as Bourdieu, who uses the term reflexivity (Bourdieu, 1998; see also 2013, 2015). Distancing himself from naive realism, which identifies a set of observed empirical regularities and constant conjunctions of such empirical regularities for scientific investigation of knowledge or reality, Bhaskar sought with transcendental realism to direct attention to the realm of the real, where internal mechanisms and structures can be, but are not always, actualised to produce and generate specific effects and events. In other words, a move is made to look beneath causal mechanisms at the level of observed experience to the level of generative mechanisms and structures. They are transcendental in the sense of transcending the given of the observed and sensed and exist as conditions. Moreover, a mechanism might still exist even if it is not visible, because it could be activated, but not perceived, not activated or potentially counter-balanced by other mechanisms. In other words, this is the realm of what he came to call the actual. Empiricists/positivists would by contrast say that the causal mechanism was not in existence if it was not functioning and evident. With his term ‘critical naturalism’ Bhaskar argued that the model held for the social world and not merely the natural sciences. But the social is characterised by a relationship between agency and structure, such that actors are able to modify the

4

1 How Might Critical Realism Extend Our Understanding of Assessment?

structures in which they are both embedded and embodied. As a corollary, actors are also able to modify their knowledge in an ongoing critical process. Put differently, Bhaskar and his fellow philosophers were interested in understanding social phenomena, such that a description of actors’ intentions was supplemented with a focus on causal explanation at the level of generative mechanisms and structures. The term critical realism married transcendental realism with critical naturalism, where the term critical referred to the view that knowledge was fallible, emergent and subject to a process of continual revision. This critical character can be likened, though without becoming synonymous, with the work of German critical theorists such as Adorno in his book Negative dialectics (1966/1973) and more recently in the work of Honneth on social recognition inspired by a re-reading of Mead and Hegel (Fraser & Honneth, 2003; Honneth, 1996). In the context of student assessment, it is not enough to merely observe and assert thereafter that there is for example a simple causal relationship between devising one’s own test and increased motivation to learn, as might be thought to be the case in the example above. It is necessary to look at the manner in which the two teachers in one of the trials in the project period divided the class into groups according to their level of already existing competence. When this was done the weaker students felt more motivated and less under the objectifying (and shameful) gaze of the stronger students (Foucault, 1984). In other words, identifying an underlying mechanism and structure, based upon streaming according to level of competence, had some explanatory and causal value. Similarly, if it is admitted as we suggested in the introduction, that assessment is an inferential process based on information obtained about the student, it becomes imperative that the assessor, in this context the two teachers, along with the students themselves, made an effort to examine whether the knowledge gained through tests provided valid insight into the acquired student competence. This means that the generative mechanisms and structures delimiting and also supporting the practice of the tests must be explored. For example, were the tests all devised in a written format, favouring those with stronger basic written skills? In the jargon of assessment, were the constructs of the tests measuring too little of the desired construct or another construct; respectively the construct under-representation as opposed to construct irrelevance (Messick, 1989). A construct (Dobson, 2008) might for example be construed to contain one or more of the following: domain- specific knowledge, cognitive reasoning, communicative ability, and emotional and existential constructs (e.g. courage and stamina). A new moment in the development of critical realism came with the publication of Bhaskar’s Dialectic: The pulse of freedom in 1993. Some of the themes had already been announced in preliminary fashion in Scientific realism and human emancipation (Bhaskar, 1986), such as the conception of social being as a social cube with different angles and planes and the view that an explanatory model need not remain at the level of the descriptive ‘is’, but can provide scientific knowledge as a form of ideological critique of the ‘ought’ or normative character. Of more technical interest in the later text is Bhaskar’s exploration and modification of Hegel’s dialectic. To non-identity, absence and totality Bhaskar adds a fourth

1.1 The Origins of Critical Realism

5

moment, namely transformative agency. From 2000 onwards, Bhaskar’s work has taken what commentators have called a more spiritual turn as he searches for the ever fashionable, but also transhistorical, ground of eudaemonia (wellbeing, care, love, solidarity and flourishing) (Bhaskar, 2000). For critics, it is a form of idealism separated from historically lived experience (Creaven, 2009). It is possible to trace a line of development from transcendental realism and critical naturalism to critical realism, and thereafter to dialectical critical realism and most recently to a form of critical realism informed by spiritual values and unconditional love of all. Instead of arguing that each new moment marks a break, there are stronger reasons to assert that critical realism has become more refined and at the same time it has shown its relevance to a widening number of fields of practice. In an attempt to avoid confusing the reader by using variously the terms transcendental realism, critical naturalism and dialectical critical realism, we will refer simply to critical realism and according to the particular critical realist concepts appropriated the reader will be able to locate key texts and moments from the body of critical realism. We feel most drawn, for reasons stated below, to the dialectical critical moment revealed in Dialectic: The pulse of freedom. There are three reasons: firstly, in an ontology of the practice of assessment a dialectical approach pinpoints and highlights the processual character of assessment, covering the run-up to, during and aftermath of an assessment act. Secondly, stakeholders in assessment, such as the school principal, teachers, students, parents, policy makers and assessment researchers/consultants, interact in and through a number of social and material relations on different levels. This is conceptualised in Bhaskar’s book as social being, understood as a social cube with causal mechanisms and structures joining the stakeholders. We will return to the concept of the social cube. Thirdly, assessment as a form of epistemology seeks to understand not only where the student is in their current learning and how they have reached their present state of learning and achievement, but also where they are going (Hartberg et al., 2012). The latter is connected with feedback that propels the student onwards toward their goal and such feedback clearly has a component of transforming agency. It is directed toward forms of self- realisation and self-determination through continued study and can be summed up by the term bildung. In Dialectic: The pulse of freedom such concerns are reflected in passages, at times normative in character, expressing the desire for self- determination as an expression of freedom, and explaining how knowledge of constraints and barriers to self-development and learning can be a valuable springboard to action. In the example of the students learning to make their own tests the two teachers again and again emphasised that for their students the scales were tipped away from a focus upon assessment of learning and test performances, and more in the direction of assessment for learning in terms of knowledge of constraints on learning and knowledge of the opportunities and efforts required to make progress. Thus, the students achieved a more self-transparent understanding of where they were in their own learning and how to move forwards and overcome the constraints. The teachers went as far as to assert that, as students developed a continuous understanding of

6

1 How Might Critical Realism Extend Our Understanding of Assessment?

where they were throughout the four-week period of study, the need for a final test at the end of the period began to disappear. This is a radical and clearly emancipatory thought in the practice of assessment. It suggests that, somewhat paradoxically, as the students decide the content of the test, they do not actually need it other than to demonstrate once again what they already know. They have achieved a higher level of self-determination. In what follows, we will present a selection of key concepts in the writing of critical realists.

1.2 Key Concepts in Critical Realism Critical realism, as noted in the introduction, posits a distinction between the objects of knowledge (ontology) and the conditions for knowledge about the objects (epistemology). Why is it necessary? The answer is simple: ‘any theory of knowledge presupposes an ontology in the sense of an account of what the world must be like for knowledge, under the descriptions given it by the theory’ (Bhaskar, 1993, p. 205; see also 1998, p. 8). To take an example, if our assessment theory is about peer assessment and it identifies as a key component interaction between students working in groups, then the ontological standpoint presupposes that interaction takes place in order that the theory, as epistemology, can both ‘pick it up’ and explain the existence of precisely these things.1 It is often the case that such an ontology of the world is glossed over and tacitly acknowledged in scientific accounts. For critical realists an understanding and clarification of the ontological realm occupies centre stage, that is, they ask: ‘what properties do societies and people possess that might make them possible objects for knowledge?’ (Bhaskar, 1998, p. 13). This marks a distance from the more common epistemology of knowledge question: How is knowledge possible? Elaborating ontology is therefore taken to be a preliminary step, which at the same time directs attention to questions of epistemology in the next instance. In seeking to understand the generative mechanisms and structures giving rise to and supporting assessment practices the world is accorded ontological depth. This is to move from the world of appearances and events, what critical realists call the ‘empirical’ and observable, to the level of the ‘real’. But not all mechanisms will be actualised and become ‘actual’ and some may be actualised but remain unobserved. Some generative mechanisms, as noted previously, are not actualised, as they are ‘annulled out’ by the action of other mechanisms. The two teachers wrote a report

Foucault (1985) makes a similar point. In his account psychoanalysts interpret the statements of the ‘mad’ as evidence of the unconscious, of desire and of sickness. They do not interpret them as statements of hidden truths about the world on its way to the day of judgement, as was the custom in the Middle Ages. In other words, psychoanalytical theory makes assumptions about the world, revealing at the same time an ontology founded upon the unconscious, desire and sickness, and what fails to confirm to this ontology and the accompanying theory of knowledge will remain invisible or unheard, even if voiced. 1

1.2 Key Concepts in Critical Realism

7

of their experiences. In the report they did not only present their experiences in terms of what they observed (the ‘empirical’ of connected events); they sought to reveal mechanisms, such as student motivation over time (located in the realm of the ‘real’) and the fact that the students were not always motivated (the ‘actual’ of motivation was not always activated). The tripartite set of concepts, the empirical, real and actual, occupy a special role in the critical realist account. The ontological–epistemological distinction makes it possible to conceptualise and posit a relatively enduring form of being in the realm of ontology, the intransitive, as opposed to a historically specific and modifiable knowledge, the transitive world of knowing. Bhaskar introduces these terms in the following manner: If the objects of our knowledge exist and act independently of the knowledge of which they are the objects, it is equally the case that such knowledge as we actually possess always consists in historically specific social forms. Thus to think our way clearly in the philosophy of science we need to constitute a transitive dimension or philosophical sociology to complement the intransitive dimension or philosophical ontology already established. A moment’s reflection will show that, unless one does so, any attempt to establish the irreducibility of knowable being to thought must end in failure (1998, p. 11).

In his early work the transitive appears to be limited more to scientific knowledge, which is reasoned and causal in character. In Dialectic: The pulse of freedom the transitive knowledge dimension is widened: ‘we are concerned with the truth, ground, reason or purpose of things, not propositions … the concept of the transitive dimension should be metacritically extended to incorporate the whole material and cultural infra-/intra-/superstructure of society’ (Bhaskar, 1993, p. 218). The mere mention of the term superstructure echoes the ghost of Marxism without explicitly evoking such a heritage. With such a cursory reading it would be easy to attribute this concept to Marxism. It would, however, be a mistake to regard critical realism as simply a form of neo-Marxism. It can indeed be applied to understand cycles of capital, exploitation and alienation, but its reach is wider and not restricted to such a view of the world. Failure to maintain the distinction between the transitive world of knowing (epistemological) and the intransitive world of being (ontological), by conflating or reducing the one to the other, can lead to a number of fallacies with respect to the knowledge produced. The epistemic fallacy occurs when mere appearances, as events, are taken to be evidence of causal knowledge. The anthropic fallacy of anthropocentrism is evident in the view that all knowledge somehow originates or is mediated through the subject. This connects with the ontic fallacy in the sense of the determination of being by being, to use Heideggerian terminology. This is the failure to acknowledge the manner in which something is consciously lived (e.g. I am happy or sad). There is also the linguistic fallacy, typical of some strains of postmodernism and discourse analysis, where the intransitive is collapsed, the world becomes discourse and a reality outside of discourse is in its most extreme form discounted. An illustration of one of these fallacies is to be found in the project undertaken by the two teachers. In one of the trials, fourth graders expressed joy on being informed that they were to make their own test in natural science. One student

8

1 How Might Critical Realism Extend Our Understanding of Assessment?

expressed it in the following terms: ‘at last we are going to do it ourselves’. It would be an anthropic fallacy to draw the conclusion from this simple statement that the students, through their own efforts alone, could create the test, in this case formulating five questions below self-chosen dinosaurs. Firstly, they needed assistance and tutoring from the teacher, and secondly, the teacher was following points specified in the national curriculum for this subject and year grade (to learn about dinosaurs and other extinct animals). Accordingly, it would overestimate the role of the students if other generative mechanisms and structures located in the teacher’s interpretation of the curriculum and her tutoring were ignored. To summarise so far, the conceptual presentation underlines the importance of revealing the ontological depth (Singh, 2018) of the world and of assessment in particular. In Dialectic: The pulse of freedom, Bhaskar presents a reasoned account of the ontological realm of being and delimits the kind of epistemological knowledge required to understand the social objects of this realm more deeply. In other words, he provides an account of what the world must be like; and in the context of assessment this means we must ask how the concepts he introduces under the guise of a dialectic approach, such as non-identity, absence, totality and transformative agency, along with the empirical, real and actual mechanisms and structures, can assist in understanding the theory and practice of assessment in the world. Put simply, how might the concepts of critical realism enrich and inform our understanding of assessment concepts and practice, resulting in a greater understanding of the depth of the reality of assessment behind its appearance? Toward the end of the chapter we will apply these concepts to an interpretation of the example of the students making their own tests and in so doing deepen an understanding of the theory and practice of assessment – the ontological depth is enriched. In so doing the example gains a new reading and a new layer of meaning, when it is described and explained using concepts from critical realism. But first, more detail is required on the dialectical approach.

1.2.1 The Concept of the Dialectic In Dialectic: The pulse of freedom the ontological depth of the world is elaborated by developing three lines of argumentation, such that the world is understood a) longitudinally, b) laterally, and c) in terms of its scalar character. The longitudinal denotes the manner in which the dialectic unfolds through non-identity, absence, totality and transformative agency. The argument is technical and we will endeavour to limit terminology to a minimum in what follows. Non-identity refers to keeping elements separate; thus the transitive and intransitive realms are not reducible and rely upon detachment, what Bhaskar calls referential detachment: ‘the ontological dislocation of referent from the act of reference’ (Bhaskar, 1993, p. 212). He means by this that discourse (act of reference) must be about something other than itself (referent). Non-identity is also found in the view of communication as the referent

1.2 Key Concepts in Critical Realism

9

(the object in the world e.g. a dog) separated from the sign (e.g. the word dog) and the signifier (e.g. an animal with a tail and four legs). It is evident in the three-fold distinction between the empirical, real and actual. As Bhaskar put it: (α) they are categorically distinct and ontologically irreducible; (β) they are normally disjoint or out of phase with one another; (γ) the activity necessary to align them for epistemic purposes normally involves practical and conceptual distanciation, typically dependent on the past and the exterior; and (δ) they may possess radically different properties (e.g. in fetishism, mediatization or virtualization they may invert, or otherwise occlude, the properties they purport to describe) (Bhaskar, 1993, p. 237).

In an assessment sense the concept of non-identity relates to the fact that what is observed (empirical regularities and patterns) in a classroom, for example students in the fourth grade test devising questions for their chosen topic on dogs, has causal mechanisms and structures that lie at a different level (real) and are non-identical (e.g. the dog has four legs and is more stable than a human with two legs). In addition, these mechanisms and structures are not always actualised (the actual) because other generative mechanisms might counter them (e.g. the dog is curled up and is not walking and the stability is not evident at that moment). The second moment of the longitudinal dialectic is absence and it constitutes a motivating or propelling force. It can refer to a contradiction between a present and an aspired for future. In an assessment sense it might be a gap or a lack identified in a student’s learning and development. What then occurs is the move to ‘absent the absence’, to make up or remove the absence. The following Goldilocks metaphor is illustrative: If the gap is perceived as too large by a student, the goal may be unattainable, resulting in a sense of failure and discouragement on the part of the student. Similarly, if the gap is perceived as too ‘small’, losing it might not be worth any individual effort. Hence, to borrow from Goldilocks, formative assessment is a process that needs to identify the ‘just right gap’ (Sadler, 1989, p. 130).

However, it is important to note that absence thinking, or a focus on lacks and gaps, more commonly known as deficit thinking, must not lead to the view that assessment has simply a ‘diagnose and redress’ goal.2 As Bennett (2011) has pointed out, one strain of assessment for learning has viewed it as a diagnostic tool focusing only upon improving learning outcomes. The diagnostic view sees learning and assessment in purely instrumental terms; however, much learning and assessment occurs in a non-instrumental, unplanned sense. This is integral to the other view of assessment for learning, where the process itself is the key and feedback in the classroom occupies a dominant role. The experience of learning can be as important as the outcome, and the formative can be as important as the summative. Absence exists in both the diagnostic and the process In contra-distinction to this there is of course the view that appreciative assessment, like appreciative pedagogy looks not to what is missing, but what already exists to build upon it and to acknowledge it in feedback (Yballe & O’Connor, 2000). From a dialectical perspective it can also be a position that propels transformation forwards, just as the gap position; but in this context through the need for ‘more appreciation’. 2

10

1 How Might Critical Realism Extend Our Understanding of Assessment?

views as a lack of something that needs to be bridged: in the former diagnostic approach it is the result of a planned learning process, while in the process variant it has more of an unplanned character. The last mentioned denotes assessment based upon on moments of contingency where the learner’s skills might grow suddenly and learning is achieved. The third moment in the dialectic, after the moments of non-identity and absence, is the identification of a totality. It brings together components that might otherwise not be in the same sphere of interaction. For example, teacher, student and school can constitute a totality, each with their own potentially differing interests. In dialectical thinking the point is to bring about a shift in perspective, such that at one moment it is the perspective of the teacher considering the interests of the whole class that is in focus, in the next that of the student and their individual learning, and later still that of the school which represents a non-anthropic view. The totality constitutes a system of regularities and institutional patterns, but it is at the same time open and flexible, developing in potentially new, unforeseen directions. If the tripartite system (school–student–teacher) above includes a new state assessment regulation or curriculum goal the totality is enlarged to accommodate a new component and a new shift in perspective is both required and possible. The last moment in the dialectic represents an enlargement of the traditional Hegelian dialectic by incorporating transformative agency. We have already mentioned this above. It might involve the student becoming aware in the previous moments of the dialect that a change is required. For example, the student might realise that greater efforts at self-study or the adoption of new learning strategies might increase independence from the teacher and enhance self-determination. It might be the student making his or her own test. Transformative agency is therefore connected with an awareness of the potential entailed (or lack of potential because of constraints and barriers) in the move from what is to what might be or ought to be and embarking on acts to bring about the transformation. This last moment of the dialect opens up the possibility of offering a moral or normative direction to agency. The direction might be on an individual level, but it might also originate at a national level, for example in a state’s desire to improve national ratings on PISA tests by introducing new forms of standardised testing in numeracy and literacy. Bhaskar appears to devote much greater detail to the elaboration of the longitudinal understanding of ontology than to the discussion of the lateral dimension of ontology. But the reader must not misread this difference, as they are of equal importance. The lateral refers to ontology’s time and spatial component, woven together as spacetime. In our reading of the lateral it provides a ‘body’ to the ontology in the sense of how the dialectic is lived existentially, corporeally and materially. A philosopher such Merleau-Ponty would call this as the flesh of the body connecting participants in what he called a chiasm. It denotes the shared inter-corporeal space of bodies, signifier (e.g. words) and history, where the result is the corporeal experience of touched-touching, seen-seeing, speaking-spoken and so on (cited in Dobson, 2004). The well-known sociologist Bourdieu would consider the lived existentially, corporeally and materially as embodiment in time and space, as what

1.2 Key Concepts in Critical Realism

11

is carried or lived in a person’s habitus. We will depart for a moment from exemplification with the case of the students making their own tests and illustrate how the temporal and spatial assume a ‘mode of extension’ (Spinoza, 1989) in a different form of assessment practice: the viva and how it has developed genealogically through history. The following quotations describe how many have existentially lived the viva and its intense emotional import: The description of his viva will bring vivid recollections of similar tortures to many minds. (Athenæum literary magazine, 19 December 1891) It is worst when the candidate appeals to feelings to compensate for a lack of knowledge. It is difficult to differentiate between crying, nerves and knowledge … but I manage usually to see what lies behind. (Professor in philosophy, 50 years old, 25 years’ experience of examining vivas) We said, it will turn out all right and he cried after he received his final grade … we almost cried when he began to cry … he had not thought he would make it … when they are full of emotion I too am full of emotion. (Senior Lecturer in education, 45 years old, 5 years’ experience of examining vivas) (Dobson, 2017, p44)

The very word viva, meaning by the living voice in Latin, is deceptive. It has to be used with the understanding that it incorporates the meanings and social practices of the earlier word disputation in which trial and exchange of knowledge took place orally. The table below summarises the extension of the viva since the time of the ancient Greeks to the present day (Table 1.1). The history of the viva reveals varied modes of temporal and spatial extension. To begin with, its supporting structure was wealthy Greek males who sought truth through syllogistic argumentation. We find evidence of this in the Topics by Aristotle: Table 1.1 A genealogy of the different modes of temporal and spatial extension of the viva Period Classical Greek Greco- Roman End of Roman era Middle ages Modern

Socio-cultural, political and economic mode of extension Reserved for philosophers, sons of the wealthy. Sophists rising in importance Skill in disputations is taught and practised in politics, law and public arenas Demise of the viva: Less open use of the Rise of the culture of Christianity. persuasive speaker The viva is considered a source of political unrest Return of the viva, e.g. logic disputations in In formal, institutionalised public educational settings Viva faces criticisms of low level of With industrialisation, increased standardisation and low level of transparency, number of students, costly to but it measures a verbal reasoning and administer compared to written communicative competency not easily measured examinations; but viva valued for by other forms of assessment (e.g. written its socio-cultural rite of passage assessment). It can probe why and how function something is understood and not merely what Characteristics of the viva In some Greek schools of thought: Dialectical forms of argumentation (syllogism) seeking certain knowledge Rhetorical use of the viva through clever and convincing argument

12

1 How Might Critical Realism Extend Our Understanding of Assessment?

‘a method by which we shall be able to reason syllogistically from generally accepted opinions about any problems brought forward, and shall ourselves, when under examination avoid self-contradiction’ (Aristotle, 1958, I:100, a18–21). In Greco-Roman times it gained a rhetorical goal and was found in law and politics. With Christianity public disputations were believed to be a threat and the viva disappeared from public arenas. With the rise of the pastoral the once public questioning of belief and accompanying penitence of one’s sins became privatised. When the viva returned it recovered the educational goal it once had, this time institutionalised in the university of the Middle Ages. Wilbrink (1997, p. 35) contends: ‘What really was innovative and characteristic of the universities, as new institutions, was the examination by a committee of masters.’ He discounts the view that the Chinese were the inspiration for university examinations in Europe. Their written examinations did not particularly resemble their European counterparts. Another more probable explanation is that the idea came from the Muslim world. Already in the eleventh century the disputation was an important instrument in the development of Muslim law. In modern times marked by the rise of industrialisation and mass society the viva faces renewed criticism; it is costly compared to written standardised forms of assessment. It has gradually disappeared in many institutions of higher education, except in subjects such as law, medicine, foreign languages and some professional qualifications at undergraduate level and in defence of doctoral theses. This decidedly Anglo-Saxon narrative remains blind to the use of the viva in other cultures. Notwithstanding this criticism, in the present context it does illustrate the manner in which a form of assessment gains a lateral extension in time and space; at times institutionalised and public, at other times privatised and apparently gone to ground. The narrative of the Anglo-Saxon viva also moves the account toward the last way in which the ontological understanding of the world is deepened, namely, in terms of the scalar which refers to social being, more specifically the four-planar social being and what Bhaskar calls the social cube.

1.2.2 The Social Cube and the Structure–Agency Distinction When critical realists conceptualise the social it reminds us of an interdisciplinary project where insights from a number of disciplines are incorporated into what appears to be a cross-disciplinary standpoint (Huutoniemi, 2016). The reader encounters a view of social being that makes it possible to appropriate and integrate concepts from the field of sociology, political science, psychology, linguistics, cultural studies, philosophy, economics and so on. Such an interdisciplinary perspective has its precedence in the work of Elias (1986). He used the term figuration to denote the meeting point of these disciplines and how this was necessary in order to respond to the ontological complexity of the world, based in his opinion upon processes, structures and inner- and outer-determined personality dispositions. Critical realists rarely refer to Elias. What both positions share is a desire to theorise and

1.2 Key Concepts in Critical Realism

13

conceptually understand the manner in which social relations of the world become sedimented and yet never totally solidified, remaining open to transformation through emergent, generative mechanisms and structures. Different perspectives and disciplines are brought together in the manner of a cubist painting by Picasso. The viewer of the painting and the painter seek to create a more holistic representation of the world paying respects to all traditions. We have already cited the importance of bridging in the introduction, which is another way of conceptualising the same point, namely how to bring different mechanisms and structures together in the lateral extension in time and space. In Dialectic: The pulse of freedom social being, in line with earlier works, is understood as a meeting place for a layered, or as he would call it a laminated structure and a laminated self.3 Put differently, the social has two generative or causal points, that of structure and that of agency, where each has its own properties and generative mechanisms. It is not a kind of dialectic between the two, as might be found in the work of social theorists such as Giddens (1984) with his theory of structuration. Instead, a dualism is proposed, such that structures afford agency and agency in turn includes the intentions and interpretations of actors that can reflexively modify social reality and its structures. Such a standpoint with an in-built dynamic avoids the ‘twin errors of reification and voluntarism’ (Bhaskar, 1993, p. 258). How is such a dualist version of the social further conceptualised? Bhaskar proposes that the social should be understood as a social cube composed of four interacting planes, and in so doing he moves the conceptual approach beyond the simple dualism of structure and agency. As he puts it: Four dialectically interdependent planes constitute social life, which together I will refer to as four-planar social being, or sometimes human nature. These four planes are (a) of material transactions with nature; (b) of inter-personal intra- or interaction; (c) of social relations; and (d) of intra-subjectivity. Important discriminations must be made at each level, thus at (c) we can differentiate power (including hegemonic/counter hegemonic), discursive and normative relations (to which they correspond at [b] power, communicative and moral relations) (Bhaskar, 1993, p. 153).

By way of illustration, here are assessment practice examples from each of the planes, each with their own generative mechanisms and structures: (a) material transactions with nature, where nature can be unformed matter but also formed objects (e.g. the architecture and planning of exam venues or the printing of assessment criteria), (b) social interaction between agents (e.g. judgement-making patterns among examiners that impact upon assessment practices), (c) social relations influenced by social structures through power, discourse and norms (e.g. the manner in which examining boards as part of the structure of examining monitor complaints and influence the examining of examiners), and (d) the stratification of embodied personalities of agents. In the last mentioned we are thinking of how a), b), and c) above are influenced by the self-esteem, motivation and values of the assessors and assessed. This is the sphere of the self capable of expressive communication and

Bhaskar is inspired on this point by Collier (1989).

3

14

1 How Might Critical Realism Extend Our Understanding of Assessment?

stratified or laminated by consciousness, unconsciousness and affective forces. The self thus conceived is the dispositional identity of the subject with shifting causal powers (conscious, unconscious and affect). It is important to note at this point there will be occasions in this book when the self will be understood to refer to something more than a conscious, unconscious, affectual and communicative entity. It will be understood additionally as an existential entity formed over time and through experiences; what we term bildung (Steinsholt & Dobson, 2011). We have already hinted at this in the definition of the assessment act in the introduction, where corporeal components such as a teacher’s hard and abrupt tone in communicating feedback provide existential insight into how the teacher’s sense of self is lived. To re-quote the words of Heidegger (1927/ Heidegger, 1962, p. 176): ‘Existentially, a state of mind implies a disclosive submission to the world, out of which we can encounter something that matters to us.’ This approaches a key point in this book, namely the tacit or taken for granted side of assessment practices and how it is carried by the self. It can be summed up in the phrase ‘I cannot tell you what is excellent, but I know when I see it.’ Sometimes the words and language of assessment are not available or are insufficient to talk of all that supports our assessment judgements. The final words of Wittgenstein’s (2001) Tractatus Logico-Philosophicus are similarly appropriate, ‘whereof we cannot speak, thereof we must remain silent’ (7). In sum, when it cannot be said it might be shown recognised as quality.

1.3 Methodology To recap, critical realism argues for a distinction between the objects of knowledge (ontology) and the conditions for knowledge about the objects (epistemology). This has a number of implications. Given a particular theory of the world, let us say of feedback in the classroom, what kind of events will be noticed and ‘picked up’ by the theory and by implication what will be left unnoticed? It is not enough to observe events and thereby to assert a causal line of connection between events (empiricism). What is required is the exposing of generative mechanisms and structures that support the existence of the events, in our context assessment acts undertaken by different stakeholders. Another way of putting this is to say that the first question, before epistemology, is one of ontology, and entails ‘an account of what the world must be like for knowledge, under the descriptions given it by the theory’ (Bhaskar, 1993, p. 205; see also 1998, p. 8). By answering this question, the move is made toward revealing the generative mechanisms and structures and the realm of epistemology is entered, identifying what kind of knowledge is necessary and desirable and also why. Thus, in the example of the two teachers producing knowledge on the basis of letting the students develop tests in certain subjects, it is necessary to consider what kind of students are involved, including their family backgrounds and past learning experiences. It also entails asking how the tests are developed, in which subjects, the

1.3 Methodology

15

connection of the tests to the curriculum, the resources available, the amount of tutoring, whether they worked in groups and so on. In clarifying these things, the ontology of the world is described (‘what the world must be like’). But something else is being accomplished. The foundation is created for the teachers writing their own report to explore beneath observations of student activity in order to identify generative mechanisms and structures accounting for the tests produced by the students. The conditions for knowledge are being established and clarified; this is to enter the realm of epistemology, whereby knowledge is context sensitive and emergent, about these specific students. It includes theories of learning, motivation and identity formation, and it also includes asking, using assessment principles, if the tests are valid, transparent, reliable, fair and accountable. In clarifying ontology and epistemological conditions, and in looking for generative mechanisms and structures, critical realists have developed their own methodology, what they call retroduction. It entails asking, as already stated, ‘what properties do societies and people possess that might make them possible objects for knowledge’ (Bhaskar, 1998, p. 13) instead of the more common epistemology of knowledge question: How is knowledge possible? What then is retroduction? We have already begun to provide the answer. The two teachers formulated the following research question for their action research project: Can student participation in the learning process and in the devising of tests influence student motivation for learning? In order to answer this question they adopted implicitly, rather than explicitly, a critical realist retroduction approach, and were asking the following: ‘Why do we have data that suggest X exists?’ X in this case might be a certain pattern of observed assessment acts whereby the students indicate that they are more motivated to learn. Why does the data suggest it has P1 and P2 qualities? P1 might be the learning performance of the stronger students, and P2 of less strong students, indicating a generative mechanism located in the academic strength of the different students. Can we reform our assumptions about X in order to make good assertions about X, P1 and P2 that are still consistent with our experience?4 The answer to these questions and to their formulated research questions required revealing the generating mechanisms and structures supporting the development of these student tests. It also entailed the two teachers repeatedly reviewing their own knowledge of motivation theories and testing and admitting their need to revise and update this knowledge. In sum, the two teachers writing the report were asking about ontology and epistemology because they were asking: How do we know that the students were motivated by devising their own tests? We arrive at the point that retroduction is searching for the generative mechanisms and structures operating to bring about the phenomenon, in this context motivating students through the assessment act of making their own tests. Danermark et al. provide a succinct definition:

These questions are inspired by Olsen (2007).

4

16

1 How Might Critical Realism Extend Our Understanding of Assessment? Retroduction is about advancing from one thing (empirical observation of events) and arriving at something different (a conceptualization of transfactual conditions). The core of retroduction is transcendental argumentation, as it is called in philosophy. By this argumentation one seeks to clarify the basic pre-requisites or conditions for social relationships, people’s actions, reasoning and knowledge (Danermark et al., 2002, p. 96).

Paraphrasing, clarifying methodological conditions means revealing generative mechanisms and structures. An example is found in research on the viva, in which it is possible to ask how the viva is talked into being such that the participants ‘pull it off’ (Dobson, 2017). This interest could entail searching for the micro- conversational related conditions that have to be in place for participants to feel at ease. Everything from the welcoming ‘Please sit down and take out your dissertation for reference during the viva’, to making the assessment act transparent to the candidate at the outset: ‘You can say a little about your thesis and then we will ask questions that will broadly speaking follow the dissertation from the first page to the last’ (Dobson, 2008, p. 21). In the case of the two teachers undertaking an action research project, a similar search for the generative mechanisms and structures was set in motion as they trialled different ways of letting the students devise their own tests and noted the difference in how stronger and less strong students performed. Retroduction is not therefore completely deduction, where a predetermined theory is simply tested. The context always guarantees that something unexpected might become apparent and require a revision of the theory, for example a mechanism might suddenly appear or cease to appear as a causal factor. Neither is it completely inductive because some hunch or known cause might already be in place directing vision, for example. One known cause might make itself evident, but not others. In retroduction inference is required to postulate and also find generative mechanisms and structures capable of producing the event. Several theories may be required to pinpoint and reveal the generative mechanisms and structures. This makes it possible to practice a cross-disciplinary approach where different disciplines might potentially make a theoretical contribution depicting and understanding the different planes of the four-planar social being as the assessment act is expressed laterally in space, time and corporally. Danermark et al. (2002, p. 103) make the argument that Mead (on conditions of symbolic interaction through self, me and the generalised other), Marx (the logic of capital, exploitation and alienation), Goffman (life on stage, off-stage, framed and lived through the playing of roles), Habermas (the necessary conditions for reaching understanding through communication, termed universal pragmatics), Bourdieu (the meeting of structure and action in the lived life of a person’s habitus) and Giddens (the theory of structuration) are important (critical) theorists because in their respective theories they explore the conditions, and generative mechanisms and structures, that make specific activities possible. We would also add a few more theorists who might potentially contribute to an understanding of assessment acts. The early Wittgenstein (1921/2001) attempted to delimit what we can and cannot say (and do) with language, and this echoes our attempt to delimit what is and is not an assessment act (see introduction). The later Wittgenstein (1953/1967) modified his approach, becoming more interested in the

1.4 Summary

17

language games we use to organise our different activities, such that using a particular language game (e.g. asking a question to elicit feedback) is to partake of a particular form of life (e.g. adopting assessment for learning). This can inspire an understanding of assessment in general and different forms and practices of assessment in particular (e.g. assessment for learning) as so many different language games, each with their own forms of life and enacted grammars of what is permissible and acceptable in undertaking an assessment act. Even the work of Foucault, not an especially welcome figure among critical realists (Al-Amoudi, 2007), can be seen to be an advocate of retroduction, despite the fact that a reading of his work can risk the linguistic fallacy of reducing an ontological understanding of the world to a series of discourses. Foucault’s (1972, 1984, 1988) retroduction rests upon a Nietzschean (1886/1973, foreword) stance, asking not what is the truth of a phenomenon, but how is truth constructed, under what conditions and in whose interests. A central concern in this respect for Foucault’s ontology is knowledge and its connection with power: the will to power as the will to knowledge.

1.4 Summary In this chapter a number of critical realist concepts have been rehearsed. They are interrelated in the sense that using one of the concepts might imply reference to the others. Another way of putting this is to say that a fertile reading of the concepts of critical realism sees them as a connected chain of concepts, where the links between the concepts are as important as the concepts themselves. In terminology inspired by Wittgenstein we might say that they share a family resemblance: I can think of no better expression to characterize these similarities than “family resemblances”; for the various resemblances between members of a family: build, features, colour of eyes, gait, temperament, etc. etc. overlap and criss-cross in the same way. – And I shall say: “games” form a family. For instance the kinds of number form a family in the same way. Why do we call something a “number”? Well, perhaps because it has a direct relationship with several things that have hitherto been called number; and this can be said to give it an indirect relationship to other things we call the same name. And we extend our concept of number as in spinning a thread we twist fibre on fibre. And the strength of the thread does not reside in the fact that some one fibre runs through its whole length, but in the overlapping of many fibres. (Wittgenstein, 1953/1967, § 67).

However, in this book it is important to keep in mind that assessment is the central point of interest. So, our focus is the manner in which critical realism can extend and illuminate our understanding of assessment and its social practice. The table below summarises these concepts and our understanding of them in this book (Table 1.2). By way of a summary we will re-visit the example of the two teachers undertaking an action research project whereby they followed and assisted students in making their own tests.

18

1 How Might Critical Realism Extend Our Understanding of Assessment?

Table 1.2 The key concepts of critical realism Key concepts from critical realism Ontology Epistemology Empirical (world) Real (world) Actual (world) Dialectic Mode of extension

Structure–agency (not as a dialectic, but as a duality, each with its own mechanism and structures) The social cube, laminated explanation, four-planar social being Retroduction

Meaning Intransitive realm of being Transitive realm of knowledge Experience and observation of regularities Generative mechanisms and structures as causal The generative mechanisms and structures of the real are actualised, but can remain unobserved Dialectic has four longitudinal moments: Non-identity, motivating absence, totality and transformative agency The lateral: The manner in which the temporal and spatial is expressed, also including the existential and corporeal of lived experience Structures afford agency through power, communication and normativity, giving rise to relatively enduring relations. Agency by contrast entails intentions and interpretations of structural influences, constraints and opportunities The four planes: Material transactions with nature, social interaction, social structures and stratified embodied personalities Transcending the empirical observation of facts to reveal generative mechanisms and structures

On this occasion we will make use of some of the concepts from the table above. The object of knowledge (ontology) is the students in the fourth and seventh grade and the two teachers as part of a master’s course in assessment are writing an action research report (epistemology). They made numerous observations during the project period lasting several months, identifying patterns and regularities (empirical), but they transcended this (retroduction) to study generative mechanisms and structures (real and actualised) accounting for their causal existence. This entailed developing a deeper and more detailed understanding of theories of motivation and the manner in which the students worked in groups closer to their competence as they devised their tests. The students experienced different levels of motivation to learn, connected with how as individuals (agency) they differed internally in their motivation to close their respective competence gaps (absence) to achieve greater self-determination and independence from the teacher (transformative agency). The students were members of a social cube composed of material interactions with the curriculum (e.g. devising questions about dinosaurs), social interaction with fellow students and teachers, within and contributing to the reproduction of the social structure of the school as an institution with rules and a culture. Lastly, the focus on motivation indicated an interest in the stratified personality of the students where other components such as cognitive reasoning and emotional strength and maturity played a role. In a sentence, the example of the students making their own tests can reveal a deeper interest in what the world must be like, in terms of the conditions, read generative

References

19

mechanisms and structures, that make it possible for students to devise their own tests and in the process become more strongly motivated to learn. In the following chapters, while retaining an interest in all the critical realist concepts presented hitherto, we will place a primary emphasis on elaborating the generative mechanisms and structures supporting assessment practices. The reason for this is twofold. Firstly, the conceptual chain of critical realism can at times appear overwhelming with too many concepts that are too tightly woven. This is especially the case if certain concepts, such as generative mechanisms and structures, and accompanying links in the chain are not highlighted for closer inspection. Secondly, since the goal of this book is to bridge critical realism with assessment theory, too active a use of the complete chain of critical realist concepts might make it difficult to see what exactly is being bridged with assessment theory. The last mentioned, assessment theory, is understood in this book as assessment principles and general theories of learning, motivation and identity formation.

References Adorno, T. (1973). Negative dialectics (trans. E. B. Ashton). New York: Seabury Press. (Original work published 1966). Al-Amoudi, I. (2007). Redrawing Foucault’s social ontology. Organization, 14, 543–563. Aristotle. (1958). Topica et sophistici. Clarendon Press. Bennett, R. (2011). Formative assessment: A critical review. Assessment in Education: Principles, Policy and Practice, 18(1), 5–25. Bhaskar, R. (1975). A realist theory of science. Harvester Press. Bhaskar, R. (1978). The possibility of naturalism: A philosophical critique of the contemporary human sciences. Routledge. Bhaskar, R. (1986). Scientific realism and human emancipation. Verso. Bhaskar, R. (1993). Dialectic: The pulse of freedom. Verso. Bhaskar, R. (1998). The possibility of naturalism: A philosophical critique of the contemporary human sciences (3rd ed.). Routledge. Bhaskar, R. (2000). From east to west: Odyssey of a soul. Routledge. Bourdieu, P. (1998). Practical reason: On the theory of action. Stanford University Press. Collier, A. (1989). Scientific realism and socialist thought. Pluta. Creaven, S. (2009). Against the spiritual turn: Marxism, realism, and critical theory. Routledge. Danermark, B., Ekström, M., Jakobsen, L., & Karlsson, J. (2002). Explaining society: Critical realism in the social sciences. Routledge. Dobson, S. (2004). Cultures of exile and the experience of refugeeness. Peter Lang. Dobson, S. (2008). Theorising the viva – A qualitative approach. Assessment and Evaluation in Higher Education, 33(3), 277–289. Dobson, S. (2017). Assessing the viva in higher education: Chasing moments of truth. Springer International. Elder-Vass, D. (2004, August). Re-examining Bhaskar’s three ontological domains: The lessons from emergence [presentation]. In IACR conference. Elias, N. (1986). Introduction. In I. N. Elias & E. Dunning (Eds.), Quest for excitement: Sport and leisure in the civilizing process (pp. 3–43). Basil Blackwell. Farrugia, D. (2013). The reflexive subject: Toward a theory of reflexivity as practical intelligibility. Current Sociology, 61(3), 283–300.

20

1 How Might Critical Realism Extend Our Understanding of Assessment?

Farrugia, D. (2015). Addressing the problem of reflexivity in theories of reflexive modernisation: Subjectivity and structural contradiction. Journal of Sociology, 51(4), 872–886. Fernandes, M., & Fontana, D. (1996). Changes in control beliefs in Portuguese primary school students as a consequence of the employment of self-assessment strategies. British Journal of Educational Psychology, 66(3), 301–313. Foucault, M. (1972). The archaeology of knowledge. (trans. A. M. Sheridan Smith). Tavistock. Foucault, M. (1984). The history of sexuality: An introduction. Penguin Books. Foucault, M. (1985). Madness and civilization. A history of insanity in the age of reason. Tavistock. Foucault, M. (1988). Technologies of the self. In L. Martin, H. Gutman, & P. Hutton (Eds.), Technologies of the self: A seminar with Michel Foucault (pp. 16–49). University of Massachusetts Press. Fraser, N., & Honneth, A. (2003). Redistribution or recognition? A political-philosophical exchange. Verso. Giddens, A. (1984). The constitution of society: Outline of the theory of structuration. Polity Press. Hartberg, E., Dobson, S., & Gran, L. (2012). Feedback i skolen [feedback in the school]. Gyldendal. Heidegger, M. (1962). Being and time. Blackwell. (Original work published 1927). Honneth, A. (1996). The struggle for recognition: The moral grammar of social conflicts. MIT Press. Huutoniemi, K. (2016). Interdisciplinarity as academic accountability: Prospects for quality control across disciplinary boundaries. Social Epistemology, 30(2), 163–185. Kvale, S. (1990). Evaluation and the decentralisation of knowledge. In M. Granheim, M. Kogan, & U. Lundgren (Eds.), Evaluation as policymaking (pp. 119–140). Jessica Kingsley. Messick, S. (1989). Validity. In R. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). Macmillan. Nietzsche, F. (1973). Beyond good and evil (trans. R. J. Hollingdale). London: Penguin Books. (Original work published 1886). Olsen, W. (2007). Critical explorations in methodology. Methodological Innovations Online, 2(2), 1–5. Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18, 119–144. Singh, S. (2018). Anchoring depth ontology to epistemological strategies of field theory: Exploring the possibility for developing a core for sociological analysis. Journal of Critical Realism, 17(5), 429–448. Spinoza, B. (1989). The ethics. Dent and Sons. Steinsholt, K., & Dobson, S. (Ed.). (2011). Dannelse: Utsikt over en ullendt pedagogisk landskap [Bildung Introduction to an opaque educational landscape]. Akademika. Yballe, L. D., & O’Connor, D. J. (2000). Appreciative pedagogy: Constructing positive models for learning. Journal of Management Education, 24, 474–483. Wilbrink, B. (1997). Assessment in historical perspective. Studies in Educational Evaluation, 23(1), 31–48. Wittgenstein, L. (1967). Philosophical investigations (trans. G. E. M. Anscombe). Oxford: Basil Blackwell. (Original work published 1953). Wittgenstein, L. (2001). Tractatus logico-philosophicus (trans. D. Pears & B. McGuinness). London: Routledge. (Original work published 1921).

Chapter 2

New Forms of Society: New Forms of Assessment

In very truth, we have become an ‘assessment society’, as wedded to our belief in the power of numbers, grades, targets and league tables to deliver quality and accountability, equality and defensibility as we are to modernism itself (Broadfoot & Black, 2004, p.19). It is difficult being the executioner (Teacher interview; Dobson, 2008, p. 280).

Abstract In this chapter the argument is made that changing forms of society rely upon specific generative mechanisms and structures that support and make possible forms of assessment and accompanying assessment acts. The emergence of the contemporary knowledge society occupies a central place. It is exemplified through the following: the OECD’s proposal of key competencies, just-in-time assessment, the debate on the ‘diploma disease’ and the global trend toward criteria referencing as a way of ensuring the attainment of learning objectives in curricula. Even though some have looked to the promise of blockchain and micro-credentials/nano- credentials/massive online open courses, the knowledge society has not been matched by the inclusion of all students. Exclusionary assessment practices reminiscent of industrial society in its heyday are still present. In the terms of critical realism, the existence of assessment mechanisms and structures can result in not all students gaining qualifications, accompanying skills and cultural capital that can be converted later into other uses and forms of value such as desired employment. Students as a consequence might not experience a stratified self characterised by greater control over their futures and enhanced transformative agency. The writers of the first quotation above highlight a global trend; namely the penetration of modernist assumptions about assessment grounded upon calls for rational control, efficiency and cost-effective accountability. As another commentator (Scott, 1998, p. 34) has put it, ‘tests are among the primary mechanisms that we use to © Springer Nature Switzerland AG 2023 S. R. Dobson, F. A. Fudiyartanto, Transforming Assessment in Education, The Enabling Power of Assessment 10, https://doi.org/10.1007/978-3-031-26991-2_2

21

22

2 New Forms of Society: New Forms of Assessment

“norm” ourselves and locate ourselves within societal hierarchies’. In other words, assessment is used to rank, select and to reinforce rather than challenge the meritocracy. Assessment culture is a central part of contemporary society. Yet assessment culture has also shifted from what Addey (2018) calls ‘philosophical doubt’ toward ‘statistical certainty’, exemplified by the widespread implementation of OECD’s PISA (Programme for International Student Assessment) and IEA’s TIMSS (Trends in International Mathematics and Science Study) across the globe. Writers inspired by Foucault (1984, 1988) would go further and assert that assessment pervades every molecular nook and cranny of society, and in so doing represents one more technology of the self, governing our socialisation, our bildung (Dobson, 2013), moving through the capillaries of our bodies on a micro or even nano-scale. Through an objectifying gaze upon others in everyday life we also undertake assessment acts and make judgements using different categories (taller, heavier, more or less immature, foolhardy, admirable and so on). In this sense we all possess some experience of being both judge and executioner and making and passing judgements. Some are also responsible for a more institutionalised form of the objectifying gaze supported by data, theories and science. They make assessments and offer different forms of diagnosis: the teacher in schools, health and care personnel in hospitals, and other workers in different organisational settings. As a corollary, we have all had the experience of being on the receiving end of the objectifying gaze of others, and thus felt sentenced and in some cases executed, lacking a path of flight or escape. The main argument in this chapter is that new forms of society, and most recently the emergence of the knowledge society understood in broad terms, support the move toward criterion referencing, key competencies, just-in-time assessment and an intensification in the importance of credentialism, most notably forms of micro- credentials. Of crucial importance to our argument inspired by critical realism is that these forms of assessment practice are intimately connected with and made possible by cultural, socio-economic, technological and political mechanisms and structures found in knowledge society. These in turn impact on the different levels of the four-planar social being (material transactions with nature, social interaction, social structures and stratified embodied personalities). By undertaking retroduction the goal is to reveal the manner in which these generative mechanisms and structures afford or alternatively constrict the transformative agency of different stakeholders such as students and teachers as they undertake acts of assessment. Some might go as far as to say, and it is worthy of further debate, that any particular assessment act will advantage some and disadvantage others. What is important in such an argument is how this might be acknowledged, legitimised (i.e. normalised) and taken for granted in different cultures in different ways. Bourdieu devoted a whole book, Distinction: A social critique of the judgement of taste (Bourdieu, 1984), to the manner in which mechanisms and structures in France create and value different symbolic and cultural practices. For example going to an art gallery can be perceived and judged to be a form of taste more highly valued, elitist and exclusionary than say going to a soccer match. To appreciate and assess these different kinds of cultural practices specific learning activities are required and these are aligned to a person’s or group’s cultural and socio-economic background.

2.1 Assessment from Ancient Times

23

2.1 Assessment from Ancient Times The ancient Greeks did not have formal examinations in the sense that we think of them, but they did practice the view that certain segments in society should be entitled to cultivate different types of knowledge. Aristotle’s (Aristotle., 1981) famous tripartite epistemology of knowledge is a case in point, where episteme (reasoned knowledge, what today we would call scientific knowledge) was reserved for those schooled in philosophy (wealthy men, and not women or slaves), techne (instrumental practice-based knowledge) was exemplified by artisans, and lastly phronesis was a form of knowledge practised by rulers and those making political judgements. The extent to which Aristotle institutionalised training in dialectics (διαλεκτική – the art of arguing) in the pursuit of episteme in the Peripatos, his own school for pupils, is impossible to determine (Grayeff, 1974). Nevertheless, in the Peripatos he followed Plato’s own precedent in the Parmenides of stating antithetical views. The goal was to keep things open through a never-completed questioning. Formalised examinations, such as the Imperial Examination, were found in China from 506 CE in the Sui Dynasty, and used as a means of recruitment to the civil service. A small vignette illustrates the importance of these examinations in some cultures. I (Dobson) completed my Magistergrad degree in sociology in the late 1980s at Oslo University, Norway. The degree itself no longer exits, having been replaced by the master’s degree. It required a research-based dissertation, a public lecture on a given topic with one week’s notice, two ten-hour written examinations (with a day in-between them) and a viva. My chosen topic was Vietnamese refugees’ experience of resettlement in Norwegian towns. The empirical research looked into the cultural history of the Vietnamese. I was inspired in particular by a book entitled Vietnamese tradition on trial 1920–45 (Marr, 1984) in which I learnt that Vietnam, like its Chinese neighbours, followed the thousand-year-old tradition of demanding exams at county, province and palace level for those wishing to rise in the ranks of the state bureaucracy. It was a part of the cultural structure of Vietnamese society embracing Confucian values of respect to elders occupying a couter point to the knowledge promoted and imposed by the French colonial system in the period after 1885. European culture by this time had already had hundreds of years of university- administered examinations. Moreover, the Chinese emphasis on the written examination was countered by the role of oral examinations in European universities, prior to the subsequent rise of written examinations from the eighteenth century. We know for example the oral disputation was a compulsory daily activity among students attending Oxford and Padua in the fourteenth century (Perreiah, 1984). Such daily disputations had among other things a didactic (teaching) goal and we find here an early example of assessment as learning. That is, being able to talk of knowledge orally was seen as a skill worth acquiring for later use in activities after graduation. Nordkvelle (2003) is surely correct to note the importance of new didactic advances based upon the innovations of Hugo St Victor (1097–1141) and Thomas Aquinas (1225–1274). They brought a greater emphasis on monological as opposed

24

2 New Forms of Society: New Forms of Assessment

to dialogical forms of teaching: one teacher to many listening students. But Nordkvelle (2003, p. 320) also draws attention to wider changes in societal mechanisms and structures supporting forms of assessment, in this case oral assessment: ‘It was not until the Italian humanists “rediscovered” the dialogue in the 15th century that it regained its previous status, and this time of a fundamental rhetorical ground.’ Politeness and amicable conversation were considered essential skills in an expanding international economy, and the art of rhetoric was an essential skill in this respect. Lost classical works such as Quintilian’s Institutio Oratoria and Cicero’s De Oratore were hunted down. In other words, the oral forms of assessment extended in time and space within the university must also be connected with the needs of the wider, emerging international economy. This would in turn support industrialisation. It is important to note however that, despite the university examination system, the meritocracy had yet to determine recruitment to forms of employment. Personal connections still occupied a dominant posiiton; whom you knew and your family lineage was more important than what you knew.

2.2 Modern Society With modern industrial society, the advent of mass education created the need to break with pre-modern ascribed and informal qualifications for occupational positions in society. Broadfoot (1996, pp. 67, 86) argues that this is connected with a) the fact that ‘instrumental and social functions are now organised on the basis of the individual’, b) society becoming more rational, and c)societies’ contradictory desires to increase social integration, while at the same time maintaining inequality and exclusionary practices. With this socio-economic background the love affair with the written examination began and the oral assessment of the Middle Ages gradually lost its position as the preferred form of assessment practice. In industrial society there was a need for a more skilled workforce, but it is important to remember that not all were to be qualified. As a consequence, the assessment system that took shape was specifically designed to both select and de- select and refuse, in some cases to exclude the majority. In industrial society, while many aspire to complete high school with the grades to guarantee entrance to higher education institutions or other forms of further training, many are not successful. Many argue that the dominant mechanisms and structures of society have been transformed. The new ontology in which we now live has been given different names. Giddens (1991) has called it late modernity, indicating that yet another twist in capitalism has taken place, with shorter production and consumption cycles. Qvortrup (2003) calls it the emergence of the hyper complex society with a greater number of differentiated sub-systems (e.g. the judicial, the family, the school and so on), each with their own regime of knowledge, and codes and rituals of assessment. Common to these theories is the belief that knowledge and information are more and more important, a point reinforced by the role of the internet and electronic communication. Castells (1996) touched upon this in terms of the

2.2 Modern Society

25

mode of extension in time and space. He talked of the space of flows, specifically the new industrial space and new service economy organised around non-specified local and global information-generating units, and how it seeks to supersede the space of places, based upon the ‘marking of places, the preservation of symbols of recognition, the expression of collective memory in actual practices of communication’ (Castells, 1996, pp. 350–51). With COVID digital connectivity with all thoughts of those who have access and those excluded follow up Castell’s point (Mercer, 2010). For some years we have been witnessing what some have called the arrival of the knowledge society and an accompanying set of ontological experiences identifying how organisations learn (Senge, 1990). We use the term knowledge society in preference to the ones above, without intending to discount their views. On the contrary, we incorporate their views into our emphasis upon knowledge and communication. With knowledge society we do not mean the inevitable demise of industrial society. Rather, knowledge is assuming a more crucial role in the dynamics of how industries function and develop in global, national and local societies. A well-known spokesperson for the emergence of the knowledge society is the educationalist Hargreaves. In one of his books, Teaching in the knowledge society: Education in the age of insecurity (Hargreaves, 2003), he offers the following definition: ‘In truth a knowledge society is really a learning society. I argue that knowledge societies process information and knowledge in ways that maximize learning, stimulate ingenuity and invention, and develop the capacity to initiate and cope with change’ (Hargreaves, 2003, p. 3). As a corollary, the emergence of the knowledge society has led to changes in curriculum and assessment. As Looney and Klenowski have formulated it: At the turn of the [21st] century there was a move away from assessing knowledge and products to assessing skills, understandings and processes. Also, rather than assessment occurring at the end of a course through external means, assessment was taking place throughout the course. A greater variety of methods and evidence was sought to demonstrate learning instead of relying only on written methods, and this was accompanied by a shift from norm referencing to criterion referencing, with less reliance on pass or fail summative assessments and more attention paid to identifying strengths and weaknesses formatively and recording positive achievement (Looney & Klenowski, 2008, p. 181).

Simply put, the knowledge society has become increasingly interested in the move from predominantly assessing learning (summative) to a greater emphasis on assessment for learning (formative). Accompanying this trend, assessment researchers have shown a renewed interest in forms of performance-based testing (Messick, 1994). Of course, to call it a renewed interest is perhaps a misnomer, since those interested in vocational assessment have always been interested in assessing the process of work undertaken by apprentices, as much as the product and the competency standard achieved. Similarly, the assessment of sport, music, art and other aesthetic subjects has always shown an interest in performance or authentic assessment and the manner in which students have reached their results. As Eisner (1999, p. 660) has put it, performance assessment ‘affords us, in principle, an opportunity to develop ways of revealing the distinctive features of individual students’ as opposed to mass, standardised forms of assessment where conformity remains the dominant logic. The

26

2 New Forms of Society: New Forms of Assessment

emphasis is thus upon the view that an assessed performance can more precisely replicate the reality of the process in a ‘real’ situation, and hence enhance its validity.1 At issue here is the disconnect between the labour market or real situation and the school-based context (Ewing, 2017). As an example, consider the Wolf Report (Wolf, 2011) in England which criticised vocational education for offering an impoverished general education in which 16–19 year-olds on vocational courses no longer study English and mathematics, even if they did not pass them in national examinations at 16. It also objects to ‘a diet of low-level vocational qualifications, most of which have little or no labour market value’ (p. 7). Research from the Netherlands suggests that the key challenges for apprenticeships are the quality of workplace learning (content, guidance, assessment) and the quality of the connection between workplace and school learning (Ostenk & Blokhuis, 2007). The knowledge society’s focus upon criterion referencing (e.g. meeting a minimum criterion, standard or threshold) as opposed to norm referencing (using a normal distribution of all student performance to determine the number receiving a C or an A for example) reflects not only the move to assess skills and processes, but also the increasing role played by generative mechanisms and structures designed to enhance accountability for the teacher and the school leaders in an ongoing sense and not a once a term or annual assessment of students. Moreover, even though criterion referencing is a global trend, its implementation and use is complicated and variable. Accordingly, on the social structure level of the social cube it is evident that institutional arrangements vary between countries, with the body designing national curricula not necessarily communicating well with the national agency responsible for devising its assessment. This can be compounded by the following situation identified in some Latin America countries, to take an example: Almost always, the curricular matrix for developing assessment instruments is devised by means of a ‘specifications table’ that outlines how test items should be drawn up so as to cover a given number of priority goals – although they do not account for all of the official curriculum or its complexity. Naturally, the negative outcome of this situation is a weak framework for interpreting results, since there is no professional or societal agreement as to what the students in the system should know and be able to do. Added to this is the problem that content that can be measured easily is often accorded priority, even if it is not necessarily the most important content as perceived by system stakeholders (Ferrer, 2006, p. 20).

Not only is there, therefore, a disconnect between the labour market and school- based assessment contexts in a knowledge society, as found for example in vocational subjects, but there can be little correlation between what students should know and be able to do, the curriculum and the assessment instruments. Countries Greater validity can come at the cost of reliability, where those assessing in the workplace have their own interests and agenda, as evidenced in England in the 1990s in vocational programs of study: ‘There is an inherent tension here. The person who is in theory best able to judge a candidate’s performance may be the least well placed to do so effectively. The assessment does not take place in a vacuum, but within a social context … the experience of professional groups which have practiced forms of competence- and workplace-based assessment for some years suggests that, in this situation, “objective”, standardised judgements are difficult to obtain …’ (Wolf, as cited in Rychen, 2003, p. 181). 1

2.3 The Meaning of Competence in the Knowledge Society

27

have tried to combat this. Columbia is a case in point, where different state bodies working together have defined a significant set of complex learning skills in several areas of the curriculum covered by the state examination. These skills, which facilitate educational assessment at distinct and clearly defined achievement levels, offer a more robust framework for interpreting results than that provided by assessments designed on the basis of a typical specifications table (Ferrer, 2006, p. 21).

Note the emphasis on skills and competence; this is something to which we shall return shortly. It is also questionable whether norm referencing can be so easily relinquished in favour of criterion referencing. On the social level of the social cube, generative mechanisms and structures support social interaction whereby we are apt to compare students in the staffroom with phrases such as, ‘This class has more demanding students than last year. How is it going with your students this year?’ Similarly, in the classroom perceptual information can also be collected and interpreted normatively: student B is more alert than normal, student C is looking out of the window more than usual. Davis has expressed this in the following terms: Criteria for the application of the term ‘difficult’ tend to shift erratically and confusingly … Content difficulty has to be appraised by means of direct inspection of the material, perhaps by those deemed to have appropriate professional expertise and experience. Such a process is open to variation in judgements between those doing the inspecting, however conscientiously this is carried out. It is quite natural for such opinions to be tainted by thoughts about the likely percentage of students who will succeed on the test items in question. So norm- referencing furtively resumes its influence (Davis, 1998, p. 8).

In a later chapter forms of criteria and the connection with taxonomy thinking will be discussed.

2.3 The Meaning of Competence in the Knowledge Society Advocates of the knowledge society have typically emphasised a number of key skills that can be found across domain-specific subjects. Some examples are illustrative. Firstly, the National Council for Curriculum and Assessment (2003, p. 20) in Ireland drew upon the following list of key skills: ‘learning to learn; information processing; communication; personal effectiveness; critical thinking; working with others’. Secondly, an example connected to curriculum is taken from Education Queensland in Australia: A-grade students in Grade 9 should possess the following generic skills: Extract information from prose, diagrams, maps and symbolic text; clarify it and transform it to display meaning in multiple media. Discern patterns and relationships in verbal, pictorial and symbolic text (alone and in combination); make significant decisions and judgements [and] operationalise these into accurate representation and products (2005, as cited in Looney & Klenowski, 2008, p. 188).

28

2 New Forms of Society: New Forms of Assessment

In the above example from Australia it might be argued that the key skill construct in terms of assessment and the knowledge society is the processing and transformation of ideas and information, but it is also connected to domain content in the sense that these generic skills are expressed in and through specific subjects. It is necessary at this point to pause and reflect upon what a concept of competence might be like in a knowledge society. It is a concept that is widely used in assessment terms, possessing a usage and conceptualisation that is wider than and subsumes the concept of skills. As we shall argue, it covers more than the already cited definition by Hargreaves (2003, p. 3), ‘the capacity to initiate and cope with change’. The question in critical realist terms can be formulated as follows: What properties of knowledge societies make the concept of competence possible in its different forms? Put slightly differently, what generative mechanisms and structures lie behind conceptions of competence and how do they impact upon different levels of the social cube? We note at the outset that no single conception of competence has complete hegemony to the exclusion of others. To understand the popularity of the term competence in recent years let us consider the OECD project Definition and Selection of Competencies (DeSeCo) and the Programme for International Student Assessment (PISA) first conducted in 2000. The last-mentioned assessment sought to compare student knowledge and skills in reading, mathematics, science and problem solving. The focus is on the mastery of processes, understanding concepts, and the ability to function in different situations in each domain, as opposed to the possession of specific knowledge (OECD Centre for Educational Research and Innovation, 2008, p. 23). However, there was always an understanding that student success in later life depended on possessing a wide set of competencies. In late 1997, experts (sociologists, assessment specialists, philosophers, anthropologist, psychologists, economists, historians, statisticians and educators) and stakeholders (policy makers, policy analysts, trade unions, employers and national and international institutions) were brought together to try to reach consensus on the kind of competence necessary for a successful life and a well-functioning society. It will always be a point of contention whether such an overarching view of competence can be formulated or whether cultural, economic, technological and political specificities make it impossible to reach such a transversal or context overarching formulation?2 For six years the experts and stakeholders participated in symposia, reviewed research on competence, clarified concepts of competence and submitted reports on country perspectives. They published a final report in 2003 entitled Key competencies for a successful life and a well-functioning society (Rychen & Salganik, 2003). The basic thesis of the DeSeCo project was that subject-based knowledge and skills are not enough. A wider competence is required to meet the challenges of a ‘It is therefore possible to define key skills independently of culture, age, gender, status, professional activity, etc., but such abstracted descriptions of the components of competence fail to engage with the extent to which competence is enmeshed in context, and the competence of individuals is conditioned by the “personal history” of the contexts in which they have performed’ (Oates, 2003, p. 186). 2

2.3 The Meaning of Competence in the Knowledge Society

29

shifting, complex world. Among other things the competence to communicate effectively is important and this strongly echoes the concerns of the knowledge society. In more detail, DeSeCo’s understanding of the ontological depth of the world is rooted in a number of generative mechanisms and structures: Globalisation and modernisation are creating an increasingly diverse and interconnected world. To make sense of and function well in this world, individuals need for example to master changing technologies and to make sense of large amounts of available information. They also face collective challenges as societies – such as balancing economic growth with environmental sustainability, and prosperity with social equity (OECD, 2005, p. 4).

The DeSeCo group argued that a) societies are becoming more complex and rely upon social interactions with diverse groups, b) technologies are changing rapidly with one-off mastery of less use than adaptability, and lastly c) globalisation brings groups and individuals into closer, interdependent contact with each other. We would argue these mechanisms and structures suggest the need for a stratified self with competencies rooted in cognitive skills and reflective thought processes, such as thinking about thinking, creative abilities and taking a critical stance. Additionally, the stratified self needs to be able to act autonomously. This constitutes one of the three categories of key competencies proposed by the DeSeCo group and entails the ability to act confidently with an understanding of the big picture3; the ability to form and conduct life plans and personal projects; and the ability to assert rights, interests, limits and needs. Echoing the social interaction level of the social cube, the group proposed a second category of key competencies revolving around interacting in heterogeneous groups, including the ability to relate well to others, the ability to cooperate, and the ability to manage and resolve conflicts. The third and final category of key competencies they proposed is close to the material transaction with nature aspect of the social cube, especially if by transaction we include the mediating function of socio-cultural and physical tools between material, nature and humans. The group defines the third category of key competence as using tools interactively. This refers to the ability to use language, symbols and text interactively, the ability to use knowledge and information interactively, and the ability to use technology interactively. The group envisages the interactive use of technology in a number of ways that reflect the increasing importance of the knowledge society, for example: Information and communication technology has the potential to transform the way people work together (by reducing the importance of location), access information (by making vast amounts of information sources instantly available) and interact with others (by facilitating relationships and networks of people from around the world on a regular basis) (OECD, 2005, p. 11).

This is also defined as ‘the ability to analyse situations, systems, and relationships, including power relationships. Generally speaking, this is a system orientation that allows people to construct a coherent line of action, have an idea of the “game” and the roles they are playing, recognise patterns and understand the larger picture, evaluate actions with respect to shared norms or with regard to a social order’ (Rychen, 2003, p. 113). 3

30

2 New Forms of Society: New Forms of Assessment

The DeSeCo group argued that these three competencies (acting autonomously, interacting in heterogenous groups and using tools interactively), in combination in different contexts, must be nurtured at school and also throughout life as part of lifelong learning. It must be noted that their view supports and also to some extent implies an instrumental view of the development of competencies. Yet not all can be planned and self-development as Bildung also happens outside of the classroom, through shared life experiences with peers and different age groups as they interact in the city, in rural areas or in the virtual medium of the space of flows (Castells, 1996) understood to be the internet (Dobson, 2018). The old adage holds some truth: to go to the university of life can be as valuable as attending the university of bricks and mortar and schools. It is this broader understanding, including non-school-related activities, that arguably lies behind the OECD Programme for the International Assessment of Adult Competencies (PIAAC). Competence as a form of literacy is defined by the latter as ‘the interest, attitude, and ability of individuals to appropriately use socio-cultural tools, including digital technology and communication tools, to access, manage, integrate and evaluate information, construct new knowledge, and communicate with others in order to participate effectively in society’ (Schleicher, 2008, p. 630). The argument we are making in these pages is that competence is wider than specific skills and knowledge connected to curricular plans and specific subject/ disciplinary knowledge. In arguing that something more is at stake the discussion is moved to an understanding of transversal competence as something to do with self- development (bildung), which can be unplanned, as well as taking part in planned, goal-directed nurturing contexts. Moreover, it is intimately related to the knowledge society with a focus upon information technologies, communication and interaction that can suit any time, any place and just-in-time needs. As Gal has aptly put it, what is required is the development of transversal skills, or what she calls transferable competency, including tasks that: • involve ill-structured problems similar to real-life tasks, • contain text-based messages conveying various quantitative and statistical arguments or requiring critical interpretation of texts, • present statistical information of the kinds normally encountered in the media or in workplace and civic action contexts, • demand the kinds of coping behaviors that adults are called upon to demonstrate in real life, including the use of technology for accessing, sifting through and organising quantitative information (Gil, 2012, p. 12). This line of argument is also in line with the debate on twenty-first-century skills (Suto, 2013), which contains many of the perspectives proposed by DeSeCo. There is no single agreed definition of twenty-first-century skills (Silva, 2009), and it may well be that flexible definitions are necessary and valid according to the situated context in which they are learnt formally and informally, and practised in education, the workplace and other arenas. Undeterred, groups such as the Assessment and Teaching of 21st Skills project (Griffin & Care, 2015) based in Melbourne have attempted a definition, proposing four main mechanisms and structures, and across

2.3 The Meaning of Competence in the Knowledge Society

31

these ten twenty-first-century skills: ways of thinking (creativity and innovation, critical thinking/problem solving/decision making, learning to learn), ways of working (communication, collaboration), tools for working (information literacy, ICT literacy: Lemke, 2003) and living in the world (citizenship, life and career, personal and social responsibility including cultural awareness). Other possible twenty-first-century skills are articulacy, intelligence (emotional, multiple, cognitive), multi-lingualism, and even narrative understanding (Dobson, 2005). In many senses some of these skills are not new, for example problem solving has existed across all cultures. What is new is that the construct of twenty-first-century skills appears to be somewhat elastic and multifaceted, supported by a range of mechanisms and structures resonating with the times in which we live. What does this mean for assessment practices? Firstly, we would argue that there is an underlying skill that seeks to bridge many of these DeSeCo and twenty-first- century skills: it is the central importance of written and oral literacy and the assessment practices associated with them. Secondly, and a point we would like to develop more fully, if competence is about using technology (including ICT), possessing autonomy, interacting with diverse groups (including collaboration), and a key to all of this entails using cognitive thought processes, reflection and problem solving that is cross-curricular and not aligned to specific subjects, then we are approaching a view of assessment as learning (Dobson et al., 2009). This is something that is conceptually different to assessment for and assessment of learning. How so? Assessment for learning seeks to integrate the assessment act with learning activity and not assign each to a specific period in the teaching of a subject. Feedback in the course of a lesson provides an example. Assessment of learning could be an assessment act that comes at the end of a period of teaching. It might be an exam at the end of a course of study. Assessment as learning is where the assessment supplies a form of meta-cognitive reflective insight into learning. We will use an example as an illustration. The eldest son of one of the authors decided to major in sports at high school. The father was somewhat prejudiced and thought nothing would come of him in future life. The son said his father did not know anything. On completing high school he had to take one year of national service (at this time compulsory for males in Norway). He decided at the same time to apply for an officer’s commission. If successful this would mean a minimum of two years of service in the military, rather than the compulsory one year. With 240 others he was called in to take the selection tests. This took place over two weeks and the assessment was continuous; sometimes in the middle of the night. The recruits were tested in leadership, teamwork, problem-solving ability, endurance, fitness, literacy and so on. He gained his commission, largely due to his physical fitness and team training during his high school education. To conclude the story, he left the army after his two-year commission because he knew he was due for a posting abroad in a conflict zone. The point of the story is that, even though assessment of learning was evident in the final outcome of the selection period, the son was gradually accumulating and also demonstrating his competence in different forms of assessment during the course of the two weeks. He was learning and developing his meta-cognitive and

32

2 New Forms of Society: New Forms of Assessment

self-reflective knowledge and skills in assessment. The knowledge and skills were cross-curricular, encompassing autonomy, mixing with peers and using different tools interactively. A more classroom-oriented example of assessment for learning that demonstrates these key competencies is peer assessment where the students exchange roles, first being the assessor and then the assessed. In the course of these assessment acts they develop their key competencies and motivation for learning in addition to expressing their knowledge of the subject, which is the focus of the activity (Nazzal, 2011). The properties of the knowledge society, with global generative mechanisms and structures, provide a platform for an overarching set of competencies that are cross- curricular and entail meta-cognitive, self-reflective skills. A corresponding form of assessment is additionally nurtured; one that represents something more than assessment of and for learning, namely assessment as learning. In the next section, we shall consider new twists in the development of the knowledge society and how they support and make possible new forms of assessment practices.

2.4 Assessment Practices and the Mode of Extension in the Knowledge Society It can be argued that the knowledge society and assessment practices connected with the learning society, such as performance assessment and criterion referencing, receive a new twist through innovations made possible by the internet. The OECD in a report entitled Innovating to learn, learning to innovate (, 2008) cites four innovation ‘pumps’ that are important in education. The first three are the appropriation of research knowledge, teachers pooling their knowledge, and a modular structures pump whereby teachers and schools work in a more connected manner between taught modules. The last force of innovation is defined as follows: The ‘information and communication technologies’ pump: there is a powerful potential for digital technologies to facilitate the transformation of education, but its use in schools remains underdeveloped, partly because the main modus operandi of school administration and instruction are resistant to change (OECD Centre for Educational Research and Innovation, 2008, p. 28).

With this in mind it can be noted that a new mode of extension is actualised with ‘just-in-time’ digital assessment.4 The very term seems well-suited to the everincreasing pace of society and resonates with COVID-19 events as teachers have worked to develop digital teaching skills and resources at a rapid rate (Dobson & Scofield, 2020). Just-in-time digital assessment underlines that assessment becomes more individualised when the student can decide when and where they wish to make their electronic submissions for assessment. This is typical of distance learning where the student might be located in a different country to the teacher and assessor. A term belonging to the same family is ‘tailored-to-fit assessment’.

4

2.4 Assessment Practices and the Mode of Extension in the Knowledge Society

33

Just-in-time assessment can also refer to the appropriation of e-assessment tools prior to the teaching of a face-to-face classroom session. This was the point of Novak et al. (1999) who pioneered just-in-time teaching whereby students submitted electronic answers prior to the week’s classroom teaching. The teacher could use the submissions as assessment information to fine tune the coming lesson. Since the late 1990s, there have been further innovations which mean that e-assessment can also be used to provide feedback in the course of the actual lesson. In this version of just-in-time assessment the students are asked in the course of the lesson to respond to selected teacher questions using digital feedback consoles (audience response systems)5 that transmit signals to the teacher’s computer in the classroom (Krumsvik & Ludvigsen, 2012). The teacher can then act upon the information to adjust the course of the ongoing lesson. Of course, there is always a danger that students might provide the right answer for the wrong reason. Krumsvik and Ludvigsen (2012, p. 44) argue that, even if this is the case, students became more motivated on learning of their errors, promising to ‘study more’, ‘study more thoroughly’, ‘look at it in more detail’ and ‘concentrate harder’. Theorising just-in-time assessment practices begins with the view that learning is no longer clearly demarcated as a synchronic activity in a shared time and place, such as the classroom. The internet is used as a mode of extension whereby the assessor and assessed can be either more greatly separated in time and place (e.g. distance learning) and hence a-synchronically connected to each other. Alternatively, they can more closely interact in the classroom (e.g. using digital feedback consoles), such that the synchronic aspect is intensified rather than toned down, as in distance learning. In this classroom-oriented form of just-in-time assessment there is often a perceived lack, as defined by the following comment: Although professors are considered to be content experts, most are not technology experts. This technology gap poses a problem for professors who want to provide quality feedback to their 21st century students who expect their teachers to be using the latest technology (Fish and Lumadue, as cited in Krumsvik & Ludvigsen, 2012, p. 39).

This lack constitutes a force propelling innovation, the use of e-assessment to promote feedback and the move to absence the absence (i.e. the lack). Technology is not formative in itself, but the manner in which it is used defines and delimits how it can support generative mechanisms and structures that are formative. Thus, teachers hope to make feedback from their students more visible so that student learning becomes more transparent and the feedback can be acted upon. As we recall from the introduction, transparency constitutes one of several possible assessment principles. Another component in theorising the generative mechanisms and structures of justin-time assessment is understanding how it increasingly relies on a different form of pedagogy. Siemens (2005) in a seminal piece highlighted connectivism. If knowledge is distributed widely in different networks, some conceptual (carried in our heads) and some external in books or on the internet, how might it best be taught or acquired? The pipe is more important than the content within the pipe … Our ability to learn what we need for tomorrow is more important than what we know today. A real challenge for any They are also known as feedback clickers.

5

34

2 New Forms of Society: New Forms of Assessment learning theory is to actuate known knowledge at the point of application. When knowledge, however, is needed, but not known, the ability to plug into sources to meet the requirements becomes a vital skill. As knowledge continues to grow and evolve, access to what is needed is more important than what the learner currently possesses (Siemens, 2005).

For Siemens the point is that teaching and learning is now about being able to connect different sources and networks of knowledge, residing in particular places and repositories, many of which are distributed across the internet. We circle back to our earlier point: we will still need knowledge that spurs action. Critical thinking is still vital, but of the character required to select and evaluate the pipe and the contents of the pipe. The role of the teacher and also the assessor changes. As knowledge is increasingly based upon connecting with networks, the physically or virtually present teacher is now only one possible – although undoubtedly valuable – connection and source of judgement and valuation. We are thinking of Google or Siri as supporting and also competing entities. We must also consider the issue of inclusion and equity. What of those who are more disadvantaged than others in joining networks because they lack suitable equipment, skills or a quiet place? During COVID-19 many commented on those who lacked all of these. Underlying the intention of just-in-time assessment, either in the form of distance education as the student decides when and where to make submissions for assessment or in the classroom use of an electronic audience response system, is the desire that assessment will improve the transformative agency of the individual student (Redecker & Johannessen, 2013). However, this presupposes that the student is able to determine when they are ready to make submissions or that they are motivated to participate in the use of the electronic console device. Such forms of independence, autonomy and motivation cannot be taken for granted. They must be developed over time. Likewise, teachers must be receptive to ‘moments of contingency’ (Black & Wiliam, 2009) when feedback information is wilfully used to modify teaching, and hence the opportunity for learning. An example of adapting to contingency is the trial by Jones and Johnson (2022) at Central Queensland University of flexible due date of assignments in a humanities program. Results were positive, but the authors also noted that making learning and by implication assessment hyper-flexible for students as they self-pace their studies might not work ‘in disciplines that are more heavily exam-driven (like health and IT).’ Considering technology we are also aware of its constantly changing character. In particular, Artificial Intelligence (AI), Learning Analytics and Big Data are central to the future of all Tertiary Education. They are part of the horizon of innovation upon us, across all subject disciplines from Early Childhood Education with examples of smiley face survey’s on social skills for kindergarten aged children to Universities where surveillance of student engagement with learning resources is consistently monitored6. AI captures how the culture of learning for students is Fladberg, K. Norsk skole ber 6-åringer sette surefjes på egen adferd og svare på om læreren liker dem (Norwegian schools request 6 year olds to use smiley faces about their own behaviour and answer assess if the teacher likes them) Published in Dagsavisen newspaper 20.01.20 (accessed 29.11.22). 6

2.5 Inclusion, Exclusion and the Diploma Disease

35

changing to include computer adaptive testing and micro-credentialing/short courses/MOOCs as windows to enrolment, deeper personalised learning experiences and efficiencies in university programs. Even though Big Data and Learning Analytics have been around for a considerable time, they are often bogged down in privacy ordinances. Nevertheless, they are central in informing trends and performance in real time feedback as regards educational attitudes and skill-potentials for students and staff alike, not to mention of course in student recruitment and retention activities. They are becoming increasingly important in today’s blended and fully online learning environment and will be central in the next generation of Learning Analytics. This will embrace wearables (new generations of digital watches are so much more than chronographs as we have already learnt) and research into measuring learning in subjects where innovation, creativity, internships, studio design/laboratory work and performance are essential – building upon STEM to STEAM and STEMM. The authors of this book are reminded of the changing and yet constant desire of society’s to adjust to the emergence of the learning society as Selwyn et al. (2006) called it. By this they meant the increasing presence and promise of education informed by the internet. The empirical research undertaken just after millennium identified how many families possessed computers in the home but left them to children and parents tended to ‘fiddling around on the computer’ (p84) around with them. Today computer functionality is as available on the hand-held tablet, mobile telephone or evening on the watch. We have moved beyond the fiddling stage, and yet the promise that access to the internet through different machines – hand-held or standing and fixed has not removed the constant fear that it may not lead to deeper engagement in learning activities. This is something we have learnt in the debates about pupil learning loss during COVID times. As McCallum et al. (2021) have noted: One of the constant challenges in education is keeping the learner engaged, motivated and connected in a world increasingly filled with distractions. Social media, streaming TV and video games all compete for students’ increasingly fragmented attention. COVID-19 lockdowns only increased the opportunity for those distractions to interfere with learning.

In later chapters we will have cause to further reference the work of McCallum and colleagues.

2.5 Inclusion, Exclusion and the Diploma Disease Even though the knowledge society is a global phenomenon it would be a mistake to assume all societies are equally developed as knowledge societies. It is also the case that in assessment terms not all have reaped the same benefits from inclusion in knowledge societies. Patterns of exclusion through assessment are persistent. To further explore these points we will revisit a classic assessment debate, the diploma disease, and our argument will be that the knowledge society has intensified something identified by Weber in the early decades of the twentieth century:

36

2 New Forms of Society: New Forms of Assessment When we hear from all sides the demand for an introduction of regular curricula and special examinations, the reason behind it is, of course, not a suddenly awakened ‘thirst for education’ but the desire for restricting the supply of these positions and their monopolization by the owners of educational certificates. Today the ‘examination’ is the universal means of this monopolization, and therefore examinations irresistibly advance (Weber, 1945, pp. 241–242).

Weber was in many senses a forerunner and early theorist of the diploma disease and, while his comments were directed primarily toward employment in professions or the civil service, they would also seem to apply to other sectors of society. The debate about credentialism is therefore nothing new, but it was debated energetically with the publication and subsequent discussion of The diploma disease: Education, qualification and development by Dore in 1976. It is important to be clear what we are arguing: the knowledge society’s interest in different forms of knowledge, skills and more broadly speaking competence must necessarily include the assessment-related question of how to value and validate these things. This requires not simply some form of appeal to assessment principles undertaken by researchers, assessment experts or assessment practitioners. It involves in addition the valuations undertaken by politicians and policy makers in each individual country. The debate and practice of credentialism, expressed vividly in the metaphor ‘the diploma disease’, is important in this respect, since it constitutes a concrete illustration of such a valuation and validation at the national and institutional levels and its impact on students at the individual level, specifically their sense of self-esteem and their feeling that they are included and successful or the opposite. Dore’s basic thesis is that credentialism represents in our terms a set of generative mechanisms and accompanying structures that bridge and connect educational performance with occupational selection. In assessment terms, Dore explored the concept of transfer of knowledge and skills and how this, measured in terms of diplomas, certificates and degrees, confers monetary, opportunity and status benefits upon those who are successful, or the opposite for those excluded from such returns. Credentials are a source of ongoing entitlement, supporting generative mechanisms and structures aligned with a meritocratic society. This marks a break with a society where favouritism and nepotism are the primary route to jobs. However, it would be a mistake to regard our line of argument to mean that the knowledge society is another word for the meritocratic society. There are many generative mechanisms and structures that counter the working of a meritocracy, where all have an equal opportunity to gain credentials. For example, the middle classes have sought to give their children advantages in the pursuit of credentials. In such an example, we are thinking of what might be termed parentocracy rather than meritocracy. Parents can invest in after-school tutoring or exam cramming, with the subsequent accrual and mobilisation of cultural capital than can be redeemed in the school setting (Bourdieu, 1986). With respect to the latter an example might be how students from middle-class families are exposed to a domestic way of talking (through open discussion and justification of thoughts) that is replicated in the way teachers talk (Maguire, 2005) or that their parents more actively follow up and support the progress of their sons and daughters in school (Lareau, 1987). Put differently, the culture of the school is familiar, rather than potentially alienating; for the successful it makes possible ‘a function of consecration’ (Bourdieu, 1996, p. 102)

2.5 Inclusion, Exclusion and the Diploma Disease

37

and inclusion in what is considered their normal life course and future careers. Consecration from the Latin consecrare to anoint. make holy, devote in contrast to the normal. Dore’s argument had a global focus in the sense that he considered the level of development of different countries (exemplified in particular by England, Sri Lanka, Japan and Kenya) and what this might mean for their education systems and patterns of recruitment into the labour force: The later development starts (i.e. the later the point in world history that a country starts on a modernisation drive) the more widely education certificates are used for occupational selection; the faster the rate of qualification inflation and the more examination-oriented schooling becomes at the expense of genuine education (Dore, 1976, p. 72).

On the back cover of the original book the following can be read: Schools used to be for educating people, for developing minds and characters. Today, as jobs depend more and more on certificates, degrees and diplomas, aims and motives are changing. Schooling has become more and more a ritualised process of qualification- earning (Dore, 1976).

Even though England was an early developer and exempted from the excessive use of credentials, by the 1990s it too was a system marked by a huge growth in qualification earning. In Japan credentialism was not marked by the level of qualification as the entry ticket to employment. Instead, with a standardisation of salary levels across firms by qualification level, it is which institution or school the graduate attended that remains decisive. Universities in Japan are ranked clearly in terms of the entry scores of students; so, too, secondary and primary institutions. Dore was also interested in Cuba, Tanzania and China as societies seeking to decouple themselves from credentialism and the connection between school performance and job recruitment. However, credentialism was also evident in these countries in the years that followed. Cuba’s expansion of primary and secondary education was not matched by job expansion, especially after the demise of the Soviet Union. China is of course renowned for its gao kao national college entrance examination, which echoes the much older Keju state examinations that had existed for hundreds of years until 1905 (Yu & Hoi, 2005). Dore’s argument is interesting in critical realist terms because it demonstrates how the institutional level of the social cube impacts upon the stratified self of the individual. Moreover, although the common denominator is the mechanism of credentialism, it is expressed differently across countries because of their specific cultural, socio-economic, technological and political histories. He summarised his model in the diagram reproduced as Fig. 2.1. In the centre we find the use of certificates for job allocation. With increased competition for a limited number of jobs qualification escalation can result, such that higher qualifications are required for the same job over time. With competition for jobs there are also those who are educated, possessing credentials, but remain unemployed and hence excluded from the rewards outlined above. There is a backwash effect on schooling and pedagogy, which becomes exam oriented. Schooling is less oriented toward developing the skills of curiosity and learning as a source of self-fulfilment. This has a roll-on effect for the successful who have performed well in assessment of learning

38

2 New Forms of Society: New Forms of Assessment

Fig. 2.1 The diploma disease. (Source: Dore, 1976, p. 141)

exercises. The result can be a deformed or incomplete stratified self, motivated by an individual ‘dog eat dog’ competition for grades and outdoing peers, rather than by cooperation and team learning/teamwork, and focused on ‘self-advancement by competitive conformity to external rather than internalised authorities’ (Dore, 1980, p. 3). Dore is scathing in his view of societies: Everywhere, in Britain as in India, in Russia as in Venezuela, schooling is more often qualification- earning schooling than it was in 1920, or even in 1950. And more qualification-earning is mere qualification-earning—ritualistic, tedious, suffused with anxiety and boredom, destructive of curiosity and imagination; in short, anti-educational (Dore, 1976, p. ix).

In a later publication Dore (1980) revisited his earlier understanding of the mechanisms and supporting structures of credentialism. He notes: Implicit in the diploma disease model is the assumption that employers use educational certificates primarily as ‘screening’ devices – as measures of general ability (intelligence and powers of application) which indicate a person’s likely ‘trainability’ over a whole range of skills, rather than as indicators of the cognitive and other skills which he has acquired as ‘human capital’ through his schooling (Dore, 1980, p. 4).

Dore elaborates how credentials as measures of general ability are typical of science and liberal art degrees, and this does not as such apply to credentials typical of vocational subjects and professional qualifications, such as law, nursing and medicine, where cognitive and other practice skills are predominant. Trainability is important for the employer who desires recruits for a long career where they will have to change their job content and develop competence over the years.7

In Dore’s (1980, p. 8) words: ‘It was not an accident that in Britain it was the East India Company and the civil service which, whether willingly or not, pioneered the practice of using general education attainments to select recruits. They were the most bureaucratic organisations in contemporary society in the sense that (a) their operations were rule-bound and prescribed by a range of regulations designed to check favouritism, and (b) they took people not just for jobs but for careers, 7

2.5 Inclusion, Exclusion and the Diploma Disease

39

Dore cites Mexico as a country where vocational training is dominant and Sri Lanka where it is credentials from the sciences or liberal arts that occupy a more dominant position with respect to recruiting. In Mexico, employers seem to take note only of what a person has studied and to what level – secondary, senior secondary or university – he or she has studied it. Technical and business studies are preferred. General education is rarely sufficient for clerical posts, and, at executive levels, is likely to be interpreted as showing a lack of motivation and orientation toward work (Dore, 1980, p. 4).

In Sri Lanka credentials are used to screen for general ability, such that science subjects are valued above liberal arts, with the former considered the most difficult. Dore also mentions that credentials are consistent with class screening in some countries, whereby employers will regard those with university degrees as belonging to a specific socio-economic class, which demonstrates a ‘higher probability of being congenial and loyal, of “fitting in”, of having a good presence, a good way with customers’ (Dore, 1980, p. 10). Dore’s solution to the diploma disease is schooling that is less credential oriented. It should be more focused on learning that reveals knowledge for its own sake or to awaken curiosity and imagination. It should foster self-fulfilment instead of acquisitive achievement in the pursuit of qualifications. Of course, Dore does not discount learning that is mastery oriented. As Dore puts it: In the process of qualification, by contrast, the student is concerned not with mastery but with being certified as having mastered. The knowledge that he gains, he gains not for its own sake and not for constant later use in a real life situation – but for the once and for all purpose of reproducing it in an examination (Dore, 1976, p. 8).

Stobart (2008, p. 101) is scathing of self-fulfilment as a learning objective and considers that Dore falls foul of believing that it is some form of inborn, fixed capacity, with the implication that no matter how much effort a particular student invests, the capacity cannot be changed. This fixed capacity perspective is reinforced when Dore advocates aptitude tests, read intelligence tests seeking to reveal a student’s underlying ability. He is thinking especially of mathematics and languages because it is more difficult to cram for them. There is nothing wrong with ability testing per se; the problem arises when it is used to make ‘a person’s prior capacity to learn, the cause of learning, which is both independent of schooling and often seen as innate and fixed’ (Stobart, 2008, p. 101). Dore can be criticised on a number of counts. In making the following assertion he seems to overstate the contrasts: ‘If education is learning to do a job, qualification is a matter of learning in order to get a job’ (Dore, 1976, p. 8). It is possible that completing studies and achieving grades can be both learning to do a job for later in time and at the same time positioning oneself to get that job. They do not have to be either/or choices and it is possible to learn transferable skills while striving to attain credentials; especially if the studies involve work integrated learning or professional qualifications with mandated placements or internships. As Little (1997, p. 9) has noted: and hence were interested in the lifelong development potential of a man, not in his specific ability to do a particular job over the next six months.’

40

2 New Forms of Society: New Forms of Assessment

‘Certificates in many areas of professional skill signal competence and protect the public against incompetent and dangerous practices (e.g. in medicine). Examination orientation can introduce students to life-enriching curriculum and pedagogy.’ Moreover, classroom assessment practices can support learning when the two are integrated in forms of assessment for learning. Not all forms of assessment in pursuit of credentials are therefore detrimental to learning. As we have already argued, assessment as learning can also be a valuable goal in the pursuit of competencies that are cross-curricular and connected with the use of reflective thought and acting autonomously, interacting in heterogeneous groups and using tools interactively. Dore might be criticised for using the nation-state as his institutional model. But this might be rebutted by the argument that Dore’s thesis of nation-states organising their own versions of credentialism has gained a global counterpart with the arrival of globally respected qualifications, such as the International Baccalaureate Diploma Programme (IBDP). It must be noted that the IBDP is not actually that old, tracing its origins to the 1960s. But this process of valuation and validation is not merely restricted to practices within the nation-state. Examples of its role can be found in the global field of interactions between states, and students are the beneficiaries in the sense that international credentials support their patterns of movement in the pursuit of learning. Before leaving the topic of the diploma disease and its focus upon credentialism, we want to return to a theorisation of the properties of the knowledge society that make credentialism and its different generative mechanisms and structures possible. In the opening account in this section on the diploma disease we adopted a neo-Weberian approach, announcing that the diploma disease might be understood as an expression of individuals and groups desiring to restrict and control a) the supply of qualified entrants to education programs, or b) recruitment to specific jobs in the labour market. In the words of Weber (1945, pp. 241–242), holders of credentials make ‘claims to monopolise socially and economically advantageous positions’. Restricting access to either education or to the monetary or status benefits of employment therefore constitutes one way in which groups by forms of social closure seek to stratify society into groups. Groups in this sense can be different classes, such as the middle classes, or different professional groups, such as doctors or lawyers. Weber was in many senses a precursor of Bourdieu and his work on different forms of capital (social, economic, cultural and symbolic) possessed and acquired by individuals. We will return to his work in the coming chapters and suggest what we call assessment capital. The middle classes are aware of the opportunities created by obtaining credentials and seek to position themselves favourably in the struggle for the benefits with respect to other classes. Brown (2003) has explored this in terms of positional conflict theory, whereby the middle classes become trapped in the struggle for credential-based opportunities. The opportunity trap is such that ‘few can afford to opt out of the competition’ for schools, universities and jobs. For those from disadvantaged backgrounds the starting blocks are placed further back than for those from the middle classes. Social mobility is far from open and equal for all members of society. Brown sums up positional conflict theory in the following manner: The opportunity trap today is quantitatively and qualitatively different from the competition for a livelihood in the past. It is the sheer scale of the credential enterprise, most notably the expansion of higher education in countries such as Britain, the Netherlands and the United

2.5 Inclusion, Exclusion and the Diploma Disease

41

States. There are far more people still in the competition for profession and managerial work in their early twenties, many of whom would have entered the labour market by their eighteenth birthday in previous decades … It is also qualitatively different in the way opportunities are now experienced. Opportunities have not only become ‘individualised’ given an emphasis on self-reliance and employability, but they have also become ‘disorganised’. Occupational careers are more opaque and contingent. One is encouraged to create one’s own opportunities within institutional structures (education, work and the labour market) that are more insecure and inherently risky (Brown, 2003, p. 36).

A paradox is therefore evident: the more the middle classes and professions as social collectives attempt to secure positional advantages by restricting entry to schools, universities or jobs, the more individuals must demonstrate self-reliance and that they are suitable candidates for either student places or jobs. Put differently, those seeking credentials cannot ultimately hide behind the advantages accrued by their class background. With respect to assessment, credentialism becomes a source of exclusionary practices reinforced by the manner in which different groups seeks to secure positional advantages. Credentialism is additionally a way for knowledge societies, and groups and individuals within these societies representing respectively the institutional, interactional and stratified self levels of the social cube, to value and validate the performance of participants in schooling and universities, and screen them for employment in the labour market. Before closing this chapter, it is timely to note that there are also generative mechanisms and structures seeking to counter exclusion and promote inclusion. We are thinking of the spread of the International Baccalaureate Diploma Programme, Cambridge Assessment International Education and the European Commission’s announcement to create European university Degrees. The last mentioned is summed up in a future oriented universities strategy document: A joint European degree, to be delivered at national level, would attest learning outcomes achieved as part of transnational cooperation among several institutions, offered for example within European Universities alliances, and based on a common set of criteria. A European degree should be easy to issue, store, share, verify and authenticate, and recognised across the EU. As a first step, the Commission will work towards developing European criteria for the award of a European Degree label. Such a label would be issued as a complementary certificate to the qualification of students graduating from joint programmes delivered in the context of transnational cooperation between several higher education institutions (European Commission, 2022).

These are inclusionary in the sense of transcending nation-states as global credentials; however in some cases they set standards and levels of performance that can be exclusionary. A good case in this respect is Indonesia as considered by Dobson and Zhudi: With a population of 268 million, access to English language curricula has mostly been limited to urban areas and middle-class parents who can afford to pay for private schools. At the turn of this century, all Indonesian districts were mandated to have at least one public school offering a globally recognised curriculum in English to an international standard. But in 2013 this was deemed unconstitutional because equal educational opportunity should exist across all public schools.

Nevertheless, today in Indonesia there are 219 private schools offering at least some part of the curriculum through Cambridge International, and 38 that identify as Muslim private schools. Western international curricula remain influential in setting

42

2 New Forms of Society: New Forms of Assessment

the standard for what constitutes quality education. In Muslim schools that have adopted globally recognised curricula in English, there is a tendency to over-focus on academic performance. Consequently, the important Muslim value of Tarbiya َ is downplayed. (Arabic ‫)طبِي َعة‬ Encompassing the flourishing of the whole child and the realisation of their potential, Tarbiya is a central pillar in Muslim education. Viewed like this, schooling that concentrates solely on academic performance fails in terms of both culture and faith. It’s unfortunate so many schools view an English-speaking model as the gold standard and overlook their own local or regional wisdoms. We need to remember that encouraging young people to join a privileged English-speaking élite educated in foreign universities is only one of many possible educational options (Dobson & Zhudi, 2020).

In considering more inclusionary generative mechanisms and structures we need to look elsewhere. Take for example two related initiatives that have been brought together, namely blockchain technology and a wide conceptualisation of credentials including not only formal study in higher education institutions, but also microcredentials, nano-degrees, massive online open courses (MOOCs) and badges (Jirgensons & Kapenieks, 2018). Blockchain technology comes in different forms and is offered by an increasing number of university institutions, but its premise is the same. It refers to digital platforms that document and store formal and informal transversal skills, providing individuals with greater access to their accumulated credentials. Over the years a need has been generated for a repository that eases anytime anyplace access to these documents and avoids a person being dependent on access through a third party, such as a previous employer or a university. As Haugsbakken and Langseth (2019) have noted, blockchain technologies, supporting initiatives like Europa’s Digital Credentials Infrastructure developed by the EU (2020), hold the promise of reducing the higher education bureaucracy amongst providers of credentials. But they also note the adoption of such technologies is by no means even and the adage that ‘data is the new oil of the digital era’ rings truer for some actors than others, meaning these technologies attract differential investment as a consequence. The overall point though is simple: blockchain technology may well be the inclusionary generative mechanism that disrupts the power that educational institutions have traditionally held to limit and make the attainment of qualifications a marker of exclusion. If we are looking more to the learning and assessment process itself then a good example of inclusionary generative mechanisms and structures is provided by the micro-credentials offered by Udacity,8 which seeks to bridge the need for experience to obtain employment and the need for employment to gain experience. Their open credentials in the fields of artificial intelligence and business are built around practical project work, with unlimited opportunities to get it right on the way to obtaining the credential. In this respect it is more like an apprenticeship model where learning through trial and error is not only permitted but encouraged. Thus, the participants might craft a piece of software, much as a novice carpenter would be taught to handcraft a cabinet. Those who complete a credential are then considered qualified to review and assess the project work of the next cohort of students. Started by two instructors at Stanford University (https://www.udacity.com/us, accessed 19 Jan 2021). 8

2.6 Closing Comments

43

As one of the students put it, ‘You’re not taught what to do; you’re taught how to think’ (Bhanot, 2019). Unlimited submissions means the participants can continue to receive feedback until they have mastered the material. Udacity has made graduates from their courses more suited to and included in the workforce. In the words of Andrew Jackson9 who brought this example to our attention: The background was Google execs complaining that students leaving university were not ready for the workforce. The owner of Udacity put the challenge back to Google – in that case work with me to design the courses. That led to a series of partnerships with big IT companies to co-create the material so that the students from Udacity had the right skills when they finished. Udacity throws away and re-creates its courses every 3 years to ensure that the courses remain relevant to the companies they partner with. Google even sponsored 50,000 students to undertake Udacity’s courses. They took a very interesting approach. They sponsored many for the lower level courses, then the best of those at the intermediate and then the best of those at the advanced courses. Then chose the best from the advanced courses for their workforce … a great way to develop and find talent. (Andrew Jackson, personal communication, 15.9.21)

2.6 Closing Comments In this chapter we have sought to explore how changing forms of society give rise to different forms of assessment. The main focus has been upon the knowledge society and a decidedly macro view of societal structures organised around global and nation-state policies. In critical realist terms, through retroduction we have sought to reveal the different properties of societal forms and how these support and make possible different generative mechanisms and structures of assessment and their accompanying assessment acts. Of course, there is rarely a simple one-to-one match between the form of a society and its assessment practice. It is more often the case that a) a pre-existing form of assessment gains a renewed or heightened actuality, or b) different societies, on the basis of varying cultural, socio-economic and political histories, develop their own particular version of a selected form of assessment practice. The latter is most evident in the way that the diploma disease has revealed itself in different ways in different countries throughout the world. The historical narrative presented in this chapter moved swiftly from ancient times through the Middle Ages to dwell at length on modern society and one aspect of its latest development, namely the knowledge society. The examples of assessment practices in the modern knowledge society have ranged from criterion referencing, key competencies and just-in-time assessment to the diploma disease. We also provided closing comments on the opportunities and challenges presented by blockchain digital credentials. The last mentioned have the potential to disrupt place-based assessment and the accompanying accumulation of credentials in the offices of the university registrar,

Andrew Jackson works at Victoria University of Wellington in New Zealand.

9

44

2 New Forms of Society: New Forms of Assessment

employers or accrediting professional associations. It may well be that we are witnessing a new twist in Novalis’s famous call to acknowledge and deliver on our restless urge to be at home everywhere.10 Globally recognised credentials announce the arrival of credentials that are not limited by boundaries set by the nation-state, institutions or employers. The question posed is thus: Are we gradually moving toward globally recognised credentials held by individuals and yet sensitive to their local contexts, cultures and experiences?

References Addey, C. (2018). The assessment culture of international organisations: ‘From philosophical doubt to statistical certainty’ through the appearance and growth of international large-scale assessments. In C. Alarcón & M. Lawn (Eds.), Assessment cultures: Historical perspectives (pp. 379–408). Peter Lang. Aristotle. (1981). The Nicomachean ethics (trans. J. Thomson). Penguin Books. Bhanot, R. (2019, October 24). Learn by doing: Why project-based learning is so effective. Udacity. https://blog.udacity.com/2019/10/learn-by-doing-why-project-based-learning-is-so- effective.html. Accessed October 21, 2021. Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21(1), 5–31. Bourdieu, P. (1984). Distinction: A social critique of the judgement of taste. Harvard University Press. Bourdieu, P. (1986). The forms of capital. In J. G. Richardson (Ed.), Handbook of theory and research for the sociology of education (pp. 241–258). Greenwood Press. Bourdieu, P. (1996). The state nobility: Elite schools in the field of power. Polity. Broadfoot, P. (1996). Education, assessment and society. Open University Press. Broadfoot, P., & Black, P. (2004). Redefining assessment? The first ten years of assessment in education. Assessment in Education, 11(1), 7–26. Brown, P. (2003). The opportunity trap: Education and employment in a global economy [working paper series paper 32]. School of Social Sciences, University of Cardiff. Carlyle, T. (2010). Novalis. In H. D. Traill (Ed.), The works of Thomas Carlyle (pp. 1–55). Cambridge University Press. Castells, M. (1996). The rise of the network society. Blackwell. Davis, A. (1998). The limits of educational assessment. Blackwell. Dobson, S. (2005). Narrative competence and the enhancement of literacy: Some theoretical reflections. Journal of Media, Technology and Lifelong Learning, 1(2), 35–50. Dobson, S. (2008). Theorising the academic viva in higher education: The argument for a qualitative approach. Assessment & Evaluation in Higher Education, 33, 277–288. Dobson, S. (2013). Dannelse som selvtekrnologi – Michel Foucaults flerfaglige forståelse [Bildung as self-technology – Foucault’s cross-disciplinary approach]. In I. Straume (Ed.), Danningsfilosofi [the philosophy of Bildung] (pp. 308–320). Gyldendal Academic. Dobson, S. (2018). Afterword: Reading the book through the lens of ‘bildung’. In S. Nichols & S. Dobson (Eds.), Learning cities: Multimodal explorations and place pedagogies (pp. 235–239). Springer. Dobson, S., & Scofield, E. (2020, April 8). The rush to online-ness. Newsroom. https://www.newsroom.co.nz/ideasroom/the-rush-to-online-ness. Accessed October 21, 2021. Dobson, S., & Zhudi, M. (2020, August 10). When English becomes the global language of education we risk losing other – Often better – Ways of learning. The Conversation.. https:// Trieb überall zu Hause zu sein (the wish to be everywhere at home) (Novalis, as cited by Carlyle, 2010). 10

References

45

theconversation.com/when-e nglish-b ecomes-t he-g lobal-l anguage-o f-e ducation-w e-r isk- losing-other-often-better-ways-of-learning-143744. Accessed October 21, 2021. Dobson, S., Eggen, A., & Smith, K. (2009). Introduction. In S. Dobson, A. Eggen, & K. Smith (Eds.), Vurdering, prinsipper og praksis [assessment, principles and praxis] (pp. 3–10). Gyldendal. Dore, R. (1976). The diploma disease: Education, qualification and development. George Allen and Unwin. Dore, R. (1980). The diploma disease revisited. IDS Bulletin, 11(1), 1–12. Eisner, E. (1999). The uses and limits of performance assessment. Phi Delta Kappan, 80(9), 658–660. European Commission. (2022). Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the regions. Strasbourg, 18.1.2022 COM(2022) 16 final. https://education.ec.europa.eu/sites/default/ files/2022-01/communication-european-strategy-for-universities.pdf (accessed 22.4.23) European Union. (2020). Europass digital credentials: Interoperability. https://europa.eu/europass/ en/europass-digital-credentials-interoperability. Accessed October 21, 2021. Ewing, B. (2017). An exploration of assessment approaches in a vocational and education training courses in Australia. Empirical Research in Vocational Education Training, 9(14), 1–18. Ferrer, G. (2006). Educational systems in Latin America: Current practice and future challenges. Preal. Foucault, M. (1984). The history of sexuality: An introduction. Penguin Books. Foucault, M. (1988). Technologies of the self. In L. Martin, H. Gutman, & P. Hutton (Eds.), Technologies of the self: A seminar with Michel Foucault (pp. 16–49). University of Massachusetts Press. Giddens, A. (1991). Modernity and self-identity. Stanford University. Gil, I. (2012). Needed adult numeracy and critical statistical skills: A view from international skill frameworks, and implications for education [Seminar Report]. Center for Research on Educational Testing. http://www.cret.or.jp/files/540b7fad10a034f6d3174743b076c507.pdf. Accessed November 21, 2020. Grayeff, F. (1974). Aristotle and his school: An inquiry into the history of the Peripatos. Duckworth. Griffin, P., & Care, E. (2015). Assessment and teaching of 21st century skills. Springer. Hargreaves, A. (2003). Teaching in the knowledge society: Education in the age of insecurity. Columbia University Press. Haugsbakken, H., & Langseth, I. (2019). The blockchain challenge for higher education institutions. European Journal of Education, 2(3), 41–46. Jirgensons, M., & Kapenieks, J. (2018). Blockchain and the future of digital learning credential assessment and management. Journal of Teacher Education for Sustainability, 20(1), 145–156. Jones, B., & Johnson, A. (2022). We took away due dates for university assignments. Here’s what we found. In The Conversation https://theconversation.com/ we-took-away-due-dates-for-university-assignments-heres-what-we-found-193024 Krumsvik, R., & Ludvigsen, K. (2012). Formative e-assessment in plenary lectures. Nordic Journal of Digital Literacy, 7(1), 36–54. Lareau, A. (1987). Social class differences in family–school relationships: The importance of cultural capital. Sociology of Education, 60(2), 73–85. Lemke, C. (2003). EnGauge 21st century skills: Literacy in the digital age. North Central Regional Educational Laboratory and Metiri Group. Little, A. (1997). The diploma disease twenty years on: An introduction. Assessment in Education: Principles, Policy and Practice, 4(1), 5–22. Looney, A., & Klenowski, V. (2008). Curriculum and assessment for the knowledge society: Interrogating experiences in the Republic of Ireland and Queensland, Australia. Curriculum Journal, 19(3), 177–192. Maguire, M. (2005). Textures of class in the context of schooling: The perceptions of a ‘class- crossing’ teacher. Sociology, 39(3), 427–443.

46

2 New Forms of Society: New Forms of Assessment

Marr, D. (1984). Vietnamese tradition on trial 1920–45. University of California Press. McCallum, S., Schofield, E., & Dobson, S. (2021, August 2). Gamers know the power of ‘flow’ — What if learners could harness it too? The Conversation. https://theconversation.com/gamers- know-the-power-of-flow-what-if-learners-could-harness-it-too-164943. Accessed 21 Oct 2021. Mercer, C. (2010). Culturelinks: Cultural networks and cultural policy in the digital age. In B. Cvjeticanin (Ed.), Networks: The evolving aspects of culture in the twenty-first century. University of Zagreb: Institute for International Relations (IRMO). Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13–24. National Council for Curriculum and Assessment. (2003). Developing senior cycle education: Report on the consultative process. NCCA. Nazzal, A. (2011). Peer and self-assessment: 20 classroom strategies and other resources to increase student motivation and achievement. South Carolina Middle School Association Journal, 2011, 28–35. Nordkvelle, Y. (2003). Didactics – From classical rhetoric to kitchen-Latin. Pedagogy, Culture and Society, 11(3), 315–330. Novak, G., Patterson, E., Gavrin, A., & Christian, W. (1999). Just-in-time teaching: Blending active learning with web technology. Prentice Hall. Oates, T. (2003). Key skills/key competencies: Avoiding the pitfalls of current initiatives. In D. Rychen, L. Salganik, & M. McLaughlin (Eds.), Contributions to the Second DeSeCo Symposium (pp. 169–200). Swiss Federal Statistical Office. OECD. (2005). The definition and selection of key competencies. Executive summary. OECD. OECD Centre for Educational Research and Innovation. (2008). Innovating to learn, learning to innovate. OECD. Ostenk, J., & Blokhuis, F. (2007). Apprenticeship in the Netherlands: Connecting school and work-based learning. Education + Training, 49(6), 489–499. Perreiah, A. (1984). Logic examinations in Padua circa 1400. History of Education, 13(2), 85–103. Qvortrup, L. (2003). The hypercomplex society. Peter Lang. Redecker, C., & Johannessen, Ø. (2013). Changing assessment – Toward a new assessment paradigm using ICT. European Journal of Education, 48(1), 79–96. Rychen, D. (2003). A frame of reference for defining and selecting key competencies in an international context. In D. Rychen, L. Salganik, & M. McLaughlin (Eds.), Contributions to the second DeSeCo symposium (pp. 106–116). Swiss Federal Statistical Office. Rychen, D., & Salganik, L. (Eds.). (2003). Key competencies for a successful life and a well- functioning society. Hogrefe and Huber. Schleicher, A. (2008). PIACC: A new strategy for assessing adult competencies. International Review of Education, 54, 627–650. Scott, T. (1998). Teaching the ideology of assessment. Radical Teacher, 71, 30–37. Selwyn, N., Gorard, S., & Furlong, J. (2006). Adult Learning in the Digital Age: Information Technology and the Learning Society. London: Routledge. https://doi. org/10.4324/9780203003039 Senge, P. (1990). The fifth discipline: The art and practice of the learning organization. Currency Doubleday. Siemens, G. (2005). Connectivism: A learning theory for the digital age. International Journal of Instructional Technology and Distance Learning, 2(1) http://itdl.org/Journal/Jan_05/article01.htm Silva, E. (2009). Measuring skills for twenty-first-century learning. Phi Delta Kappan, 90(9), 630–634. Stobart, G. (2008). Testing times: The uses and abuses of assessment. Routledge. Suto, I. (2013). twenty-first century skills: Ancient, ubiquitous, enigmatic? Research Matters, 15, 2–8. Weber, M. (1945). In H. Gerth & C. W. Mills (Eds.), From max weber. Routledge. Wolf, A. (2011). Review of vocational education: The Wolf report. Department of Education. Yu, L., & Hoi, K. S. (2005). Historical and contemporary exam-driven education fever in China. KEDI Journal of Educational Policy, 2(1), 17–33.

Chapter 3

Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?

If one expects something of students and they act against these expectations, and one lets it happen then a situation of tacit acceptance arises. (Dale, 2009, p. 76) (The Norwegian quotation, prior to our translation, reads as follows: ‘Dersom en forventer noe av elevene og elevene handler i motsetning til forventningene og en lar det skje, oppstãr ettergivenhet’).

Abstract This chapter asks what kinds of generative mechanisms and structures might be present in the practice of assessment for learning, the term originating in the UK, or formative assessment, the term preferred by many in the USA. The argument is made that it is too simple to regard it as simply a high-speed motorway for the improvement of student learning outcomes. Challenges exist in the implementation of assessment for learning, not least the fact that it can be interpreted in several different ways. One of the topics taken up in this chapter is feedback, which occupies a central position in assessment for learning and in assessment of students in general. An important rationale for assessment for learning is that it demonstrates a care for the student over time, nurturing their development through feedback. In contrast to this we have assessment of learning where the teacher might undertake a one-time check for the taught segment of the curriculum and then move on to teach a new segment. In assessment of learning it might be asserted that the teacher is abdicating from a longer-term commitment to the learning of the student by directing attention to each test in successive fashion. But abdication is both too strong a word and the wrong word because a teacher might argue that successive summative assessment of learning tests in the course of teaching a subject demonstrate care for the student because their progress is followed in the longer term and teaching can be adjusted on the basis of results in the tests. When teaching is adjusted the assessment functions in a formative, or assessment for learning, manner. Put differently, the teacher has transformed assessment of learning into assessment for learning. A difference still remains, in that in assessment for learning the teacher not only adjusts teaching © Springer Nature Switzerland AG 2023 S. R. Dobson, F. A. Fudiyartanto, Transforming Assessment in Education, The Enabling Power of Assessment 10, https://doi.org/10.1007/978-3-031-26991-2_3

47

48

3 Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?

on the basis of results from tests, but regular feedback of a formative character is provided during the period of teaching, such that students are able to act upon it. The care is continual and not periodical or in fits and starts after the students have sat a test and wait to receive their results. Irrespective of the differences, both assessment for learning and assessment of learning share an interest in the active involvement of the teacher. The teacher does not retreat into the background, acquiescing (giving tacit acceptance) in the face of opposition or disinterested students who appear to be unmotivated and unresponsive; a possibility delineated by the well-known Norwegian educationalist Erling Lars Dale, quoted at the beginning of this chapter. This opening might imply a level of consensus about how to define assessment for learning, about the generative mechanisms (deep-rooted levels of existential care and concern for students1) and structures (feedback process and relationships to the curriculum) that lie behind its working in the classroom and other educational contexts, and also about if it is a motorway for improved learning outcomes or a dead end. But disagreements simmer not far below the surface and this chapter will explore some of them with the ambition of using insights derived from critical realism to bring greater clarity to some of the issues. It is necessary to explore the domains of the empirical (e.g. how assessment for learning as an assessment act is defined and observed in the classroom or elsewhere), the real (e.g. the generative mechanisms and structures supporting how it works and also how it has come to be defined) and the actual (e.g. beyond empirical observation, where generative mechanisms and structures supporting assessment for learning can be actualised and operate without necessarily becoming visible).

3.1 Origins Assessment for learning has been for some years an intellectual export commodity for England and it has had a visible impact on assessment policy and also practice in many countries. In Australia, to take an example, assessment for learning resources have been developed by the Curriculum Corporation, now part of Education Services Australia,2 on behalf of the state governments and are available to all schools. In England assessment for learning has been specifically associated with the group of researchers and academics who formed the Assessment Reform Group in 1998. The group was a follow up to the Assessment Policy Task Group set up by the British Educational Research Association to measure the likely impact of the Education Reform Act of 1988. Reference to the Assessment Reform Group’s publications can be found across the globe when the topic is assessment policy: Inside the black box (Black & Wiliam, 1998b), originally published in 1998 and Our existential interest in care reveals the manner in which teachers care for students as subjects rather than as objects (care to) (see Conroy & Dobson, 2005). 2 https://www.digitaltechnologieshub.edu.au/teach-and-assess/assessment-overview/planning-forassessment/ (accessed 12.12.22). 1

3.1 Origins

49

funded by a Nuffield Foundation grant, had sold over 50,000 copies worldwide by 2010 (Mansell, 2011). Maybe the explanation for the global prevalence of the group’s ideas rests with the simple fact that they have written in English and it is a world language. Such an explanation does however downplay how active the different group members have been in disseminating assessment for learning in international conferences papers, invited lectures or when undertaking professional development abroad. It may also be the case that the explanation for the growing popularity rests with timing: a growing sense, shared by many from 1988 to the present day, that assessment of learning can inhibit learning and motivation. The group, by mutual agreement, ceased to exist in 2010. By this time, they had accomplished much; many teachers and education policy makers across the globe were familiar with the term. But for many assessment of learning still remained the paramount objective3 and some believing they were practising assessment for learning were in fact still practising assessment of learning, that is, the regular insertion of assessment of learning between periods of teaching. Stobart (2008), a prominent member of the group since 1994, has said that the term assessment for learning was chosen instead of formative assessment because the latter was often understood to cover assessment acts which were mini-summative and intended to support the collection of assessment documentation to set the final grade. An example of this is found in an OECD report on evaluation and assessment in Norway: In Norway, the OECD review team also encountered a view of formative assessment as somehow ‘including’ a range of small summative tests counting toward a final achievement mark. Teachers’ classroom assessments were frequently used to track students’ progress and provide practice for a final summative assessment (e.g. exam, oral exam, teacher- designed test). Similarly, self-assessment was often understood in a framework of self- marking, not reflection on learning. (Nusche et al., 2011, p. 56)

All members of the assessment reform group also regarded the term formative assessment as having to cover too many different meanings and because of this it had lost it protreptic intent (Broadfoot et al., 1999). In their words: The term ‘formative’ itself is open to a variety of interpretations and often means no more than that assessment is carried out frequently and is planned at the same time as teaching. Such assessment does not necessarily have all the characteristics just identified as helping learning. It may be formative in helping the teacher to identify areas where more explanation or practice is needed. But for the students, the marks or remarks on their work may tell them about their success or failure but not about how to make progress toward further learning. (p. 7)

To this day there are still many who prefer to use the term formative assessment, rather than assessment for learning. In the USA this seems to be very much the case. We have already commented in the opening to the chapter about how assessment of learning can function in an assessment for learning manner, and at this juncture

The UK debate about moving away from the modular nature of the General Certificate of Education and returning to the end-of-course assessment is a case in point (Department for Education & Gibb, 2011; see also Rodeiro & Nádas, 2012). 3

50

3 Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?

we would like to add that the main premise of assessment for learning is that learning and assessment should support generative mechanisms and structures that facilitate the interweaving and integration of learning with assessment. They are not to be separated spatially (student work marked in the teacher’s office and handed back later in the classroom) or temporally (students receive feedback after teaching a part of the curriculum for a period that might stretch over several weeks followed by an assessment act). We will return to definitional questions in due course. From a critical realist perspective, a quick empirical glance at the achievements of the Assessment Reform Group, encompassing among other things its level of global dissemination and policy impact, might give the impression that they coined the term assessment for learning. But it is possible to find earlier examples of assessment for learning as a form of practice, even if the term itself seems to have been first used by others in 1986.4 Scriven (1967) regarded summative evaluation as information assessing an educational program, whereas formative evaluation sought to contribute to the improvement of the program. We note as above that formative assessment has been the preferred term in the USA. Bloom, well known for his thinking on taxonomy in the post-WWII period, showed an additional interest in assessment that might promote learning. In particular, he was interested in the importance of students receiving feedback to correct mistakes in the learning process and threw down a gauntlet to his followers: how might the 2-sigma challenge (students who received one-to-one feedback were able to perform as well as the top 2% of a control group) be achieved for all students?5 In this connection he identified feedback, support from parents and tasks that generated experiences of mastery as crucial (Bloom, 2006). Slavin (1987) has disputed the magnitude of these benefits with respect to the mastery of learning. Of course, the antecedents to assessment for learning stretch back further in time, and apprenticeships in vocational subjects/ occupations have always emphasised the need for formative feedback to further progress and achievement. With respect to professional development courses taught to teachers we have ourselves experienced on numerous occasions that teachers teaching vocational subjects in upper-secondary school say they have always practised assessment for learning and find nothing revolutionary whatsoever in the decidedly fashionable term.

Wiliam (2011) asserts that Harry Black used it as the title of a chapter in a book (Black, 1986). Bloom (1984, p. 4): ‘Most striking were the differences in final achievement measures under the three conditions. Using the standard deviation (sigma) of the control (conventional) class, it was typically found that the average student under tutoring was about two standard deviations above the average of the control class (the average tutored student was above 98% of the students in the control class). The average student under mastery learning was about one standard deviation above the average of the control class (the average mastery learning student was above 84% of the students in the control class).’ 4 5

3.2 Defining Assessment for Learning

51

3.2 Defining Assessment for Learning Part of the challenge in seeking to define assessment for learning is that some, such as those not following the lead of the Assessment Reform Group, still hold a preference for the term formative assessment. Sadler is a case in point, as we shall discuss in a later chapter. Yet, in common with the Assessment Reform Group, they hold similar intentions and visions of the generative mechanisms involved: ensuring that assessment informs learning and learning informs assessment. It is not therefore a case of a linguistic fallacy, whereby the term or discourse is considered the reality and the distinction between the reality and the discourse is dissolved. The reality of what they seek to achieve in the classroom is not lost. But having said this, as we shall see, even if there is agreement on the goal, there is a level of disagreement among followers of assessment for learning/formative assessment with respect to the most desirable and preferred forms of practice. Let us look at some definitions: ‘Assessment for Learning is the process of seeking and interpreting evidence for use by learners and their teachers to decide where the learners are in their learning, where they need to go and how best to get there’ (Assessment Reform Group, 2002, p. 2). This broad definition by the Assessment Reform Group entails a gap-closing activity after gathering and interpreting evidence. We discussed the idea of a gap or a lack in Chap. 1 when we presented an overview of critical realism. As we stated in that chapter, the concept of a lack constitutes the second moment in the longitudinal dialectic seeking to explore and account for the ontological depth of reality. The lack is a propelling force between a present and aspired for state. We added in that chapter that it is important that absence thinking, and the very culture and mentality of looking at what is missing, should not lead to the view that assessment has simply a ‘diagnose and redress’ goal. As Bennett (2011) has pointed out and as we shall see shortly, one strain of assessment for learning has viewed it solely as a diagnostic tool focusing only upon how to improve learning outcomes, and in so doing neglecting a focus upon the process on the way toward the learning outcome. Bennett adds a fourth action to the Assessment Reform Group’s definition (seeking, interpreting and acting upon interpretations), namely planning and designing the assessment act. It is absent from the definition above, but to be fair it underpins the Assessment Reform Group’s (2002) definition in the cited Assessment for learning: 10 principles. Black and Wiliam propose a definition that similarly focuses upon the action: Practice in classroom is formative to the extent that evidence about student achievement is elicited, interpreted, and used by teachers, learners, or their peers, to make decisions about the next steps in instruction that are likely to be better, or better founded, than the decisions they would have taken in the absence that was elicited. (2009, p. 7)

Bennett in his critical and yet a the same appreciative review of formative assessment makes the point that two forms of practice can be identified. Firstly, among test publishers there is the view that formative assessment produces test scores of diagnostic value, typically in the time scale of the instructional unit or an interim period after a component of the curriculum has been taught. Secondly, among

52

3 Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?

educational practitioners and researchers there is the belief that it refers to assessment acts in and between lessons: Formative assessment is a process used by teachers and students during instruction that provides feedback to adjust ongoing teaching and learning to improve students’ achievement of intended instructional outcomes. A common simplification of this position is that as long as the results are used to change instruction, any instrument may be used formatively, regardless of its original intended purpose. (Bennett, 2011, pp. 6–7)

The advantage of Bennett’s pertinent distinction is that it directs attention to how assessment for learning/formative assessment is practised by different groups with different interests. The Assessment Reform Group in one of its publications increases the number of possible assessment acts associated with assessment for learning: • • • •

the provision of effective feedback to pupils; the active involvement of pupils in their own learning; adjusting teaching to take account of the results of assessment; a recognition of the profound influence assessment has on the motivation and selfesteem of pupils, both of which are crucial influences on learning; • the need for students to be able to assess themselves and understand how to improve. (Broadfoot et al., 1999, pp. 4–5)

We support this as the direction to go. Rather than producing a single, broad catch- all definition that seeks to encompass the numerous ways in which assessment for learning is practised, it is fruitful to direct attention to the assessment acts that might be present. With this in mind we have constructed a table where on one side the mechanisms connected with assessment for learning acts have been inspired by the work of the Norwegian assessment expert Roar Engh (2011, pp. 61–68), while on the other side we identify important assessment for learning acts (Table 3.1). Engh places a particular emphasis on the role of interpreting the curriculum, with an accompanying interest in how assessment for learning is coloured by the traditions and practices of each subject. On the other hand, we are particularly interested in how the assessment acts related to assessment for learning will change according to when the assessment act takes place. What is suitable and fits the purpose within the short time span of a single lesson (e.g. say the word øyeblikk captures this and means literarily the blink of an eyelid) may be different to assessment for learning offered when the middle time span covering the instructional unit as a whole is in Table 3.1 Assessment acts evident in assessment for learning Engh Focus on learning Social relations in the classroom Curriculum understanding Understanding where the student is in the learning process Offering feedback to help the student move forwards

Dobson and Fudiyartanto What is to be assessed (in terms of the learning objectives)? How it is to be assessed (choice of assessment tools)? When is it to be assessed (short, middle or long time span)? How is the assessment act and accompanying data to be interpreted and theorised?

3.3 Assessment for Learning as a Motorway for Improved Learning Outcomes

53

focus (e.g. in order to complete this history project go and research the local history of place x), or with a longer time span in mind of half a year or longer when things are less certain or clear and more open ended (e.g. you will be in a position to consider whether you will be ready to choose advanced mathematics with more algebra). Timing also incorporates something of which we have both become increasingly conscious: assessment for learning can cover pacing and leading others. By this we mean knowing when to intervene and direct the assessment act, irrespective of whether it concerns feedback, letting students do a test or some other form of assessment act. Counsellors mirror teachers with respect to offering feedback, and learn about pacing and leading in their training. Here is an example from an interview reported by Meier and Davis: Client: I Am Stuck. I just can’t keep putting off studying like this. Counsellor: I wonder if you cramming has anything to do with your desire to avoid responsibility for failing the course. Client: Well … my dad thinks I get Cs because I party like he did in college, not because I don’t have the ability. Hmmm … Counsellor: (silence). (Meir & Davis, 2009, pp. 6)

In defining assessment for learning according to its assessment acts, we are moving toward our primary concern and main point. The assessment acts outlined in the table above and in the conception of the Assessment Reform Group constitute empirical instances of regularities that can be observed amongst those practising assessment for learning, that is, they are on the level of the empirical in critical realist terms. But what is additionally required to produce greater clarity in understanding assessment for learning is an elaboration of the generative mechanisms and structures at work producing and supporting these regularities. This constitutes the critical realist contribution to understanding assessment for learning. With this in mind we will proceed in three ways: firstly, in looking at assessment for learning as a motorway for improved learning outcomes, we will explore if there are grounds for regarding the generative mechanisms and structures of assessment for learning as producers of radically improved learning outcomes. Secondly, we will look at theories that might explain what is going on in assessment for learning. And lastly, we will look at one particular mechanism in assessment for learning and its supporting structure. It is something which all commentators, experts and practitioners of assessment for learning regard as important: the practice of feedback as an assessment act.

3.3 Assessment for Learning as a Motorway for Improved Learning Outcomes We have already seen that Bloom regarded formative assessment as a way of increasing learning outcomes. But there are several others who have voiced similar positions. A research review by the Centre for Evaluation and Monitoring at Durham

54

3 Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?

University (Higgins et al., 2011) assessed a number of initiatives with respect to the effect they would potentially have upon the learning of an average student in a school year: assessment for learning represents 3 months extra learning at a moderate cost for teacher training and half a General Certificate of Secondary Education grade in a particular subject at the end of lower secondary education. But the authors noted that homework (low cost) or one-to-one tutoring (high cost) represent 5 months of extra learning. Effective feedback, presumably an activity found in all versions of assessment for learning, represents 9 months of extra learning. In a much-cited review of formative assessment research Black and Wiliam (1998a) examined its prevalence as a topic in 76 journals over a 10-year period (1987–1997). They found an effect size of improved student learning between 0.4 and 0.7 standard deviations: • An effect size of 0.4 would mean that the average student involved in an innovation would record the same achievement as a student in the top 35% of those not so involved. • An effect size gain of 0.7 in the recent international comparative studies in mathematics would have raised the score of a nation in the middle of the pack of 41 countries (e.g., the U.S.) to one of the top five. (Black & Wiliam, 1998b, p. 142)

Their paper was a thematic review rather than a meta-analysis and they highlighted research undertaken on feedback, the student perspective including peer assessment and self-perception/self-esteem, mastery learning, teacher questioning, systems for the organisation of teaching and formative assessment. They underlined the importance of ‘some degree of feedback between those taught and the teacher, and this is entailed in the quality of their interactions which is at the heart of pedagogy’ (Black & Wiliam, 1998a, p. 16). Others have highlighted results from meta-analyses to support assessment for learning as a source of increased learning outcomes.6 The most comprehensive examination to date has been undertaken by Hattie in his globally ambitious meta- analysis of 800 meta-analyses. He summarised several years’ work in the book Visible learning (2009) and in his more teacher-friendly, less scholarly Maximising impact on learning (2011), arguing that 0–20% of a student’s learning outcome can be accounted for by variables found at the level of the school, while 16–60% come from differences between teachers and classes, that is, differences within and between classroom practices, rather than in terms of the school itself. The table below shows a selection of factors that can influence learning outcomes. In his 2009 book he lists over 130 different factors that influence learning outcomes. Everything over 0.4 (standard deviation) is considered to be a significant effect size, and under 0.4 reveals little effect (Table 3.2). We can note that the teaching of test skills scores poorly. This must not be confused with ‘teaching to the test’, which might teach only test content rather than assessment skills, what we call assessment as learning. Self-reported grades refers to students conducting self-assessment prior to submitting assignments or test Shute’s (2007) review of the literature is relevant, but is not as systematic as the work undertaken by Hattie. 6

3.3 Assessment for Learning as a Motorway for Improved Learning Outcomes

55

Table 3.2 Developing potential for learning Influences Self-reported grades Feedback Providing formative evaluation to teachers Frequent/effect of tests Teaching test-taking skills

Effect sizes 1.44 0.72 0.70 0.46 0.22

Source: Hattie (2007)

answers. For Hattie feedback from students – to themselves (self-assessment) and to teachers – is of prime importance; such that the greatest effects on student learning are evident when students become their own teachers and when teachers learn through student feedback about their own teaching. And yet, somewhat paradoxically, he notes the lack of systematic evidence that feedback from student to teacher is of great importance. Hattie has been positioned in the ‘what works’ (Nelson & O’Beirne, 2014) evidence-based practice movement with its roots in positivism. For this movement the view of what works rests upon systematic statistical analyses of existing research on the effect of educational interventions, such as feedback practices, and the belief that teachers will be willing to change assessment practices armed with results from such analyses. This view contains the assumption that teachers are adept at balancing the probability of the occurrence of a certain result of an assessment act in the classroom or in a test situation with a ‘matter of fact’ (unemotional) judgemental approach to reality. This form of judgemental approach occurs when teachers are both willing and able to view their practice in a distanced, objective manner and let the evidence from systematic research inform practice in specific, predefined directions. This might be an unwarranted assumption, given that emotions and other interests such as loyalty to established forms of assessment practice might generate significant inertia to change, even though the change is supported by so-called systematic meta-analyses of evidence from research. If we introduce the term ‘balance of probability’ we are aware that in a judicial meaning associated with civil cases a person must convince the judge that there is a stronger than even likelihood (i.e. statistically 51%) that the facts they allege are true. In a criminal case the standard is beyond reasonable doubt, such that there is closer to total certainty (i.e. statistically 100%). In our context, where assessment practices are the focal point, we are interested in how the supporters of evidence- based practice might posit a statistical measure to guide practitioners. This in our view might approximate the lower (civil) rather than higher (criminal) standard of probability. The problem is that they do not explicitly make this clear other than talking of the probability threshold of the 0.4 standard deviation effect size and what this might mean in a practical sense for an average child. As a result, the teacher is free to weigh a probability-based rationale against a different rationale, supported by experience, loyalty to a form of assessment practice, a gut emotional feeling or

56

3 Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?

something else. Teachers might also consider the term ‘average child’ is not something they can support. From a moral philosophical perspective the following criticism of the ‘what works’ movement has been voiced: We cannot use any means just because they are ‘effective’. Education is first and foremost a moral practice; assessments and decisions within pedagogy are not just about what is possible (an actual assessment), but also about what is pedagogically desirable (a valuation). To assume that research on ‘what works’ may replace normative professional judgement is simply to make a totally unwarranted jump from is to should. But perhaps worse: Such education with the support of newer and more reliable ‘scientific research’ will be able to more easily dismiss the educational practitioner’s right to act on the basis of their own experience of necessary (un)certainty and hesitation about what may work in situations where events come galloping on a conveyor belt. It will ignore many of the sources that can help to provide pedagogy with nuanced, sensitive, contextually poignant, insightful and often indeterminate experiences. (Steinsholt, 2009, p. 16; translation by Dobson)

From a more methodological point of view Black and Wiliam have been scathing of meta-analyses such as these: Whilst they show some coherence and reinforcement in relation to the learning gains associated with classroom assessment initiatives, the underlying differences between the studies are such that any amalgamations of their results would have little meaning. At one level, these differences are obvious on casual inspection, because each study is associated with a particular pedagogy, with its attendant assumptions about learning: one that in many cases has been constructed as the main element of the innovation under study. There are however deeper differences: even where the research studies appear to be similar in the procedures involved, they differ in the nature of the data which may have been collected – or ignored. (1998a, p. 53).

As regards learning outcomes from assessment for learning, at the opposite extreme to the more quantitative meta-analysis is the view that students experience a qualitative improvement in how they understand and phenomenologically experience their own learning. In many of the in-service teacher courses on assessment undertaken by one of the co-authors (Dobson), teachers regularly tell of students who seem more confident in locating where they are in their own learning after different forms of assessment for learning have been introduced. Assessment has become more transparent for the students and they experience a heightened sense of self-esteem. This is not it must be understood, in a systematically researched or documented sense; it is more a change of consciousness where students express a preference for this form of assessment. In this section we have looked at examples of the view that assessment for learning can lead to improved learning outcomes. Some writers have presented the argument in favour of improvement in terms of quantitative standard deviations. Others have made the argument in negative terms, noting how difficult it is to come to a firm, unambiguous conclusion that assessment for learning can have positive effects. To probe the argument further, what is required is an exploration of the underlying generative mechanisms and structures at work in the practice of assessment for learning. Put differently, a critical realist approach is required that moves below the empirically observed regularities, and this is the topic of the next section.

3.4 Theories of Assessment for Learning

57

3.4 Theories of Assessment for Learning There is little agreement on the theoretical generative mechanisms and structures evident in assessment for learning. Stobart (2008) presents the dispute as two camps, those in favour of a socio-cultural understanding as opposed to a neo-behaviourist culture of test and remediate. He implores researchers to undertake more theoretical work, ostensibly to bring order into disorder. But as we shall argue in due course, it may not be possible or desirable to arrive at a single overarching theory. This is mainly because it can be practised in diverse ways with a differing weight placed on different component parts and different theoretical approaches to account for and legitimate what is going on i.e. different processes and mechanisms giving rise to it. Another theory can be found in the Educational Testing Service’s (2009) rationale for the Keeping Learning on Track program. Bennett (2010), using the American term formative assessment, argues that this program is founded upon a rudimentary theory of action or, in critical realist terms, a generative mechanism and a set of supporting structures. The theory of action in the first instance entails a big idea: ‘students and teachers using evidence … to adapt teaching and learning to meet immediate learning needs minute-by-minute and day-by-day’ (ETS, 2010). This focus is clearly stated, but has the disadvantage of neglecting assessment acts directed toward the middle (instructional unit) or longer time span (half a year or more). With this theory of action in mind, Bennett suggests the teacher might adopt one of five strategies to direct the learning process: sharing learning expectations (e.g. clarifying and sharing learning intentions and criteria for success), questioning (e.g. generating classroom discussions), feedback (teacher to student), self- assessment (i.e. activating students as owners of their own learning), and peer assessment (e.g. activating students as instructional resources for one another). Figure 3.1 summarises how the move is made from left to right in the theory, where intended effects are stated as teacher and student outcomes. At the same time moving to one of the five different strategies might mitigate unintended effects. This theory scores well in critical realist terms because it indicates a causal direction with respect to the assessment acts. Furthermore, it can be connected with an interest in socio-cultural theories, where interaction with others supports the learning processes; note in particular the ongoing professional development with colleagues and facilitators proposed at regular intervals. It might be thought that such an approach to assessment for learning settles the issue of providing a set of generative mechanisms and structures to account for what is going on. However, this theory does not engage with motivation (which can be connected with psychological theories) in an explicit manner, it does not highlight the importance of experience (which can be connected with theories of aesthetic experience), it considers itself subject independent (where theories of traditions in different subjects might be relevant) and says nothing of the role of teachers working in teams either in the classroom or outside of the classroom when planning or evaluating teaching and assessment (which draws upon theories of social interaction and moderation). Put simply, the Keeping Learning on Track program and accompanying theory

58

3 Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?

Fig. 3.1 The KLT Theory of Action. (Cited in Bennett (2010, p. 9) he had permission to use this figure from the Education Testing Service)

represents only one way in which diverse components in assessment for learning can be theorised and practised. Dobson et al. (2012) propose a theoretical synthesis of the practice of assessment for learning. The goal was to understand how different generative mechanisms in assessment for learning and accompanying structures might co-exist and each play their own role without any one of them assuming an absolute position of dominance. The theoretical synthesis envisaged has much in common with Elias’s (1986) theory of ‘figuration’, where different theoretical and practice-based components are brought into close proximity with each other. For Elias inner generative mechanisms and structures (e.g. psychological, psychoanalytical, aesthetic) mix with outer generative mechanisms and structures (e.g. economic, political, sociological, historical, educational and cultural). Dobson et al. chose not to use the dichotomy between inner and outer mechanisms and structures. Nevertheless, it is possible to read such a dichotomy as an underlying theme in Table 3.3, with some theories possessing an inner and outer relationship, such as mastery being connected with both inner and outer conceptualisations of motivation. Mastery as a perspective in assessment for learning can be connected with the need to enhance rather than deflate motivation. It emphasises that the student must be given the opportunity not only to complete the task, but to retry on the basis of feedback if mastery was not accomplished. The student moving up Bloom’s (1984)

3.4 Theories of Assessment for Learning

59

Table 3.3 Theoretical concepts derived from different assessment for learning/formative assessment approaches Assessment for learning can highlight … Mastery

System

Experience

Mechanisms and supporting structures Motivation Formative and summative Praise and feedback Initiation-response-evaluation Information gap Self-regulated learning Communication and dealing with complexity Assessment acts Moments of contingency Connoisseur Dialogue

Socio-cultural

Interaction and scaffolding Shared community of practice in use of assessment artefacts

Cognitive acceleration Assessment for learning in subjects

Didactic traditions in different curriculum subjects organised around ‘big ideas’

Examples Bloom (1984) Scriven (1967) Henderlong and Lepper (2002) Wiener (1948) Ramaprasad (1983) Black and Wiliam (2009) Wilson (2012) Skogvoll and Dobson (2011) Black and Wiliam (2009) Eisner (1998) Pryor and Crossouard (2008) Pryor and Crossouard (2008) Lave and Wenger (1991) Dobson and Haaland (1993) Venville and Oliver (2015) Dobson and Engh (2010) Bennett (2011)

knowledge taxonomy would be an example of this. Scriven’s (1967) understanding of program evaluation in summative and formative terms betrays an interest in the supporting structure around the assessment act. Henderlong and Lepper (2002) note that praise may not always lead to enhanced motivation, especially if the student sees it as lacking in sincerity. The systematic understanding of assessment for learning views a connected group of assessment acts from a bird’s eye or meta-perspective, whereby the supporting structure is in view. At the same time, it looks at the assessment acts as mechanisms at work within the system. From the system-wide level, the view might be that the classroom as a whole is the ‘playground’ for assessment acts that collectively reinforce the effect of assessment for learning or alternatively frustrate its realisation. For example, some students might persist in retaining only an interest in grades and summative assessment. Weiner’s cybernetic approach can be likened to such a conception, where feedback constitutes an element that might influence the further development of the classroom system.

60

3 Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?

Within the system, communication mechanisms are usually highlighted as a way of bringing order to and decreasing the conceptual complexity of the interactive processes evident between participants. Black and Wiliam, for example, recall the initiation-response-evaluation cycle of the teacher. The cycle begins with a question and then eliciting information to lead to and support adjustments in teaching: A formative interaction is one in which an interactive situation influences cognition, i.e., it is an interaction between external stimulus and feedback, and internal production by the individual learner. This involves looking at the three aspects, the external, the internal and their interactions. The teacher addresses to the learner a task, perhaps in the form of a question, the learner responds to this, and the teacher then composes a further intervention, in the light of that response. This basic structure has been described as initiation-response- evaluation or I-R-E, but this structure could represent either a genuinely dialogical process, or one in which students are relegated to a supporting role. (2009, p. 9)

Inspired by Luhmann (1995), Wilson (2012) has taken a communicative approach. She asks the reader to consider the teacher’s own reflective understanding of their teaching and the student’s own understanding of their own learning; together they create a system in which assessment for learning is played out between communicating parties. Common to these approaches is the view that gaps exist in the system and they must be redressed not only by understanding the mechanisms in the system, but by also possessing an overview of the system as a whole and the need for self-regulative learning. As Ramaprasad (1983, p. 4) has put it: ‘Feedback is information about the gap between the actual level and the reference level of a system parameter which is used to alter the gap in some way.’ There is a danger that the system view of assessment for learning can result in a view of students as objects, ‘to whom things are done’, rather than as active co-communicators (Swaffield, 2011). As an opposite to the system and communicative focus there is a view of assessment for learning grouped around a set of concepts that all share an interest in how assessment is corporeally experienced. One of the concepts that can be mentioned in this connection is the assessment act, which was defined earlier in the book as: ‘An action by student(s) or teacher(s), written, verbally or corporeally, that may or may not have as its stated goal an assessment of a student, individually or collectively; but with assessment as the outcome.’ Of interest is the manner in which the teacher’s nod or the tone of their voice can influence the student’s future-oriented work on the task at hand, in an affirmative or negative sense. In the real time of the classroom moments of contingency (Black & Wiliam, 2009) arise, where the teacher’s intervention has an important effect. Skogvoll and Dobson (2011) have used the Norwegian term ‘blikk for øyebikk’ (awareness of the moment), which captures the significance of the timeliness of a teacher’s formative input. Rowland et al. (2009) have made a similar point in highlighting contingency, along with transformation, foundation and connection, as a quartet of ideas that are central to the work of a teacher in the classroom. Awareness of the moment (blikk for øyebikk) for Skogvoll and Dobson (2011) is connected not merely with a cognitive awareness that something has to be done at this particular moment, but also with the corporeal sense in which the teacher actually sees and experiences the moment. It has existential import for all involved, both

3.4 Theories of Assessment for Learning

61

the student and the teacher. Eisner’s concept of the connoisseur is pertinent in this connection because he directs attention to the manner in which the teacher as assessor is a connoisseur who draws upon their reservoir of corporeal, existential, emotional and cognitive experience to offer the student formative feedback, while the learning is still taking place, and not merely once the task has been completed and the product is in place. The concept of the connoisseur will be dealt with more fully in a later chapter. The experience of assessment for learning can also be connected with the kind of questions asked by the teacher. As Pryor and Crossouard (2008) have noted, if they are convergent (closed), requiring simple memory-based answers, then the opportunity for students to experience and express moments of contingency will not be great. This is in contrast to questions that are divergent (open), such that several different types of answer are possible and a greater degree of student reflection is required.7 Their point is that convergent questioning provides less opportunity for dialogue and chains of formative assessment acts than divergent questioning. The socio-cultural view of assessment for learning is popular because it appeals to the shared interaction that is evident between teacher and student and between peers. Specifically, it draws upon concepts found in what is known as the situated learning or human activity theory conception of action. One of the concepts highlighted is scaffolding structures (Pryor & Crossouard, 2008) that create an environment supportive of assessment for learning, for example students collectively developing a shared understanding of the building blocks of storytelling before they begin to write and assess their own ‘once upon a time …’ stories. A second concept is the zone of proximal development. Many are familiar with the idea that what the student learns today with assistance can be independently accomplished tomorrow. This is because the tasks are within their zone of proximal development and do not stretch them too far so they consider giving up. In assessment terms, the zone of proximal development can refer to an assessment mechanism whereby a teacher might assess a student today and realise that tomorrow the student might well manage to accomplish the act and assess themselves on their own because they have learnt from the way the teacher undertook the assessment act. When assessment outcomes are improved and the student becomes more independent the teacher might well witness the need among more able students to compress the curriculum (Renzulli, 2016). To follow up this point, in work with the gifted and talented it is possible to practise cognitive acceleration as students are coached to assess their own progress and cognitively accelerate through the taught curriculum more quickly. A third concept from the socio-cultural tradition emphasises that each classroom constitutes a community of practice with its own culture and norms stipulating how assessment artefacts are to be used to mediate between actors. A consequence might be a backwash effect as assessment criteria are used by students to strategically adjust their learning efforts to match what is specified in the criteria. For Wenger

For example, not what is the capital of Norway, but why?

7

62

3 Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?

(2010) the community of practice constitutes a resource to support knowledge creation, accumulation and diffusion. In the context of assessment for learning the classroom community provides an arena not only to steward, curate and care for the domain knowledge and competence of students, but also to provide a home for their identity development in a safe and secure environment. The fifth perspective emphasises the importance of a subject-based and curriculum-embedded approach to assessment for learning. Bennett (2011) asks if those concerned with researching and practising assessment for learning have all too easily adopted a cross-subject position. In professional development for in- service teachers it does not take long before they ask what assessment for learning means for each particular subject taught. In this respect each subject expert can quickly assert that they possess what might be called their own discrete and delimited set of ‘big ideas’,8 which can be taught and explored using an assessment for learning mechanism organised in a continuous loop of teaching the big idea, student exploration/investigation in a problem-based format, and subsequently feedback between teacher and student (Fig. 3.2). Let us take some examples to illustrate this, beginning with the assessment for learning tradition of teaching English as a second language. In the 1960–1970s discrete-point testing occupied a dominant position in classroom practice (Chvala & Graedler, 2010). The goal was to teach and assess different skills, such as the use of grammar or vocabulary, in separate tests. There was in such cases an objective assessment with only one correct answer and the students accumulated points independently of the teacher’s input and judgement, for example in vocabulary tests or national tests. From the 1980s to the present a different form of classroom practice has assumed a more dominant position (Simensen, 2007). Commonly known as communicative language teaching, the central focus is upon classroom teaching and assessment that focuses on meaning-oriented language and its use in authentic Teaching the ‘big idea’

Feedback between teacher and student

Student exploration/ investigation

Fig. 3.2 Assessment for learning organised around ‘big idea’ teaching We take the term ‘big idea’ from the Japanese term 発問 (hatsumon) which means to ask or pose a question. In this context it is connected with a big idea in a subject that the teacher and curriculum explore in a problem-based teaching manner. 8

3.4 Theories of Assessment for Learning

63

contexts. It directs attention toward a more continuous form of assessment and focuses on guiding the development of the students’ communicative competence. Assessment for learning plays a key role in communicative language teaching, in particular the manner in which the pedagogue utilises feedback strategies to assist students in the co-construction of meaning and understanding. Assessment scores are judged on the basis of appropriateness rather than correctness, marking a clear breach with the discrete-point pedagogy of teaching and assessment. Chvala and Graedler support this move with the following argument: feedback on language needs to focus not just on overall accuracy but on errors versus mistakes. An error is a category of inaccuracies which repeat themselves and reflect a systematic problem in the learner’s language. A mistake, on the other hand, is an inaccuracy which occurs randomly and could be the result of a slip-of-the-hand or could occur as the result of the learner being focused on something else at the time. Errors can be subdivided into two types: Language 1 (L1) interference errors and developmental errors. L1 interference errors are those errors which occur as a result of differences between the L1 and English. Developmental errors, on the other hand, are errors which occur mostly consistently as a result of a learner’s move from one stage of mastery to another. (2010, p. 80)

A second example is found in the teaching of natural science. In the European Commission report from 2004 Europe needs more scientists it was noted that when students encounter natural science as a well-established set of facts without a connection to experiments and observations, they have fewer opportunities to form their own interpretations and are less motivated to learn the subject. This is the background for the 2007 European Commission Report Science education NOW: A renewed pedagogy for the future of Europe, which called for a re-orientation of science pedagogy, placing a greater emphasis on the role of scientific inquiry. This is in line with findings of the National Research Council (2007) report from the USA, which found that children master natural science when they learn to understand what scientific explanations/concepts are, how evidence is produced, how natural science knowledge can be revised and modified, and lastly that, just as science is a social activity discussed among scientists in the scientific community, so can students share their ideas and conceptions about science in peer groups during classroom activity. When embedded in curricula, such as the Norwegian Knowledge Promotion Reform curriculum of 2006, these insights lead to activities where the student is encouraged to more actively research and make discoveries. The term used in the Norwegian curriculum is forskerspiren (research germination). The student is encouraged, with the support, guidance and feedback of the teacher, to form hypotheses, undertake experiments, make systematic observations and evaluate critically. Put simply, forskerspiren is to let investigation and research skills grow and germinate. However, as Dobson and Engh (2010) in the introduction to an edited collection have argued, even if each subject has its own ‘big idea’ and this is integrated in an assessment for learning form of practice, the differences must not be exaggerated. It can be argued that all subjects can be united on a higher level around a few broadly defined mechanisms of assessment for learning and accompanying structures. For

64

3 Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?

example, all subjects in adopting assessment for learning support mechanisms whereby students receive different forms of formative feedback from teachers and peers with accompanying dialogue about the feedback. The structure of the curriculum is something all subjects share; it provides learning goals that students desire to master, and these may be adapted to individual learning pathways. With advances in assessment technologies and computerised adaptive testing this is becoming increasingly possible. Theorising the generative mechanisms and structures evident in the practice of assessment for learning reveals a number of different perspectives: mastery, systems, socio-cultural, experiential and subject based. Even though we have presented them sequentially, in the practice of assessment for learning they can occur together, such as when the mechanism of mastery is linked to the mechanism of experience, producing a multifaceted assessment act. The same is the case with the structures supporting these mechanisms, such that the system approach views the classroom as a supporting structure and within it generative mechanisms, such as those just mentioned, mastery and experience, can be active in informing the assessment actions of students and the teacher. With COVID-19 learning experiences we have of course understood the necessity of a broader conception of the classroom informed by distance and digital connectivity. In a previous chapter we drew attention to Siemens’ (2005) seminal paper on how knowledge is distributed widely in different networks, some conceptual – carried in our heads – and some external in books or on the internet. He suggested that we reflect upon questions of where, how and what knowledge and by implication assessment is to be trusted and considered valid and reliable. In what follows, we will explore in more detail one significant generative mechanism and set of supporting structures in assessment for learning: feedback.

3.5 Feedback In this chapter we have already cited definitions of feedback in which the gap metaphor was invoked: ‘Feedback is information about the gap between the actual level and the reference level of a system parameter which is used to alter the gap in some way’ (Ramaprasad, 1983, p. 4). But feedback can have intentions other than removing the gap in a structure framed by diagnosis and redress. It can reveal a concern with the process on the way to the learning objective, with a focus on the experience of the formative assessment act. Together with colleagues Hartberg and Gran, Dobson offers in the introduction to their book Feedback i skolen (Feedback in the school) what at first sight might appear to be a recasting of the gap metaphor definition: ‘All assessment communication between two parties should include the feedback looking backwards at what has been achieved, and also the feedback looking forwards toward future goals’ (Hartberg et al., 2012, p. 12). However, our deeper intention was to give this umbrella-like definition a variable content and form, such that it transcends a focus upon merely the gap-bridging

3.5 Feedback

65

conception of feedback. Accordingly, we had in mind several different forms of feedback content (advice, praise, feedback about effort and the self), across different temporal spans (short, medium and long) and opening for different participatory roles (self-directed feedback, peer feedback and teacher feedback). Furthermore, feedback in an educational context might, or should in the opinion of the authors, be founded upon two guiding principles. Firstly, it should have the furthering of the student’s learning as a red thread. Secondly, it should be based upon a social relation of recognition and respect between participants. An example of the latter might be as follows: a group of students sit at the back of the classroom. They look out of the window or throw small rolled up paper projectiles at each other. They show no respect toward the teacher who is trying to teach the whole class about calculus. The teacher, on the other hand, holds little respect for these disruptive students. This social relation is the opposite of one based upon mutual recognition and respect. In such a climate it is unlikely that subject-related feedback by the teacher will be well-received, if at all, by these students. The chance that they will act upon the feedback is even less likely. The point is simple. The content of feedback, with the promotion of learning as its goal, requires the establishment of a mutual bond of recognition and respect. This is not a new idea. Rousseau in his famous essay Discourse on the origin and foundations of inequality among men (1754/2011) talked of modern people who had become overly dependent on receiving recognition from others for their actions. The problem was that there was no guarantee of recognition. On the contrary, the possibility existed that the other might equally be jealous or hold feelings of hatred, so that the relationship would be marked less by recognition, and more by the open expression of contempt or resentment. Accordingly, a definition of feedback might underline not merely the importance of mechanisms of communication backwards in terms of what has been undertaken or forwards in terms of what might be achieved. This references the manner in which the language game of feedback includes a family of resemblances towards feed-back and feed-forward, just as Bennett (2011) evoked the language game of a diagnostic goal as opposed to a language game more interested in the process. In other words, each language game is associated with a slightly different social practice and accompanying intention. In the previous section of this chapter we too considered how the language game of feedback references different. but related theoretical language games generating the social practice, ranging from motivation to bid ideas and so on. With mechanisms in mind, in an educational context feedback should also look to mechanisms that can further the learning of students and structures should be in place that support social relations of recognition and respect. We use the term ‘should’ well aware that it holds a certain normative, proluctionary9 force capable of causing participants to do things (Searle, 1969).

Words that produce an effect upon the listener that might persuade, frighten, amuse, or cause a listener to act. 9

66

3 Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?

In professions and disciplines other than that of teachers working in classrooms, generative mechanisms of feedback and structures might have a different goal and proluctionary potential. Let us take two professions by way of example: counselling and nursing. Feedback as an assessment act is central to their practice, and we shall argue that by exploring how it is practised in each context with different language games and forms of life, we gain deeper comparative insight into how it is practised by the teacher working in a classroom. Accordingly, the generative feedback mechanisms and structures might be different and yet hold comparative wisdom for each of these professions and by extension for other professions. A central concern in counselling is knowing when to intervene in an active manner with feedback and when to be silent. It is known as the art of pacing and leading (Meier & Davis, 2009, p. 6). This does not imply an intention of leading the client down a predefined path and manipulating them. It is more the case of knowing when to be silent so the client can leave an imprint on the dialogue and express their own meanings and intentions. Teachers are accustomed to think in terms of pacing and leading as they make decisions about how the content of the curriculum is to be taught. So too, when it comes to feedback to the class, to groups of students or to individuals and they think of feedback across a wider temporal horizon. This incorporates the short time span of the immediate lesson, the medium time span of the instructional unit, which might stretch over 2–6 weeks and be connected with the use of criteria referenced in the curriculum, and the long time span of half a year or more when parents might be present at scheduled student–parent meetings. The counsellor, by contrast, might consciously focus upon pacing and leading in the short time span of the meeting with the client, especially when future meetings are not scheduled. The corporeal intensity of the interaction might be noticeable and different to that of the teacher in the short time span. The teacher, of course, knows future opportunities for feedback are guaranteed in the classroom situation. If we look at the nursing profession, something is immediately evident. Feedback is rarely given or sought from a group of 20 or more people, as in a classroom. It is more likely to be organised in a structure that allows feedback to be given to the individual patient and/or their close relative and friends. On the conceptual level, inspired by Foucault and his studies of the birth of the medical clinic, it can be argued that the nurse cultivates a specific bedside manner where feedback rests upon an empirical medical gaze, not reserved just for doctors. It is a medical gaze uncluttered by theories and chimeras: What was fundamentally invisible is suddenly offered to the brightness of the gaze, in a movement of appearance so simple, so immediate that it seems to be the natural consequence of a more highly developed experience. It is as if for the first time for thousands of years, doctors, free at last of theories and chimeras, agreed to approach the object of their experience with the purity of an unprejudiced gaze. (Foucault, 1973, p. 195)

The nurse’s feedback rests in this sense upon what is in effect a corporeal gaze, a way of looking, gaining access to what had previously been obscured. Once again, it is not the case that the teacher makes no use of a corporeal gaze; it is more the case that the teacher’s gaze roams over the classroom, at times stopping to rest on a

3.5 Feedback

67

particular child, at times moving on. The nurse’s gaze and the teacher’s gaze constitute supporting structures for the offering of feedback. In the former case it is less likely to be with the goal of motivating the receiving part to further learning. In the case of the teacher–student relationship the feedback can have such a goal. Leaving the comparative interest in feedback in counselling and nursing and returning to generative feedback mechanisms and structures in the teacher–student relationship our focus is now upon some selected feedback models. Kluger and DeNisi (1996) note that several models have been proposed where feedback constitutes a central component: cybernetic theory (Annett, 1969; Podsakoff & Farh, 1989), goal-setting theory (Locke & Latham, 1990), multiple-cue probability learning paradigm (Balzer et al., 1989), social cognition theory (Bandura, 1991), and learned helplessness theory (Mikulincer, 1994). Common to the theories identified by these authors is the emphasis on learning and motivation, and Kluger and DeNisi’s theory of feedback similarly reveals such an interest. Their model was developed on the basis of a meta-analysis of 131 papers on feedback interventions. The papers were selected according to strict criteria (studies with groups who received feedback and control groups that did not, more than ten participants and size of impact could be measured) across the period 1905–1995. They found that 41% of 607 effect sizes10 were positive, and yet 38% had a negative effect upon performance. In the case of the former, they included findings that frequent feedback can enhance performance, that the effect is stronger for memory tasks than for procedural tasks, and that computer-mediated feedback is advantageous. In the case of the latter, they included praise, threats to self-esteem and orally delivered feedback, which is construed to be more biased than written feedback. The model of feedback intervention they proposed had three elements organised in a hierarchy: task learning, task motivation and meta-tasks including the process related to the self (self-esteem, control and impression management). The model was tested using moderator analyses and it showed that feedback effectiveness decreases as attention moves up the hierarchy closer to the self and away from the task. In the model of feedback proposed by Hartberg et al. (2012) the findings of evidence-based practice are questioned on the count that its postulates are thought to be universally valid across contexts. The result is an insensitivity to the context of the classroom and the moments of contingency that unexpectedly arrive. Instead, in that study we drew upon our experience as teachers, on our research on feedback in the classroom (Dobson & Søby, 2012; Eriksen et al., 2010) and on extensive experience over many years of holding professional development sessions for in-service teachers. As we have noted previously, in our model of feedback assessment acts take place in a complex and variable interaction between the feedback content, the suggested temporal span and involving participatory. Let us take an example inspired by that model, the giving of advice and praise, and explore what kinds of generative mechanisms and structures are potentially in

10

Weighted mean (weighted according to sample size) (Kluger & DeNisi, 1996, p. 258).

68

3 Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?

play. It can be challenging for the teacher to provide good backward-directed mastery descriptions (e.g. praise) and future-oriented advice in various situations. With respect to the latter, consider the following point raised by Benjamin (1973, p. 86) in a famous essay on the function of the storyteller: ‘After all, counsel is less an answer to a question than a proposal concerning the continuation of a story which is just unfolding.’ The last phrase, ‘which is just unfolding’, underscores a point made in the original German text: ‘Rat ist ja minder Antwort auf eine Frage als ein Vorschlag, die Fortsetzung einer (eben sich abrollenden) Geschichte angehend’ (Benjamin, 1961, p. 413). ‘Which is just unfolding’ becomes ‘eben sich abrollenden’, meaning to ‘at any rate roll on’, or ‘at any rate occur’. It is present in the Norwegian translation of this phrase, ‘som i alle fall går sin gang’ (Benjamin, 1991, p. 182), which can refer to how the recipient’s own story, their life as experienced, will continue onwards, even if the feedback advice is not heeded. In other words, what Benjamin seems to say, according to the Norwegian and German versions, and which the English translation glosses as unfolding, is that the advice can be heeded or ignored, but irrespective, a person’s life will proceed onwards. The point we are making is that feedback as advice requires adopting a role that allows the recipient, as Winne and Butler (1994, p. 5740) suggest, to ‘confirm, add to, overwrite, tune, or restructure’ the feedback (see also Gamlem & Smith, 2013) It can also be refused. Timing is relevant in the sense that the advice has an open- ended character; it does not seek to close off the student’s options to do as they wish with the advice if it is revisited at some later date. The future time span can thus be short or stretch into the future. It can also be related to the medium time span after the instructional unit of 2–6 weeks has been completed and some kind of test has been taken.11 As regards the content, if the feedback is to have a future effect, it must be considered pertinent by the student or it will be refused. In sum, advice weaves together roles, content and time to create a mechanism and structure of feedback that is future oriented and allows the student several options as to how to make use of the feedback. Praise might at first sight seem to be a universally powerful, positive mechanism in assessment for learning, but research has demonstrated that it can have negative consequences for the student. The classic review of research on praise has been published by Henderlong and Lepper (2002) and we will draw upon some of their arguments in what follows. They cite Faber and Mazlish who are sceptical about praise: ‘Children become very uncomfortable with praise that evaluates them. They push it away. Sometimes they’ll deliberately misbehave to prove you wrong’ (as cited in Henderlong & Lepper, 2002, p. 774). In addition, praise can lead to increased pressure to perform, reduce perceived autonomy and result in a lack of desire to take risks in the learning context. Praise Riggan and Oláh (2012, p. 3) argue that the interim assessment cycle has received relatively little coverage in formative assessment research: ‘Indeed, Herman et al. noted that few studies have examined the ways in which teachers “orchestrate” the range of assessment tools and practices available to them, that is, the way they link, integrate, or sequence them within their instruction and planning.’ See also Perie et al. (2009). 11

3.5 Feedback

69

can undermine motivation if it is not based on academic performance, but instead on the student’s ability to play up to the teacher in competition with other students. In sum those sceptical about praise argue that it offers the teacher a subtle mechanism to control the student by making them more dependent (Sæverot, 2011). On the other hand, there are scholars who support a more traditional view: that praise from teachers causes increased intrinsic motivation to learn as students see that they themselves are responsible for their own performance. In the same tradition, we find researchers who say that praise has a positive effect on learning, but it depends on the presence of an external source of motivation. If the praise disappears, the effect on learning also disappears. Put another way, this form of praise is strongly linked to the social relationship between student and teacher. Students may not wish to let down the adult (Henderlong & Lepper, 2002, p. 778). The table below summarises the two opposing views (Table 3.4). The two authors identify five variables that can moderate the effects of praise as either positive or the opposite. Firstly, it must be perceived as sincere. Second, praise identifying ability is more effective for future intrinsic motivation than praise that emphasises effort (perceived cause of performance). Thirdly, students will be more motivated if they believe in their own abilities to act (self-efficacy), rather than remaining dependent on others (lack of autonomy). Fourth, intrinsic motivation is increased if students are given information about their own competence in solving a task, rather than solely a social comparison. Finally, praise that conveys realistic standards and expectations rather than too low or too high standards and

Table 3.4 Proposed mechanisms for beneficial and detrimental effects of praise Beneficial mechanisms Detrimental mechanisms Boosts self-efficacy (Bandura, 1977, 1997) Leads to inferences of low ability, when given for easy tasks (Meyer, 1992) Enhances feelings of competence and Over-justifies performance (Kohn, 1993; Lepper autonomy (Deci & Ryan, 1985) et al., 1973) Creates positive feelings (Blumenfeld Creates pressure and highlights self-consciousness et al., 1982) (Baumeister et al., 1990) Strengthens association between response Causes perceived locus of causality to shift from and positive outcomes (O’Leary & internal to external (Deci & Ryan, 1985) O’Leary, 1977) Provides incentive for task engagement Encourages stable ability attributions and (Madsen et al., 1977) contingent self-worth (Kamins & Dweck, 1999; Mueller & Dweck, 1998) Encourages adaptive effort attributions Produces a purely instrumental focus (Birch et al., (Henderlong, 2000; Mueller & Dweck, 1984) 1998) Provides motivating information about Invites rejection of praise due to insincerity normative excellence (Koestner et al., (Kanouse et al., 1981) 1989) Helps children regulate task engagement Encourages invidious social comparison (Kohn, (Schunk & Zimmerman, 1997) 1986)

70

3 Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?

expectations has a greater impact on inner motivation. The authors also highlight differences between boys and girls: it was directed almost exclusively toward the intellectual quality of work for boys. For girls, however, in addition to praise for intellectual substance, positive feedback was also directed at matters of form, such as neatness or following instructions. When negative feedback was given, it was directed toward the intellectual quality of work much more for girls than for boys, who were also criticized for messy papers and unruly behavior. (Henderlong & Lepper, 2002, p. 787)

However, Henderlong and Lepper’s argument has some weaknesses. They draw mainly on research from the US, and they are aware that they do not pay particular regard to cultural differences. In the United States praise is directed more toward talent than effort, but in Eastern cultures we see the opposite, where praise is more likely to emphasise effort and perseverance. This point mirrors the work of Dweck (1999) who researched the effect of praise for talent as opposed to effort, and her work has been seen by many to be universally valid across all cultures in a similar way. Those receiving praise for their talent are considered more likely to give up if they performed poorly in a test or task, unlike those praised for effort. We might note that in Norway, as presumably is the case in most countries, the national curriculum constitutes the basis for feedback, such that feedback is related less to effort and talent than to achieved competence in the subject.12 Henderlong and Lepper end their review of research on praise by stating the following: ‘rather than asking whether praise enhances intrinsic motivation, it is far more useful to ask about the conditions under which this is likely to occur’ (Henderlong & Lepper, 2002, p. 791). Re-phrasing in our terms, the conditions of praise refers to the manner in which roles are enacted (e.g. to enhance autonomy and belief in self- autonomy), the content of praise is selected and communicated (e.g. in a sincere manner) and lastly, the timing of praise (e.g. it can be timed to be given in the presence of other students, but it must not give rise to competitive social comparison between student peers). Simply put, the generative mechanisms and structures are just as likely to be culturally and historically set and mutable in different countries and contexts.

3.6 Closing Comments This chapter has asked if assessment for learning, the term adopted in England, or its American counterpart formative assessment is a magic bullet to progress learning. As to be expected there have been contrasting views and evidence has been sought from different sources. For example, drawing upon meta-analysis the ‘what works’ movement has argued that assessment for learning can generate important Historically, physical education was exempted in this respect in Norway because the teacher was allowed to openly consider effort as a foundation for grading and feedback, not simply achieved competence in the subject. 12

3.6 Closing Comments

71

learning gains. Others have argued against this from a moral point of view since ‘what works’ risks replacing normative professional judgement about what is desirable based upon accumulated teacher experience. Simply put, this is a debate between the is and the ought or, put differently, just because we have evidence does not immediately mean it should be adopted in every context to direct teaching and assessment acts. A key point in this chapter has been to explore these differences using a different approach, namely drawing upon the resources of critical realism and seeking to identify multiple generative mechanisms and structures supporting the practice of assessment for learning. Therefore we are not arguing that either the ‘what works’ or the ‘morally inclined profession’ approach is more appropriate in understanding whether and how assessment for learning might progress learning. We have proposed a number of different theories that can account for generative mechanisms and structures supporting assessment for learning, ranging over mastery, system, experience, socio-cultural and assessment for learning in subjects. We have also used this chapter to focus upon feedback as an assessment act which in many senses is an overarching generative mechanism and accompanying set of structures to be found in the theories explored above. The importance of understanding feedback reaches beyond the field of education. We used examples from two professions, nursing and counselling, but the points made are arguably equally applicable to most professions. This does raise an interesting point: what happens when we have AI chatbots or avatars that seek to meet our needs for feedback? Avatars are of course nothing new. Consider the following in Norway. High school students across the country take part in the so-called Russ celebration for the whole of their final year (Dobson et al., 2006). In lasting a year it is longer than the comparable schoolies in Australia (Green & Bennett, 2020) or the prom in the USA (Tinson, 2016). Reaching a climax in the final month of school, Russ participants wear coloured overalls according to the subject discipline between May 1 (Labour Day) and May 17 (Constitution Day commemorating independence from Denmark in 1814). According to tradition students are meant to wear the overalls for 16 days without them being washed! When these are worn the shyest of students are able to take on a different identity; let us call it an avatar offering them confidence to celebrate and undertake rite of passage dares – running across bridges naked, surprising the local city mayor with pranks and so on. In short, feedback in AI forms is likely to continually change in coming years, offering participants the opportunity to take on board or alternatively offer feedback with confidence and no inhibitions as they hide behind AI avatars. It will also offer us all the chance to reflect upon the need for mutual relationships of recognition and respect in feedback interactions when we the teacher or the student who is a group member is never sure if the ‘other’ is a real person, an AI chatbot, an avatar or some hybrid of all of these. Relationships of mutual recognition and respect were identified in the chapter as key aspects supporting and facilitating effective feedback – now challenged by exactly the arrival of AI chatbots or avatars.

72

3 Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?

References Annett, J. (1969). Feedback and human behaviour. Penguin Books. Assessment Reform Group. (2002). Assessment for learning: 10 principles. Assessment Reform Group. http://www.uni koeln.de/hf/konstrukt/didaktik/benotung/assessment_basis.pdf. Accessed 12 Jan 2021. Balzer, W. K., Doherty, M. E., & O’Connor, R., Jr. (1989). Effects of cognitive feedback on performance. Psychological Bulletin, 106, 410–433. Bandura, A. (1977). Self-efficacy: Toward a unifying theory of behavioral change. Psychological Review, 84, 191–215. Bandura, A. (1991). Social cognitive theory of self regulation. Organizational Behavior and Human Decision Processes, 50(2), 248–287. Bandura, A. (1997). Self-efficacy: The exercise of control. Freeman. Baumeister, R. F., Hutton, D. G., & Cairns, K. J. (1990). Negative effects of praise on skilled performance. Basic and Applied Social Psychology, 11, 131–148. Benjamin, W. (1961). Illuminationen: Ausgewählte Schriften. Suhrkamp Verlag. Benjamin, W. (1973). The storyteller: Reflections on the works of Nikolai Leskov. In H. Zohn (Ed.), W. Benjamin, Illuminations (pp. 83–109). Fontana. Benjamin, W. (1991). Kunstverket i Reproduskjonsalderen og andre essays [Art in the age of reproduction and other essays]. Pax Forlag. Bennett, R. (2010). Cognitively based assessment of, for, and as learning (CBAL): A preliminary theory of action for summative and formative assessment. Measurement, 8, 70–91. Bennett, R. (2011). Formative assessment: A critical review. Assessment in Education: Principles, Policy and Practice, 18(1), 5–25. Birch, L. L., Marlin, D. W., & Rotter, J. (1984). Eating as the “means” activity in a contingency: Effects on young children’s food preference. Child Development, 55, 431–439. Black, H. (1986). Assessment for learning. In D. L. Nuttall (Ed.), Assessing educational achievement (pp. 7–18). Falmer Press. Black, P., & Wiliam, D. (1998a). Assessment and classroom learning. Assessment in Education: Principles, Policy and Practice, 5(1), 7–74. Black, P., & Wiliam, D. (1998b). Inside the black box: Raising standards through classroom assessment. Phi Delta Kappan, 80(2), 139–148. Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21(1), 5–31. Bloom, B. S. (1984). The 2-sigma problem: The search for methods of group instruction as effective as one-to-one tutoring. Educational Researcher, 13(6), 4–16. Bloom, S. (2006). Foreword. In T. Guskey (Ed.), Benjamin S. Bloom: Portraits of an educator (pp. iii–xix). Rowman and Littlefield Education. Blumenfeld, P. C., Pintrich, P. R., Meece, J., & Wessels, K. (1982). The formation and role of self perceptions of ability in elementary classrooms. The Elementary School Journal, 82, 401–420. Broadfoot, P. M., Daugherty, R., Gardner, J., Gipps, C. V., Harlen, W., & James, M. (1999). Assessment for learning: Beyond the black box. University of Cambridge School of Education. Chvala, L., & Graedler, A. (2010). Assessment in English. In S. Dobson & R. Engh (Eds.), Vurdering for læring i fag [Assessment for learning in subjects] (pp. 75–89). Høyskoleforlaget. Conroy, S., & Dobson, S. (2005). Mood and narrative entwinement: Some implications for educational practice. Journal of Qualitative Health Research, 15(7), 975–990. Dale, E. L. (2009). Fellesskolens utfording [The challenge for state schooling]. Bedre Skolen, 1, 75–80. Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation and self determination in human behavior. Plenum Press. Department for Education, & Gibb, N. (2011, December 14). End for GCSE modules and spelling, punctuation and grammar marks restored to exams [Media release]. https://www.gov.uk/gov-

References

73

ernment/news/end-for-gcse-modules-and-spelling-punctuation-and-grammar-marks-restored- to-exams. Accessed 11 Jan 2021. Dobson, S., & Engh, R. (Eds.). (2010). Vurdering for læring i fag [Assessment for learning in subjects]. Høgskole Cappelen Damm. Dobson, S., & Haaland, Ø. (1993). Vygotskian perspectives on ethnicity: From science to narrative. Oppland distriktshøgskole. Dobson, S., & Søby, K. (2012). Vurdering som uttrykk for faglig forståelse og anerkjennelse. En nærstudie av feedback knyttet til elever med svake læreforutsetninger [Assessment as an expression of subject understanding and recognition: A study of feedback to students with weak learning skills]. In T. Nordahl (Ed.), Bedre læring for alle elever [Improved learning for all pupils] (pp. 134–149). Gyldendal. Dobson, S., Brudall, R., & Tobiassen, H. (2006). Courting risk – The attempt to understand youth cultures. Young: Nordic Journal of Youth Studies, 14, 49–59. Dobson, S., Engh, R., Engvik, G., Hartberg, E., Gemlem, S., & Tellefsen, H. (2012). Teoretisk bakgrunnsdokument for arbeid med implementering av vurdering for læring på ungdomstrinnet [Theoretical background paper for work on implementing assessment for learning in the lower secondary school]. Directorate of Education and Training. Dweck, C. (1999). Self-theories: Their role in motivation, personality, and development. Psychology Press. Eisner, E. (1998). The enlightened eye: Qualitative inquiry and the enhancement of educational practice. Merrill. Elias, N. (1986). Introduction. In N. Elias & E. Dunning (Eds.), Quest for excitement: Sport and leisure in the civilizing process (pp. 3–43). Basil Blackwell. Engh, R. (2011). Vurdering for læring i skolen: På vei mot en bærekraftig vurderingskultur [Assessment for learning in the school: Moving toward a sustainable assessment culture]. Høyskoleforlaget. Eriksen, S., Sand, S., Dobson, S., & Nes, N. (2010). Elevvurdering og tilpasset opplæring [Pupil assessment and adapted education]. Hedmark University College. ETS (Educational Testing Service). (2009). Research rationale for the Keeping Learning on Track® program. http://www.ets.org/Media/Campaign/12652/rsc/pdf/KLT-Resource-Rationale.pdf. Accessed 17 Dec 2010. ETS (Educational Testing Service). (2010). About the KLT program. http://www.ets.org/Media/ Campaign/12652/about.html. Accessed 17 Dec 2010. European Commission. (2004). Europe needs more scientists. The Commission. European Commission. (2007). Science education NOW: A renewed pedagogy for the future of Europe. The Commission. Foucault, M. (1973). The birth of the clinic: An archaeology of medical perception. Tavistock. Gamlem, S., & Smith, K. (2013). Student perceptions of classroom feedback. Assessment in Education: Principles, Policy and Practice, 20(2), 150–169. Green, B., & Bennett, A. (2020, September 7). No festivals, no schoolies: Young people are missing out on vital rites of passage during COVID. The Conversation. https://theconversation.com/ no-festivals-no-schoolies-young-people-are-missing-out-on-vital-rites-of-passage-during- covid-145097. Accessed 23 Jan 2021. Hartberg, E., Dobson, S., & Gran, L. (2012). Feedback i skolen (Feedback in the school). Gyldendal. Hattie, J. (2007, August 28–September 1). Developing potentials for learning: Evidence, assessment, and progress [Plenum lecture]. In Earli 2007: 12th Biennial Conference for Research on Learning and Instruction, Budapest, Hungary. http://www.slideserve.com/isi/developing- potentials-for-learning-evidence-assessment-and-progress. Accessed 9 Sep 2012. Hattie, J. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. Routledge. Hattie, J. (2011). Maximizing impact on learning. Routledge. Henderlong, J. (2000). Beneficial and detrimental effects of praise on children’s motivation: Performance versus person feedback. Unpublished doctoral dissertation, Stanford University.

74

3 Assessment for Learning: Motorway or Dead End for Improved Learning Outcomes?

Henderlong, J., & Lepper, M. (2002). The effects of praise on children’s intrinsic motivation: A review and synthesis. Psychological Bulletin, 128(5), 774–795. Higgins, S., Kokotsaki, D., & Coe, R. (2011). Toolkit of strategies to improve learning: Summary for schools spending the student premium. Centre for Evaluation and Monitoring, Durham University. Kamins, M. L., & Dweck, C. S. (1999). Person versus process praise and criticism: Implications for contingent self-worth and coping. Developmental Psychology, 35, 835–847. Kanouse, D. E., Gumpert, P., & Canavan-Gumpert, D. (1981). The semantics of praise. In J. H. Harvey, W. Ickes, & R. F. Kidd (Eds.), New directions in attribution research (Vol. 3, pp. 97–115). Erlbaum. Kluger, A. N., & DeNisi, A. (1996). The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychological Bulletin, 119, 254–284. Koestner, R., Zuckerman, M., & Koestner, J. (1989). Attributional focus of praise and children’s intrinsic motivation: The moderating role of gender. Personality and Social Psychology Bulletin, 15, 61–72. Kohn, A. (1986). No contest: The case against competition. Houghton Mifflin. Kohn, A. (1993). Punished by rewards: The trouble with gold stars, incentive plans, A’s, praise, and other bribes. Houghton Mifflin. Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Cambridge University Press. Lepper, M. R., Greene, D., & Nisbett, R. E. (1973). Undermining children’s intrinsic motivation with extrinsic reward: A test of the “overjustification” hypothesis. Journal of Personality and Social Psychology, 28, 129–137. Locke, E. A., & Latham, G. P. (1990). A theory of goal setting and task performance. Prentice Hall. Luhmann, N. (1995). Social systems. Stanford University Press. Madsen, C. H., Becker, W. C., & Thomas, D. R. (1977). Rules, praise, and ignoring: Elements of elementary classroom control. In K. D. O’Leary & S. G. O’Leary (Eds.), Classroom management: The successful use of behavior modification (2nd ed., pp. 63–84). Pergamon Press. Mansell, W. (2011). The assessment reform group: 21 years of investigation, argument and influence. Cambridge Assessment Network. Meier, S., & Davis, S. (2009). The elements of counselling. Cengage Learning. Meyer, W.-U. (1992). Paradoxical effects of praise and criticism on perceived ability. In W. Strobe & M. Hewstone (Eds.), European review of social psychology (Vol. 3, pp. 259–283). Wiley. Mikulincer, M. (1994). Human learned helplessness: A coping perspective. Plenum Press. Mueller, C. M., & Dweck, C. S. (1998). Praise for intelligence can undermine children’s motivation and performance. Journal of Personality and Social Psychology, 75, 33–52. National Research Council. (2007). Taking science to school: Learning and teaching science in grades K–8. National Academies Press. Nelson, J., & O’Beirne, C. (2014). Using evidence in the classroom: What works and why? National Foundation for Educational Research. Nusche, D., Earl, L., Maxwell, W., & Shewbridge, C. (2011). OECD reviews of evaluation and assessment in education: Norway. OECD. O’Leary, K. D., & O’Leary, S. G. (1977). Classroom management: The successful use of behavior modification (2nd ed.). Pergamon Press. Perie, M., Marion, S., & Gong, B. (2009). Moving toward a comprehensive assessment system: A framework for considering interim assessments. Educational Measurement: Issues and Practice, 28, 5–13. Podsakoff, P. M., & Farh, J. H. (1989). Effects of feedback sign and credibility on goal setting and task performance. Organizational Behavior and Human Decision Processes, 44, 45–67. Pryor, J., & Crossouard, B. (2008). A socio-cultural theorisation of formative assessment. Oxford Review of Education, 34(1), 1–20. Ramaprasad, A. (1983). On the definition of feedback. Behavioural Science, 28(1), 4–13.

References

75

Renzulli, J. (2016). The three-ring conception of giftedness: A developmental model for promoting creative productivity. In S. Reis (Ed.), Reflections on gifted education (pp. 55–86). Prufrock Press. Riggan, M., & Oláh, L. (2012). Locating interim assessments within teachers’ assessment practice. Educational Assessment, 16(1), 1–14. Rodeiro, C., & Nádas, R. (2012). Effects of modularity, certification session and re-sits on examination performance. Assessment in Education: Principles, Policy and Practice, 19(4), 411–430. Rousseau, J.-J. (2011). Discourse on the origin and foundations of inequality among men. Bedford/ St Martins. (Original work published 1754). Rowland, T., Turner, F., Thwaites, A., & Huckstep, P. (2009). Developing primary mathematics teaching: Reflecting on practice with the knowledge quartet. Sage. Sæverot, H. (2011). Praising otherwise. Journal of Philosophy of Education, 45(3), 455–473. Schunk, D. H., & Zimmerman, B. J. (1997). Social origins of selfregulatory competence. Educational Psychologist, 32, 195–208. Scriven, M. (1967). The methodology of evaluation. In R. W. Tyler, R. M. Gagné, & M. Scriven (Eds.), Perspectives of curriculum evaluation (pp. 39–83). Rand McNally. Searle, J. (1969). Speech acts: An essay in the philosophy of language. Cambridge University Press. Shute, V. (2007). Focus on formative feedback. Education Testing Service. Siemens, G. (2005). Connectivism: A learning theory for the digital age. International Journal of Instructional Technology and Distance Learning, 2(1) http://itdl.org/Journal/Jan_05/article01.htm Simensen, A. (2007). Teaching a foreign language: Principles and procedures (2nd ed.). Fagbokforlaget. Skogvoll, V., & Dobson, S. (2011). Connoisseuren – Med blikket for øyeblikket: Et kroppsfenomenologisk essay om det profesjonelle [Connoisseur – Awareness of the moment: A corporeal phenomenological essay about professionality]. In Ø. Haaland, S. Dobson, & G. Haugsbakk (Eds.), Pedagogikk for en ny tid [Education for a new time] (pp.161–173). Oplandske. Slavin, R. (1987). Mastery learning reconsidered. Review of Educational Research, 57(2), 175–213. Steinsholt, K. (2009). Enhver beslutning må være avsindig: Noe om det ubestemmelige grunnlaget for pedagogiske beslutninger [Any decision must be insane: Something about the indeterminate basis for educational decisions]. In K. Steinsholt & S. Dobson (Eds.), Verden satt ut av spill: Postmoderne pedagogiske perspektiver [The world dislocated: Postmodern perspectives in education] (pp. 11–32). Fabokforlaget. Stobart, G. (2008). Testing times: The uses and abuses of assessment. Routledge. Swaffield, S. (2011). Getting to the heart of authentic assessment for learning. Assessment in Education: Principles, Policy and Practice, 18(4), 433–449. Tinson, J. (2016, June 28). How British teenagers are making American high-school proms their own. The Conversation. https://theconversation.com/how-british-teenagers-are-making- american-high-school-proms-their-own-61527. Accessed 21 Jan 2021. Venville, G., & Oliver, M. (2015). The impact of a cognitive acceleration programme in science on students in an academically selective high school. Thinking Skills and Creativity, 15, 48–60. Wenger, E. (2010). Communities of practice and social learning systems: The career of a concept. In C. Blackmore (Ed.), Social learning systems and communities of practice (pp. 179–198). Springer. Wiener, N. (1948). Cybernetics, or control and communication in the animal and the machine. Wiley. Wiliam, D. (2011). What is assessment for learning? Studies in Educational Evaluation, 37, 3–14. Wilson, D. (2012). Underveisvurdering som profesjonalisering av spesialundervisning. [Formative assessment as the professionalisation of special education]. In R. Hausstätter (Ed.), Inkluderende spesialundervisning [Inclusive special education] (pp. 83–95). Fagbokforlaget. Winne, P. H., & Butler, D. (1994). Student cognitive processing and learning. In T. Husen & T. Postlethwaite (Eds.), The international encyclopaedia of education (pp. 5739–5745). Pergamon.

Chapter 4

Motivation, Learning and Assessment

[W]hen students are told they’ll need to know something for a test they are likely to come to view that task (or book or idea) as a chore. (Kohn 1999).

Abstract Student assessment can ‘make’ a student in the sense of enhancing their sense of self-esteem and motivation, or it can do the opposite, crushing their sense of self-esteem and motivation. This chapter explores the generative mechanisms and structures at work in the tripartite relationship between motivation and learning, the self, understood to be stratified by conscious, unconscious and affective forces, and lastly, social practices of student assessment. The chapter considers grading as the main example. In particular, grading from the standpoint of the individual student, the practice of the examiner/teacher and lastly grading students who have worked in groups. A pertinent question in this respect is: do students stand to gain as much from learning and outcomes with individual as opposed to group grading? The latter is understood to refer to students working on group pieces of work or projects. It might be anticipated that in grading groups the individual student is at the mercy of the motivation and performance of other group members and this influences the final grade, especially if it is a group grade. In Norway every child can expect to receive a diploma for achievement in sport. Elaboration is required: in sport before the age of 12 no single competitor or team is allowed to receive a trophy for first, second or third place. Instead, it is common that all participants in a competition will receive the same certificate or medal to mark achievement. For sure, some still win and some still lose and this counts as evidence of an act of assessment. But the difference between winner and loser is reduced and it is not an assessment criterion of importance in this context. None of the children experience exclusion. On the contrary, all are included, irrespective of their level of skill and performance.

© Springer Nature Switzerland AG 2023 S. R. Dobson, F. A. Fudiyartanto, Transforming Assessment in Education, The Enabling Power of Assessment 10, https://doi.org/10.1007/978-3-031-26991-2_4

77

78

4 Motivation, Learning and Assessment

This is very different from one of the authors’ (Dobson’s) memory of playing sport from an early age in England in the 70 s. It felt that I was one of the chosen few because I did not spend much time sitting on the bench in the primary school football team. Some of my peers quickly became accustomed to the role of ‘bench warmer’ and this could find its echo in non-sport activities. They were destined to become spectators in much of their learning; and their experience of assessment and ranking confirmed again and again their designated position. However, it would be naive to conclude that Norway is solely concerned with inclusionary forms of assessment and England the opposite. Grading practices in all subjects, not allowed by law in Norwegian primary school, ‘kick in’ on entry to lower secondary school and, like their peers in England, students become preoccupied with performance or lack of performance measured in grades. What assessment practices, specifically grading, might mean for the student is the topic of this chapter. Conceptually, adapting a critical realist stance, we are concerned with the mechanisms and structures that both generate and support the tripartite relationship between motivation and learning, the stratified self and lastly student assessment practices. The guiding assumption in what follows is that these mechanisms and structures (the real) are not always observable or active (the actual) in giving rise to assessment acts and student performance. As contended in the chapter on critical realism, the self can be understood to be a dispositional entity laminated or structured by conscious, unconscious and affective forces. These elements of the self in combination give rise to a sense of self characterised by self-esteem, self-belief, sense of inclusion or other forms of self-experience. Why have we identified this particular tripartite relationship as the guiding vehicle in this chapter? In Chap. 3, while theorising assessment for learning, we identified the importance of motivation and in this chapter we seek to further explore some of its multifaceted mechanisms and structures and how it is related to learning and assessment. But a discussion of assessment practices and motivation and learning requires an additional focal point around which it can spin. The self as student is one such point, as students constitute themselves as active carriers and at times passive recipients of assessment acts. We might have chosen instead other points identified in previous chapters, such as curriculum and big ideas, a systematic focus where education policy is central, the socio-cultural aspect of learning or the learning experience itself, including moments of contingency and the role of the connoisseur. The connoisseur will be the topic of Chap. 5. With these points in mind, Fig. 4.1 illustrates the relationships that are the focus of this chapter.

4.1 Theories of Motivation and Learning Some researchers have regarded motivation as the driving engine in learning (Harlen, 2006; Stiggins, 2001). It is not uncommon to view it as an outcome of learning and not merely an input factor. However, it is important to note that not all learning requires motivation; it is possible to learn in an unconscious manner, when working or playing for example, when learning is not necessarily the primary goal (Enerstvedt, 1986).

4.1 Theories of Motivation and Learning

79

Self

Fig. 4.1 Self, assessment practices, and motivation and learning

Assessment practices

Motivation and learning

How might motivation be defined? One possibility would be to define motivation as ‘the conditions and processes that account for the arousal, direction, magnitude, and maintenance of effort’ (Katzell and Thompson, as cited in Harlen, 2006, p. 61). Other definitions include: ‘motivation is the process whereby goal-directed activity is instigated and sustained’ (Schunk et al., 2010, p.4) or ‘the enjoyment of school learning characterised by a mastery orientation; curiosity; persistence; task- endogeny; and the learning of challenging, difficult, and novel tasks’ (Gottfried, 1990, p. 525). While these definitions might additionally include reference to learning and non-cognitive perceptive experience, attitudes and beliefs, as well as incorporating explanatory mechanisms and structures, they run the risk of over-emphasising a timeless, context-free understanding of motivation. In accordance with critical realism, knowledge is, however, understood to be historically specific and modifiable; it is transitive and emergent. The definitions of motivation and the associated generative mechanisms and structures have not only changed through time, but have different degrees of relevance according to the context where differing mechanisms and structures may be actualised. For example motivation for a military recruit in a time of combat might be different to motivation in an after-school activity or in a classroom preparing for final exams. With these points in mind, we will review definitions of motivation to highlight how they have evolved in the last 100 or so years. Put differently, the social construction of the construct of motivation, including theories about generative mechanisms and structures of motivation and its connection with learning, must also be connected with the generative mechanisms and structures supporting its production as epistemology. In particular, this entails exploring what kinds of questions researchers have posed to suit different contexts and the manner in which they have been answered from different conceptual positions: volition/affectual,1 psycho-dynamic,2 instinctual-genetic,3 behaviourist,4 cognitive5 or socio-cultural positions.6 Will can be defined as desire or want, and volition as the act of using the will. For example, some kind of Freudian unconscious drive or a Lacanian desire for the other’s recognition. For the latter, see Lacan (1977). 3 As an example of the instinctual consider Maslow (1943). 4 For example, Skinner (1974) on reinforcements. 5 For example, focusing on intrinsic and extrinsic motivation. 6 ‘Regarding motivational issues, the situative perspective emphasizes ways that social practices are organized to encourage and support engaged participation by members of communities and 1 2

80

4 Motivation, Learning and Assessment

From the early to mid-twentieth century the dominant perspective on motivation and learning focused upon extrinsic generative mechanisms and structures. Skinner’s (1974) work on reinforcers is a case in point. Positive reinforcers utilise rewards, as opposed to negative reinforcers, where the probability of an action is based upon removing negative external stimuli. Good grades from this perspective constitute a reward and a positive reinforcer, while poor grades are a punishment and a negative reinforcer which the student wishes to avoid in the next act of assessment. A drawback of this kind of behaviourist perspective is that the students over time might lose interest in the rewards or cease to be motivated to avoid poor grades. As an alternative to this approach, cognitive behaviour modification proposed that the student should be given the responsibility to monitor and control their own goals, meta- cognitive strategies and rewards. In behaviour modification the key mechanism at work entails the student being given the responsibility for their own reinforcement contingencies and behaviour is ultimately controlled by the consequences of performance. The drawback of this approach, which seeks to use cognition to change behaviour, is that the students might select the most lenient and least demanding task when given the option. They might even cheat or might not be mature enough to take responsibility for their own motivation. According to Lai (2011, pp. 32–34) narrative cognitive approaches assumed a more dominant position from the 60 s onwards and asserted that cognition directly affects behaviour with the structuring role of consequences exerting a less critical role. Put simply, reinforcement history might not be as important as future-oriented reinforcement mediated by verbal persuasion (Stipek, 1996). In the cognitive approach to motivation where the generative mechanisms and enabling structures are connected with cognitive processes it is possible to follow the lead of Broussard and Garrison (2004) who pose three questions: • Can I do this task? • Do I want to do this task and why? • What do I have to do to succeed in this task? The first question has given rise among other things to theories of self-efficacy, attribution and self-esteem. According to Bandura (1982, p. 122), a student’s self- efficacy, defined as ‘judgments of how well one can execute courses of action required to deal with prospective situations’, will exert an influence on motivation. The generative mechanism at play is the student’s level of self-belief and how it intrinsically motivates, or if lacking does not motivate, the student in the act of setting goals, persistence and effort. It is not a general, context-independent level of motivation that the student possesses. On the contrary, motivation is connected with specific tasks in specific contexts; hence the term ‘execute courses of action’. Bandura bridges generative mechanisms of self-efficacy with outcome that are understood by individuals to support the continuing development of their personal identities’ (Greeno and the Middle School Mathematics Through Application Project, as cited in Hickey & Zuiker, 2005, p. 283, emphasis added). Thus, a curriculum goal might function to support engaged participation and hence motivation.

4.1 Theories of Motivation and Learning

81

expectations. Four permutations are possible. High self-efficacy and high outcome expectation will lead to students showing cognitive engagement and persistence. High self-efficacy might co-exist with low outcome expectation, as when the student is willing to work hard but believes the teacher will not grade in a fair manner. Those demonstrating low self-efficacy and low outcome expectation will be unwilling to invest energy in the task and will show resignation, apathy and withdrawal as a strategy. Lastly, there are students who have low self-efficacy but believe that the classroom is characterised by high outcome expectations, as when the teacher is willing to give good grades despite mistakes. Another answer to the question ‘Can I do the task’ is found in theories of control. This theory posits that if the student is in control of own successes and failures, they will control their own motivation. It is related to attribution theory, which highlights causes of success and failure in such things as ability, effort, task and luck. Those believing in their attributed ‘ability’ are less likely to increase their motivation on the basis of poor performance, as compared with a student with a high belief in the attribution of ‘effort.’ Self-esteem has also been identified as crucial to whether a person believes they can do the task. It is connected with emotional or affectual views of the self and connects with the critical realist view that the self is stratified by affectual forces. It is not to be confused with self-efficacy, which is to do with competence in tasks. Self-esteem is more a sense of pride in one’s own accomplishments or a positive self-image. It is more diffuse and less connected to particular tasks. To fail at one task may not necessarily impact on self-esteem, since it may be connected to different domains. Self-esteem is however connected with motivation to the extent that the student attributes causes (ability, effort, task and luck) to performance in specific domains, and on the basis of this the student becomes motivated to improve their self-esteem. Self-esteem is also connected with motivation in another way, when it is remembered that it is to do with affect. This directs attention to self-esteem as a sign of emotional wellbeing, recognising that different activities or their outcomes can set in motion different emotional mechanisms such as joy or disappointment. These emotions are capable of motivating in a positive or negative sense. Some researchers (e.g. Perkun et al., 2007) have proposed a taxonomy of the emotions, as reproduced in Table 4.1. Contextualising this table to the context of assessment, if the student is contemplating an outcome of an exam they might feel hope if they believe they will be successful and anxiety if they hold the opposite belief. In both cases emotions are activated but in contrasting ways with respect to motivation. With respect to the activity itself consider the example of the student who might find the exam enjoyable as the adrenaline pumps. Alternatively, the student might experience the exam as an uninteresting activity and the emotion of boredom is evoked, along with a lack of engagement. While enjoyment activates motivation and draws the student into the activity, it is the opposite with boredom. Some of these emotions reach beyond merely cognitive generative mechanisms and structures. They impact upon the existential element of the self. This is one of

82

4 Motivation, Learning and Assessment

Table 4.1 A three-dimensional taxonomy of achievement emotions Object focus Activity focus Outcome focus

Positive (pleasant emotion) Activating Deactivating Enjoyment Relaxation Joy Hope Pride Gratitude

Contentment Relief

Negative (unpleasant emotion) Activating Deactivating Anger Boredom Frustration Anxiety Sadness Shame Disappointment Anger Hopelessness

Source: (Perkun et al., 2007, p. 16)

the occasions in this book when we have cause to note that existential mechanisms and structures are actualised and the stratified self gains a new level of meaning: the existential. Emotions can offer a ‘disclosive submission to the world, out of which we can encounter something that matters to us’ (Heidegger 1927/1962, p. 176). This means that the emotions of boredom or joy, to mention two randomly chosen emotions, may have more than a fleeting significance.7 Heidegger provides the example of real boredom. It is not reading a book that one finds not totally enthralling. When really bored we are not just bored with a particular thing or activity, rather we drift ‘hither and thither in the abysses of existence’. An example is the student about to drop out of high school not because a single subject is too hard, but because they are bored and fail to be motivated by any of their school subjects or their extra-curricular school pursuits. Heidegger also suggests that the sense of dread, which in our context might refer to the experience of assessment anxiety, can be a general feeling of the uncanny (unheimlich) – not being at home with any particular thing – as the what-is refuses to remain and disappears. As it disappears we become aware of the totality surrounding and including this what-is (Heidegger, 1949, p. 336). In his lecture series of 1929–30 Heidegger described this profound boredom in detail, the langeweile of the long while or long wait, and in so doing the mood of profound boredom joins his other descriptions of emotions, such as anxiety and the uncanny, where all display this sense of homelessness (Heidegger, 1995, p. 5). For him homelessness – inspired by Novalis – is the restless urge to be at home everywhere.8 On the flip side, to be at home everywhere might also mean not to be at home in any particular place; failing to live and engage with matters close at hand. The risk is to live where you are not, imagining and dreaming, rather than where you are. One of the authors (Dobson) on commencing employment in a New Zealand University was called konene by a colleague. This is the Māori word for nomad or drifter. Dobson obviously picked up on this to mean he was migrant who would not remain long, as he had previously lived in different countries, such as Zambia, In the next few paragraphs we closely follow the argument made by Dobson (2004). Trieb überall zu Hause zu sein (the wish to be everywhere at home) (Novalis, as cited in Carlyle, 2010). 7 8

4.1 Theories of Motivation and Learning

83

England, Norway and Australia. Despite countering the connotations of nomadic drifter with the view in his own mind that he had lived in Norway for over two decades, he still felt there might be some sneaking kernel of truth that he was existentially a nomadic drifter who sampled, rather than remained, was superficial rather than deep. The colleague who had used this term explained a few years later that is had a positive connotation in a part of the world characterised with deep relationships with and between Pacific islands. Historically there has always been a long tradition of sea faring where konene could be understood to denote the ‘human’ seaweed, bringing nutrients to the land, washed up on the shore and then retreating to the sea to replenish itself. In short, this story is illustrative of a key Heideggerian point and one aligned with all existentialist philosophers, it depends on how experience is lived and valued, and this can change. In our context, assessment might be lived as a burden and source of dread. And even though these existential emotions can be enduring they can change, as when assessment is a continuing source of elation and achievement. Heidegger mentions joy as another mood able to reveal a person’s totality of Being as it stretches beyond the what-is of scientific reasoning. And here it is important to note the role played by emotions, called feelings in the following: Our ‘feelings’, as we call them, are not just the fleeting concomitant of our mental or volitional behaviour, nor are they simply the cause and occasion of such behaviour, nor yet a state that is merely ‘there’ and in which we come to some kind of understanding with ourselves. Yet, at the very moment when our moods thus bring us face to face with what-is-in- totality ... (Heidegger, 1949, p. 334).

What we are arguing is that existential mechanisms and structures can be actualised such that existential experiences become a gateway to motivation, leading us to act in particular ways. As a result, the pertinence of the existential self is made evident. As a corollary, self-esteem and motivation are not merely connected with the affectual part of the stratified self, but also with what must be envisaged are existential components of this stratified self and how they are capable of influencing or in some cases defining how we live the Being of being with a reflective awareness and emotion. Returning to the presentation of the theories of motivation and learning, the second question that can be posed is, ‘Do I want to do this task and why?’ In answering this question we will make reference to two commonly utilised theories: expectancy value theories and goal achievement theories. In the expectancy value theory of Eccles and Wigfield (2002) expectancy refers to a construct of the probability of success and value refers to a construct of incentive values. The expectancy construct covers much of the ground already covered in the previous question, that is, does the student believe that they will be able to do the assessment task and if so what role will be played by past achievement, task choice and persistence? The authors make reference to Bandura’s self-efficacy theory. Values refers to the values individuals hold about participating actively in the activity. Eccles and Wigfield identify four types of value: • attainment value: the personal value of doing well in an assessment act and how this supports the student’s competence in various domains or their sense of self, such as masculinity or femininity (or gender preference in the term of today)

4 Motivation, Learning and Assessment

84

• enjoyment value of undertaking an assessment act, for example experiencing a sense of total engagement, where self-consciousness is lacking, action is merged with awareness and the attention is focused on a limited field of stimulus (Csikszentmihalyi, 1990) • utility value when the assessment task may not be of interest in itself, but it carries value for future career or study goals. • cost value, which is ‘conceptualised in terms of the negative aspects of engaging in the task, such as performance anxiety and fear of both failure and success, as well as the amount of effort needed to succeed and the lost opportunities that result from making one choice than another’ (2002, p. 120). The authors found that the expectancy construct in students influences the value construct, so that increased belief in competence in the course of a school term, for example, would increase the value construct. They also found that performance expectancies predicted performance in mathematics and English, while values motivated students to enrol in these subjects, as well as physics. Lastly, they found that young children initially held expectancy of competence in activities separately from values. With time they attached more values to their competence and expectancy of competence in activities they can do well. Eccles and Wigfield argue that it is important to note that the values of the individual student constitute a general set of stable beliefs about what is desirable and draw upon societal norms, as well as the individual’s needs and sense of self. They also note that the basic mechanism at work in the expectancy value model is that students are rational in their choices leading to motivation and behaviour. Of course, it is questionable whether students are always so rational in their choices. For this reason they have expanded their model to include affective memories, cultural stereotypes and identity processes, such as long- and short-term goals and the formation of personal and social identities. Another approach to the question ‘Do I want to do this task and why’ explores the relevance of goal orientation theories. Research in this tradition has considered how the student can be motivated by self-mastery or alternatively by performance related to others or in the eyes of others. Table 4.2 presents different versions of this dichotomy where intrinsic generative mechanisms and structures of motivation Table 4.2 Different items used to assess mastery and performance goal orientation Dweck Learning goal I like problems that I can learn something from

Ames Mastery goal I work hard to learn Making mistakes is part of learning Performance goal Performance goal I like problems that are I work hard to hard enough to show get high grades that I’m smart

Midgley et al. Task-focused I like schoolwork that I’ll learn from, even if I make a lot of mistakes Performance-approach I like to show teachers I am smarter than other students

Source: adapted from (Schunk et al., 2010, p. 185)

Nicholls Task orientation I feel successful when something I learn makes me want to find out more Ego orientation I feel successful when I’m the smartest

4.1 Theories of Motivation and Learning

85

correspond to the mastery orientation and the extrinsic generative mechanisms and structures of motivation correspond to the performance orientation. Mastery goals are associated with perceived ability and the belief that it can be improved through effort, whereas performance goals are associated with external grades or external rewards such as teacher praise (Anderman et al., 2011). The ego orientation is found in students concerned with impression management, where the chosen strategy might be to choose tasks they know they can complete. Task orientation betrays a focus on mastering the task and competence. This simple mastery– performance dichotomy can be enlarged to incorporate a) not merely performance to outcompete peers, but also performance to avoid failure and looking incompetent, and b) social goals, such as a student being motivated to show that they are dependable and socially responsible and to keep commitments, make friends, have fun or seek approval from others. Moreover, the pursuit of the social responsibility goals is also seen to be a predictor of higher subject mastery. The last question that might be posed concerning motivation and learning is: ‘What do I have to do to succeed in this task?’ Two kinds of theories have been proposed in this connection. The first is to do with volition, which considers how motivation might lead to the decision to act and move toward the goal; but it is also volition understood as the will, which supports the execution of the act. Corno’s (Corno, 1993, p.16) conception is appropriate: ‘the term “volition” refers to both the strength of will needed to complete a task, and the diligence of pursuit’. Volition is a kind of management and control mechanism where the strength of will integrates discipline, self-direction and resourcefulness. Nietzsche’s term ‘will to power’ sums up the vitalism from which volition draws its energy. A person might want to experience an increasing feeling of control over their actions – an enhanced feeling of the will to power and control. Nietzsche sees no dichotomy between the actors, be it student or teacher in our context, and power. All actions and actors are imbued with power. It is lived, experienced, relational and not reified.9 This feeling of power might also be an unconscious force, as it is not always experienced in a conscious manner. In this respect Nietzsche was anticipating psychodynamic perspectives. He conceptualised power in the following manner: ‘The feeling of power … the will to grow stronger of every centre of force – not self-preservation, but the will to appropriate, dominate, increase, grow stronger’ (Nietzsche, 1968, pp. 366–367). Volition is sometimes enlisted as one component of self-regulation theories of motivation. This leads to the second theory often cited in connection with the question about what has to be done to succeed in a task. It focuses upon how the student can strengthen their motivation by adopting self-regulatory strategies (e.g. adopting assessment as learning, as described in Chap. 3). These involve meta-cognition and three specific mechanisms: self-observation, self-judgement and self-reaction. Selfobservation can lead to improved motivation because the student reflects and acts upon this knowledge. Self-judgement refers to the work required in comparing

Simply put in this conception, it is not the case you own or possess the thing power (reified), ‘you are power.’ 9

86

4 Motivation, Learning and Assessment

current performance to the desired goal and making adjustments to motivational efforts required to reach the goal. Lastly, self-reaction is the behavioural, cognitive and affectual responses to the self-judgement. Self-reaction can improve motivation if the student believes they are making progress. The challenge with self-regulation is that its success rests upon the students being interested and mature enough to undertake self-regulation. They must also develop a sense of self-efficacy (Pintrich & DeGroot, 1990). This may not always be the case, especially among less able students or those who are easily distracted. The three questions posed above (Can I do this task? Do I want to do this task and why? What do I have to do to succeed in this task?) are directed toward motivation and learning theories concerned with the cognitive approach. It is also necessary to consider how motivation is connected with socio-cultural generative mechanisms and structures. We are thinking in particular of the manner in which significant others, such as peer, teachers or parents can provide motivation to the individual through among other things praise, teaching and communicating expectations. As noted in Chap. 3, praise connected with feedback can also inhibit motivation if students become dependent upon it and become unable to self-activate or self-regulate themselves. Rosenthal and Jacobson (1968) conducted a test in which teachers were told that certain students would be intellectual bloomers on the basis of results from a non-verbal intelligence test. The bloomers were selected at random and bore no relation to actual performance in the test. In a year, however, those designated bloomers demonstrated greater progress than the control group who had not received this designation. Their findings, commonly known as the Pygmalion effect, lend support to the view that teacher expectations can motivate students. With respect to generative mechanisms at work, one explanation is that the students identify with and seek to confirm the expectations teachers communicate in the course of teaching. Another is that teachers might offer more feedback time to the bloomers and question them more extensively. A third explanation is to be found in Vygotsky’s (Vygotsky, 1978, p. 86) classic formulation of the zone of proximal development: ‘the distance between the actual developmental level as determined by independent problem solving and the level of potential development as determined through problem solving under adult guidance, or in collaboration with more capable peers’. This zone and the interaction between adult and student, or between peers as they assess and provide feedback to each other, makes an allowance for one party adjusting their feedback and teaching to the needs and expectations of the other, and at the same time stretching the student toward new levels of mastery and learning. The socio-cultural perspective on motivation additionally reaches beyond the relationship between the teacher or parent and the child and between peers. It also includes the communities of practice (Wenger, 1998), for example the classroom or the group piece of work in which the learning is situated (Lave & Chaiklin, 1993). These social collectives are the source of social motives for learning, as expressed through the norms established in and through the shared practices of the group and participants’ motivation to adhere to them. In the terminology of this book the social collectives are inspired by Wittgenstein’s term forms of life denoting where and

4.1 Theories of Motivation and Learning

87

how life is lived. They constitute the in-between space of meaning that can be both explicit and more implicit or tacit (Amble, 2022). The latter might be the experience of break time in a school or morning tea. Staff might congregate around the coffee machine at work. In these interactions, we are socially motivated to share talk with each other, chat, gossip, joke, catch up on news and most importantly, demonstrate care for each other. The members of the collectivities can embrace a form of collective efficacy, which is at the same time collectively self-governed to ensure conformity and compliance. Collective self-efficacy is different to individual self-efficacy. It is found when members work together on a shared project and are task interdependent. Generative mechanisms and structures of cooperative learning are actualised. As the term implies, cooperative learning is about learning to cooperate. An important component is the degree to which group members develop and sustain the ‘we belief’ shared by group members. It should permeate the work of members as they work on their individually allotted tasks, and also when they are working together collectively, evaluating the work of the individual as it is incorporated and woven into the final product.10 But as we will argue later in this chapter, not all members necessarily share the same belief in the ‘we’ when working as a group and this can threaten the group’s collective self-efficacy and its assessed performance on a given task. In a classroom setting, when the attempt is made to include all students in a single shared task, there is a greater chance that members will share different beliefs on collective self-efficacy. Only in fleeting moments are they likely to work together and demonstrate task interdependence and collective self-efficacy. By way of contrast, when students work in groups on potentially different projects there is a greater chance that collective self-efficacy beliefs will be established. In the latter, the level of task interdependence is likely to be higher and to endure over a longer period of time. In examples such as these (the classroom as a whole setting as opposed to the group) we enter the realm of the difference between the group as a series of individuals who are potentially competitors with each other and the group as a collective entity sharing the same project. Sartre (1991) famously described this as a dialectical process where a group can move in pendulum fashion between a serial group entity (individuals regarding themselves as individuals in competition with each other) to a group as collective (group-in-fusion) and the reverse. For him the key factor in forming the group as a collective is an external threat to the group that unifies them and strengthens their shared sense of purpose and equality. The evidence for the generative mechanisms and structures he identified was not psychological tests or experiments, but an understanding of history and the French Revolution. The revolutionaries were able to collectively act as a group-in-fusion facing a shared external threat – they stormed the Bastille. However, in the

The argumentation echoes contemporary and also ancient views and debates on the viability of sensus communis (shared sense) (see Jovchelovitch, 2008). 10

88

4 Motivation, Learning and Assessment

following years this sense of group we-ness broke down and members of the group began to fear and mistrust each other. In an assessment context the outer threat might be materialised into the form of the teacher or examiner who will formatively or summatively judge the work of the group or the class. One definition of collective self-efficacy is the following: Collective-efficacy reflects the shared beliefs of the group members in their group’s capabilities to mobilize the motivation, cognitive resources, and courses of action needed to produce given levels of attainments on a specific task. Collective-efficacy influences what people choose to do as a group, how much effort they put into the group’s objectives, and their persistence when group efforts fail to produce results (Katz-Navon & Erez, 2005, p. 439).

This definition underlines the fact that the collective has to mobilise its own motivational and other resources, whether threatened from the outside (the teacher or examiner) or within (members have to self-govern and keep to certain time frames in order to reach set goals). When collective self-efficacy works, the group gels into a collective group with a shared sense of purpose, and this supports the achievements of the group. Likewise, if the group fails to gel and remains a serial entity of strong ego-oriented individuals, the spectre of failure or poorer performance looms on the horizon. Another alternative exists, and this is where the level of task interdependence is either high or low, with corresponding levels of collective self-efficacy. In such cases research has shown that high collective self-efficacy groups were more likely to improve their performance following failures (Katz-Navon & Erez, 2005, p. 444). The opposite was the case for low collective self-efficacy groups after a failure. Part of the reason for this difference lies in the fact that groups with high collective self-efficacy are better at pooling their knowledge and expertise, such that they can interpret cues in a similar manner and are able to more easily reach consensus on taking appropriate action in case of failure. In this section on theories of motivation and learning we have been hesitant to define motivation in a once and for all manner and sought instead to present a number of different theories of motivation and learning, each positing a different construct with an accompanying defining characteristic of motivation and learning. In the terminology of this book, they represent different language games that are all related (all consider motivation and learning) and give rise of forms of social life. It would be too easy to criticise one theory using another. As has become evident, there is some overlap where one theoretical perspective draws upon another, as for example in how volition theory becomes associated with self-regulation theory. There are in this sense family resemblances. The theories considered suggest that different generative mechanisms and structures are potentially actualised in educational settings. Here are some examples of generative mechanisms and structures: external reinforcers, volition, self-belief, self-esteem, expectancy values, control, affect, attributions, ego and task focus, and the influence of socio-cultural communities of practice and individual and collective self-efficacy. In a broad sense these theories align themselves according to a weighting either toward intrinsic motivation or extrinsic motivation. However, it is perhaps advisable to envisage a kind of dialectic, where both are required, rather than a

4.2 Enlarging the Index of Motivation

89

simple either–or. Ryan and Deci (2000) also propose a regulatory type that additionally includes an amotivational category where the student does not feel competent (low self-efficacy and low capacity). Moreover, in the course of writing this book we have had cause to widen our understanding of the stratified self, so that it is not only stratified by affect, consciousness and unconsciousness. We have also highlighted existential structures of experience. In Bhaskar’s conception of the stratified self, as presented in Dialectic: The pulse of freedom (Bhaskar, 1993), he included the capacity of the self to communicate. We have not referred to this in the foregoing discussion in this chapter. On the one hand, we take it as given in the sense that the student is compelled to communicate in the course of the assessment act so that the assessor has evidence to support their judgement and grade. This might be written, verbal or corporeal, as defined in our definition of the assessment act and might include, for example, smiling to indicate agreement, tut-tutting or kissing your teeth11 or to indicate disdain. On the other hand, the act of communication in the course of the assessment might also be a source of challenges, such as when some students master oral assessment and are more motivated by it than other students. The same might be the case when communication skills differ among team members in a group assignment; some being more confident and some more shy.

4.2 Enlarging the Index of Motivation Given the disagreement about the generative mechanisms and structures at play in instances of motivation and learning, and the several different constructs proposed, it is perhaps surprising that some researchers have sought to argue for a form of agreement, however fragile, based upon the empirical observation of motivation. Seeking a common denominator of empirical evidence of motivation can be defended if it is remembered that the different empirical sources do not necessarily make us prisoner to a specific definition or to a single theoretical mechanism of motivation and associated set of structures belonging to the real. On the contrary, they provide a multifaceted landscape (i.e. family resemblances) that can be appropriated to support the different theories of motivation and learning as they are actualised in assessment acts. Schunk et al. (2010, pp. 11–13) have proposed such a landscape, calling it an index of motivation. A modified version of the index is shown below. The last two points constitute an enlargement of the index proposed by them, where our goal is to incorporate social sources of motivation and learning (Table 4.3). Children having choices and selecting activities can be indicative of motivation. However, Schunk et al. (2010) note that not all students have choices and secondly

A Caribbean term indicating disdain as the tongue touches your teeth and you breathe out with a sigh. 11

4 Motivation, Learning and Assessment

90 Table 4.3 An enlarged index of motivation Index Choice of tasks Effort Persistence Achievement Socio-cultural conditions Collective self-efficacy

Relation to motivation Selection of a task under free-choice conditions indicates motivation to perform the task High effort – Especially on difficult tasks – Is indicative of motivation Working for a longer time – Especially when one encounters obstacles – Is associated with higher motivation Choice, effort and persistence raise task achievement Norms and values embedded in the socio-cultural context give rise to social motives which can motivate learning and performance Members unite behind the collective goals, which becomes a motivating force for the individual

research has shown that the expectancy of a reward may not be enough of a motivation for a student to actually select the activity as opposed to an activity without the expectancy of an award. The willingness to expend high or persistent effort can also be indicative of motivation. Lastly, choice, effort and persistence can result in achievement; and achievement through these means can be indicative of the view that achievement or its desire strengthens motivation. Thus, seeking to achieve goals can be a source of motivation. We have added two other sources of motivation to extend the index. The first is the role played by socio-cultural conditions that can motivate in an extrinsic fashion, and the second the role played by collective self-efficacy to likewise motivate in an extrinsic fashion. In the former social motives are brought into play as a mechanism and in the latter mechanisms such as agreeing to follow a group’s goals can be a motivating force for the individual student. Evidence of the points in the index can be solicited to support assessment through a number of different methods: • direct observation of effort, persistence, choice and goal orientation during a test or performance • examples of performance, such as tests, projects, demonstrating skills • self-evaluation by students in the form of logs of work completed or thinking aloud in the presence of the teacher or examiner. Let us pause for a moment and consider one of these methods: thinking aloud. In Dobson’s (Dobson, 2017) research on acts of assessment in the viva he observed examples of two examiners questioning students with the tutor also present. The student had to defend their written work. Sometimes in answering questions the students were able simply to present an answer. But on other occasions, it was evident that they were actually thinking aloud as they answered and their motivation for making particular choices in their answers or the motivation to persist with or relinquish a line of argumentation was revealed to those present.

4.2 Enlarging the Index of Motivation

91

However, the think-aloud method and the other forms of self-reporting face some limitations. Cooksey et al. in an unpublished psychometric study in 2007 asked a sample of 20 teachers to think aloud as they evaluated students’ writing. They were interested in judgemental thinking and the motivation of examiners for assessing in the manner in which they did. They searched for in-context cues, such as subjective knowledge of specific candidates, and out-of-context cues, such as stated criteria and standards on grading sheets. The think-aloud approach rests upon the assumption, as they themselves note, that judgement processes can be adequately verbalised and do not remain largely intuitive or a property of embodied expression (e.g. nods, sighs and so on). Adequate verbalisation may not always be achieved. To recall the early Wittgenstein in his Tractatus: ‘What we cannot speak about we must pass over in silence’ (1921/1961, proposition 7).12 In our context this means that the observation of motivation risks remaining partially unverbalised and must be sought by other means. Put differently, the judgemental and motivational mechanisms presumed to be accessed by the think-aloud method may be under-represented. More information is required by supporting methods. Motivational mechanisms other than those identified by these authors might be actualised in the judgemental process. Thus, instead of in-context and out-of-context motivational cues the theory of judgement thresholds might offer a more plausible explanation of motivational generative mechanisms and structures at play in this particular example. Royce Sadler in conversation (personal communication, 21 March 2007) talked of judgemental thresholds. He was thinking of the work of Tversky and Kahneman (1974) who argued that motivation for judgements is based upon a critical factor or an accumulation of factors to a saturation point that, when reached, results in a judgement. Klenowski (2007, p. 10) in similar vein has talked of identifying the non-discountable pole in examiner judgements. The non-discountable functions as a critical threshold dictating the weighting and direction of the judgement. The observant reader will have noticed that we are gradually introducing more examples that connect motivation to acts of assessment. In the following section we will continue to narrow the focus to acts of educational assessment and in particular to the generative mechanisms and structures at play in the tripartite relationship between grading, motivation and learning and experiences of the stratified self. We will have cause to draw upon the foregoing presentation of motivation, learning and the self as three examples are discussed: grading from the standpoint of the individual student, the grading practice of the examiner/teacher and lastly the grading of students who have worked in groups. With this point of departure, the questions we are going to pose are simply: What generative mechanisms and structures of motivation and learning are brought into play when assessment is based upon grading? And connected with this, how might this impact on the student’s sense of self?

12

‘Wovon man nicht sprechen kann, darüber muss man schweigen.’

92

4 Motivation, Learning and Assessment

4.3 Grading Individuals Butler’s (1988) pioneering work on grading and feedback is often cited. She examined the learning outcomes of different forms of feedback with 132 sixth graders in Israel. Students in all classes were given a similar large task and submitted it for assessment after the allotted time period. Researchers and teachers independently graded the work. After two days the students received feedback. Students in four classes were only told the grade on a scale from 40 to 99. In four other classes the students received only comments about what they were good at and how they could improve. In the last four classes students received both such comments and grading. In the next task students who had just received comments previously demonstrated a 30% improvement in performance. Those who had been told the grade in the previous submission showed no average progression. These results are surprising. What is interesting in the study is what happened with the students who had previously received both comments and grading. It turned out that these students revealed the same average trend compared with students who had only been told the grade, that is, no development. It is also worth noting that the same tendencies appeared across pupil levels. Butler explains this by postulating that those who received a high grade saw no point in reading the comments. Those who learnt that they had a low grade did not want to read the comments since their sense of self would be further diminished. The weaker student might experience what Ryan and Deci (2000) identified as a mechanism of amotivation (low self-efficacy and low capacity). A conclusion that can be drawn from this research is that grading does not actually motivate the learning and performance of either weaker or stronger students. The last mentioned are already motivated and the grade does not necessarily change or intensify their motivation to learn and perform. Kohn has consistently argued that grading can be detrimental to the student’s sense of self. This is explicitly stated in his article ‘From degrading to de-grading’ (Kohn, 1999) where he argues, firstly, that those who do not receive good grades feel degraded and secondly, that the best strategy is to de-grade and teach without the use of grading as a source of motivation. Kohn’s arguments are worth rehearsing and they are informed by his reading of Butler’s work and much of the research we have already highlighted, including intrinsic and extrinsic motivation: • Grades tend to reduce students’ interest in the learning itself because goal- orientated motivation (being told to learn something for a test or a grade) can work against the motivation to learn (learning can become a chore). Put simply, the reward is not enough. • Grades tend to reduce students’ preference for challenging tasks. Thus, students seeking a good grade will choose the easiest assignment rather than the one which is most challenging. Students are not so much lazy as strategic in their choices. • Grades distort the curriculum because it encourages teachers to teach a ‘bunch of facts’ because they are easier to measure and grade.

4.3 Grading Individuals

93

• Grades waste a lot of time that could be spent on learning. Kohn points to the time a teacher spends marking and all the (mostly unpleasant) conversation about grading with students and their parents. Kohn’s basic point is that motivation based upon a behaviourist, rewards (grades) approach can take student attention away from what should be in focus, namely learning as a form of mastery-guided motivation. Kohn is of course clear that there is not necessarily an opposition between learning and grading, but there can be, and when there is the student narrows learning to encompass only what is of use in a possible test. In more recent commentary, Kohn (2019) has argued that grading is ultimately not about the performance of an individual against themselves; it is about success relative to others. Securing entrance to college in a course with limited places would be a good example. Those inspired by Kohn, such as a teacher by the name of Joe Bower (2010), have de-graded, such that their everyday teaching and use of tests is not based upon grades. Students are given only feedback, much in line with the findings from Butler’s research presented above. When it comes to setting the grade for the end of term if required by education policy or regulations, these teachers, having given detailed feedback throughout the course period, know the students well and this assists in deciding the appropriate grade. Secondly, teachers inspired by Kohn have tended to ask the students to take part in setting the grade (to self-assess) in negotiation with the teacher, anticipating that they would most likely decide upon the same grade as the teacher, and sometimes even a lower grade. Kohn (2011) puts it like this: If people find that idea alarming, it’s probably because they realize it creates a more democratic classroom, one in which teachers must create a pedagogy and a curriculum that will truly engage students rather than allow teachers to coerce them into doing whatever they’re told. In fact, negative reactions to this proposal (‘It’s unrealistic!’) point out how grades function as a mechanism for controlling students rather than as a necessary or constructive way to report information about their performance.

So, Butler’s research and the work of Kohn suggest that the grade in itself does not necessarily motivate learning and also performance. However, this is only half the argument. In what follows we shall address the more commonly held belief that the presence of the grade acts as a source of extrinsic, goal-oriented motivation. There is also evidence to support the view that this form of motivational mechanism is also actualised. Consider the following example. Dobson was once engaged as an external consultant in a medium-sized high school in Norway with 160 full-time teaching staff, across a spread of vocational and non-vocational programs of study. The staff were undertaking a three-year project to improve classroom management and assessment skills. Staff participating in the project were organised in cross- disciplinary groups and met on a regular monthly basis after teaching for two-hour project sessions. In the final half-year staff had to trial a self-selected way of improving their assessment practices and discuss their findings in their groups. One of the

94

4 Motivation, Learning and Assessment

cross-disciplinary staff groups developed an electronic Questback survey13 for their students in different subjects, with the goal of generating self-reflection among students about their motivation for learning. The students were asked a battery of questions about how they studied, learnt and performed during the last completed period of learning in selected subjects. Two questions in particular drew Dobson’s attention: • I am motivated because the subject interests me. • My motivation is first and foremost based on the desire to gain a good grade. The small number of respondents in each subject (approximately 25–30 per subject) means that the results must be interpreted with caution and cannot be generalised. Some of their key findings were as follows: • In Norwegian, a compulsory subject, 10% of the students said the subject did not motivate them, 65% were indifferent, 20% were a little motivated, and the remainder did not answer. 50% were motivated by the grade, and 35% partially motivated with 15% not answering. • In sociology and anthropology, an elective subject, all the students said they were motivated by the subject and at the same time all were motivated by the grade. • Over 85% of students taking applied mathematics said they were motivated by the grade. The corresponding figure for theoretical mathematics was 94%. These courses were chosen by students after the mandatory level mathematics courses. Yet 50% of the applied mathematics students said the subject itself did not motivate them and 50% were only partially motivated. In theoretical mathematics 33% were motivated by interest, 33% were partially interested and the remainder were not. The trend across subjects was that students were motivated to learn by the grade. This suggests a clear weighting toward the perceived importance of extrinsic motivation. The teachers were disappointed in some cases that the students were less motivated by the subject itself. This was perhaps also indicative of a lower level of motivational values (enjoyment, attainment, utility and cost) attached to the subjects (Eccles & Wigfield, 2002). What the teachers did not investigate, arguably because it was too obvious to them, was the social pressure among students to achieve grades. A Norwegian phrase is pertinent in this respect: ‘karakter jag’ which can be translated as ‘hunting for the better grade’. It has two meanings: firstly, the pressure whereby the individual student feels they must continually self-improve and attain improved grades. Secondly, it refers to the competition amongst peers, which results in the desire to outshine them with better grades. In the context of what we have discussed in the section on theories of motivation and learning, the social pressure from peers constitutes a set of social norms that structure and direct the motivation of students. Students establish these norms and yet they are supported by parents and teachers in 13

Questback is an online survey and feedback software company (https://www.questback.com/)

4.3 Grading Individuals

95

the sense that they want their sons/daughters/students to gain entry to higher education or highly sought after apprenticeships where the competition for a limited number of student places can be severe, or in a different word extreme. Another aspect of grading these teachers did not consider, once again because it was arguably self-evident, was the role played by socio-economic background. We will briefly consider research that suggests that socio-economic background does play a role in grade attainment from the perspective of the student. Gender can also play a role. In the section thereafter we shall argue that there are good reasons for believing that teachers make allowances for different background components when grading. In Sweden grades were gradually abolished in primary school up to grade 7 (14 years of age) in a period between 1969–1982. Specifically, local municipalities in this period were allowed to choose to keep grading or remove it. By 1982 no grading was allowed (it was re-introduced again in 2012 in the last grade of primary school) and instead parents were able to attend a twice-yearly meeting with teachers to receive formative feedback without grades on the progression of their children. It was felt that grading produced unhealthy competition between students, and favoured academically stronger students from middle-class backgrounds, and it was not necessary for selection to higher levels of schooling. Sjögren (2010) has examined cohorts born between 1957 and 74 and register data to explore the consequence for educational achievement (highest level of education in 2004) and earning (average taxable income 2004–06). She found the following interesting findings: • Grading increases the probability that girls and boys from lower family educational backgrounds will complete high school. • Grading lengthens how many years girls will select to stay at school, but not significantly for boys. • Sons of middle-class parents who were graded in primary school are less likely to have a university degree and earn less than boys of the same background who were not graded. Sjögren is clear that she can only talk of consequences and not of mechanisms at play. She does however offer some tentative interpretations of her findings. She concludes that lower class students and parents might have appreciated the information communicated by grades and been less able to extract the information and meanings communicated in the oral twice-yearly meetings with teachers. Girls of all class backgrounds might underestimate their attainments when grades are not offered, while for boys from middle-class backgrounds grades might lower outcome expectations and motivation (Bandura, 1982). Sjögren’s work suggests a compromise position: grades may be motivating for some groups, but not for all.

96

4 Motivation, Learning and Assessment

4.4 The Power and Responsibility of the Teacher Who Grades Individuals On the one hand, grading and motivation seems to influence how students perceive and perform in the act of assessment. On the hand, it is clear that the work of the teacher or examiner can also be a source of motivation or in fact the opposite. This line of thought would seem to be valid if all teachers and examiners assess in exactly the same way and thus score well on inter-rater reliability. However, this is not the case, as we shall argue in this section. If assessment is to fit the purpose and the background of the student, the argument can be made that the examiner/teacher should modify their grading practice when grading students with certain needs. In recent times this has been considered under the policy banner of more accessible assessment to meet these needs. A roundtable was held in Australia in 2019 (Round Table on Information Access for People with Print Disabilities, 2019). In Norway the Education Act for Primary and Secondary Schools has stated this explicitly for many years: ‘education must be adapted to the abilities and prerequisites of the individual student, the apprentice, practical graduate and training candidate’.14 Here is a reflection in the early 2000s from one of the authors (Dobson). My colleague and I had taught an advanced educational theory course at university. It ran on a weekly basis for part-time students for a whole year. One of the students had severely limited sight and was to all intents blind. He attended every lecture and took part in all the discussions. A friend had read most but not the entire reading list onto tapes. The exam for the course entailed two parts: firstly, a 5000-word essay on a self-chosen topic related to the course, and secondly, a viva where the student had to defend the essay in the presence of the course tutor and an external examiner. The viva was used to adjust the written grade to a final grade. The external examiner refused point blank to assess the blind student like all the other students. He learnt from the tutor that the student had demonstrated an enduring wilful determination to complete the course. There had been no lapses in the blind student’s motivation and he had actively taken part in class discussions. The student received viva questions that took account only of the literature he had read and what he had written in his exam essay. The student received a grade B. It reflected his performance and no effort was made to compare him with other students who received questions across the whole reading list and not merely in relation to what they had used in their essay. In high-stakes exams, such as those completed by high school students across the globe it might be argued that it is more difficult to adapt the assessment to cater for the special needs of the student. This is especially the case if many are sitting the exams, where examiners are less likely to meet the candidate in person (as was the In Norwegian: ‘Opplæringa skal tilpassast evnene og føresetnadene hjå den enkelte eleven, lærlingen, praksisbrevkandidaten og lærekandidaten’: Lov om grunnskolen og den vidaregåande opplæringa (opplæringslova) § 1–3 (https://lovdata.no/lov/1998-07-17-61/§1-3, accessed 30 Jan 2021). 14

4.4 The Power and Responsibility of the Teacher Who Grades Individuals

97

case with the blind student in the example above), and if what is required demands more than extra time because of writing difficulties or the like. It is charged that reliability between examiners is desirable (Barnes & Pressey, 1929; Hartog & Rhodes, 1936; Starch & Elliot, 1912) and also validity in the sense of measuring the graded construct so that it is not under-represented or alternative irrelevant constructs are measured (Messick, 1989). A consistent challenge in grading is the following: When examiners disagree negotiations begin. These are likely to pull down the grades of candidates who score highly with one examiner and pull up the grades of candidates who score low with one examiner. On the other hand, high examiner agreement increases the chances of positively affecting the decision for those who perform well and negatively affecting the decision for those who perform poorly (Dahl, 2006, p. 20).

Training examiners in an attempt to build consensus is a well-known strategy directed toward ensuring greater inter-rater reliability. How might these points connect with the focus on student motivation? What happens in the act of grading is not always veiled in secrecy, since teachers may have examined for examining boards and can tell students what to expect, but there is a lack of transparency in the sense that the students may not find out in detail how they have scored in particular questions. Students might have to wait for the result, which comes in the form of a single grade, and it is not easy for the student to plan or manage their motivation in advance. At best they can attempt to plan their future motivation according to different self re-orientation scenarios. In these scenarios the student can ask: How will I feel (self) and be motivated if I receive a) a good grade, b) an average grade, c) a poor grade, or d) a failed grade in one or more subjects? In low-stakes assessment situations, where the grading goal is formative and part of an ongoing student and teacher relationship the student’s outcome expectation cam interact with their self-efficacy. Different permutations are possible, as Bandura (1982) suggested. For example high self-efficacy might co-exist with low outcome expectation, as when the student is willing to work hard but believes the teacher will not grade in a fair manner, or alternatively, a student might have a low self-efficacy but believe the classroom is characterised by high outcome expectations, as when the teacher is willing to give good grades despite mistakes. After the grading situation the teacher’s feedback with grade, grade and comments, or just comments provides the opportunity to motivate the student toward further learning and performance in future assessment situations. There is also a third grading variant that combines components of the above low- stakes formative grading and high-stakes summative grading. It provides the option of relating student motivation to the grading practice of the teacher. Let us return to the case of assessment in practice in Norway in the years around 2010: There are few examination marks on the school leaving certificate (16 years) compared to the number of overall achievement marks. On the school leaving certificate after completing compulsory school there will generally be overall achievement marks in 16 subjects and, in addition, examination marks in two of these subjects (OECD, 2011, p. 72).

98

4 Motivation, Learning and Assessment

The important point in this form of assessment practice at the end of middle school before entry to high school is that the subject teacher sets the overall achievement marks in all subjects. To award the overall assessment mark the subject teacher has to arrange a graded assessment situation toward the end of the course that measures the actual achievement of the student in all the curriculum goals of the specific subject. It is not permissible, according to Norwegian assessment regulations, to hold a number of tests throughout the course period and arrive at the overall achievement based upon an average grade from these tests. Each student will sit one centrally given written exam in mathematics, Norwegian or English and one locally set oral exam in any subject. Students are randomly drawn for the exam subjects and cannot choose the subject. In setting the overall achievement grade in the subjects the teacher will possess knowledge of the student over time and this might potentially influence the final grade. This is despite the fact that the results from tests during the course should not be formally drawn upon by the teacher as they consider the final grade. Is there evidence that this does in fact take place? Research by Prøitz and Borgen found that: teachers tell of large individual variations in the practice of setting overall achievement grades. Among other things, they assess the same curriculum components differently, and they hold different perceptions of the function of the overall achievement grade and what the grade expresses (2010, p. 11).

A teacher put it in the following manner in an interview reflecting upon the setting of the overall achievement grade: Low achievers, those struggling with home factors … in their oral grade I give them some ‘cred’. I know they have a lot going on in their heads and if they can put this to the side I can them give them a bit extra. (Norwegian teacher, as cited in Prøitz & Borgen, 2010, p. 70).

The researchers concluded that teachers, and they found many examples, might aspire to grade in a fair and neutral manner, not taking into account background differences, but in practice they graded differentially and took into account different factors with respect to weaker students. It might be posited that students are aware of this kind of practice, irrespective of how correct it is according to Norwegian assessment regulations; and it is to be expected that it feeds into their outcome expectations (Bandura, 1982) and motivation to attain a specific final grade. It might also be an instance of the Pygmalion effect (Rosenthal & Jacobson, 1968), whereby teacher expectations can motivate students. To summarise, the teacher’s grading practice can influence student motivation most directly in the case of formative assessment acts of grading. In such cases mechanisms of expectancy value are actualised. Teachers’ grading practices can also influence particular student groups in the setting of the overall assessment grade. As we have argued research shows that this is the case for students who are lower achievers or who have home backgrounds that might be less knowledgeable or supportive of school study. When this occurs the student is more able to achieve success in the assessment situation and experience a heightened sense of self- esteem. It makes it possible to work against what one upper secondary school student appropriately called ‘our social caste system’. By this she meant:

4.5 Grading Group Work/Projects

99

It is as if you have a number on your forehead that indicates which part of the ladder to which you belong. Four in the class might have grade A, some are C’s, others D’s, and one unlucky person is an E. May God forbid if an E takes it upon himself or herself to sit beside a person with an A. There will be a lot of backbiting and gossip (Aftenposten, 2013).

This is the opposite of what we referred to earlier as ‘karakter jag’ (‘hunting for the better grade’). It is not about a set of social norms putting pressure on peers to get the better grade. It is instead a set of social norms motivating them to avoid the worst grade. It is doubtful whether the assessment act teacher giving the low achiever some ‘cred’, and enhancing their self-esteem and motivation to learn in the next instance, will be enough to counter the pressure of social norms found in the social caste system. Put simply, mechanisms of individual grading actualised by the teacher might influence the motivation and learning of the student, but the motivational mechanisms connected with social norms established between students are capable of overpowering the effort of the teacher with respect to the individual student.

4.5 Grading Group Work/Projects After emigrating from England and spending a year in Norway learning Norwegian in the mid-1980s, Dobson enrolled in a one-year Diploma in Political Science. He was clearly motivated by utility value. But it was not utility in the sense of foreseeing a direct use for the newly acquired domain knowledge. The utility value was more related to the course offering him the opportunity to develop and practise his knowledge of Norwegian. This was something he needed in order to secure future employment in Norway. The four course tutors were political activists and belonged to the same left-wing socialist party.15 They were not interested in the students learning or performing individually. Students were always doing group work. The course had two final pieces of assessed work: an individual essay and a group essay. The lecturers openly preferred the latter and put up with the former. In previous years students had been graded simply pass or fail in both papers. That year was the first in which they were given detailed grades in each. What struck Dobson immediately from the first week of the diploma was that 50% of the final grade would be based upon a group essay. Dobson struggled with the thought that there might be freeloaders in his group, or that he might be regarded as one because his Norwegian skills at that time were relatively poor. Nowadays it is increasingly the norm that university students can expect at least one course in their degree programs that requires group work and an accompanying form of group assessment. The rationale is usually that this prepares the students for employment where it is now taken for granted that teamwork skills are a In Norway, there are many political parties, some larger and some smaller and it is rare that one party can govern without a coalition agreement with other parties to which they are aligned in some respect. 15

100

4 Motivation, Learning and Assessment

requirement. Many years later Dobson gained an opportunity to explore how student motivation and learning and sense of self might be influenced by the social norms generated by a) students working in groups, and b) teachers/examiners assessing the work of the students working in groups. The occasion was a commissioned piece in a festkrift dedicated to his onetime dean (Bjørn Berg) when he still lived and worked in Norway (Dobson, 2011). Some of the findings are worked into what follows, where additionally some international perspectives on group assessment will be introduced. Assessing group work opens up an immediate challenge, or should we say opportunity. It becomes possible to assess the process and the product, where the former includes how group members have worked together. The latter can be a group oral exam where group processes are also evident. The option to assess both process and product will require a weighting of the two if a single final grade is to be awarded. This weighting will have been both justified and communicated beforehand to the students. In many senses it is easier to grade just the product, but this misses what is special about group work, namely the processes between group members on the way toward the product. Berg and Engen (1976) conducted a survey of 169 student teachers in Norway, 112 of whom took the group exam and 57 who sat the more traditional individual written examination. They were interested in two things. The first was cooperation among students. This is a theme highlighted in educational debate about democracy and student participation. It supports the general belief that student teachers should develop expertise in group processes as a fundamental component of their professional teaching activity in the classroom. The authors anticipated that group work would have a greater impact if it were also associated with group assessment, that is, there is a ‘washback’ effect since students will adapt their motivation and learning if they are to be assessed with a group grade. The authors’ second area of interest was whether the group assessment of the process and product might be a means to develop independent, critical teachers. Their assumption was that group work through group dialogue and discussion offers an arena where knowledge is not only reproduced but also developed by students to resolve problems. The method used by the researchers was a questionnaire. They argued that observation would be too time consuming. Besides, a questionnaire immediately after graduation would provide insight into students’ attitudes and experiences with this form of assessment. They had two focus areas: knowledge of the subject and the social skills that students acquired through the group examination and while working on the group product. They defined social skills to mean the group members’ experience of arguing for their own views in the group, leadership, distributing tasks and collaboration in knowledge construction. They further operationalised social skills as a form of social maturity characterised by management skills, establishing group norms, frequency of communication, and group members feeling accepted and well-liked. They found evidence of these social maturity indicators and also the following:

4.5 Grading Group Work/Projects

101

• The major objections to group examinations among students were that some were able to pass without adequate effort (free riders) and that the group examination placed too many demands on cooperative skills. • Many highlighted the strength of group exams that they mirrored later work situations. • It was the reverse for students who took the written examination, ‘where luck and bad luck can have a huge influence’ and where there were greater opportunities to measure ‘self expression ability and thinking ability’ (Berg & Engen, 1976, p. 98). In our context using motivation constructs the most important finding was that students who took the group exam had greater motivation to master the subject material (goal-oriented motivation), which they also regarded as more relevant (utility value motivation) and they had greater influence on what they were to be assessed on than in traditional individual exams (attribution motivation). There are obvious blind spots in this research, such that the reader learns nothing of the way the examiners might have weighted product and process, or if the students understood how they were assessed. Nevertheless, the findings are thought provoking. In turning to international research this teacher/examiner perspective on group assessment is heavily in focus, and so too is the process component. Process has often been understood as ‘personal and interpersonal team-working skills such as communication, negotiation, self-initiative, resourcefulness and conflict management’ (Bryan, 2004, p. 53). To a large extent this corresponds with social skills as understood by Berg and Engen. But it is interesting to note that leadership skills are not always mentioned in international research. What is noticeable about recent research (Lejk & Wyvill, 2001; Nordberg, 2008; Sharp, 2006) and practice of group assessment is the desire to measure the effort of individual group members using statistical tools. This has a roll-on effect as students become highly motivated to demonstrate their own contribution to the group, but are unsure about how motivated they should be to unite behind the group’s goal and strengthen the group’s collective self-efficacy. The individual effort or rather lack of effort was a topic for Berg and Engen (1976), but neither they nor others at that time were keen to measure individuals’ efforts in the group. They observed how the same group grade was awarded to each group members regardless of the efforts they had made toward the shared product. Of continued interest today are the debates about group assessment in higher education concerning whether, and if so how, the efforts of the group members should be documented and given a score or value. Kuisma (2007) has suggested the following: assessment of each other’s efforts, the students’ own log of the process and teacher observation of group processes. The latter method is problematic according to Orr (2010, p. 305) because students act differently when the teacher is present. Students try to maximise the behaviours that they believe will have the most impact on the teacher’s view of their own individual character: ‘performers can put on a good act for the observer (comment by a theatre student)’. A log can be a kind of self-presentation document that favours

102

4 Motivation, Learning and Assessment

those who are good at writing and are able to underline their own contribution to the group. Moreover, when students assess each other, they tend to complain about others rather than being self-critical. This might be the case even if they use pre- developed criteria.16 All these methods rely upon the student representation or presentation of the group work process to the teacher/examiner, and such (re)presentation with a focus upon identity is not necessarily a reliable measure of the process. It is too subjective and dependent on the vagaries of individual perception and judgement. Trust is an important motivating mechanism in group work and assessment. It is vitally important that team members trust each other, so they dare to take risks in their thinking and in their group work and not just ‘play it safe’. Berg and Engen found that there was a need for team members to feel accepted. In more recent organisational research we find similar perspectives. For example von Krogh et al. (2000, p. 49) have promoted the Japanese term ‘ba’ (場) to denote ‘a network of interactions, determined by the care and trust of participants’ and thus a shared meeting place for emerging relationships. The development of creative risk in group performance depends on mutual trust, active empathy, access to help, absence of condemnation and drive. The acceptance of what in Norwegian is commonly known as everybody’s right to a tabba kvota (meaning in English, an allowed quota of errors). Let us return to a theme that was missing in Berg and Engen’s (1976) account, namely how teachers actually examine student group work. Orr (2010) recounts the practice of an art and drama department where students were given grades based on a combined assessment of the group process and group product. In a BA in years 1 and 2, there was a weighting toward process rather than product, and in the last year it was the reverse. Why is the process less relevant in year 3; isn’t it just as relevant? To avoid this problem, we might suggest that two grades could be given, one for the process and one for the product. According to Orr (2010) students tend to believe that it is the quantity of effort that teachers value as essential in group work, while teachers are in fact more concerned about the quality and relevance of students’ contributions to the group. A related challenge is actualised if the group product is to be defended in a joint oral exam: how is each individual’s contribution in the oral to be weighted against the assessment of the group product? In this case it is not therefore a question of assessing the formative social process that has produced the product, but assessing the social process evident in the oral defence of the final product in the summative phase. In research on PowerPoint group presentations undertaken by Dobson Lejk and Wyvill (2001, pp. 63–64): Criteria 1: motivation/responsibility/time management (indicators: attends meetings regularly and on time, accepts fair share of work and reliably completes by the required time). Criteria 2: adaptability (indicators: wide range of skills, readily accepts changed approach or constructive criticism). Criteria 3: creativity/originality (indicators: problem solver, originates new ideas, initiates team decisions). Criteria 4: communication skills (indicators: proficient at diagramming/documentation/overhead projector or slides, effective in discussions, good listener, able presenter). Criteria 5: general team skills (indicators: positive attitude, encourager, supporter of team decisions, desire for consensus). Criteria 6: technical skills (indicators: provides technical solutions to problems, ability to create designs on own initiative). 16

4.5 Grading Group Work/Projects

103

(2006), examples were found of examiners who neglected form, while other examiners in the same discipline (tourism studies) examining the same students were willing to grade according to an unspecified combination of form and product. Pausing for a moment, we will reflect more upon the grading mechanisms which might be brought into play when it is a question of collective self-efficacy. Put differently, what might examiners/teachers assess and are students necessarily aware of this? Pazos et al. (2010) suggest assessing group interaction style (in terms of degree of cooperation) and problem-solving style (e.g. whether the group is looking for ‘quick fixes’ rather than digging deeper). Nordberg (2008) directs attention to assessing the manner in which the group has developed though a number of phases: forming (group formation), storming (discussion in the group about how the task should be solved), norming (establishing internal norms of accepted ways to trade and distribute work internally) and performing (the product realised and assessed). Berg and Engen (1976) asked informants about group leadership. They found that there was rarely any elected leadership of the group. Leadership was by function, ‘which alternated spontaneously by task and situation’ (1976, p. 58). But they did not discuss further how other tasks were distributed and whether there were other important roles in the group. Orr (2010, p. 307) comments on the roles of secretary or IT manager in the group: ‘students worried that these quieter roles were not always recognised when marks are allocated’. In the latter case students are motivated to think in utility terms about maximising the final grade. Delving more deeply into the interaction mechanisms at play, we can draw upon the work of Goffman (1969). He is well known for his view that interaction is comparable to life in the theatre where participants can be likened to actors who adopt and play different roles. The individual acts on different stages, some more public than others, and they are motivated to manage their self-identity according to the social norms of acceptable behaviour in the particular activity. In our context, the stage in question is twofold: a less public stage in front of other team members, and a more public stage when the group is observed by their teachers in the work process or when assessed in the group oral, if there is one. As Christensen puts it: In the group exam students are visible in relation to each other and also at the same time they are instrumental in making other members visible. Group examination thus offers the opportunity to judge the individual student rather than limiting them (2008, p. 23).

It is important to present a good face to non-group members such as teachers and examiners. Internally, within the group roles can be played in different ways: the cooperative group member, the supportive group member, the lazy group member, the dominating group member and so on. This mean that, even if group norms are established within each student group as roles are allocated, there is always a certain leeway in how they are played. The important point here is that members of the group will feel motivated to play their roles and uphold the social norms established in the group. But the social norms cannot be defined in advance and cannot be defined in a once and for all manner. They are open to re-negotiation. In Goffman’s (1974) later work he proposes the term frame analysis, which further describes the mechanisms and structures at play in group work and assessment.

104

4 Motivation, Learning and Assessment

A frame indicates the ‘sense of what is going on’ (p. 10) in the activity and refers to the ‘meanings’ given by participants to the activity under consideration. To use an analogy, the frame is a snapshot that lasts longer than a photographic second and puts spatial, temporal and conceptual boundaries around an activity. It provides a foundation for a critical realist exploration of generative mechanisms and structures. Summarising the discussion, we can say that when it comes to the grading of group work and its connection with self-esteem and motivation and learning a number of options are possible. Firstly, the teachers/examiners have choices about how they weigh the formative process of the group against the final product in the summative phase, when an oral defence might also be included. Secondly, the option of grading the individual student’s role in the group process and product can exist in opposition to awarding a group grade irrespective of the role of the individual and potential free riders. The individual position can lead to a stronger ego-oriented motivation as opposed to a shared group motivation and a belief in the pertinence of collective self-efficacy. In such a situation the social norms associated with ‘we are all in it as a team’ are favoured less and the dominant social norms exhibit more of the character of ‘I am taking care of myself’. Irrespective of which option occupies the dominant position in assessing group work in different contexts it is pertinent to consider that both impact upon self- esteem and motivation, such that the self is stratified in an ego or other-directed orientation. Arguably, in all cultures there are embedded ways of understanding this with implications for motivation. For example, in Māori culture a well-known proverb is ‘he waka eke noa’ (we are all in the same canoe without exception) (Herd, 2016), rising and falling with each other in the sea and sharing the same destiny. The Norwegian proverb is ‘du skal ikke tro du er noe’ (Sandemose, 1933, p. 85), meaning each individual must not think they are anything special. In Indonesia, one of the most well-known education activists and reformers, Ki Hajar Dewantara, wrote about collective effort in group work in Dutch: ‘een voor Allen maar Ook Allen voor Een’. In Bahasa this is ‘Satu untuk Semua, tetapi Semua untuk Satu Juga’, meaning literally, ‘one for all and all for one’, implying that no one is better than the other (Mirnawati, 2012, p. 110). All these saying from different cultures are indicative of how can and potentially how should group assessment be undertaken.

4.6 Closing Comments In this chapter we have sought to understand generative mechanisms and structures that provide deeper understanding of the connection between assessment practices, sense of self, and motivation and learning. In the first half of the chapter, we presented and explored the multidimensional generative mechanisms and structures connected with motivation, moving beyond the traditional discussion of intrinsic and extrinsic motivation to a focus on the role of emotions and the importance of

4.6 Closing Comments

105

shared social motivation. The connection with assessment has been founded upon the manner in which motivation might be measured. In the second part of the chapter, we explored the generative mechanisms and structures connected with the grading of the individual, the practice of the teacher/ examiner and the grading of the group. As with motivation we had cause to consider the social aspect of grading. In Chap. 2 we talked of how new forms of society generate new forms of assessment. This has been echoed in this chapter: the fashion for examining in groups and with a simple pass or fail for the whole group was evident amongst some Norwegian lecturers who were a product of the radical 1970s. They sought to establish a match between the form of society and the assessment. By the 1980s, this assessment practice was disappearing as the group received a point or letter grade. The use of group work and the skill required to assess the process and product through such grading has gradually intensified with new psychometric tools. But the generative mechanisms and structures on the macro-scale are now less connected with the collective mentality of the radical 70 s. They are more connected with the perceived need in macro-economic terms to enhance the employability of graduates trained in all aspects of group work, such as leadership, planning, offering feedback to peers and competence in self-assessing the final product. Some might call this the rise of neo-liberalism. It is also important to note that this more group-oriented form of assessment has a longer historical trajectory. It would be a mistake to believe it only appeared in the 1970s or in today’s society. Professions such as the armed forces and the emergency services have throughout history valued and assessed teamwork. There is also a cultural aspect worth considering in terms of generative mechanisms and structures. Take for example the proverb ‘It takes a village to raise a child.’ It was popularised by Hilary Clinton’s, 1996 book It takes a village. The proverb has been attributed to African proverbs in the sense of ‘if you are not taught by your mother, you will be taught by the world’. We are accustomed to thinking this is of less relevance in cities. But even in cities there can be strong neighbourhood bonds, just as in rural communities. In both, the upbringing of the child, including offering them feedback, is not limited to teachers or the immediate parents. The wider community and extended group of relatives have a key role to play. A case in point is Aboriginal culture in Australia. It is the longest unbroken living culture in the world and is actually not a single culture, but many interrelated cultures and peoples across an immense continent. Aboriginal pedagogy is founded upon knowledge that is community and place-based, where direct questioning and verbal transmission pedagogy is downplayed. Assessment is driven by the student reflecting and trialling in practice and demonstrating independence on the basis of the lessons learnt (Yunkaporta, 2009). The student is thus called upon and also self- motivated to show resilience and that they can survive in harsh conditions. They undertake significant self-assessment alongside the supportive eye of the teacher, who might be an elder member of the community. Dobson’s colleague and good friend Professor Lester Irabinna Rigney (University of South Australia, 2021), a descendant of the Narungga, Kaurna and Ngarrindjeri peoples of South Australia, has related how in the pedagogy of his people there is always a teacher, a learner and

106

4 Motivation, Learning and Assessment

a third person who checks something has been learnt (Rigney, 2023). Thus understood, knowledge, learning and assessment is an intensely collaborative activity, where the student is less an individual and is from the very first moment immersed in shared acts of learning and assessment. Rigney (2019) also explains that, once something is taught to you, you are in turn obligated to teach it to the next generation when it is your turn. All are thus learners, and nobody is excluded from being a teacher. This chapter might have given the impression that there is a dichotomy between individual and group assessment. It is, however, important to keep in mind that the supporting culture is a key generative mechanism and structure, which circles back to the point of Chap. 2, namely societal forms influence forms of assessment, and the reverse, forms of assessment influence the societal forms. In critical realist terms we are rehearsing once again the interrelationship between the four planes of the social cube. This encompasses material (and place-based) transactions with nature, social interaction between agents, social structures influencing social relations through power, discourse and norms, and the self that is laminated or structured by consciousness, unconsciousness and affective forces. A nod to Wittgenstein is also due: the social practice of assessment we have discussed in this chapter evidences the different language games of assessment that give rise to diverse forms of life i.e. social practices (Amble, 2022).

References Aftenposten. (2013, March 22). Vårt sosiale kastesystem [Our social caste system]. Aftenposten. https://www.aftenposten.no/meninger/sid/i/op3qg/vaart-sosiale-kastesystem. Accessed June 30, 2013. Amble, N. (2022). Nærhet, samvær og samarbeid i arbeiderkollektivet. En postpandemisk refleksjon over betydningen av fysisk nærvær av andre. (Closeness, togetherness, and cooperation in the workers collective. A post-pandemic reflection on the importance of the physical presence of others. Norsk sosiologisk tidsskrift (Norwegian Journal of Sociology), 6(6), 1–9. Anderman, L., Andrejeweski, C., & Allen, J. (2011). How do teachers support students’ motivation and learning in the classroom? Teachers College Record, 13(5), 969–1003. Bandura, A. (1982). The self-efficacy mechanism in human agency. American Psychologist, 37(2), 122–147. Barnes, E., & Pressey, S. (1929, November 23). Educational research and statistics: The reliability and validity of oral examinations. School and Society, 719–723. Berg, B., & Engen, T. (1976). Gruppeeksamen: En sammenliknende undersøkelse av to alternative eksamensformer [group exam: A comparative survey of two alternative forms of exam] (report no. 3). Hamar Lærerskole Pedagogisk Høgskole. Bhaskar, R. (1993). Dialectic: The pulse of freedom. Verso. Bower, J. (2010). Grading without grading. For the Love of Learning. http://joe-bower.blogspot. com/2010/07/grading-without-grading.html. Accessed January 28, 2021. Broussard, S., & Garrison, M. (2004). The relationship between classroom motivation and academic achievement in elementary school-aged children. Family and Consumer Sciences Research Journal, 33(2), 106–120. Bryan, C. (2004). Assessing the creative work of groups. In D. Mills & K. Littleton (Eds.), Collaborative creativity: Contemporary perspectives (pp. 52–64). Free Association Press.

References

107

Butler, R. (1988). Enhancing and undermining intrinsic motivation: The effects of task-involving and ego-involving evaluation of interest and performance. British Journal of Educational Psychology, 58, 1–14. Carlyle, T. (2010). Novalis. In H. D. Traill (Ed.), The works of Thomas Carlyle (pp. 1–55). Cambridge University Press. Christensen, G. (2008). Eksamen som en kamp om positioner [exam as the struggle for positions]. Unge Pædagoger, 5, 17–23. Clinton, H. (1996). It takes a village. Simon and Schuster. Corno, L. (1993). The best-laid plans: Modern conceptions of volition and educational research. Educational Researcher, 22, 14–22. Csikszentmihalyi, M. (1990). Flow: The psychology of optimal experience. Harper and Row. Dahl, T. I. (2006). When precedence sets a bad example for reform: Conceptions and reliability of a questionable high stakes assessment practice in Norwegian universities. Assessment in Education, 13(1), 5–27. Dobson, S. (2004). Cultures of exile and the experience of refugeeness. Peter Lang. Dobson, S. (2006). The assessment of PowerPoint presentations – Attempting the impossible. Journal of Assessment and Evaluation in Higher Education, 31(1), 109–119. Dobson, S. (2011). Gruppeeksamen i et vurderingsteoretisk perspektiv [group examinations in an assessment theoretical perspective]. In R. Pettersen (Ed.), Om alt skal bli som før, må alle ting forandres: Festskrift til Bjørn Berg [if all is to be as before, then everything must be changed Festskrift for Bjørn Berg] (pp. 179–198). Sebu. Dobson, S. (2017). Assessing the viva in higher education: Chasing moments of truth. Springer. Eccles, J., & Wigfield, A. (2002). Motivational values, beliefs and goals. Annual Review of Psychology, 53, 1091–1132. Enerstvedt, R. (1986). Hva er læring? [what is learning?]. Falken. Goffman, E. (1969). The presentation of self in everyday life. Penguin Dissertations. Goffman, E. (1974). Frame analysis: An essay on the organization of experience. Northwestern University Press. Gottfried, A. (1990). Academic intrinsic motivation in young elementary school children. Journal of Educational Psychology, 82(3), 525–538. Harlen, W. (2006). The role of assessment in developing motivation for learning. In J. Gardner (Ed.), Assessment and learning (pp. 103–118). Sage. Hartog, P., & Rhodes, E. (1936). The marks of examiners. Macmillan and Co. Heidegger, M. (1949). Existence and being. Gateway Editions. Heidegger, M. (1962). Being and time. Blackwell. (Original work published 1927). Heidegger, M. (1995). The fundamental concepts of metaphysics. Indiana University Press. Herd, R. (2016). He waka eke noa. Knowledge in Indigenous Networks. https://indigenousknowledgenetwork.net/2016/06/10/he-waka-eke-noa/. Accessed October 21, 2021. Hickey, D., & Zuiker, S. (2005). Engaged participation: A sociocultural model of motivation with implications for educational assessment. Educational Assessment, 10(3), 277–305. Jovchelovitch, S. (2008). The rehabilitation of common sense: Social representations, science and cognitive polyphasia. Journal for the Theory of Social Behaviour, 38(4), 431–448. Katz-Navon, T., & Erez, M. (2005). When collective and self-efficacy affect team performance: The role of task interdependence. Small Group Research, 36(4), 437–465. Klenowski, V. (2007). Evaluation of the consensus-based standards validation process. Department of Education. Kohn, A. (1999). March. From degrading to de-grading. https://www.alfiekohn.org/article/ degrading-de-grading/ Kohn, A. (2011, November). The case against grades. Educational Leadership. https://www. alfiekohn.org/article/case-grades/ Kohn, A. (2019, June 15). Why can everyone get A’s? New York Times. https://www.nytimes. com/2019/06/15/opinion/sunday/schools-testing-ranking.html Kuisma, R. (2007). Portfolio assessment of an undergraduate group project. Assessment and Evaluation in Higher Education, 32(5), 557–569.

108

4 Motivation, Learning and Assessment

Lacan, J. (1977). Écrits: A selection. W. W. Norton and. Lai, E. (2011). Motivation: A literature review. Pearson. https://images.pearsonassessments.com/ images/tmrs/motivation_review_final.pdf. Accessed January 25, 2021. Lave, J., & Chaiklin, S. (Eds.). (1993). Understanding practice: Perspectives on activity and context. Cambridge University Press. Lejk, M., & Wyvill, M. (2001). Peer assessment of contributions to a group project: A comparison of holistic and category-based approaches. Assessment and Evaluation in Higher Education, 26(1), 61–72. Maslow, A. (1943). A theory of human motivation. Psychological Review, 50, 370–396. Messick, S. (1989). Validity. In R. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). Macmillan. Mirnawati. (2012). Kumpulan Pahlawan Indonesia Terlengkap (Cet. 1). CIF. Nietzsche, F. (1968). The will to power. Vintage Books. Nordberg, D. (2008). Group projects: More learning? Less fair? A conundrum in assessing postgraduate business education. Assessment and Evaluation in Higher Education, 33(5), 481–492. OECD. (2011). OECD review on evaluation and assessment frameworks for improving school outcomes: Country background report for Norway. OECD. Orr, S. (2010). Collaborating or fighting for the marks? Students’ experiences of group work assessment in the creative arts. Assessment and Evaluation in Higher Education, 35(3), 301–313. Pazos, P., Micari, M., & Light, G. (2010). Developing an instrument to characterise peer-led groups in collaborative learning environments: Assessing problem-solving approach and group interaction. Assessment and Evaluation in Higher Education, 35(2), 191–208. Perkun, R., Frenzel, A., & Goetz, T. (2007). The control-value theory of achievement emotions: An integrative approach to emotions in education. In P. Schutz & R. Pekrun (Eds.), Emotion in education (pp. 13–36). Academic. Pintrich, R., & DeGroot, E. (1990). Motivational and self-regulated learning components of classroom academic performance. Journal of Educational Psychology, 82, 33–40. Prøitz, T., & Borgen, J. (2010). Rettferdig standpunktvurdering – Det (u)muliges kunst? [setting a fair overall achievement grade – The art of the (im)possible]. Norwegian Institute of Innovation, Research and Education. Rigney, I. (2019). Early childhood education in the region now known as South Australia prior to European settlement/Irabinna Rigney interviewed by Victoria Whitington [video]. University of South Australia. https://unisa.hosted.panopto.com/Panopto/Pages/Embed. aspx?id=7e3d60fc-2999-4300-ab46-aa700078ef36. Accessed October 21, 2021. Rigney, L.-I. (2023). Global perspectives and new challenges in culturally responsive pedagogies. Super-diversity and teaching practice. Routledge. Rosenthal, R., & Jacobson, L. (1968). Pygmalion in the classroom (Vol. 3, pp. 16–20). Holt, Rinehart and Winston. Round Table on Information Access for People with Print Disabilities. (2019). Guidelines for accessible assessment. The Round Table. http://printdisability.org/guidelines/guidelines-for- accessible-assessment-2019/. Accessed August 8, 2021. Ryan, R., & Deci, E. (2000). Intrinsic and extrinsic motivations: Classic definitions and new directions. Contemporary Educational Psychology, 25, 54–67. Sandemose, A. (1933). En flyktning krysser sitt spor: fortelling om en morders barndom. Tiden. Sartre, J.-P. (1991). Critique of dialectical reason. Volume 1: Theory of practical ensembles. Verso. Schunk, D., Pintrich, P., & Meece, J. (2010). Motivation in education: Theory, research, and applications. Pearson Education International. Sharp, S. (2006). Deriving individual student marks from a tutor’s assessment of group work. Assessment and Evaluation in Higher Education, 31(3), 329–343. Sjögren, A. (2010). Grading children: Evidence of long-run consequences of school grades from a nationwide reform. (working paper no. 7). Uppsala: Institute of Labour Market Policy Evaluation. Skinner, B. (1974). About behaviourism. Knopf.

References

109

Starch, D., & Elliot, E. (1912). Reliability of grading work in mathematics. School Review, 21, 254–259. Stiggins, R. (2001). Student-involved classroom assessment (3rd ed.). Prentice-Hall. Stipek, D. (1996). Motivation and instruction. In D. Berliner & R. Calfee (Eds.), Handbook of educational psychology (pp. 85–113). Macmillan. Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124–1131. University of South Australia. (2021). Professor Lester Rigney. https://people.unisa.edu.au/lester. rigney. Accessed October 21, 2021. Von Krogh, G., Ichijo, K., & Nonaka, I. (2000). Enabling knowledge creation: How to unlock the mystery of tacit knowledge and release the power of innovation. Oxford University Press. Vygotsky, L. S. (1978). Mind in society: Development of higher psychological processes. Harvard University Press. Wenger, E. (1998). Communities of practice: Learning as a social system. Systems Thinker. https:// thesystemsthinker.com/communities-of-practice-learning-as-a-social-system/. Accessed October 21, 2021. Wittgenstein, L. (1961). Tractatus logico-philosophicus (trans. D. Pears & B. McGuinness). New York: Humanities press. (Original work published 1921). Yunkaporta, T. (2009). Aboriginal pedagogies at the cultural interface [EdD thesis]. James Cook University.

Chapter 5

Assessment as Connoisseurship

I have always been interested in empowering children, releasing their creative potential. But first I had to measure that potential. (Torrance, as cited in Hébert et al., 2002, p. 13)

Abstract In this chapter we ask a simple question: How can creativity be assessed? This entails exploring generative mechanisms and structures supporting creativity, its definition and emergence, and also how it might be assessed. Central to this argument will be the point that tests of creativity are not enough on their own; it is necessary to understand assessment as a form of connoisseurship. As examples we will draw upon the taught subjects of media studies, art and music. Additionally, we will have cause to reflect upon the Future Problem Solving Program as originally developed by Torrance, along with the debate on the gifted and the talented, and how it might give rise to exclusionary practices and forms of elitism. Is creativity a spark, a sudden idea or hunch, or an adrenaline rush. If it happens how should we visualise it so it is not lost? Are we able to assess the spark? Should we even try to assess this or wait until it has further crystallised in a visualisation? Consider an example: you are a lecturer knowing the learning objectives and seeking to design a course or the even larger program to which the course belongs. Finding an old envelope, you sketch on the back the spark of an idea, maybe more than one spark, identifying how to design teaching, including the points of assessment. Figure 5.1 shows key programs (to the top left) moving from year 1 to year 3, with some courses that are more advanced (the teeth of the key). In accredited professional programs in, say, teacher training knowledge introduced in year 1 is revisited at more advanced levels through practice on placements. This is the spiral design (top right). Dobson’s preference is for the treehouse design (in the middle of

© Springer Nature Switzerland AG 2023 S. R. Dobson, F. A. Fudiyartanto, Transforming Assessment in Education, The Enabling Power of Assessment 10, https://doi.org/10.1007/978-3-031-26991-2_5

111

112

5 Assessment as Connoisseurship

Fig. 5.1 Dobson’s sketch of some teaching design ideas on the back of an envelope

the figure toward the top) inspired by the twisting architecture of Frank Gehry.1 It emphasises capstones at different levels with the final one like the tree that reaches above others in the jungle, revealing a 360-degree vista. This represents the graduate viewing their journey to the top and now seeing beyond to the post- university world. Simply put, knowledge and skills can be built more horizontally or more vertically. There is a variant not included in this sketch, where the program might be designed so courses from around the globe accumulate toward a degree with multiple credentials, some of which are micro-credentials undertaken online. The assessment points dotted across the different designs on the envelop denote assessment acts that are further informed by sets of criteria, rubrics and other points specifying guidance. But what happens when the lecturer encounters a student who thinks outside the box? What kinds language games of assessment and forms of life are at play? Let us take another example: the subject is media studies where the expression of creativity is a desired learning objective along with objectives such as technical proficiency. The specific student task in one part of the course was to make a complete design suite communicating and marketing a message.2 The students attending Norwegian Upper Secondary School on a Media and Communication program had a few weeks to complete the task and were to work alone. The specified top grade had the following rubric: deliver the design suite in a digital and print format; include a motto, vision, goal; design a logo with four stages of simplification; utilise at least four colours and at least two fonts; and produce a brochure. The lower grades used rubrics with fewer criteria containing these elements. All the students were also to submit a two-page self-assessment defending their choices. What happens when a student fails to meet the criteria for the top grade, but the student’s design suite overall is striking and clearly communicates something special? Put simply, the student scores well on creativity, but more poorly on the technical, ‘In fact, the UTS building was originally scratched onto a napkin with a pen over lunch when he had the simple idea of a treehouse’ (New Atlas, 2015). 2 The example is inspired by the second-year upper secondary school program in media studies and communication in Norway, where the national curriculum goals are specified in the Knowledge Reform National Curriculum from 2006. The teacher in this example further specified the design production goals in terms of communicating and marketing a message. 1

5 Assessment as Connoisseurship

113

skill-based aspects of the task. Would it be fair to give the student the top grade, when they have perhaps deliberately not paid the necessary attention to developing their skill set and meeting the task parameters? Let us recall the opening to the book where we noted that some teachers believe they cannot define quality, but they know it when they encounter it in the performance of a student. We also cited Sadler’s astute comment, which we reproduce here: A work judged as ‘brilliant’ overall may not rate as outstanding on each criterion. This would be necessary, logically and arithmetically, for the work to be assigned the top grade. Conversely, another work that comes out well on each criterion may be judged as only mediocre overall. (2008, p. 165)

These points can be paraphrased in the context of this chapter: teachers might know creativity when they see it, but find it difficult to define it a priori according to its appearance and they may lack ‘ready-at-hand’ assessment criteria to support such an assessment. To say a student has that special ‘X factor’ is to mystify creativity and to retreat from the challenge of both defining and assessing it. There have however been numerous attempts to do so throughout history. This chapter sets as its goal understanding some of the generative mechanisms and structures that support (a) varied definitions, (b) the emergence of creativity in an educational context, and lastly, (c) how it might be assessed and by what language games of assessment. Creativity differs according to the context, such that the demands on creativity for the student doing technical drawing will differ from the kind of creativity required of the student doing a piece of creative writing or the young student scientist trying to create an experiment to solve a problem. This suggests that the construct of creativity might be multifaceted and more complex than the simple dichotomy between those who are considered creative and those who are not. However, suggesting that the construct of creativity can be variable rather than constant might invite a form of linguistic fallacy whereby the creativity construct and its definition of creativity determines the reality, or more specifically what is seen and measured and what is not. In Chap. 1 we drew a distinction between the transitive world of knowing (epistemological) and the intransitive world of being (ontological). We warned against the linguistic fallacy whereby the intransitive is collapsed and the world becomes discourse and a reality existing outside of discourse is discounted. The challenge in this chapter on creativity is that when everything seems to rotate around the definition reality recedes and efforts to formulate the definition to be used occupy centre stage. Thus, how creativity emerges over time and also how it is to be assessed become of secondary importance and the linguistic definition is all- powerful. In this chapter, we resist this temptation, indicative of the linguistic fallacy, by arguing that a focus on the generative mechanisms and structures making possible definitions of creativity and how creativity emerges and is measured in the classroom means that, while the definition is still important, it is balanced by concerns with forces that make all of these things possible. Put in critical realist terms, we are concerned with the role of the real, where generative mechanisms and structures are actualised and give rise to empirical instances of creativity that open themselves to the possibility of assessment by their very existence.

114

5 Assessment as Connoisseurship

To organise the discussion, we will begin by introducing two main arguments, one elitist and one the opposite, popular and connected with democratic views on education. In the former we find the view that creativity should be reserved for the gifted and that generative mechanisms and structures should be put in place to ensure that this is the case. The well-rehearsed work of Gagné (2004) might be considered an elitist perspective based on the transformation of the gifts of the few into talents through self-management, chance and environmental factors such as parental support, teachers and talent development facilities (e.g. a tennis talent academy for those considered exceptionally promising). The democratic view is diametrically opposite, namely that creativity is something of which all are capable. It is immediately evident that the conception, definition, and generative mechanisms and structures of creativity are different in each of these cases. Put differently, the social construction of the construct is connected with the mechanisms and structures supporting its production. At the outset we will delimit the focus to creativity and its assessment in individuals. We are not addressing the issues of assessing the creativity of societies or groups (Villalba, 2008). PISA surveys could not be used for this purpose because the questions in PISA tests are convergent (one right answer to each question), rather than divergent, and in assessment of creativity divergence is considered desirable (more than one correct answer to a question).

5.1 Creativity, Elitism and Democracy There is a long line of thinkers who reserve creative acts for the few. The European narrative begins with the ancient Greeks, where poets were able through imagination to conceive and communicate something new. Artists, on the other hand, merely imitated and were subject to rules. Rome permitted the view that artists also created. The term creativity did not emerge until the last Latin poet and theoretician Sarbiewski (1595–1640) wrote that the poet ‘creates anew’ (‘de novo creat’) (Tatarkiewicz, 1980, p. 248). As an example of the view that creativity belongs to the few, more specifically those practising forms of artistic expression, we will refer to Nietzsche. In five not very well attended public lectures in the early 1870s he formulated a line of argumentation that he would return to in one of his later texts, The twilight of the idols (1889/1990). For Nietzsche the problem with German education was that it was not organised in such a manner as to cultivate the gifted and to let them develop their genius qualities. Part of the problem lay in the way teachers responded to the excellence of gifted students: ‘their individuality is reproved and rejected by the teacher in favour of an unoriginal decent average’ (Nietzsche, 1909, p. 24). Nietzsche was of the opinion that the nineteenth-century school system in Germany with its focus upon the masses was to blame; even more so at upper secondary gymnasium level and in higher education. It failed to develop German culture, and most particularly its language, writing and understanding of the Greek classics. Two quotations exemplify his position:

5.1 Creativity, Elitism and Democracy

115

What the ‘higher schools’ of Germany in fact achieve, is a brutal breaking-in with the aim of making, in the least possible time, numberless young men fit to be utilized, utilized to the full and used up, in the state service. ‘Higher education’ and numberless – that is a contradiction to start with. All higher education belongs to the exceptions alone: one must be privileged to have a right to so high a privilege. Great and fine things can never be common property: pulchrum est paucorum hominum [beauty is for the few]. – What is the cause of the decline of German culture? That ‘higher education’ is no longer a privilege – the democratism of ‘culture’ made ‘universal’ and common. (Nietzsche 1889/1990, p. 74) The education of the masses cannot, therefore, be our aim, but rather the education of a few picked men for great and lasting works. We well know that a just posterity judges the collective intellectual state of a time only by those few great and lonely figures of the period, and gives its decision in accordance with the manner in which they are recognized, encouraged, and honoured. (Nietzsche, 1909, p. 16)

For Nietzsche, creativity is only found in the gifted and the talented who are nurtured so that they can accomplish great works. Creativity is not connected with the rights of all. His position is therefore in line with the view that generative mechanisms and structures (our terms) must be put in place in the school system to cultivate a gifted elite; and from this elite and even smaller group will come the creative geniuses of any one culture. Education should therefore exclude rather than include. What of today? Cultivating geniuses sounds outdated and old-fashioned. However, the idea of cultivating the gifted and talented is still with us (Gagné, 2004) and not merely with reference to the creation of culture through writing and language, which was Nietzsche’s primary interest. The question is whether those in favour of such a cultivation of the gifted and the talented, and its connection with elitism, also believe that the gifted and the talented are the ones destined to be creative to the exclusion of others. An answer to this might be found in the voluntary associations found today in most countries. They seek to secure the interests of gifted and talented children in schools or outside through extra-curricular activities. It is important to note that many of these associations are strongly supported, if not run, by the parents of the gifted and the talented. The associations often offer advice on how to support gifted children and how the respective education laws and institutions of the country may or may not provide special provisions. The associations repeat what appears to be a kind of set pattern on their webpages: a number of presumed myths and truths considered typical of children who are gifted and talented are stated, along with sets of resources to assist parents. Below are two myths followed by two truths: All gifted children are high achievers; they don’t have to work hard for exam success. Gifted children are self-directed; they know where they are heading. Some gifted children are ‘mappers’ (sequential learners), while others are ‘leapers’ (spatial learners). Leapers may not know how they got a ‘right answer.’ Mappers may get lost in the steps leading to the right answer. Gifted students may be so far ahead of their chronological age mates that they know more than half the curriculum before the school year begins! Their boredom can result in low achievement and grades. (Dutchtown Elementary, n.d.)

One of the sets of generative mechanisms and structures supporting a belief that creative children are destined to achieve elite status separate from and excluding

116

5 Assessment as Connoisseurship

others relies upon the concept of the gifted and the talented and the role of ‘motivated’ parents supporting their children, including through associations for gifted children. The parents desire to influence where their children are heading, even if the children themselves are not always self-directed and conscious of where they are going. It is also important to be clear we are not saying these gifted and talented children or their parents are denying the needs of all to be included and their spokespeople often couch their views in the argument that all children, whatever their special needs – in this case being gifted and talented – should have a right to have their special needs met in what otherwise might appear to be a mass somewhat differentiated school system and curriculum. We will return to the question of creativity and the gifted and the talented later in the chapter. In diametric opposition to an elitist conception of creativity, and by extension the view that a group of the gifted and their talents should be separately nurtured, is the view of creativity as a democratic right, something belonging to and achievable by all. Support for this view can be found in different quarters. For example, the UK National Advisory Committee on Creative and Cultural Education (1999, p. 30) favours a definition of creativity ‘which recognises the potential for creative achievement in all fields of human activity; and the capacity for such achievements in the many and not the few’. Dobson’s one-time tutor for his Magistergrad Degree in Sociology, Professor Regi Enerstvedt, inspired by Vygotsky, Luria and Leont’ev, was fond of reformulating Sartre in tutorials and in his writing. Sartre might have said humans possess one ultimate existential freedom: to acknowledge and act upon choices (good faith) or to refuse to acknowledge and take choices (bad faith). Enerstvedt, on the other hand, argued that humans are marked by an even more fundamental existential truth, that of undertaking acts of creation. In his words: ‘Sartre says that human freedom has only one limit, namely freedom itself – humans are forced to be free; I say that humans face no limits to their choices, except one: they are always forced to be creative’ (Enerstvedt, 1982, p. 19). It forms one of the main threads in his untranslated magnum opus Mennesket som virsomhet (Humans as active) (Enerstvedt, 1982, p. 19) where to be robbed of the ability to create is indicative of an experience of alienation. In Enerstvedt’s view humans interact with others in subject–subject relations mediated by different material and idea-based artefacts, such that creativity is reliant and dependent upon interactions with others.3 In the elite view and in much of the research on creativity it is the individual who is the starting and final point. Simply put, in the democratic conception of creativity by contrast it is the group and social relations that are the starting point. This is not to say that the elite view totally lacks a social relations perspective and the democratic view totally lacks an individual perspective. It is instead to assert that their point of origin and primary focus remains different. An example of the democratic conception of creativity can be found in team sports such as football, as opposed to an individual sport such as chess. A famous Norwegian football trainer, Neils Arne Eggen (Eggen & Nyrønning, 2003), once formulated creativity in something like

Enerstvedt drew his philosophical inspiration from the early writings of Marx.

3

5.2 Definition of Creativity

117

the following terms: by all means work on your own area of expertise, where you are strongest and most creative, but never forget to make the others play better. It has passed into common folklore in Norway. Unpacking the saying, in football it might be thought that the main goal is for the player to focus on their own individual skills, but a more important goal lies elsewhere; it is to improve the skills of teammates and, when they do this the reverse occurs, all improve as a team and as individuals. In other words, in making them (the others) play more creatively the whole team will notice the results. An additional example of the democratic perspective is found in the desire to assign a single shared grade for a group piece of project work. But, even in such assessment acts, the elite perspective raises its head when there is a desire to assess and grade each group member’s contribution to the group, both in terms of the work process toward the product and in the final product itself (Nordberg, 2008; Lejk & Wyvill, 2001; Sharp, 2006). The main point to keep in mind as we develop our argument is that creativity can have a predominant focus and starting point in characteristics of the stratified self (e.g. the elitist view with its focus on the gifted individual) or alternatively the point of origin of creativity can be located in the social relations level of the social cube and by extension in its institutional levels (the democratic view of creativity). At the level of material interactions, creativity is connected with the manner in which curriculum or other teaching resources are activated and made relevant. Differing institutional mechanisms and structures in different countries will generate and support different combinations and practices of creativity as elite (reserved for the few) or democratic (a right for all).

5.2 Definition of Creativity In this section we will follow the line of argumentation introduced above, exploring in more detail the differing mechanisms that give rise to creativity. A definition that anticipates the concern with assessment is as follows: the persons capable of creative production correspond to a distinctive type of high ability or specific talent. When we refer to creativity, we have in view a complex feature in which there are involved cognitive and non-cognitive [e.g. family, peers, teachers, the unforeseen] variables, but they are both essential for the development of creative thinking. This fact determines us to reflect on the necessity to elaborate some instruments which would allow the identification of the creative subjects, as the results of intelligence or efficiency tests are irrelevant in this. (Fernandez and Peralta, as cited in Cosmovici, 2006, p. 64)

With an emphasis on mechanisms by which students learn as the defining characteristic, Szabos (1989) proposes a distinction between bright learners and the gifted (Table 5.1). In this dichotomy the creative student is defined as the gifted student who is able to move beyond an accurate, technical reproduction. Of note is that the creative learner prefers to interact with those who are older and more adult. Kingore (2004) has argued that this dichotomy is too simplistic. The gifted are credited with wild,

5 Assessment as Connoisseurship

118 Table 5.1 Differences between bright and gifted learners Bright learners Knows the answers Is interested Has good ideas 6–8 repetitions Understands ideas Enjoys peers Copies accurately Absorbs information Technician Is pleased with learning Enjoys straightforward, sequential presentation

Gifted learners Asks the questions Is highly curious Has wild, silly ideas 1–2 repetitions for mastery Constructs abstractions Prefers adults Creates a new design Manipulates information Inventor Is highly self-critical Thrives on complexity

Source: Szabos (1989)

silly ideas, but in reality this is a characteristic of creative thinkers and not all gifted students exhibit this character trait. Secondly, Szabos suggests that bright students enjoy straightforward, sequential presentations, when in reality gifted students may also like these kinds of presentations. Thirdly, it is asserted that bright students like the company of peers and the gifted prefer adults. This might imply that the gifted have such poor social skills that they can only communicate with adults, who in their turn might anticipate and accommodate this. Kingore argues instead that it may be more the case that the gifted seek idea-mates rather than age-mates. They might well seek the company of peers if they understand shared ideas. Kingore proposes a tripartite set of categories, whereby creative learners are defined in and through a contrast with significant others, in this case high achievers and the gifted. Put differently, it might appear that the teacher or researcher requires others to define the gifted learner, rather than defining them in themselves. Table 5.2 reproduces 10 of Kingore’s categories, where the last category is named creative thinker. For some readers, this might suggest overly cognitive associations and a neglect of non-cognitive variables. Our proposal to Kingmore would be to call this category creative learner, which invites and captures a wider, more inclusive construct, including learning through all the bodily senses. Kingore underlines that a student in a particular context might exhibit learning traits from one, two or all three categories. They are not, therefore, mutually exclusive. Figure 5.2 shows a practical classroom example of these three categories in action. We can see from Kingore’s proposal that creativity is greatest in the creative learner, but the gifted still demonstrate originality, while the high achievers are the most pragmatic and strategic, and seek to remain ‘on task’. Where the high achiever completes work on time the creative learner may appear to be marked by idea overload, requiring feedback from the teacher to assist in ‘clearing the forest’ in order to find a way out and move toward to the goal. Alternatively, they may reach the goal, but in an unconventional manner. Thinking outside the box they might risk not

5.2 Definition of Creativity

119

Table 5.2 Contrasting the learning of the high achiever, gifted learner and creative learner High achiever Remembers the answers when asked Works hard to achieve Performs at the top of the group Needs 6 to 8 repetitions to master Is attentive, highly alert and observant Learns with ease Is a technician with expertise in a field Is accurate and complete Enjoys the company of age peers Is pleased with own learning Gets A’s

Gifted learner Poses unforeseen questions

Creative thinker Sees exceptions

Knows without working hard Is beyond the group

Plays with ideas and concepts Is in own group

Needs 1 to 3 repetitions to master

Questions the need for mastery

Is selectively mentally engaged; anticipates and relates observations Already knows Is an expert who abstracts beyond the field Is original and continually developing Prefers the company of intellectual peers Is self-critical

Daydreams; may seem off task; is intuitive Questions: What if … Is an inventor and idea generator

May not be motivated by grades

Is original and continually developing Prefers the company of creative peers but often works alone Is never finished with possibilities May not be motivated by grades

Source: Kingore (2004, p. 47)

What do YOU really want? HIGH ACHIEVER

What I would like to do is... GIFTED LEARNER

Fig. 5.2 Responses to an assignment. (Source: Kingore, 2004)

What about... CREATIVE LEARNER

120

5 Assessment as Connoisseurship

scoring points according to the set of predetermined assessment criteria, as we suggested at the beginning of the chapter. At this point we wish to summarise our approach to defining creativity in the more elitist, exclusionary sense. This also entails transcending the Gagné (2004) inspired distinction between the gifted and talented. Accordingly, we have gradually introduced a distinction between the gifted and the creative learner, such that they can no longer be seen to be identical. We have also moved beyond a simple one or two sentence definition of creativity and sought to reveal the mechanisms by which students are creative learners. Our argument is that the student who is creative in the elitist sense may or may not actualise and hence reveal a particular learning trait in a specific context. Put differently, we may have defined creative traits as particular learning mechanisms (e.g. playing with ideas and concepts) that are causal and result in the phenomenon (being creative), and by this moved from the empirical to the level of the real, but the student may not reveal this trait in all contexts. This gives rise to obvious challenges in the context of assessment, since the creative person may not respond to the assessment act, failing to produce enough or any of the desired information for the assessor to arrive at a judgement. Creativity may be missed, if for example IQ tests are used, as they are not particularly well- suited to revealing creativity that is not already anticipated in the test questions and pre-agreed correct answers. An answer to this challenge comes with a focus on matching the assessment to the structures supporting the emergence of creative learning mechanisms in different educational contexts. This might refer to the manner in which the teacher organises classroom activities, such that a more continuous form of assessment becomes possible, for example assessment for learning over a longer period of time to chart the emergence of creativity. It might also entail drawing upon and developing the student’s skills in self-assessment, for example assessment as learning, such that the student can self-assess or peer assess creative acts. We will return to these issues later in this chapter after considering how creativity can be defined in non-elitist, democratic terms. As we have seen above, creativity tends to be connected in some way to providing novel solutions. In research not restricted to the elitist perspective on creativity a typical definition of creativity underlines precisely this: ‘the power of the imagination to break away from perceptual set so as to restructure or structure anew ideas, thoughts, and feelings into novel and associative bonds’ (Khatena & Torrance, 1973, p. 28). In MacKinnon’s (1978) classic study of architects it was found that creative people scored highly on inventiveness, tackling ambiguity and lack of closure. The trait of novelty was uppermost, along with independence, enthusiasm, determination and the will to work hard. This desire for the novel was understood as indicative of divergent thinking,4 which is thinking that explores new ideas, options and opportunities, and might be seen in embracing risk taking. To be creative seems to The concept of divergent thinking (seeing many solutions to a problem) was proposed by Guilford, an IQ and creativity researcher. He identified four characteristics: ‘fluency (the ability to produce great number of ideas or problem solutions in a short period of time); flexibility (the ability to simultaneously propose a variety of approaches to a specific problem); originality (the ability to produce new, original ideas); elaboration (the ability to systematize and organize the details of an idea in a head and carry it out)’ (New World Encyclopedia, 2018). 4

5.2 Definition of Creativity

121

be understood as thinking outside of the box, along with a number of accompanying learning mechanisms. However, there is a danger, as we have already indicated, that creativity, understood as something novel, unusual and scoring high on originality might be reserved for an elite who are deemed to have the potential to become creative or to be already creative. So, how might a democratic view of creativity be defined to include all? Our argument is founded upon three mechanisms of learning we consider important in this respect. They are selected because they mirror our interest in the processes by which learning takes place and take account of the four-planar social cube (of material transactions with nature, social interaction between agents, social structures influencing social relations through power, discourse and norms, and the self laminated or structured by consciousness, unconsciousness and affective forces). The first learning mechanism is the action of making meaning. In an essay entitled ‘The pedagogue as translator in the classroom’ Dobson (2012) was interested in the manner in which students learn through acts of translation. Knowledge is not just communicated by the teacher in a neutral, mirror-like manner. A teacher will talk and demonstrate in their own particular manner, to ‘leave his or her mark upon the original and mark their difference’ (p. 277). Similarly, the student does not simply ‘bank’ the knowledge in a safe deposit box as if it were share certificates in a company. As they learn they must translate the knowledge, giving it a personal mark, making it mean something for them. This making of meaning is a creative act. As Kiraly has put it: ‘All input from the environment, including a teacher’s utterances, will have to be interpreted, weighed and balanced against each learner’s prior knowledge’ (2003, p. 10). To make meaning is thus a creative act of learning whereby the knowledge is translated to mean something for the student; it gains a personal mark of significance. This experience is open to all, from the child reading a sentence for the first time to the university undergraduate after many years of study in primary and secondary schooling finally managing of time in primary to understand the maths equation. The second learning mechanism is about the function of the creative act and how the function directs the creative learning. It can be instrumental in the sense of seeking to resolve a problem. Guilford highlighted this point: problem solving and creative thinking are closely related. The very definitions of these two activities show logical connections. Creative thinking produces novel outcomes, and problem solving involves producing a new response to a new situation, which is a novel outcome. (1977, p. 161)

But, once again, problem solving as a creative act can be open to all as students seek to make meaning. To be clear, the outcome is novel for the person concerned and does not have to be novel in the sense of the problem never having been resolved before. It is also important to underline that the creative act can also have a non- instrumental function. Put differently, it need not be adaptive to reality. This is reminiscent of creativity as a form of play, where the goal lies within the activity and not outside it, as in instrumental activity. It can occur when the student experiences

122

5 Assessment as Connoisseurship

activity that simply flows (Csikszentmihalyi, 1990) without boundaries in time or space and creativity seems to be present without effort. When the creative act is desired and part of an instrumental act (e.g. problem solving), the outcome is in focus. If on the other hand, the process is in focus in the creative act, then a prior goal to provide direction may not exist and is not considered necessary. This sounds counter-intuitive, but what we are seeking to communicate is that the process can be novel and creative, even when the product or goal is not. The third learning mechanism at play is to do with creativity as an identity- enhancing experience, such that the stratified self is enriched. It can be likened to a feeling of surprise as a student senses that they have ‘overcome themselves’ (Dobson et al., 2006), reaching at the same time a new self-awareness and sense of mastery. This can also be a shared experience; it is not necessarily restricted to the individual student. Moreover, it is something that all students can potentially experience. Overcoming the self requires that the student is willing to cultivate what Goffman (1967, p. 152) called chanciness and at the same time relax the hold normally kept on maintaining rather than developing self-identity: ‘Chanciness … [the] individual must ensure he is in a position (or forced into one) to let go of his hold and control on the situation, to make in Schelling’s sense a commitment. No commitment, no chance taking.’ A good Nordic example would be strapping on cross-country skies for the first time and skiing through the forest on a frosty morning. By the end of a few stumbles may have occurred, but you have picked yourself up and carried on. Your sense of identity and mastery is enriched. You took a chance. Cross-country skiing does not have the same entry level requirements, often economic, as with downhill skiing: the ski-boots, skies and poles are less expensive; a ski-lift card is not required as the track can be in a snowy forest where the summer walking path between trees is now covered by snow and lastly, no special lessons are required before starting. Figure 5.3 illustrates the three mechanisms and the learning processes with which they are connected.

Meaning making (e.g. translating and making personally significant)

Function (goal directed vs Process)

Identity (e.g. ‘self-overcoming’, ‘chanciness’, ‘mastery’)

Fig. 5.3 Creativity as learning processes connected with meaning making, identity enhancement and function

5.2 Definition of Creativity

123

The mechanisms of this tripartite understanding of creativity, including all students, are to some extent mirrored in Rhodes’ (1961, p. 306) conception of creativity. Although, having said this, she gives a slightly different content to the person (identity in our conception), she has the added point of context, or press as she calls it, and she lacks meaning making as a mechanism of creativity. Her multifaceted conception5 identifies four components, commonly known as the 4 P’s: • person (personality characteristics or traits of creative people) • process (elements of motivation, perception, learning, thinking and communicating), which constitutes the non-instrumental conception of function in our theory • product (ideas translated into tangible forms), constituting the instrumental understanding of function in our theory • press (the relationship between human beings and their environment). The importance of context, understood as press above, is however present in our conception on three counts: firstly, the functional mechanism of process and product takes place in the space of the social cube where interaction and socially constructed institutions are present. Secondly, for the stratified self, experiencing new self- awareness need not be a solitary endeavour. It can be a shared experience that reflects the social context of the creative act. In broad terms the interaction between person and press, or between identity and the social function in our conception, entails ‘recognizing that creative behavior is influenced by motivational as well as situational factors’ (Rhodes, 1961, p. 306). Lastly, identity includes what is commonly called biographical events. This refers to things that happen or are experienced during one’s life that can result in creative acts. These happenings and experiences always take place within a context and not in a vacuum. Thus, when Davis (1998) asserts that participating in theatre or having an imaginary friend are predictive of later creativity, these activities are anchored in contexts that support their emergence. In a democratic sense, it might be formulated as, ‘life comes in the way’ and person shows their true character and integrity by the manner in which theye (re)solve the challenges thrown at them. Summarising our argument so far in this chapter, creativity can be defined in terms of mechanisms of learning that seek to make it exclusive and generative of elitist positions in society. According to this conception a number of sub-categories Another multifaceted conception is given by Treffinger et al. (2002) who identify four interdependent components: generating ideas (e.g. divergent thinking or creative thinking abilities and metaphorical thinking or making new connections); delving deeper into ideas (e.g. analysing, synthesising, reorganising or redefining, evaluating, seeing relationships, desiring to resolve ambiguity or bringing order to disorder); openness and courage to explore ideas (includes some personality traits relating to one’s interests, experiences, attitudes and self-confidence, along with sensitivity, aesthetic sensitivity, curiosity, sense of humour, playfulness, fantasy and imagination, risk taking, tolerance for ambiguity); lastly, listening to one’s ‘inner voice’ (e.g. traits that involve a personal understanding of who you are, a vision of where you want to go, and a commitment to do whatever it takes to get there). The characteristics for the category called listening to one’s ‘inner voice’ also include awareness of creativeness, persistence or perseverance, self-direction, introspection and work ethic. 5

124

5 Assessment as Connoisseurship

are possible, each with their own mechanisms of learning and associated language games of assessment and forms of life (concepts introduced in Chap. 6 which offers a critical appreciation of the work of Sadler on assessment): high achievers, gifted learners and creative learners. Genius, Nietzsche’s favoured concept, is less fashionable today in some circles. Alternatively, creativity can be defined according to learning mechanisms that are more inclusive, where meaning making, function, and a sense of personal and shared identity give rise to mechanisms of learning.

5.3 The Emergence of Creativity in an Educational Setting If creativity is considered to be inherited, a form of natural talent, then there is little point in education to support its emergence. Even Nietzsche (1909) believed that the gifted student needs to be first educated in self-discipline and the basic skills of the subject before they should be allowed the freedom to develop their creative skills and knowledge. In this section we will explore how creativity emerges in students, with a particular focus upon examples of structures which must be in place over a longer period of time. Syed (2010) in a book entitled Bounce: How champions are made charts and at the same time syntheses ideas on how champions in sport are made. He joins a long list of popular writers looking for the clue to success, creativity and winning. Other examples are Coyle’s book The talent code: Greatness isn’t born, it’s grown (2009) and Colvin’s Talent is overrated: What really separates world-class performers from everybody else (2008). Syed reflects upon research about how young aspiring musicians realise their talent. Why, asks Syed, is it the case that from a group of highly motivated and skilled young musicians chosen in their early teens to attend prestigious academies, only a handful become successful in cultivating their creative skills? He identifies three important ingredients. Firstly, many hours of practice are required. The view popularised by Ericsson et al. (1993) and Gladwell (2008) about 10,000 h of practice in a chosen skill, springs to mind. Jimi Hendrix told of his endless hours of practice as a young teenager; David Beckham spent hours on the football field practising, long after his peers had gone home. Structures have to be in place that support such practice, including materials such as equipment, available time making it possible to pursue the interest and, most importantly, the practice must be goal- directed toward improvement. It is not simply the kind of practice associated with attending a training studio several times a week. In critical realist terms this is the material aspect of the social cube, encompassing material interactions with nature. The second element that must be present is a tutor or coach who can advise and teach. In the context of creativity in the classroom, the teacher fulfils this role. The structure of the relationship of the student to the tutor/coach is one of a mentee toward their mentor. It incorporates the social interaction level of the social cube and becomes institutionalised in social roles on the next level of the social cube. Torrance (1981), in his longitudinal studies of creativity, found that the presence of

5.3 The Emergence of Creativity in an Educational Setting

125

a mentor was a good predictor of creativity among students, especially with respect to the quality of the acts of creativity. More can be said about mentors as we move to the third significant structure that must be present to inculcate and support the creativity of students. In addition to tutoring and 10,000 h of goal-directed, deliberate practice, Syed argues that the mentor must set challenges that move the aspiring student to continually strive to break new ground. Put differently, the student must desire to push themselves at the upper end of what they can, reaching beyond the comfort zone. The stratified self belonging to critical realism’s fourth component of the social cube must therefore be motivated. Vygotsky’s (1986) zone of proximal development is relevant in this respect, whereby the teacher as mentor helps the student reach a new level today, which tomorrow they will reach without assistance. Epstein (2019) in his bestseller Range: How generalists triumph in a specialised world contends in opposition to Syed that it is not specialists we should focus upon, but the opposite, who are more often than not neglected, namely, the generalists who sample much before deciding on a specialism for deliberate practice. From sport he highlights how the dedication of golfer Tiger Woods from an early age must be balanced against the manner in which the tennis player Roger Federer sampled many sports before circling in on tennis. I (Dobson) recall that my father worked in a copper mine in Zambia in the 1960s. To become general manager of the whole mine he recounted that it first required working in all the departments. It was a prerequisite for building up knowledge and experience in all aspects of the mine’s activities including production, HR, safety, finance, engineering, maintenance and so on. The function of creativity in such a conception of the mentor is understood in instrumental terms. But, as we argued above, there are also non-instrumental conceptions of creativity, and so too with respect to mentoring. Here we are thinking of what happens when the student is in the presence of an expert musician or artist as they demonstrate their skills. It is not a one-off or infrequent presence, for example at a concert they are holding or a master class. We have in mind a longer time scale, perhaps even acting as the apprentice or assistant to the mentor over several months or even years. The role of the student to the teacher in the classroom can also be one of an apprentice to a master over many years. The following is another example: once I (Dobson) came across a colleague who was studying a business doctorate in Spain while himself living abroad. I was told in a manner reminiscent of a figure in Kafka’s The trial (1956) that he would soon have enough professors on his side so his dissertation could be passed. We have never investigated the truth of the colleague’s account and what it might mean for the transparency, not to say the least the validity, of a doctorate assessed on the basis of support from a group of professors. But it can be understood in other terms, as an expression of a doctoral student who has completed their apprenticeship with the professors and the time has arrived for a rite of passage as they cross to the community of scientists. The mentor does not so much impart a distinct set of skills, even though they may be learnt. They rather model a way of approaching a new, unforeseen

126

5 Assessment as Connoisseurship

challenge. This way of seeing constitutes a form of tacit knowledge. It is not easily made explicit and verbalised, as is the case with the instrumental function of the mentor. This tacit tradition is described in the well-known work of the Dreyfus brothers (1980), as they detail the move from beginner to expert. What once had to be learnt consciously becomes automated and part of the body, almost instinctive or a reflex. Understanding Syed’s point in the context of creativity, what is required are structures that support sustained practice over a long period, mentoring and the setting of rising challenges at the upper end of a student’s level of developing competence. In adding that non-instrumental components are important, we are moving toward our next argument, namely that tacit components also have to be both learnt and mobilised. They have to be incubated and it is possible to envisage a classroom pedagogy that has precisely this as its goal: warming up an array of skills, some tacit and some explicit, all coming together later in the moment of creative activity. Torrance proposed the incubation model of teaching creativity along with the Future Problem Solving Program. The model he developed in 1966 was applicable to the gifted and talented, as well as to other students. In this sense it was both elitist and democratic in ambitions and intent. Torrance’s interest in the field of creativity can be traced to his work many years earlier as a young teacher with only a 2-year provisional teaching certificate. In one particular year he had faced difficulties in teaching two of his teenage students. They would later become successful professionals, one as Secretary of Labour in President Ford’s Cabinet. But in the school system what he sensed were emerging creative skills that were ignored and considered problematic. The school lacked a structure capable of offering suitable support. The incubation model was Torrance’s proposal for providing such a structure. He developed guidelines for teachers where the student was to be warmed up prior to creative thinking by undertaking activities connected with heightening anticipation (e.g. create the desire to know; heighten anticipation and expectation; get attention; arouse curiosity; tickle the imagination; and give purpose and motivation), followed by activities designed to deepen expectations (e.g. dig deeper below the surface of the topic to discover what is hidden and unexplored; look twice at a problem in search of new associations; and pick out crucial information and discard the unneeded). One of the authors, Dobson, is reminded of his father who took early retirement with a slump in the granite stone-quarrying business after all the major motorways had been built in England. He began painting, adopting the ‘painting by numbers’ technique (Seelye, 2019). To begin with it was to pass the time. After a while and through much patience and practice he became an accomplished painter and transcended the need for ‘painting by numbers’ templates. He was able to sell his watercolour seascapes in local galleries to tourists. Before starting every day, he would loosely sketch anything that came to mind. He would always crumple up the attempt and throw it away; sometimes without even looking at it. This was his period of incubation, warming up for the creative ‘aha’ to take place. Thereafter, it is necessary to extend the process of creativity, fleshing out the details and learning new aspects of the ‘aha’. This is the ‘keeping it going’ stage. Torrance and Safter (1990) suggest activities such as ‘having a ball’ in

5.3 The Emergence of Creativity in an Educational Setting

127

the sense of making sure the activity continues to be enjoyable and full of humour, ‘singing in one’s own key’ (letting students explore how the newly gained wisdom can be made fruitful for their own personal challenges in life) and ‘shaking hands with tomorrow’ (students must use the creative ‘aha’ to propose a solution to a future problem). The incubation model is nothing new. In 1926 Wallas already understood creativity as an emergent process characterised by four stages: making preparations, whereby the problem was detected and data was gathered; incubation as the student steps back for a moment or longer period of time; illumination when a new idea or solution emerges, sometimes in a surprising or unexpected manner; and lastly, verification as the new idea or solution is examined or tested. What characterises the views of Torrance and Wallas is goal-directedness in the sense that the student is directed toward a problem that requires resolution. In our view, what Torrance and Wallas are striving towards, but do not always realise, is the following: the student is not so much directed to the problem, but the problem draws them in and controls what they are called upon to notice. Educationalists are familiar with the former strategy, which is often called problem- based learning (Hmelo-Silver, 2004). It does tend to begin with a pre-given problem, rather than the problem emerging. Students either alone or in groups are set the task of discovering and exploring different solutions. It is worth keeping in mind the advice of Blaimaire and Peterson (2004, p. 152): ‘engagement with prior knowledge, and linking that to purposes and values, can fuel the imaginative grappling with problems that can become creative’. Torrance developed the Future Problem Solving Program (FPSP) in the 1970s to assist all students, both the so-called gifted and regular students. The goal was for them to learn how to think more creatively and not merely to learn what they think they have to think. His intention was to marry the idea of creative thinking with a future orientation. Central to the program is the view of a problem requiring resolution. Torrance’s own definition of creativity is pertinent: Creativity is a process of becoming sensitive to problems, deficiencies, gaps in knowledge, missing elements, disharmonies, and so on; identifying the difficulty; searching for solutions, making guesses, or formulating hypotheses about the deficiencies; testing and retesting these hypotheses and possibly modifying and retesting them; and finally communicating the results. (1974, p. 8)

His original program has grown and been adopted by many states in the USA, as well as in many other countries.6 Schools taking part are able to focus upon the four problem topics proposed by the program for that year. Students working in groups dig deep into present day issues or dilemmas connected with the problem and adopt a future-oriented perspective. Six stages are followed: (a) identify challenges in the future scene, (b) determine the underlying problem, (c) produce solution ideas to the underlying problem, (d) generate and select criteria to evaluate the solution

To name some: Australia, Canada, Great Britain, Hong Kong, Japan, Korea, Malaysia, New Zealand, Portugal, Russia, Singapore, Turkey and India. 6

128

5 Assessment as Connoisseurship

ideas, (e) evaluate the solution ideas to determine the best action plan, and (f) develop a plan of action. The program also caters for those wishing to apply the six- stage process to a pressing community problem, or those wishing to write a futuristic scenario on one of the annual FPSP topics. With the program’s expansion over the years both in the USA and abroad concerns have been raised about the training of coaches and the feedback received by participants (Treffinger et al., 2012). The incubation model and the Future Problem Solving Program, today known as the Future Problem Solving Program International (FPSPI), are two examples of structures that can be put in place to support the emergence of creativity. It is, however, important to note that what disappears in these approaches is the non- instrumental view of creativity introduced earlier in the chapter: creativity as a form of play where the activity itself is in the centre, rather than achieving some goal beyond the activity. When truly lost to play it takes over; we are totally engrossed and the experience of play is effortless. Dobson’s friend the Norwegian educational philosopher Kjetil Steinsholt (inspired by Gadamer 1960/1989) talks of how the structure of ‘play plays you’ rather than ‘you playing play’. In Kjetil’s own words along with a co-author: “It is the play that takes precedence. It is the master. The game has priority in relation to the consciousness of the players. It has its own active life and draws whoever plays with it into a surprising and exciting play world. Play is therefore to a lesser extent something we do. Rather, it is something that is done to us. We are talking about an event that we are caught up in. In all play, the player is played.” (Steinsholt & Ness, 2016: 163)

However, one caveat is worth noting: even though we might be enthusiastic about the opportunities for creativity in play, we must also be aware that some play is extremely rule-governed and the opportunity to modify the rules of play can be limited. In play-inspired situations not everything can be voiced or must necessarily be voiced in a clearly defined and delimited problem-based sense. Not everything can be forced into a predefined box, waiting for a set of assessment criteria to be developed, piloted and used. This returns us to the point raised at the beginning of the chapter. When the student thinks outside of the box, when they are creatively thinking beyond the grade and the parameters of the set task, in the sense of the creative learner as defined by Kingore who seems less interested in the grade and demonstrating mastery, something happens. Assessment becomes less straightforward. After a meandering journey, we are now in a position to approach the goal of this chapter: exploring the assessment of creativity.

5.4 Assessing Creativity We have hesitated to adopt a single definition of creativity universally applicable to all groups, which seeks ‘to recognise or identify creative characteristics or abilities among people or to understand their creative strengths and potentials’ (Treffinger et al., 2002, p. xi). Instead, we have adopted a critical realist stance to identify the generative mechanisms and structures that make it possible for different kinds of

5.4 Assessing Creativity

129

creativity to exist. Put differently, we have sought to explore the processes that make creativity identifiable on the personal and institutional level. With this in mind we have argued for two broad mechanisms and accompanying structures: one resulting in creativity as elitist and exclusionary, and the other the opposite, namely creativity as achievable and a possibility for everybody. When assessing creativity two issues are at stake and highlighted by many. On the one hand, assessors must select those considered to already possess or demonstrate the potential to be creative. On the other hand, they must look at and make an assessment of the creative process and its product. We consider both these issues are equally relevant to the elite and democratic conceptions of creativity and we will explore each of these in turn. So-called expert-informed opinions can be used to select creative students, but often this is undertaken by teacher nomination (Lassig, 2009). However, as Kim has noted: teachers are apt to identify students who are achievers and teacher pleasers as gifted rather than creative students who may be disruptive or unconventional. Even worse, energetic and unconventional students can be seen as having Attention Deficit Hyperactivity Disorder (ADHD) by their teachers. As a result of scholastic expectations and the needs of creatively gifted children, the potential of creatively gifted students may be overlooked by teachers who view them as ‘trouble-makers’ rather than successful young scholars. (2006, p. 9)

Put simply and to recall Kingore’s (2004) distinction, there may be a tendency that mechanisms and structures of teacher selection favour high achievers achieving the highest grades, as opposed to those who might be gifted and creative learners. What of using IQ tests? In the 1960s the use of IQ testing was popular; those scoring more than 130 were considered to be gifted. But when creativity refers to a broad set of skills the IQ test can be too narrow with its measures of verbal and non- verbal reasoning skills. Moreover, research suggests that a minimum IQ exists as a threshold, in the region of 120, for scientific inventions or elaborations (Getzels & Csikszentmihalyi, 1976). But beyond this the relationship disappears: those with higher IQ scores are not necessarily those who are most creative. Renzulli (1978) makes a similar point in his view that IQ is an insufficient measure of giftedness; above average ability, task commitment and creativity are important. To avoid missing students by relying solely on IQ scores, and to reduce the bias of teacher nominations, an additional selection instrument might be used. A number of tests for creativity exist. Here are some examples, more or less randomly selected that consider the following topics: Test of Creative Potential (Hoepfner & Hemenwaz, 1973) and Exercise in Divergent Thinking (Williams, 1980). We have chosen to consider Torrance’s test on the grounds that it is well-known and has been widely used. He developed what has come to be known as the Torrance Test of Creative Thinking (Torrance, 1966). The origins of the test can be traced to 1958 when Torrance and colleagues were engaged in developing a longitudinal test of various kinds of giftedness among children. He chose to focus on measuring creative skills. Many have regarded him primarily as a psychometrician, but his interest lay in nurturing creativity, as evidenced in his incubation model. He thus regarded tests as a means to measure creativity, not as an end per se. He proposed criteria to select appropriate activities for the test participants. Each activity was to be:

130

5 Assessment as Connoisseurship

Fig. 5.4 Image of a jellybean

(a) a natural, everyday process; (b) suitable for all ages and levels of educational from kindergarten to graduate; (c) easy enough for the young or those with learning difficulties to make a response, yet challenging enough for the most skilled; (d) unbiased with respect to gender and race, and open-ended enough to permit responses from those with different experiential backgrounds; and lastly, (e) fun. (Torrance, 1966, p. 10)

Points (c) and (e) are important because they indicate that the tests are to be inclusive for different kinds of student and, additionally, not necessarily a source of anxiety. With colleagues he developed two sets of activities, one verbal and the other figural. The former consists of five types of activities: ask-and-guess, product improvement, unusual uses, unusual questions, and just suppose. For each the participant was shown a picture and asked to respond in writing. The latter figural activities, 10 min for each, consisted of picture construction with a jellybean or pear as stimulus; using incomplete figures to make a picture; and repeated figures, specifically lines or circles to be used to form a picture (Torrance, 1974, 2001). Respondents are encouraged to give unusual and at the same time detailed responses and above all to make meaning. This last-mentioned point echoes our earlier point about making meaning as a creative act (Fig. 5.4). Torrance explored different ways of scoring the tests. To begin with he was inspired by Guilford’s (1956) view that creativity demonstrates divergent thinking in four directions: (a) fluency – including the number of relevant responses, (b) flexibility – the number of different categories/types of responses, (c) originality – the number of unusual yet relevant ideas, and (d) elaboration in terms of the details used to develop an answer. In the 1980s, he streamlined the scoring system (Ball & Torrance, 1984) of the figural test to five norm-referenced scores and 13 criterion- referenced scores7 and the verbal test was simplified to fluency, flexibility and originality to enhance inter-rater reliability. The validity of the Torrance test has been assessed on longitudinal data originally collected by Torrance in 1958. A study of the original participants 40 years later showed that the Torrance test was a better predictor of creativity than an IQ test (the Stanford-Binet Intelligence Scale, The norm-referenced scores measure fluency, originality and elaboration, with the addition of abstractness of titles, as a verbal measure on the figural tests, and resistance to premature closure, as a gestalt measure of a person’s ability to stay open and tolerate ambiguity long enough to come up with a creative response. Flexibility scoring was eliminated because it correlated very highly with fluency. Norms have been developed in the USA and for a number of other selected countries by local test users (e.g. France, Turkey, Taiwan). The 13 criteria measure creative strength and sought to capture manifestations of creativity missed by the norm-referenced scores: emotional expressiveness, storytelling articulateness, movement or action, expressiveness of titles, synthesis of incomplete figures, synthesis of lines or circles, unusual visualisation, internal visualisation, extending or breaking boundaries, humour, richness of imagery, colourfulness of imagery, and fantasy. 7

5.4 Assessing Creativity

131

Wechsler Intelligence Scale for Children, or California Test of Mental Maturity) (Plucker, 1999). Some commentators, however, have criticised the Torrance test for over-relying on the fluency factor (Kim, 2006). Torrance warned against viewing the test results as a static measure of a student’s creativity. They were to be used to focus on the nurturing of creativity. Secondly, even though the tests have been predominantly used to select the gifted and talented, resulting in what we have described as a process of exclusion and the cultivation of an elite, Torrance envisaged a different goal. The tests were to make it possible to individualise and adjust teaching according to the needs and strengths of the individual student. Thirdly, it might be argued that there is limited justification in regarding creativity as composed of a number of traits, such as fluency, originality, elaboration and flexibility (Hassan, 1986). Against this, it can be countered that if a person scores well on the other traits they will also score well on originality (Torrance & Safter, 1999). Lastly, Torrance was also aware of the fact that, just because a student scored well on the test, it did not guarantee that they would act in a creative manner. The test only indicated a heightened possibility of creativity. With this we reach an important limitation of creativity tests, and arguably of many other forms of assessment. The student might not exhibit creativity outside of the test or, if they do, it might not be to a level in accordance with their test score. In other words, a question mark is placed about the validity of such tests (Starko, 2005, pp. 423–425). With this in mind we want to propose an understanding of assessment as a form of connoisseurship. This entails moving the focus to an assessment of the process and product of creativity as it actually takes place, in what might be an educational or classroom setting. It is not in the experimental setting of a test environment. It might be charged that seeking to assess the process and product of the student’s creativity is an even more authentic form of assessment than the Torrance Test of Creative Thinking. Where the topic is art, drama, dance, music and media studies it is not unusual to come across assessors who practise assessment as a form of connoisseurship. This is in line with the view that these subjects are traditionally where creativity is considered most evident and where assessment therefore requires connoisseurship. However, we would assert that the view of assessment as a form of connoisseurship should not be limited to aesthetic subject areas and is relevant to other subjects as a well. As we have argued above, a student in a subject such as natural science or a vocational subject such as carpentry can be regarded as creative when they solve a set problem: they have created a meaning that holds personal significance for them and they feel a sense of mastery. In an action research project in New Zealand with science teachers and university lecturers the goal was to explore structures that might be understood to support exactly creative expression amongst 8–11 year olds planning to resolve a number of science problems (Te Ture et al., 2006). An example of one of the problems is: Design a fair test that would help you to find out which of three types of material used in shirts (denim, sweatshirt material and wool) is best for keeping something warm. They concluded that five small measures might foster creativity: clarifying the purpose of the investigation, suggesting ways that students might be creative, displaying the range of the manipulating variable equipment,

132

5 Assessment as Connoisseurship

Fig. 5.5 Arrowtown students’ creative solution for night skiing. (Source: Ministry of Education, 2020)

modelling exemplary investigative plans and providing opportunities for students to collaborate and negotiate during investigative planning. Another example, also from New Zealand, concerns 150 students in Arrowtown Primary School who worked on a deep learning project for a term (Ministry of Education, 2020). Deep learning has been launched by Fullan et al. (2018) as a way of engaging students more deeply with learning across different subjects, as they build a series of competencies known as the 6 C’s (citizenship, character, collaboration, creativity, critical thinking and communication). The students sought to find a creative solution with a local ski-field operator who wished to offer night skiing to the local community. The solution they proposed entailed light strips attached to the skiers (Fig. 5.5). In what follows we will present and argue for connoisseurship that yields an understanding of assessment of creativity. When we assess creativity we would like the student to be working on rich assessment tasks8 that generate a broad spectrum of information to assist in forming the basis for the assessment judgement. Consider an example to illustrate from the sporting world. The Williams sisters are playing tennis against each other in a final. Serena begins to wind up for her serve, bending her body in a graceful manner, her arm stretching to the sky after the ball she has thrown up vertically. As she does this Venus knows where the ball is going. It is not that they are sisters and have spent countless hours practising against each other. It is rather that she makes an assessment based upon reading her sister’s body

‘Certificate assessment tasks are “rich” if they provide assessment information across a range of course outcomes within one task, optimising students’ expression of their learning’ (Plummer, 1999, p. 15). 8

5.5 Assessment as Connoisseurship

133

language, knowledge of the previous shot, whether it is a grass or clay court, whether her sister is tired, whether it is the first or second serve and where Serena is looking, knowing full well that that she may be disguising her look. The expression ‘chunking’ describes the manner in which all this information is grouped together for the assessor to sift through and assign a value in a split second. Syed (2010) restricts the term ‘chunking’ to perceptual information, but arguably it is more than perceptual information; it includes history and experience. When a teacher enters the classroom they begin with the same thing: collecting different forms of perceptual information and drawing upon their history and experience of teaching the students in the particular class: A is alert today, B seems to have mislaid his books, C is looking drowsy, possibly gaming too much in the night.

5.5 Assessment as Connoisseurship To assess in the manner of a connoisseur is to act in much the same manner. It is to collect a wealth of information about the student’s way of working on a task (process) or the product. Sadler’s distinction between holistic and analytic assessment is useful, where holistic assessment acts are akin to acting in the manner of a connoisseur. He suggested, in a well-known essay in the assessment literature (Sadler, 1989), that when examiners make qualitative judgements on the finished product or process of a student, two forms of judgement are possible. The judgements are either analytical (working with pre-specified criteria and searching for evidence of them in the work of the candidate) or configurational (with a holistic assessment first, and then substantiating it by referring to criteria which may or may not overlap with each other).9 With configurational judgements, the criteria exist as a tacit pool of potential criteria, from which examiners select according to their relevance to the case at hand. In other words, they become manifest when useful. Another way of putting this is to say that the assessment acts of the connoisseur draw upon a reservoir of experience, including tacit experiences, that is made explicit when required to inform the judgement and provide a rationale. In later work, clearly inspired by Wittgenstein, Sadler has called this a form of ‘noticing’, to which students do not always have access. This may be because they In more detail (Sadler, 2008, p. 3): ‘In holistic (also called global) grading, the assessor progressively builds up a complex mental response to a student work. This involves both attending to particular aspects that draw attention to themselves, and allowing an appreciation of the quality of the work as a whole to emerge. The appraiser then makes a qualitative judgment as to its overall quality, and maps that judgment directly to the appropriate point on the grading scale. In addition to assigning the grade, the assessor may provide a rationale for it, perhaps in summary form for the work as a whole, or as running comments on various features of the work. Rationale and feedback statements necessarily invoke one or more criteria, because criteria are constitutive elements of all evaluative explanations or advice. In analytic grading, criteria play a clear front-end framing role. In holistic grading, the assessor’s emergent global judgment dominates. In principle at least, the global judgment is made first; references to criteria follow from reflection on that appraisal.’ 9

134

5 Assessment as Connoisseurship

do not always notice what the teacher expects or because the teacher has not shared these expectations. Sadler formulates this ‘noticing’ in the following manner: ‘These particular “noticings” are associated with criteria (often drawn from an undefined pool of potential criteria) which then become useful both in shaping the assessor’s emerging perspective, and in constructing feedback comments’ (Sadler, 2010, p. 546). Of course, it is debatable whether making tacit experience explicit (‘constructing feedback comments’) is as easy as Sadler’s quotation suggests. In many cases the connoisseur can be likened to the skilled craftsperson who demonstrates to the apprentice and they do not have to talk; seeing is the form of communication and is considered sufficient. Sadler draws attention to the concept of the connoisseur as a skilled craftsperson who can also provide a rich understanding of the judgements required to make an assessment of creativity. This echoes the work of his predecessor Polanyi (1966), who posited that in consciously repeating an action normally undertaken in an automatic manner, its tacit character and also its criteria for success become understood. Following this line of argument an assessor undertaking the same assessment act on repeated occasions would be able to draw upon tacit criteria in the manner suggested by Sadler.10 But the assessor may not explicitly communicate them in works, leaving much in unspoken practice, and ultimately the student may be left uncertain about the foundation and process by which the assessment judgement is made. Orr (2007) in her research on assessment in the teaching of fine arts was also interested in connoisseurship, conceptualising it as a form of guild knowledge, or ‘insider knowledge’ possessed by teacher. It represents a tacit ‘feel for the game’ that is not easily articulated. Assessment is not, however, arbitrary or idiosyncratic. It is grounded in the social context and draws upon, among other things, corporeal forms of expression. She quotes favourably another researcher’s study of the art critic: ‘staff sometimes demonstrated a difficulty in articulating their opinions and values … they would resort to the use of imprecise and general terms, unconsciously relying on their accompanying non-verbal and gestural behaviour to convey meaning’ (2007, p. 21). While a techno-rationalist view of assessment based upon clearly defined and delimited rubrics might seek to make all things transparent to the student, the connoisseur, on the other hand, would understand that not all can be made transparent and verbalised; as evidenced in Orr’s point that ‘student artwork that was viewed as excellent sometimes explored the boundaries of the discipline’ and two informants who said respectively: ‘I kind of developed an idea about art students, which was that, if they’re doing really well, they should be beyond you in some kind of a way, more than you’, and ‘Well you know a first has got the zing … What the hell would that be?’ (Orr, 2007, pp. 117–118). This is a slightly different reading of the connoisseur than Sadler’s version. Sadler might be read to mean that the connoisseur endeavours to give an explicit, Colleagues of Sadler have developed and provided empirical support for his argument utilising a tripartite set of concepts: latent, explicit and meta-criteria (the last mentioned specifying the rules and occasions for the use of latent and explicit criteria) (see Wyatt-Smith & Klenowski, 2013). 10

5.5 Assessment as Connoisseurship

135

criteria-based reference in the last instance, even if it is lacking in the initial encounter with the object or subject of assessment. For Orr the connoisseur admits to and reconciles himself/herself with a certain excess or residue, the tacit, that escapes clear articulation and a voice in the act of assessment. The holistic judgement is sometimes known under the term global judgement (Sadler, 2008). It is also regarded as impressionistic, relying in the first instance upon the assessor’s intuitive feeling or sense of the assessment information and the subjective experience of the assessor. The analytical versus holistic distinction also seems to parallel the philosophy of mind distinction on knowledge as applied to a child, namely, some knowledge can be parcelled into discrete testable units and some knowledge is connected to other knowledge in the mind of the knower and gains its significance because of this (Davis, 2006). Eisner gives a different meaning to assessing in the manner of a connoisseur. He does not seek to understand the manner in which the holistic assessment of the connoisseur might draw upon criteria. He was interested in how to assess the creative component in the student’s work of art. He also extended his argument to the assessment of the student’s work and process of working in all subjects.11 He defined the assessment connoisseur’s way of working in the following way: ‘Connoisseurship is the art of appreciation. It can be displayed in any realm in which the character, import, or value of objects, situations, and performances is distributed and variable, including educational practice’ (Eisner, 1998, p. 63). By variable we would argue that he means that the process or the product is not expected to conform to a predefined standard or set of criteria. It is to be a creative expression that cannot be planned, predicted or imagined beforehand. This might sound like the elite definition of creativity, but he means that each and every person, in an art class for example, is capable of creating something of unique value for themselves and of potential value to others. He argued that criteria-based assessment was concerned with learning objectives/outcomes embedded in the curriculum and instructional outcomes, such that the student is considered to be formed by experience with reference to them. By way of contrast, the assessor as connoisseur might be inclined to place a value on the process of creativity and how it was experienced, in addition to the product. Eisner called this a focus on expressive objectives. As two Norwegian-based assessment researchers have put it: One of the really big challenges for the assessment of music in the classroom involves what Eisner (1985, p. 54) called expressive objectives. Expressive objectives are relevant in learning activities where students explore and experience wonderment, try and fail, are engaged in creativity and allowed to focus upon aspects in the respective activity that hold special, personal interest. These learning activities are strongly evident in improvisation and composition. Here, experience and exploration are central elements and qualities to be developed in student creativity, students’ perceptual skills … etc. For the development of such expressive qualities it will be difficult or impossible to set criteria or behavioral goals in advance, for the simple reason that there exist numerous solutions that can be just as ‘Good’ or as ‘True’. The solution is often to omit this aspect of the formal assessment, or enter more ‘objective’ assessment of the productive and creative activities e.g. Compose 12

11

An example in another discipline is offered by Chu (2009), where leadership practice is in focus.

136

5 Assessment as Connoisseurship

strokes, ABA form, should include an intro and one outro, should be pentatonic. (Særte & Vinge, 2010, p. 172)

So, what does it mean to undertake what at first glance might appear to be impossible: to assess without predefined criteria? Eisner’s proposal is that the connoisseur is like an art critic. First, they retell what they have seen, heard or read in a rich, metaphorical language. This is the descriptive phase. Thereafter, a layer of interpretation is added to try to explain the significance of what they witnessed. In order to do this the interpretation can make reference to the context, and relate the event to the past and the present and to the assessor’s own previous experience of such expressive processes and products. The distinctive aspects are voiced and the role of the assessor as connoisseur is to appreciate the quality in the student’s work, not by comparing it to criteria, or other students (norm-based assessment), but in terms of the student’s previous and present performance and the skills and knowledge it reveals. Adopting his terminology, it is not about instructional objectives as predefined skills and knowledge are acquired, but a focus upon expressive objectives that seek to express skills and knowledge in educational encounters with topics. In these encounters personalised meanings develop (Eisner, 1967; Gottlieb, 2013). We can see clearly the difference in the way Sadler, Orr and Eisner understand what is at stake with respect to assessment in the manner of the connoisseur. For Sadler it is possible to connect with criteria thinking, for Orr a certain excess or residue escapes verbalisation, while for Eisner the opposite is the case, with the creative voiced, but in metaphors and the drawing out of distinctive features. Our view is that each of these views can be correct, since each depicts a particular set of generative mechanisms and structures supporting one of three ways in which the connoisseur practices assessment: believing in criteria, not believing in criteria or its verbalisation, and lastly not believing in criteria, but believing in verbalisation.

5.6 Closing Comments In critical realist terms the Torrance tests of creativity are based upon a mechanism and structure that creates an opportunity for the potential expression of creativity and thereafter an assessment of the student’s creativity. An example is when the student is invited to think figuratively with a jellybean as stimulus. In the connoisseur approach the generative mechanisms and structures of the assessment act seek not to select and target the creative in an artificial environment with a standardised test, but to look at the creative process and product in a more authentic situation. Namely, the focus is the process itself, viewing or listening to the product, and what takes place in each encounter between an assessor and the student’s creative process and/or product. The Torrance test arrives at normed scores, whereas the connoisseur looks to the individual student. The latter reminds us of ipsative assessment: assessing the individual against themselves and earlier

5.6 Closing Comments

137

performances (Hughes, 2014). The following quotation sums up the difference in the generative mechanisms and structures at play: ‘No test can measure the chill that goes up the spine when we hear an emotionally moving performance. Well, perhaps not, but we can assess the skills and knowledge associated with those phenomena’ (Lehmen, as cited in Sætre and Vinge 2010, p. 172). We have also argued that, before accounting for the assessment of creativity, it is necessary to reflect upon how creativity is defined and nurtured. We have presented two contrasting definitions and accompanying views of the mechanisms and structures behind each of these definitions: the elitist view of creativity and, on the other hand, a democratic view. The former looks to propel a lesser number of gifted students into talented students, while the latter seeks to open up giftedness to all in some respect. This approach echoes the discussion in Chap. 2 on how new forms of society lead to new forms of assessment, in particular on the mechanisms whereby assessment includes and at the same time also excludes. We also touched upon the different mechanisms at play in Kingore’s tripartite distinction been the high achiever, gifted learner and the creative learner. The point is that the different ways that students learn and with what motivation create an important context for designing assessment acts that are able to capture these differences. The motivational aspect was discussed in Chap. 4 on the connection between motivation, assessment and learning. Where might one go as an educator interested in creativity and its assessment? We have argued that the role of the educator as a connoisseur is central. In a practice sense models exist that take up some of the ideas in this chapter and indicate how they can be operationalised. Take, for example, the model proposed by Blamires and Peterson (2014) to map pupil progress in creativity in and across subjects. It has two main components. Firstly, the teacher notes student examples across different learning activities: (1) questioning and challenging; (2) making connections and seeing relationships; (3) envisaging what things might be; (4) exploring ideas, keeping options open; (5) reflecting critically on ideas, actions and outcomes.12 Secondly a number of specific enablers (see Ferrari et al., 2009) should be put in place (assessment tasks that are relaxing and stimulate imagination; a culture of risk taking and non-conformity; a curriculum understanding that makes room for creativity rather than constraining it in a prescriptive manner; individual skills based on a minimum of knowledge; teachers who possess knowledge of creative teaching and learning processes; technology and tools providing the opportunity for interactive learning). The model sounds pretty common-sense. But when we add a new level of context it gains relevance. We are thinking in particular about the challenge of teaching and assessing in a world that has been marked by COVID-19. Students have had to learn at a distance and teachers have had to design and deliver their lessons remotely. We want our students to remain engaged, inquisitive and creative in how they learn. We end this chapter with a comment on engagement and in particular distraction from learning, which will have a roll-on impact on creativity and assessment. These

12

These strands of activity are inspired by Qualifications and Curriculum Authority (2005).

138

5 Assessment as Connoisseurship

thoughts on technology add substance to the model proposed by Blamires and Peterson (2014). The thoughts reproduced below have been developed by Simon McCallum, Ed Schofield and Stephen Dobson (see McCallum et al., 2021): One of the constant challenges in education is keeping the learner engaged, motivated and connected in a world increasingly filled with distractions. Social media, streaming TV and video games all compete for students’ increasingly fragmented attention. COVID-19 lockdowns only increased the opportunity for those distractions to interfere with learning. But, as we look hopefully towards a post-COVID world, perhaps we can take inspiration from the things many students are clearly drawn to—in particular, video games. Of course, borrowing from video games and their design to inform educational practice isn’t new. Some have talked this up as “gameducation”, whereby courses are like games with trophies for participation and engagement. It’s clear learning this way can be fun, but there is another important element of that experience that deserves closer examination—“flow”. Gamers (athletes, too) experience this flow state when totally engaged in the game. Living in the moment and the experience, the activity is effortless and there is no sense of time passing. Students can also experience flow, and this is when learning is at its most productive. So, the challenge in education is to plan for and achieve that level of engagement. Flow is and always will be the gold standard. Learning has always been a deeply social activity, with the student connected to the institution, as Nietzsche put it, “by the ear, as a hearer”. Schools relied on classrooms full of children learning the same material together, their shared attention helping to reduce distractions during focused moments of teaching. Over time, various strategies for combating distraction have been developed, including offering students a smorgasbord of learning experiences, or cutting the length of lectures to account for the tyranny of concentration spans. But COVID-mandated videoconferencing deprives both students and lecturers, and drains the richness from these social interactions. Furthermore, learning mediated by screens simply amplifies the myriad distractions available online. Even with cameras on, we’re not necessarily paying attention to each other, we’re paying attention to the screen. But maybe this is where the qualities that define video games come into their own. After all, gaming is also a deeply social activity that allows for complex interactions and learning without the physical presence of anything more than a screen. Online games have already partly substituted for the things COVID-19 has affected— sports events, concerts and music festivals, parties and weddings. Take the game Among Us, for example, which in September 2020 alone had 200,000 people going online to watch “impostors” try to eliminate “crewmates” from teams before they can complete a set of tasks or identify which players are the impostors. Within the context of the game, the tasks are actually the distractions that prevent players from focusing on who is really an impostor. It is about observation, memory and insight—a game full of learning opportunities that teaches participants how to control distractions. The social cohesion created in the teams of Among Us players offers a template for teachers looking for ways to create engaging digital learning environments. Creating teams, allocating individual tasks that help the team and regularly changing team members all help to engage and stimulate students. With online teaching making it harder for institutions to control the learning environment, it becomes imperative to making learning activities themselves more engaging in a screen-mediated environment.

References

139

As Marshall McLuhan famously said, “the medium is the message”. Understanding how games grab and hold attention can help with the design and implementation of new online learning tools. Even some politicians are learning from games and using them to engage with the public. Gamification is also enhancing academic research and teaching. The key lies in our definition of distraction. Screen learning must involve distracting students towards the things that really matter. In education, as in gaming, we can “court risk” without the fear of failing. Rather than admonishing learners for not focusing when sitting at desks in school or in front of screens, we should work within our distracted world. We need to play with distraction, work with distraction and learn with distraction. Paradoxically, distraction may not be the enemy, it could be the gateway to more attentive learning.

References Ball, O., & Torrance, E. (1984). Streamlines scoring workbook: Figural A. Scholastic Testing Service. Blamires, M., & Peterson, A. (2014). Can creativity be assessed? Toward an evidence-informed framework for assessing and planning progress in creativity. Cambridge Journal of Education, 44(2), 147–162. Chu, C. (2009). Developing Pacific leaders within a tertiary education setting through appreciative inquiry: A personal perspective. New Zealand Annual Review of Education, 19, 99–113. Colvin, G. (2008). Talent is overrated: What really separates world-class performers from everybody else. Nicholas Brealey. Cosmovici, E. (2006). Talent identification and selection within differentiated educational programs (PhD thesis). Oslo University. Coyle, D. (2009). The talent code: Greatness isn’t born, it’s grown. Arrow Books. Csikszentmihalyi, M. (1990). Flow: The psychology of optimal experience. Harper and Row. Davis, G. (1998). Creativity is forever (4th ed.). Kendall-Hunt. Davis, A. (2006). High stakes testing and the structure of the mind: A reply to Randall Curren. Journal of Philosophy of Education, 40(1), 1–16. Dobson, S. (2012). The pedagogue as translator in the classroom. Journal of Philosophy of Education, 12(2), 271–286. Dobson, S., Brudalen, R., & Tobiassen, H. (2006). Courting risk – The attempt to understand youth cultures. Young: Nordic Journal of Youth Studies, 14(1), 49–59. Dreyfus, S., & Dreyfus, H. (1980). Five models of the mental activities involved in directed skill acquisition. Storming Media. Dutchtown Elementary (n.d.). Common myths and truths about gifted students. https://schoolwires.henry.k12.ga.us/Page/111171 Eggen, N., & Nyrønning, S. (2003). Godfoten: Samhandling – Veien til suksess [The good foot: Interaction – The way to success]. Oslo. Eisner, E. (1967). Instructional and expressive educational objectives: Their formulation and use in curriculum. http://files.eric.ed.gov/fulltext/ED028838.pdf Eisner, E. (1985). The art of educational evaluation – A personal view. Falmer Press. Eisner, E. (1998). The enlightened eye: Qualitative inquiry and the enhancement of educational practice. Merrill. Enerstvedt, R. (1982). Mennesket som virksomhet. Innledning til en teori: Grunnleggende begrepene i samfunnvitenskapene [Humans as active. Introduction to a theory: Fundamental concepts in the social sciences]. Tiden Norsk.

140

5 Assessment as Connoisseurship

Epstein, D. (2019). Range: How generalists triumph in a specialized world. Macmillan. Ericsson, K. A., Krampe, R. Th., & Tesch-Romer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100, 393–394. Ferrari, A., Cachia, R., & Puni, Y. (2009). Innovation and creativity in education and training: Fostering creative learning and supporting innovative teaching (EU technical note JRC 52374). European Commission Joint Research Centre. Fullan, M., Quinn, J., & McEachan, J. (2018). Deep learning: Engage the world to change the world. Corwin. Gadamer, H. G. (1989). Truth and method (2nd rev. ed.) (J. Weinsheimer & D. G. Marshall, Trans.). Continuum. (Original work published 1960). Gagné, F. (2004). Transforming gifts into talents: The DMGT as a developmental theory. High Ability Studies, 15(2), 119–147. Getzels, J., & Csikszentmihalyi, M. (1976). The creative artist as an explorer. In A. Rothenberg & C. Hausman (Eds.), The creativity question (pp. 161–175). Duke University Press. Gladwell, M. (2008). Outliers: The story of success. Little, Brown and Company. Goffman, E. (1967). Interaction ritual. Anchor Books. Gottlieb, D. (2013). Eisner’s evaluation in the age of the race to the top. Curriculum and Teaching Dialogue, 15(1–2), 11–25. Guilford, J. (1956). Structure of intellect. Psychological Review, 53, 267–293. Guilford, J. P. (1977). Way beyond the IQ. Bearly. Hassan, M. (1986). Construct validity of Torrance tests of creative thinking: A confirmatory factor- analytic study (PhD thesis, Claremont Graduate School). Dissertation Abstracts International, 46(8-A), 2233. Hébert, T., Cramond, B., Neumeister, K., Millar, G., & Silvian, A. (2002). E. Paul Torrance: His life, accomplishments, and legacy. University of Connecticut. Hmelo-Silver, C. (2004). Problem-based learning: What and how do students learn? Educational Psychology Review, 16(3), 235–266. Hoepfner, R., & Hemenwaz, J. (1973). Test of creative potential. Monitor. Hughes, G. (2014). Ipsative assessment: Motivation through marking progress. Palgrave Macmillan. Kafka, F. (1956). The trial (W. Muir & E. Muir, Trans.). Secker and Warburg. Khatena, J., & Torrance, E. (1973). Thinking creatively with sounds and words: Technical manual. Personnel Press. Kim, K. H. (2006). Can we trust creativity tests? A review of the Torrance Tests of Creative Thinking (TTCT). Creativity Research Journal, 18(1), 3–14. Kingore, B. (2004). Differentiation: Simplified, realistic, and effective. Professional Associates Publishing. Kiraly, D. (2003). From instruction to collaborative construction: A passing fad or the promise of a paradigm shift in translator education? In J. Baer & G. Koby (Eds.), Beyond the ivory tower: Rethinking translation pedagogy (pp. 3–27). John Benjamins. Lassig, C. J. (2009). Teachers’ attitude towards the gifted: The importance of professional development and school culture. Australian Journal of Gifted Education, 18(2), 32–42. Lejk, M., & Wyvill, M. (2001). Peer assessment of contributions to a group project: A comparison of holistic and category-based approaches. Assessment and Evaluation in Higher Education, 26(1), 61–72. MacKinnon, D. (1978). In search of human effectiveness: Identifying and developing creativity. Creative Education Foundation. McCallum, S., Schofield, E., & Dobson, S. (2021, August 2). Gamers know the power of ‘flow’— What if learners could harness it too? The Conversation. https://theconversation.com/gamers- know-the-power-of-flow-what-if-learners-could-harness-it-too-164943. Accessed 21 Oct 2021. Ministry of Education. (2020). Arrowtown School – Lighting up minds through project based learning. NZ Curriculum Online. https://nzcurriculum.tki.org.nz/Curriculum-resources/ School-snapshots/Arrowtown-School. Accessed 21 Oct 2021.

References

141

National Advisory Committee on Creative and Cultural Education. (1999). All our futures: Creativity, culture and education. DfES. New Atlas. (2015, February 3). Frank Gehry’s ‘paper bag’ – A new architectural icon for Australia? New Atlas. https://newatlas.com/frank-gehry-chau-chak-paper-bag-sydney/35891/. Accessed 21 Oct 2021. New World Encyclopedia. (2018). J. P. Guilford. https://www.newworldencyclopedia.org/entry/ J._P._Guilford. Accessed 21 Oct 2021. Nietzsche, F. (1909). On the future of our educational institutions. Allen and Unwin. Nietzsche, F. (1990). Twilight of the idols and the anti-Christ. Penguin Books. Original work published 1889. Nordberg, D. (2008). Group projects: More learning? Less fair? A conundrum in assessing postgraduate business education. Assessment and Evaluation in Higher Education, 33(5), 481–492. Orr, S. (2007). Making marks: The artful practice of assessment in fine art (PhD thesis). London University. Plucker, J. A. (1999). Is the proof in the pudding? Reanalyses of Torrance’s (1958 to present) longitudinal data. Creativity Research Journal, 12, 103–114. Plummer, F. (1999). Rich assessment tasks: Exploring quality assessment for the school. SCAN, 18(1), 14–19. Polanyi, M. (1966). The tacit dimension. University of Chicago Press. Qualifications and Curriculum Authority. (2005). Creativity: Find it, promote it – Promoting pupils’ creative thinking and behaviour across the curriculum at key stages 1, 2 and 3 – Practical materials for schools. QCA. Renzulli, J. (1978). What makes giftedness? Re-examining a definition. Phi Delta Kappan, 60(3), 180–184. Rhodes, M. (1961). An analysis of creativity. Phi Delta Kappan, 42, 305–310. Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18, 119–144. Sadler, R. (2008). Indeterminacy in the use of preset criteria for assessment and grading. Assessment and Evaluation in Higher Education, 34(2), 159–179. Sadler, R. (2010). Beyond feedback: Developing student capability in complex appraisal. Assessment and Evaluation in Higher Education, 35(5), 535–550. Særte, J., & Vinge, J. (2010). Musikk og vurdering [Music and assessment]. In S. Dobson & R. Engh (Eds.), Vurdering for læring i fag [Assessment for learning in subjects] (pp. 155–172). Cappelen Damm Høgskole Forlaget. Seelye, K. (2019, April 5). Dan Robbins, who made painting as easy as 1-2-3 (and 4-5-6), dies at 93. New York Times. https://www.nytimes.com/2019/04/05/obituaries/dan-robbins-dead.html. Accessed 21 Oct 2021. Sharp, S. (2006). Deriving individual student marks from a tutor’s assessment of group work. Assessment and Evaluation in Higher Education, 31(3), 329–343. Starko, A. (2005). Creativity in the classroom: Schools of curious delight. LEA. Steinsholt, K., & Ness, S. A. (2016). Motstrøms. Åpninger i retning av en levende pedagogikk (Openings in the direction of a living pedagogy). Fabokforlaget. Syed, M. (2010). Bounce: How champions are made. Fourth Estate. Szabos, J. (1989). Bright child, gifted learner. Challenge, 34. http://toolbox1.s3-website-us-west-2. amazonaws.com/site_0610/gifted_children4.PDF Tatarkiewicz, W. (1980). A history of six ideas: An essay in aesthetics (C. Kasparek, Trans.). Martinus Nijhoff. Te Ture, R., Smith, J., Graham, R., Smith-Graham, V., & Lewthwaite, B. (2006). Small measures for fostering creativity in science investigative planning? STERpapers, 2006, 3–25. Torrance, E. (1966). Torrance tests of creative thinking: Norms technical manual (Research ed.). Personnel Press. Torrance, E. P. (1974). Norms-technical manual: Torrance tests of creative thinking. Ginn and Company.

142

5 Assessment as Connoisseurship

Torrance, E. P. (1981). Empirical validation of criterion-referenced indicators of creative ability through a longitudinal study. Creative Child and Adult Quarterly, 6, 136–140. Torrance, H. (2001). Assessment for learning: Developing formative assessment in the classroom. Education, 3–13(29), 26–32. Torrance, E. P., & Safter, H. (1990). The incubation model: Getting beyond the aha! Bearly. Torrance, E., & Safter, H. (1999). Making the creative leap beyond …. Creative Education Foundation Press. Treffinger, D., Young, G., Selby, E., & Shepardson, C. (2002). Assessing creativity: A guide for educators. National Research Center on the Gifted and Talented, University of Connecticut. Treffinger, D., Selby, E., & Crumel, J. (2012). Evaluation of the future problem solving program international (FPSPI). FPSPI. http://fpspi.org/pdf/FPSPI-EvaluationArticle%20-%20treff.pdf. Accessed 5 Mar 2021. Villalba, E. (2008). On creativity: Toward an understanding of creativity and its measurements. Joint Research Centre, European Commission. Vygotsky, L. S. (1986). Thought and language. MIT Press. Wallas, G. (1926). The art of thought. Harcourt-Brace. Williams, F. (1980). Exercise in divergent thinking: The Williams Scale. DOK Publishers. Wyatt-Smith, C., & Klenowski, V. (2013). Explicit, latent and meta-criteria: Types of criteria at play in professional judgement practice. Assessment in Education: Policy, Principles and Practice, 20(1), 35–52.

Chapter 6

Challenging the Culture of Formative Assessment: A Critical Appreciation of the Work of Royce Sadler

Whereas moderation relevant for a single assessment task is repeated for subsequent tasks, the ultimate objective is the development of ‘calibrated’ academics … [who] accept responsibility for grading against agreed achievement standards, participate in periodic (but not continuous) checking and recalibration. (Sadler, 2013, p. 12)

Abstract How might the understanding and practice of assessment be progressed? We offer a critical appreciation of the work of Royce Sadler and his elaboration of assessment for, of and as learning. He has been globally cited for his work on formative assessment since his seminal paper in 1989. After exploring the philosophical and theoretical underpinnings of his work, we propose two under-used assessment concepts informed by the work of Bourdieu and Wittgenstein. They are respectively assessment capital and assessment judgements. While Sadler has been a long-time reader of Polanyi and Wittgenstein, he does not explicitly refer to Bourdieu. Increasingly it is a truism that we live in a world that is both interconnected and highly complex (Boud & Falchikov, 2007). Twenty-first-century skills are much debated in this context and so too the pedagogy of connectivism in a digital world of online learning (Dobson & Scofield, 2020). Twenty-first-century skills include the ability to learn more about how to learn, especially given that we experience ever quicker feedback loops between the point of origin of learning and its assimilation and impact, or its converse, the refusal to learn both content and how to learn as we fear they will both soon become redundant. Moreover, these concerns highlight the connection between assessment and learning and the need to develop the skills of self-referential assessment as we learn about our own learning and its importance for self- and other-directed development. An early key contributor in this field has been Royce Sadler from Australia. His 1982 article critiqued the cybernetic system of instructional learning as only © Springer Nature Switzerland AG 2023 S. R. Dobson, F. A. Fudiyartanto, Transforming Assessment in Education, The Enabling Power of Assessment 10, https://doi.org/10.1007/978-3-031-26991-2_6

143

144

6 Challenging the Culture of Formative Assessment: A Critical Appreciation…

suitable for lower levels of learning and not suitable for higher levels of more complex learning, where evaluation is central (see Grover, 2016). Throughout the 1980s his work progressed to the much-cited 1989 article where his preference was for the term formative assessment, rather than the term ‘evaluation’. The latter was more suited in his view to the American readership at that time. Hence, he uses the term ‘evaluation to improve academic performance’ (American terminology) to refer to formative assessment (Sadler, 1983). In the same article of 1989, he also introduced ‘self-evaluation’ as a working concept drawn from his action research with his university students. In 2002, he engaged with the topic of learning dispositions, noting how difficult it is to assess them. Most recently he has focused on grading and how it should rely on periodic recalibration by teachers, rather than moderation which has to be repeated on each occasion with subsequent tasks. Even though his leitmotif has clearly been formative assessment, the attentive reader might still identify an interest in the evaluation and design of (micro and classroom) learning systems. It is not difficult to make the argument that throughout Sadler’s distinguished academic career there has been a strong and clear focus on the topic of formative assessment. However, what is less evident in our view, but still worthy of exploration, is the philosophical and theoretical foundations supporting his approach, and how he has conceptualised the relationship between assessment and learning during his academic career. With this in mind, the main argument in this chapter is that, if we are to appreciate Sadler’s work, it is necessary to understand the inspiration he has consistently sought from philosophers such as Polanyi and Wittgenstein, and not merely from well-known scholars in the field of educational measurement.1 In order to explore the philosophical and theoretical base of this proposition we pose two questions in this chapter: • How has Sadler understood the terms assessment of, for and as learning? • What concepts, building upon the work of Sadler, can we introduce to enrich and progress our understanding of future-directed practices of (formative) assessment? We begin by presenting an understanding of the debate on formative assessment and how it seems to swing in pendulum fashion between assessment of, for and as learning. At times one of the terms assumes primacy; at other times it is all three in an uneasy alliance. It is important to remind the reader of earlier passages in the book (especially in Chap. 3) where we highlighted that there are many like Sadler who hold a preference for the term formative assessment. The Assessment Reform Group supported the term assessment for learning because they contended it was more precise than the catch-anything term formative assessment. Our view in this book is that the generative mechanisms and structures identified by users of both of these terms Sadler has himself suggested in personal communication (23 July 2018) with the authors: ‘Let the problem speak to me, with no prior (known) commitment on my part to some theoretical stance.’ This is clearly a key methodological strength in his approach to assessment; apparently moving more toward eclecticism than dogmatism. However, we would argue that he has consistently sought philosophical inspiration from the philosophers we have nominated. Readers of this chapter might of course identify other sources of inspiration for Sadler’s work than these particular philosophers. 1

6.1 Understanding the Debate about Assessment of, for and as Learning

145

share more similarities than differences. For the purposes of this chapter and the argument we are making, we use the terms formative assessment and assessment for learning interchangeably.

6.1 Understanding the Debate about Assessment of, for and as Learning It is not uncommon to understand the relationship between assessment and learning through two concepts: summative assessment, also known as assessment of learning, and formative assessment, widely called assessment for learning, even though some seek to distinguish formative from for learning (Dixson & Worrell, 2016; Harlen, 2007; Heron, 2011; Sadler, 1989; Weisler, 2015; Wiliam, 2010). It is also important to note another perspective on assessment, namely assessment as learning. This has drawn an increasing amount of attention from assessment experts (Dann, 2014; Sadler, 2007, 2010; Torrance, 2007; Weurlander et al., 2012). Predating many of these is the work of Earl (2013) who draws attention to the manner in which students can self-monitor, self-correct and make adjustments to their own performance, whether formative or summative. As she aptly puts it, ‘self-assessment is at the heart of the matter’ (p. 25) and additionally supports or even necessitates a balancing of the three assessments, namely assessment for, of and as learning. As widely understood, assessment of learning is mainly used ‘to evaluate student learning at the completion of a programme of study’ (Jackel et al., 2017, p. 14) or ‘to summarise students’ achievements in order to award some kind of certification’ (Weurlander et al., 2012, p. 747). The purpose is to assess a student’s level of achievement at a specific point in time according to predetermined criteria (Dixson & Worrell, 2016). Some people also understand assessment of learning as judgemental assessment because it involves gathering, interpreting and using evidence to make judgements about students’ achievements (Brown et al., 1997; Harlen, 2007). Sadler tends to use the term summative assessment when talking of assessment of learning: ‘Summative contrasts with formative assessment in that it is concerned with summing up or summarizing the achievement status of a student, and is geared toward reporting at the end of a course of study especially for purposes of certification’ (1989, p. 120). Assessment for learning is conceptualised as having the purpose of providing rich, meaningful and timely feedback to students on their learning and progress throughout a program of study (Dixson & Worrell, 2016; Jackel et al., 2017; Weurlander et al., 2012). This kind of assessment is also called formative assessment, Sadler’s preferred term, or developmental assessment, as it is ultimately used as feedback to modify, change or adapt the learning (and teaching) activities throughout a program of study (Brown et al., 1997; Dixson & Worrell, 2016; Sadler, 1989, 1998; Torrance, 2001). Many hold the view that summative assessment per se is not enough; it represents a kind of one stop shop that does not sufficiently reflect

146

6 Challenging the Culture of Formative Assessment: A Critical Appreciation…

or support a judgement on a student’s performances, skills and learning process. Such a rationale has been used by educators who propose incorporating formative assessment practices as one of the crucial instruments ‘to shape and improve the student’s competence’ (Sadler, 1989, p. 220). In other words, there is a kind of ‘social drive to help learners’ to succeed in their education (Sadler, 2007, p. 387). Some authors have argued that connecting assessment for learning with summative assessment in a complementary fashion leads to a situation where the summative assessment is based upon an accumulation of the results of formative assessment of students’ performances (Pereira et al., 2015). By doing so, assessment can send ‘the right signals’2 to students, encouraging them to improve their learning, even if it might mean assessment for learning actually comes to reflect mini-assessments of learning as discrete assessment items that are bunched together (Biggs & Tang, 2011, p. 119). Sadler’s work has explored these conceptual differences or similarities between assessment for and of learning, and this has been mirrored in the work of others, such as the Assessment Reform Group in the UK, on the basis of his foundational work in the 1980s. However, the narrative would be incomplete if we did not consider the concept of assessment as learning. Sadler himself noted in 2007 that his understanding of assessment as learning is inspired by a paper Torrance published in the same year, and upon which he was invited to comment. For Torrance, teachers who operationalise assessment as learning seek merely to comply with criteria that address competence or a set of skills, rather than directing attention to the connection between assessment and learning. For Torrance, learning has its philosophical and theoretical basis in ‘social and intellectual development’ (2007, p. 293). Teachers tend to practise what Torrance calls ‘convergent assessment’, that is, referring strictly to the criteria previously set, and forget the ‘divergent aspect of assessment’, which concerns the individual differences of students as they exhibit qualities not necessarily recognised by the set criteria. Convergent assessment may imply that students are to a large extent guided to succeed in assessment by demonstrating what they ‘know and can do’ merely by complying with assessment criteria (Torrance, 2007, p. 292).3 Students thus displace learning, and Sadler notes that assessment as learning (Torrance’s term) has ‘a deservedly pejorative ring to it’ (Sadler, 2007, p. 388). Sadler’s understanding carries with it the view that assessment as compliance (learning) cheats the learner and others involved in the activity. This comment by Sadler, while important to acknowledge, underplays or even negates the different view of assessment as learning found in his 1989 article, even if he does not use this term. He argues that students in working to achieve high-quality work need to develop or possess evaluative skill to ‘compare with some The signals could be marks which might contaminate the learning process as students focus on the mark more than the learning. But the signals could also be rewards that are not marks, rather forms of encouragement, such as letting the students play a classroom game in the last lesson of the week as a reward for progress. 3 This echoes the debate on and practice of teaching-to-the-test. 2

6.2 The Inspiration of Polanyi and Wittgenstein

147

objectivity the quality of what they are producing in relation to the higher standard, and that they develop a store of tactics or moves which can be drawn upon to modify their own work’ (Sadler, 1989, p. 119).4 In summary, it is apparent that assessment as learning, in line with our understanding, possesses two meanings, one undesirable and the other desirable. The former, inspired by Torrance, refers to assessment as learning as criteria compliance, meaning a somewhat mechanical and less reflective compliance. The latter refers to assessment as learning as the development of self-evaluative skills. The latter version entails the student developing meta- cognitive skills to judge and self-assess their own learning and that of peers. This is in line with Earl’s (2013) view cited above and that of Tai et al. (2016). In Sadler’s, 2007 response to Torrance, it is clear that Sadler seems somewhat reluctant to acknowledge that assessment as learning is a separate concept in its own right. Instead, he incorporates its meaning into his understanding of formative assessment or assessment for learning. However, it is our argument that the concept of assessment as learning, which considers the process by which the student develops skills in assessment and self-assessment, can be usefully separated out from assessment for learning or formative assessment. A student receiving feedback may simply take it on board without necessarily acquiring and practising a longer-term growth in understanding of the importance of self-assessment and greater self- evaluative skills. In summary, Sadler is clear in adhering only to summative and formative assessment. He therefore rejects the conceptual difference and utility of the three concepts of assessment for, of and as learning. As we have noted, assessment as learning can have a negative meaning, as assessment compliance supported by teaching to the test, and a positive meaning, as a broader set of assessment skills, such as self-evaluative skills.

6.2 The Inspiration of Polanyi and Wittgenstein A key point in our critical appreciation of Sadler’s work is to argue that his understanding of formative and summative assessment, which we understand through the concepts of assessment of, for and as learning, is rooted in a deep understanding of the philosophical sources of his inspiration. We are thinking in particular of the concept of tacit knowing (Polanyi) and noticing (Wittgenstein). Specifically, the concern is with the manner in which teachers and students assess knowledge and performance to serve the purposes of the three kinds of assessment. Or, put differently, when teachers and students are seeking to engage in assessment of, for and as learning, what is going on in critical realist terms and how can we both identify and share this understanding? To recall, in this book we are interested in the theoretical import gained by a critical realist approach that looks to understand the generative

In highlighting the positive aspect of assessment as learning, Sadler was drawing attention to the fact that the testing-assessment process should in itself be a learning exercise. 4

148

6 Challenging the Culture of Formative Assessment: A Critical Appreciation…

processes and structures that cast a phenomenon into existence. The phenomenon in question in this case is assessment for, of and as learning. As early as 1987, Sadler warned teachers that, even though tacit knowledge, a term popularly introduced by Polanyi (1962), is important in formative learning and assessment, standards of achievement should be made explicit and clarified by teachers or assessors. The move should be from tacit to ‘verbal descriptions’ in order to minimise intuitive decisions and to support a more ‘scientific management of education’ (Sadler, 1987, p. 202). As he proposed, alongside ‘standard-referenced assessment’ and ‘norm-referenced and criterion-referenced assessment’, tacit knowledge was one of the methods used to specify and promulgate educational standards, along with numerical cut-offs, exemplars and verbal descriptions (Sadler, 1987). In another article, we can note that Sadler (1985) had already embarked on this so-called scientification (our term) of assessment. He talked about ‘criteria’ and used the concepts of valuation (from Dewey), recognition (not reason) and merit. He also mentioned ‘indirect evidence’ as the existential basis of criteria and ‘the tacit’. The tacit knowing of assessment, read evaluative knowing, held by the teachers should be shared with the students as they develop, through their own experience, what constitutes good quality assessment practices. Sadler also emphasises that teachers should encourage students in ‘developing tacit evaluative knowledge through experience’ in assessment (1989, p. 136). Polanyi (1962) calls these evaluative skills a subsidiary awareness, which draws subconsciously on a body of tacit evaluative knowledge. These points can be summarised in the adage, ‘I can’t tell you what quality is, but I know it when I see it’, or more simply, ‘We know more than we can say’. It is important to understand what exactly is at stake as teachers seek to make the tacit evaluative knowledge of the teacher or the student explicit. We must understand that the mere experience of assessment might not actually result in an explicit understanding – the teacher and student might remain trapped in their separate circles of tacit evaluative knowledge and understanding, and this might not progress assessment nor the learning of the students. A simple but important example of this is the manner in which Aboriginal children in Australia might be required to undertake reading tests, such as the so-called standardised Progressive Achievement Tests (https://www.acer.org/au/pat). They experience an absence of Aboriginal culture and experiences in the material. The teachers, often non-Aboriginal, using this test will remain trapped in their standardised cultural world of tacit knowledge and are unable to bridge the distance to the tacit world of the Aboriginal students, which is not acknowledged (Mika, 2017). It is noteworthy that Yang (2021) in her research found that the reading performance of Aboriginal children increased when the reading items reflected their cultural experiences (Heim, 2023). One reading of Polanyi might be that there will always be an inescapable level of the tacit. Such a position might encourage a sense of fatalism as to what can be done, and as a consequence a level of mysticism would have to be accepted, whereby the teacher is always seen to possess their own tacit and hidden reasons for

6.2 The Inspiration of Polanyi and Wittgenstein

149

assessing as they do. However, we can recall Plato’s (1956) famous Meno dialogue to understand how these circles might be broken down through shared questioning of others and oneself. The Meno dialogue explores learning and the transmission of knowledge, and also the practice of assessment and evaluative knowledge. This is not the transmission of knowledge and evaluative knowledge between people in the first instance, but from different levels within the same person. The learning involves learning how to look within oneself and recollect the already known, but not recognised. Despite using questions initially to induce a state of confusion and doubt in the person questioned, Socrates regarded the dialogue with Meno, the student, as an opportunity to ‘carry out together a joint investigation and inquiry’ (Plato, 1956, p. 80d). That it is a joint investigation must not be read to mean that the participants are equals in knowledge or experience, also Polanyi and Sadler’s point. Implicit in the view of Socrates, even when he claims his own ignorance, is that he knows more and is more experienced than Meno. Socrates’ standpoint can be summed up by the quotation: ‘Knowledge will not come from teaching but from questioning. He will recover it for himself’ (Plato, 1956, p. 85d). Socrates asks one of Meno’s servants how to solve a complicated geometry question, and leads him to the answer not by explicit teaching but by pointing the servant in a certain direction and asking him for his opinion. With the Meno spoken dialogue in mind, we might understand the process of moving from tacit to explicit shared evaluative knowledge and skills. As one becomes self-aware of the tacit, it can be shared. In other words, to begin with a psychological process takes place within the student as questions are asked and the student realises their own tacit knowledge, making it explicit to themselves; in Socrates’ sense recollecting what they already knew. Only after this is sharing possible in a social sense. By this we mean that the student is in a position to formulate the once tacit knowing in a shared inter-subjective language. The separate circles of knowing are broken. Is the explicit evaluative knowledge thus attained of a constant, unvarying character? Will it not change according to the perspective or interests of the teacher or person asking the questions (Dobson, 2017, p 18)? The answer is most likely in the affirmative, but at least the process is made visible and becomes the object of shared learning and assessment.5 However, the issue is without doubt much more complex and we will have cause to return to the kind of language required, and whether the Meno-inspired argument about self-awareness does in fact result in a higher level of explicitness. Sadler also deepens his understanding of the tacit component in assessment when he considers the way in which teachers develop their ‘professional tacit knowledge’ (1998, p. 82). Specifically, this entails the teacher building a tacit understanding of how to assess a student’s skills in problem solving, analysis or knowledge synthesis. Through the direct experience of giving feedback to students on

Here we are referring to the need to understand the active process of making learning or assessment visible. 5

150

6 Challenging the Culture of Formative Assessment: A Critical Appreciation…

their performance, the teacher will eventually build a tacit ‘repertoire of tactical moves’ that can be drawn upon on demand (Sadler, 1998, p. 81). On the basis of this teacher-led experience, the teacher may also see more student-informed ways of undertaking or responding to the assessment. In other words, the teacher learns from the student in a vicarious, derivative manner. The repertoire of tacit knowledge, including tacit evaluative knowledge, is accumulated over time by the teacher and it needs to be articulated and shared with colleagues in a community of experts. To the student Sadler proposes it should be transmitted, and we would prefer instead the term shared as indicated in our reference to the Meno dialogue. This supports a more dialogical form of pedagogy. Sadler (2005) offers a word of caution in his later work in relation to tacit knowledge. If no transmission, or sharing in our words, takes place a mystification can develop whereby transparency is undermined and the students become overdependent on the teacher as expert (Sadler, 2005, 2009a, b, 2010, 2011, 2013, 2014). Thus we note that if sharing takes place student de-mystification can progressively ensue. However, with regards to the concept of tacit knowledge, Polanyi is sometimes critiqued for not including the social and collective aspect of the tacit and only focusing on a psychological understanding (Turner, 2012). Turner (2012) asserts that the tacit world consists of the general tacit – which is socially shared among the members of a society, such as a general understanding of English – and the individual tacit, such as a particular understanding of English, which is possessed by an individual. Accordingly, Polanyi understands the tacit as individualised and therefore ‘transmitted’ in an apostolic, mystical way of teaching. This mystical way is also critiqued by Sadler, even if he uses the term transmission. Might Sadler be accused of neglecting the social aspect of the tacit, and thus remaining locked into a psychological understanding of the tacit? He does in fact acknowledge the social aspect of the tacit, but in a slightly different manner than Turner. Let us explore this in more detail. Sadler understands the teacher’s tacit knowledge, once again including tacit evaluative knowledge, as a form of guild knowledge which exists in some ‘quiescent and pliable form’ (Sadler, 1989, p. 127). Quiescent is here used in the sense of tacit and dormant, and pliable means the opposite as it can be woken and manipulated. Taking this on board, Sadler indicates that the tacit can be readily accessible by the teacher. As he puts it, citing Polanyi, ‘the apprentice unconsciously picks up the rules of the art, including those that are not explicitly known to the master … Connoisseurship … can be communicated by example, not by precept’ (Sadler, 1989, p. 135). However, students may possess different tacit knowledges, including evaluative knowledge, and these maybe different to that of the teachers. In order that this guild knowledge is also known to the students, it needs to be activated or made explicit through teacher-introduced examples that call upon the student to use evaluative skills and knowledge before, during and after assessment practices. The degree of explicitness might be different in every case and never totally explicit. The teacher-led examples are a form of telling where the tacit can be made explicit and the teacher is a presumed expert in this activity. But in the telling the student as apprentice may not notice what they are expected to notice. Put

6.2 The Inspiration of Polanyi and Wittgenstein

151

differently, not all telling leads to learning because the student may not share the understanding of the teacher. Following this line of argument, it becomes important to look at the telling and with this in mind we contend that Sadler seeks another source of inspiration. It enables him to answer the question of both what is identified for assessment by the teacher and how, and how this is shared with the student, that is, made explicit to varying degrees. Sadler’s (2010) main point is clear: too much feedback is concerned with telling, rather than improving students’ skills in assessment. This moves our attention and argument to the issue of the language used in explicit sharing of evaluative knowledge and skills. Sadler has since the 1980s (re)turned periodically to the work of the language philosopher Wittgenstein for inspiration. For Wittgenstein, the tacit should be understood as a kind of psychological form of shifting perception or noticing noticeable aspects or attributes, such as the person who looks at an image and sees the rabbit ears, while another or the same person might see the duck beak in the next instance (Wittgenstein, 1953/1967, p. 194).6 Wittgenstein also uses other terms such as ‘seeing’ or ‘aspect perception’ and ‘thinking’ to describe this concept of noticing an aspect (Dinishak, 2013). Simply put, noticing involves both seeing and thinking, and it can appear that they take place simultaneously or in reverse order i.e. there are family resemblances (Wittgenstein, 1953/1967, § 67). Wittgenstein also adds a linguistic component when he talks of the way we speak using multiple language games, for example to assert, to evaluate, to confirm, to appreciate and so on. If we do not understand or recognise the language games chosen in each instance, we risk not being able to break the circle of tacit knowing or assessment skills and they remain tacit and not shared. Language games are a vehicle that help us make what we highlight or notice explicit: noticing is thus to name and bring to light. In a similar parallel, an assessment language game points out what we are ‘looking’ at in an ostensive fashion,7 such that it can be assessed. We will return to this presently as we explore how and if assessment games can be forms of assessment capital. For Sadler, assessment as assessment games (noting that he does not explicitly use the term language game8) rests upon the pivotal role and quality of recognising Sadler (2014, p. 282) sometimes connects noticing to the Wittgenstein term noticing family resemblances. 7 Wittgenstein in later work (1953/1967, § 26–34) was scathing of understandings of language solely and simply as ostensive and pointing, which fail to understand that language is more than merely pointing with words. Language includes a social practice in which pointing might be but one component in a more culturally and context-sensitive language game. For Sadler, noticing is always a social practice. 8 We are conscious of an apparent paradox in the use of the term ‘noticing/looking’ at a rabbit/duck in the later Wittgenstein – it repeats and seems to support in practice his earlier usage of ostensive pointing that he used in the Tractatus logico-philosophicus (1921/2001). But we note that in his later work Philosophical investigations (1953/1967) this is only one language game and not the only language game; and thus, we continue to use it knowing that it is only one language game that might exist in assessment. Others might be promising, speculating, voicing a hunch or a proposed grade, and so on. 6

152

6 Challenging the Culture of Formative Assessment: A Critical Appreciation…

or what Wittgenstein calls noticing an aspect. This can be not only noticing or bringing to the surface the already-set criteria related to the performance, but also noticing the pertinence of other often tacit aspects of intellectual or skill development amongst students such as disappointment, enthusiasm or courage. This offers a more holistic understanding of how the tacit components are drawn into the ring of perception and valued (Sadler, 1989, 2009b, 2010, 2013, 2014). Through various stages of the assessment processes, both the teacher and students need to possess the capacity of noticing not merely something of high ‘salience’ and out of the ordinary or extraordinary (e.g. an essay based upon set criteria), but personal or psychological qualities (e.g. resilience or motivation) that are also worth noticing and valuing. The latter are often kept distinct from the former, such that psychological qualities are seen to attract praise in the process, but the final grades tend to focus more upon qualities measured in the criteria, such as critical or analytic thinking. On occasions, there may be good reasons for including such psychological qualities in the work or performance of the student, for example if employment will be later sought in the world of HR, health care, hospitality or different kinds of research where possessing and demonstrating resilience and motivation are central. A criteria-oriented way of noticing, looking for the salience of specified criteria, might be significant for grading academic achievement, for example, in order to maintain ‘grade fidelity’ (Sadler, 2010). This rests upon the proposition that the language game notices and makes explicit one or more of the aspects. Put simply, noticing is the process by which the tacit is made explicit and described in the criteria. Sometimes, the teacher’s noticing is different to the noticing of the student. The student might be looking the other way, so to speak, or not paying attention to a change in the teacher's voice as they talk of the coming assessment, and as a consequence, fail to notice what is pertinent. In such cases they may not be aware of the language game of noticing proposed by the teacher. A preliminary conclusion is as follows. Sadler is correct to draw attention to the role of the tacit in assessment knowledge and practice. It is our view that he understands the tacit in two ways. In citing Polanyi, we consider he tends to view it as a personal and to a large extent a psychological exercise in moving from the tacit to the explicit. It is a shifting of psychological perception, echoing in many senses the experience of Meno’s servant as suggested above. This is founded upon a psychological theory of learning and assessment. Worthy of further debate is the question: Is the individual capable of undertaking this shift in perception of his or her own volition and if so, how? The second way in which Sadler understands the tacit is as a form of language game: the game of noticing an aspect. Inspired by Wittgenstein, it appears to us that Sadler is suggesting that learning the language games of assessment is to learn what can and cannot be said. It is to make an item explicit with criteria, for example, to use a language that notices it is pertinent. It must be noted, though, that he implies the use of the concept of language game and does not actively or explicitly use it. For us on the contrary, in adopting a Wittgenstein-inspired understanding of noticing it is logical to also look into and adopt another of Wittgenstein’s concepts,

6.2 The Inspiration of Polanyi and Wittgenstein

153

namely the language game. Is it not the case that the language game directs the user to notice or the opposite, not to notice? Moreover, we would also argue, surely there are still aspects that remain resistant to moving from the tacit to the explicit. Let us take the example of the connoisseur, who communicates by example, not by precepts. In sharing the example, the teacher or examiner hopes the student will notice what was tacit, and this will in the process become explicit; that they will ‘get the point’ of the language game. However, to take an example from Australian higher education, sometimes international students may not notice the expectation that they are to critique a theory or might be allowed to disagree with teacher feedback on an assignment without exhibiting blatant disrespect. Thus, they may not join with the teacher in co-noticing or co-understanding the meaning of these things as shared language games. Given such a situation, what should teachers do to build the evaluative knowledge and skills of the student? For Wittgenstein, and this is underlined by Sadler without using the same terminology, to learn a language game is to train and repeat and the student might have to repeat and repeat to get the point: ‘Here the teaching of language is not explanation, but training’ (Wittgenstein, 1953/1967, § 5). This obviously has behaviouristic connotations and is associated with rote learning. Wittgenstein was in this context referring to young children learning the language games of language. But the point seems valid irrespective of the age of the student: it is not merely about psychological perception; it is about language games. In our context, it is the language games of assessment and evaluation. A student has to train in the use of criteria or other forms of noticing. They need to gain experience under the guidance of teachers and peers. The language game also requires that the participants learn the rules of the game and the conventions, and they have to be explicitly understood and shared. This means that learning the language games of assessment entails making both the content and the practice of the language game explicit. An important point for further discussion is that to some extent Sadler’s view of the teacher is that he or she is the personification of the connoisseur, and thus the expert in showing the student what assessment is and how evaluative knowledge is developed. But, in so doing, the risk is that a form of transmission-based pedagogy is both revealed and relied upon where the student trains and learns by imitating. It is not a pedagogy that is open-ended and collaborative with a high degree of co- creation and co-ownership. With this in mind, the emphasis on the active engagement of peers in assessing each other’s work is useful. This is something Sadler often identifies as crucial. Returning to our original concern, we would contend that assessment of learning draws heavily upon the language game of noticing criteria, learning objectives and the use of rubrics. In contrast, assessment for learning arguably entails a relationship based upon moving to and fro between tacit and explicit knowledge and evaluative knowledge. Assessment as learning is about the skills of assessment and evaluation required to work in both assessment for and of learning. The Polanyi- inspired conception of assessment of, for and as learning is more disposed to consider the role of psychology, rather than the key role played by the social practice of language games and collaborative learning. Table 6.1 seeks to summarise the

154

6 Challenging the Culture of Formative Assessment: A Critical Appreciation…

Table 6.1 Sadler’s understanding of evaluative knowledge and skills Type of assessment Assessment of learning Assessment for learning

Key aspects of noticing (Wittgenstein) Learning objectives, criteria, rubrics Feedback to move the learning from tacit to explicit knowledge and evaluative knowledge

Assessment as learning

Complying with criteria and building evaluative knowledge/ skills within a shared practice (language game) and training

Tacit communicated by example (Polanyi) Knowledge does not uncover the tacit, and focuses only upon the explicit Change in individual psychological perception is required, to move toward, but never totally, to discover the yet to be discovered, i.e. what constitutes the tacit foreknowledge Unconsciously learning of the tacit through imitating the teacher or expert, learning through the example and not the precept

differences between a Wittgenstein and Polanyi inspired understanding of assessment, keeping in mind that Sadler has drawn from both. Note in particular that Polanyi never considers that the tacit can be made totally explicit and, in the adage ‘I can’t tell you what quality is, but I know it when I see it’, we understand a movement toward, but never totally discovering tacit (fore)knowledge (Polanyi & Sen, 2009). For Polanyi it is a somewhat mystical transmission of the tacit within the individual through example and not through precepts (i.e. general rules) , indicating that (i.e. precepts) still predominantly. Sadler draws upon this view, but he seems to be more oriented toward the teacher gaining access to the tacit in themselves and seeking to communicate it to the students. We have sought to explore how Sadler is inspired by the psychological concept of the tacit (Polanyi) and by Wittgenstein’s concept of noticing, which is one particular language game among many alternative language games of assessment. Sadler has undoubtedly advanced a global discussion and understanding of formative assessment over many years, and yet we remain unsure whether he has progressed our understanding as far as it might go. For example, while Sadler notes that the teacher needs to help the student develop evaluative knowledge and skills, he to some extent underplayed the disparities among students and between students and teachers as regards such knowledge and skills. Secondly, just because something is made explicit and shown to a student does not necessarily mean it is learnt. Lastly, a key component of any assessment is judgement skills based upon evidence from the student, which is weighed up, balanced and an assessment made and communicated. Whether this judgement takes on the character of feedback as telling in an ongoing chain of feedback loops or it is assessment of learning in a summative sense, we are dealing with the language game of judgement and this is only one of several possible language games that exist in assessment, such as, but not limited to, promising, speculating, following and voicing a hunch, proposing a comment or grade, and so on. These others remain relatively absent in the debate about assessment practices in general and the debate inspired by Sadler in particular on formative assessment/assessment for learning.

6.3 Assessment Capital in a Knowledge Society

155

With these points in mind, in the following sections our goal is to ask: what concepts, building upon the work of Sadler, can we introduce to enrich and progress our understanding of a future-directed practices of (formative) assessment? We will argue for an understanding of assessment capital, where equality and inequality are central, and equally assessment judgements play a crucial role in addressing these concerns.

6.3 Assessment Capital in a Knowledge Society In this part of our argument, we will draw upon the inspiration of Bourdieu and his concepts of habitus and capital. With these concepts it is possible to understand that assessment for, of and as learning is not necessarily evenly distributed as an evaluative skill and knowledge among teachers and students. They constitute forms of assessment capital that can be accumulated through training and experience. The obverse is that the capital in question might deplete or dissipate if not recalibrated, maintained and actively used over time. In simple terms it might become out of date. Under Bourdieusian logic, sociological systems drive the assessment practices of teachers and students through specific generative mechanisms and structures. As the practices are repeated, they attain a tacit character and become what he calls habitus. They become ‘unconscious schemata’ (Wacquant, 1998, p. 221). This is ‘a system of durable and transposable dispositions which, integrating all past experiences, functions at every moment as a matrix of perceptions, appreciations, and actions’ (Wacquant, 2016, p. 66). In assessment guided by a particular habitus, as it always is, we can grasp ‘the intentionality without intention, the knowledge without cognitive intent, the pre-reflective, infra-conscious mastery that agents acquire in the social world’ (Bourdieu and Wacquant 1994, as cited in Stahl, 2015, p. 22). It is also important to consider how individuals transform and modify their habitus through their actions and accumulated periods of training and practice; and that habitus is a form of power guiding how one acts when assessing or being assessed. Habitus is, therefore, not something fixed in stone and unchangeable, nor is it neutral; it can be unevenly distributed. Bourdieu’s idea of habitus parallels Polanyi’s terminology of tacit knowledge. As expressed by the latter, it can be understood as ‘the quiet storehouse of information that exists though it cannot necessarily, or at least easily, be put into words. It is the underlying framework that makes explicit knowledge possible’ (as cited in Johnson, 2016, p. 2). Thus understood, tacit evaluative knowledge and skills as assessment capital, defined below, constitute a framework for assessment that can lead or guide someone to ‘articulate intelligence’ (Polanyi, 1962, p. 79) or the observable behaviour or actions in life. We can also seek to make the tacit, as habitus, explicit, as in the assessment of learning. However, what is made explicit is only what is captured by the criteria, objectives and rubrics. In assessment for learning, there is a steady flow from implicit to explicit and vice versa. For example, a student assessed in English skills

156

6 Challenging the Culture of Formative Assessment: A Critical Appreciation…

at the end of every week will gradually develop tacit evaluative knowledge and skills as a form of durable disposition, as habitus, determining how they perceive, judge and act in the world. When a teacher offers feedback, the tacit might become explicit and they will understand more deeply what was taken for granted. With assessment as learning the Bourdieusian insight is that evaluative knowledge and skills are obtained by training with examples in a social setting over time, such that they become a deeply ingrained habitus. We should remember that, while Polanyi, like Sadler and Bourdieu, emphasises the role of training to develop the tacit/habitus, Polanyi tends to present a more psychological and individual understanding than the sociological position proposed by Sadler and Bourdieu. Bourdieu’s theory of capital, and assessment capital in this context, is intimately related to his field theory. Individuals follow the rules of the game and exercise them for their own interests in this field. Field can thus be defined as a place where individuals and groups interact, work, and struggle over power – based on a shared set of understandings, beliefs, values, and norms that form the rules of the game. Fields are organized around specific types or combinations of capital. (Broberg, 2015, p. 51)

So essentially a field would include the arena of shared interactions and the systems that operate as agents seek to exercise assessment capital/power for positions in the field. The rules of the game are not always written and, even if they are written, as in the rubrics of assessment, how they are used can remain tacit or part of the habitus and also be differentially drawn upon in the interests of different forms of power. The term ‘game’ in Bourdieu’s perspective is analogous to a card game, where the card holders possess certain cards they can play to out-trump each other. To Wittgenstein, and by extension Sadler, a game is less often understood as out- trumping and differential levels of power. Indeed, for Sadler the language assessment game might be understood differently by teacher and student, but the concept of power remains under-theorised. The exact opposite is the case with Bourdieu’s view, in which power is evident in all places and spaces and on every occasion. Power and the exercise of power through the generative mechanisms and structures of habitus, field and capital reveals or seeks to disguise through its tacit, taken-for-granted nature the origin and maintenance of inequalities between participants in assessment acts. What is gained by using capital and habitus to further progress our understanding of assessment for, of and as learning in the work of Sadler? Assessment of learning constitutes a form of assessment capital in the sense of reaching agreement between teacher and student about what knowledge and skills a student possesses in a summative sense at any point in time along a continuum, and their accompanying evaluative knowledge of this. Assessment for learning is the understanding that assessment capital is transformable and based upon feedback and subsequent adoption or modification to change the level of knowledge and skills possessed over time. Assessment as learning as assessment capital, on the other hand, is the view that a student or teacher possesses skills in evaluating their own and others’ knowledge and skills at any point in time. For Sadler the goal is always building up assessment capital and a further note is possible with Bourdieu in mind: the different levels of assessment capital between

6.4 Assessment Judgements

157

students and between the student and teacher are an expression of power and the dialectics of power. The levels of assessment capital as assessment for, of and as learning will therefore be differentially distributed and it becomes a moral and learning imperative that the learning and assessment process is directed to reducing these disparities. Not only this, assessment capital can be not only built up but, as suggested above, it can become depleted or dissipated in the sense of ‘use it or lose it’. Such a view is intimately connected with the opportunity to transform and exchange assessment capital with other forms of capital. If we consider assessment as a form of capital it includes economic components, referring to the value of evaluative knowledge and skills as a measure of future earning potential. Assessment capital as possessed by teacher and student can also be transformed or exchanged into a social component, such as access to social or professional networks of belonging and recognition. Lastly, assessment capital also accrues and can be exchanged for cultural capital in the sense of insights and understanding of shared values in a group of students, teachers and assessors. By using these insights into assessment capital, we become more conscious of how the teacher might be encouraged to transform assessment capital into social and cultural capital in the short term of the university course, or during the period the student is at university. In the longer term after graduation, assessment capital will be more easily measured and transformed into economic capital, through employment prospects promised or realised. Please note, some readers more conversant with Bourdieu’s concepts might prefer the use of the word convert rather than transform, much as one source of capital is converted into another like a currency. Two points remain undiscussed and must be the subject of further research and discussion. Firstly, how and in what manner can assessment capital disparities be explicitly and consciously reduced? Secondly, what are the consequences of assessment capital disparities in terms of power relations in the school classroom, university tutorial or assessment act? Inspired by Bourdieu, assessment acts will never be power neutral. We can work to reduce the differentials in assessment capital, habitus and its evidence and effects in fields of interaction and systems. But they can never be totally eliminated. By further implication, such a theoretical view suggests that any talk of assessments as neutral and perfectly inclusive of all in every respect is a fallacy. The practice of assessment acts remains a field in which power as assessment capital can accrue or be depleted across and between different participants.

6.4 Assessment Judgements A key theoretical, practical and moral imperative in Sadler’s work has been to empower the student and also the teacher in assessment. His consistent view over many years has been that the student should develop their meta-cognitive skills to judge and self-assess their own work and that of peers. Following from the previous

158

6 Challenging the Culture of Formative Assessment: A Critical Appreciation…

section, this would lead to the accruing of assessment capital. But our interest in this section is in the assessment judgements that are so crucial to assessment in any context and not merely in education. Take, for example, the concept of judgement thresholds proposed by Sadler in conversation with Dobson (21 March 2007) and inspired by Tversky and Kahneman (1974). Judgements are understood to be based upon a critical factor or an accumulation of factors to a saturation point that, when reached, results in a judgement.9 An example is purchasing a car after collecting different pieces of information. Klenowski (2007), in similar vein, has talked of identifying the non-discountable pole in examiner judgements. The non-discountable functions as evidence of a critical threshold criteria and as a consequence it dictates the weighting of the judgement. Sadler (1989), writing about formative assessment, has reflected in more detail upon the topic of examiners’ judgements. As we discussed in Chap. 5, he suggested that when examiners make qualitative judgements on the work of a student two forms of judgement are possible. The judgements are either analytical (with pre- specified criteria and the search for evidence of them in the work of the candidate) or configurational (with a holistic assessment first, and then its substantiation by referring to criteria which may or may not overlap). With configurational judgements, the criteria exist as a tacit pool of potential criteria, from which examiners select according to their relevance to the case at hand. In other words, they become manifest when useful. It is possible to envisage a mixture of the two, or a commuting backwards and forwards between the two: analytical to configurational, and back again. Moving beyond Sadler, we can ask why is it so hard to pinpoint the kind of judgements used by teachers in assessment for, as and of learning? It might be because assessment as an evaluative language game misconstrues its task, seeking to justify its prescriptive assertions on the basis of different language games, the interrogative and the report language games, both with denotative components. As Lyotard (1986, pp. 27–29) has argued, the denotative component assumes the demonstration of the existence or non-existence of a state of affairs or a particular truth. In some assessment acts, as in seeking to assess a written essay, the truth of a particular assertion can be affirmed, read extracted, from the printed text. But evaluative language games do not deal with such empirical certainties. Nor is the appeal to some predefined and explicit set of grading criteria enough. The criteria on grading sheets must still be interpreted by users, and as such they are potentially vulnerable to the subjective opinions and interests of examiners. These judgements can and do vary in a case-by-case manner, especially when the student work to be assessed has a significant subjective component or the tacit habitus of the examiners varies significantly.

Sadler (1981) talked of judgemental bias and the need to understand the different ways in which this might arise, such as an overload of data hindering understanding in a specific situation, what is now more commonly known as cognitive overload, as it has an effect upon cognitive decision making. 9

6.4 Assessment Judgements

159

The point is simple. Teachers when assessing tend to move between language games with denotative components (through questions and answers and reports) and the evaluative language game when the decision or feedback as a judgement is determined. Note also that even in a multiple choice test the evaluative judgement is still present in determining what counts as a cut-off grade before it is implemented in the allocation of assessment bands e.g. what is a B as opposed to a C and what counts as a B– as opposed to a C+. The information and hence practice upon which each of these language games rest can be different and mutually exclusive: there is no automatic or necessarily causal connection. It is timely to recall Aristotle’s (1981) famous distinction in Dissertation IV of the Nicomachean ethics between episteme (a scientific, theoretical knowledge denoting a fixed form of ‘know that’, found in language games with denotative components) and phronesis (often translated as practical common sense based upon wisdom, prudence and a ‘know why’). Aristotle explained: And let it be assumed that there are two parts which grasp a rational principle – one by which we contemplate the kind of things whose originative causes are invariable, and one by which we contemplate variable things; link – https://www.sacred-texts.com/cla/ari/ nico/nico055.htm (1981, Book VI, 91–92). We consider phronesis in our usage as indicative of the evaluative language game that considers the variable, while episteme concerns itself with the invariable. The teacher may at times use questions, answers and reports to focus upon looking for constancies and certainties, while the judgements made on grades, feedback or decisions may rest upon variables, and change from case to case. Some, such as Orton (1998), resist this argument, arguing that it is necessary to move from the interrogative and report language games to evaluative language games, because if the latter are not enlisted in assessing candidates they will be treated as objects and not as humans, ‘confusing persons and things’ (Orton, 1998, p. 542). Messick’s (1998) combination of the evidential with values and consequences attempted to do precisely this, including combining the evaluative language game (associated with values and the consequential) with the denotative language game (associated in particular with the evidential, but also with constructs). But in reducing this combination to a unitary judgement, as was his preference, he risked neglecting and overlooking the difficulty of combining them and whether it was at all possible. Markus (1998) has suggested that Messick remains trapped in the impossible project of attempting a synthesis between the realist evidential (the is or that) and the constructivist consequential (the ought). This was despite Messick’s somewhat positivistic rejoinder: ‘If values are justifiable, they are akin to facts, and the tension between the evidential and consequential bases of validity is resolved’ (Messick, 1998, p. 36). Perhaps there is no answer to this paradox, when framed as above10 in the manner of discontinuous, discrete language games. The danger with such an argument, however, is that assessment judgements are no longer seen to rest upon what is

10

Inspired by Aristotle’s (1981) concepts of episteme, techne and phronesis.

160

6 Challenging the Culture of Formative Assessment: A Critical Appreciation…

affirmed in the particular essay or piece of work under examination and its connection with theoretical knowledge (episteme), but upon the opinions of examiners and their interests and competences as judgements are made (phronesis). There is arguably no definitive response to such a danger, other than to say that it is worthy of further debate within and outside this current text. One avenue is to engage with the work on students’ and teachers’ evaluative judgements and the bias that both may hold as they make an assessment. The biases may be unconscious, such as valuing what we already sense we understand or own about something learnt or taught, the so-called endowment effect (Joughin et al., 2019). Such views might be usefully discussed as forms of tacit knowledge and evaluative skills. Another avenue is to consider whether all assessment and evaluation is based upon comparison with your own work or that of another. (This line of thought goes back to Thurstone in the 1920s in his work on comparative pairs, whereby we are most comfortable recognising and comparing traits in the different works under consideration; Thurstone, 1927, 1928). For us, such views might be understood as dependent on what we choose to notice or not notice and how.

6.5 Closing Comments In this chapter, we have drawn attention to how Sadler has sought to understand assessment for, of and as learning with his own particular emphasis on formative assessment. Over many years he has made a number of noteworthy theoretical interventions with implications for assessment practices. It has been our contention that his work can be understood by looking at theories of the tacit as proposed by Polanyi and the work of Wittgenstein on noticing and language games (only partially adopted by Sadler). For Sadler the common thread has been about developing the evaluative skills and knowledge of the teacher and the students; what we have highlighted as assessment as learning. We have also sought to progress the work of Sadler by looking at assessment capital and assessment judgements as two examples of generative mechanisms and structures influencing the enactment of assessment practices. The theory of assessment capital highlights that teachers and students accumulate assessment capital differentially and do not always possess, or come to possess, the same understanding or evaluative skills. As regards assessment judgements, it is also crucial that we consider the intricacies and challenges of combining the evidential with the making of decisions and judgements that have consequences of a formative and summative character. This may also require teachers to consider the ‘calibration and/or moderation’ of their judgements (Sadler, 2013; Wyatt-Smith et al., 2017). Our ideas have commonalities with Wyatt-Smith’s (2009) understanding of assessment as ‘critical inquiry’ that involves a lens (we would write instead a noticing) to characterise what is at play (read instead language game or games), to emphasise how student achievement is evaluated and therefore valued (read assessment capital accumulated and used).

References

161

What remains, and this must be the task for the reader, is to critique the views we have presented in this chapter and in so doing engage with and develop additional theoretical and applied approaches to formative assessment. Another important task is to critically work through our ideas with culturally and contextually sensitive examples from different assessment systems located in other education eco-systems. Take, for example, how English is both taught and assessed in many countries in the world; and how those learning and teaching methods have different practices of noticing and possess different assessment capitals that develop in a formative and summative sense. Such an approach remains sensitive to the importance of a comparative case-by-case methodology, without losing sight of the desire to up-scale and standardise our assessment judgements in and through assessment policies. We begin to consider this cultural aspect in the final chapter, but longer case studies must be the subject of a subsequent book to follow in due course.

References Aristotle. (1981). The Nicomachean ethics (J. Thomson, Trans.). Penguin Dissertations. Biggs, J. B., & Tang, C. (2011). Teaching for quality learning at university: What the student does (4th ed.). McGraw-Hill, Society for Research into Higher Education and Open University Press. Boud, D., & Falchikov, N. (Eds.). (2007). Rethinking assessment in higher education: Learning for the longer term. Routledge. Broberg, G. (2015). Seeing is achieving: Assessment practice and student capital (PhD thesis). Arizona State University. Brown, G., Bull, J., & Pendlebury, M. (1997). Assessing student learning in higher education. Routledge. Dann, R. (2014). Assessment as learning: Blurring the boundaries of assessment and learning for theory, policy and practice. Assessment in Education: Principles, Policy and Practice, 21(2), 149–166. Dinishak, J. (2013). Wittgenstein on the place of the concept ‘noticing an aspect’. Philosophical Investigations, 36(4), 320–339. Dixson, D. D., & Worrell, F. C. (2016). Formative and summative assessment in the classroom. Theory Into Practice, 55(2), 153–159. Dobson, S. (2017). Assessing the viva in higher education. Springer. Dobson, S., & Scofield, E. (2020, April 8). The rush to online-ness. Newsroom. https://www.newsroom.co.nz/ideasroom/the-rush-to-online-ness. Accessed 21 Oct 2021. Earl, L. (2013). Assessment as learning: Using classroom assessment to maximize student learning. Corwin. Grover, V. K. (2016). Classroom cybernetics: An approach for effective and efficient classroom teaching. International Journal of Research in Advent Technology, 4(1), 45–52. Harlen, W. (2007). Assessment of learning. Sage. Heron, G. (2011). Examining principles of formative and summative feedback. British Journal of Social Work, 42(2), 276–295. Heim, A. (2023). Cultural perspectives on indigenous students’ reading performance. Springer. https://doi.org/10.1007/978-981-19-9790-7_8 Jackel, B., Pearce, J., Radloff, A., & Edwards, D. (2017). Assessment and feedback in higher education: A review of literature for the Higher Education Academy. Higher Education Academy. Johnson, S. (2016). Tacit knowledge: An assessment of Michael Polanyi’s epistemology (MDiv thesis). Regent University.

162

6 Challenging the Culture of Formative Assessment: A Critical Appreciation…

Joughin, G., Boud, D., & Dawson, P. (2019). Threats to student evaluative judgement and their management. Higher Education Research and Development, 38(3), 537–549. Klenowski, V. (2007). Evaluation of the effectiveness of the consensus-based standards validation process. Department of Education, Training and the Arts. Lyotard, J.-F. (1986). The post-modern condition of knowledge. Manchester University Press. Markus, K. A. (1998). Science, measurement, and validity: Is completion of Samuel Messick’s synthesis possible? Social Indicators Research, 45, 7–34. Messick, S. (1998). Test validity: A matter of consequence. Social Indicators Research, 45, 35–44. Mika, C. (2017). Indigenous education and the metaphysics of presence: A worlded philosophy. Routledge. Orton, R. E. (1998). Samuel Messick’s consequential validity. Philosophy of Education, 1998, 538–545. Pereira, D., Flores, M. A., & Niklasson, L. (2015). Assessment revisited: A review of research in Assessment and Evaluation in Higher Education. Assessment and Evaluation in Higher Education, 41(7), 1008–1032. Plato. (1956). Meno. Penguin Dissertations. Polanyi, M. (1962). Personal knowledge: Toward a post-critical philosophy. Routledge. Polanyi, M., & Sen, A. (2009). The tacit dimension. University of Chicago Press. Sadler, D. R. (1981). Intuitive data processing as a potential source of bias in naturalistic evaluations. Educational Evaluation and Policy Analysis, 3(4), 25–31. Sadler, D. R. (1982). Evaluation criteria as control variables in the design of instructional systems. Instructional Science, 11(3), 265–271. Sadler, D. R. (1983). Evaluation and the improvement of academic learning. Journal of Higher Education, 54(1), 60–79. Sadler, D. R. (1985). The origins and functions of evaluative criteria. Educational Theory, 35(3), 285–297. Sadler, D. R. (1987). Specifying and promulgating achievement standards. Oxford Review of Education, 13(2), 191–209. Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18(2), 119–144. Sadler, D. R. (1998). Formative assessment: Revisiting the territory. Assessment in Education, 5(1), 77–84. Sadler, D. R. (2002). Learning dispositions: Can we really assess them? Assessment in Education: Principles, Policy and Practice, 9(1), 45–51. Sadler, D. R. (2005). Interpretations of criteria-based assessment and grading in higher education. Assessment and Evaluation in Higher Education, 30(2), 175–194. Sadler, D. R. (2007). Perils in the meticulous specification of goals and assessment criteria. Assessment in Education: Principles, Policy and Practice, 14(3), 387–392. Sadler, D. R. (2009a). Grade integrity and the representation of academic achievement. Studies in Higher Education, 34(7), 807–826. Sadler, D. R. (2009b). Indeterminacy in the use of preset criteria for assessment and grading. Assessment and Evaluation in Higher Education, 34(2), 159–179. Sadler, D. R. (2010). Beyond feedback: Developing student capability in complex appraisal. Assessment and Evaluation in Higher Education, 35(5), 535–550. Sadler, D. R. (2011). Academic freedom, achievement standards and professional identity. Quality in Higher Education, 17(1), 85–100. Sadler, D. R. (2013). Assuring academic achievement standards: From moderation to calibration. Assessment in Education: Principles, Policy and Practice, 20(1), 5–19. Sadler, D. R. (2014). The futility of attempting to codify academic achievement standards. Higher Education, 67(3), 273–288. Stahl, G. (2015). Egalitarian habitus: Narratives of reconstruction in discourses of aspiration and change. In C. Costa & M. Murphy (Eds.), Bourdieu, habitus and social research: The art of application (pp. 21–38). Palgrave Macmillan.

References

163

Tai, J., Canny, B., Haines, T., & Molloy, E. (2016). The role of peer-assisted learning in building evaluative judgement: Opportunities in clinical medical education. Advances in Health Sciences Education: Theory and Practice, 22(3), 659–676. Thurstone, L. (1927). A law of comparative judgement. Psychological Review, 34, 278–286. Thurstone, L. (1928). Attitudes can be measured. American Journal of Sociology, 33(4), 529–554. Torrance, H. (2001). Assessment for learning: Developing formative assessment in the classroom. Education, 3–13, 29(3), 26–32. https://doi.org/10.1080/03004270185200331 Torrance, H. (2007). Assessment as learning? How the use of explicit learning objectives, assessment criteria and feedback in post-secondary education and training can come to dominate learning. Assessment in Education: Principles, Policy and Practice, 14(3), 281–294. Turner, S. (2012). Making the tacit explicit. Journal for the Theory of Social Behaviour, 42(4), 385–402. Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124–1131. Wacquant, L. (1998). Pierre Bourdieu. In R. Stones (Ed.), Key sociological thinkers (pp. 215–229). Macmillan Education. Wacquant, L. (2016). A concise genealogy and anatomy of habitus. The Sociological Review, 64(1), 64–72. Weisler, S. (2015). Some perspectives on assessment of student learning. Journal of Assessment and Institutional Effectiveness, 5(2), 117–130. Weurlander, M., Söderberg, M., Scheja, M., Hult, H., & Wernerson, A. (2012). Exploring formative assessment as a tool for learning: Students’ experiences of different methods of formative assessment. Assessment and Evaluation in Higher Education, 37(6), 747–760. Wiliam, D. (2010). The role of formative assessment in effective learning environments. In D. Hanna, I. David, & B. Francisco (Eds.), Educational research and innovation: The nature of learning: Using research to inspire practice (pp. 135–159). OECD Publishing. Wittgenstein, L. (1967). Philosophical investigations (G. E. M. Anscombe, Trans.). Oxford: Basil Blackwell. (Original work published 1953). Wittgenstein, L. (2001). Tractatus logico-philosophicus (D. Pears & B. McGuinness, Trans.). Routledge. (Original work published 1921). Wyatt-Smith, C. (2009). Toward theorising assessment as critical inquiry. In C. Wyatt-Smith & J. J. Cumming (Eds.), Educational assessment in the 21st century: Connecting theory and practice (pp. 83–102). Springer. Wyatt-Smith, C., Alexander, C., Fishburn, D., & McMahon, P. (2017). Standards of practice to standards of evidence: Developing assessment capable teachers. Assessment in Education: Principles, Policy and Practice, 24(2), 250–270. Yang, G. Y. (2021). An investigation of how cultural factors influence Australian Aboriginal students’ reading performance: An exploratory and participatory case study (PhD thesis). University of South Australia.

Chapter 7

Moving Assessment in New Directions

You could say: disorder is when nothing is in the right place. Whereas order is when the right place has nothing at all. These days, you tend to find order where there isn’t anything. It’s a symptom of deprivation. (Brecht, 1956/2019, p. 13). Ocean between tūpuna (grandfather) and mokopuna (grandchild) on the porch always to be re-painted peeling a story silver feathered silence strong backed and stubborn carried by the whistling spring to Ocean our Ocean the sail cloth stained red red the fish heads thrown back to bait the dreams of other childhoods of eyes turned down and precious thoughts kept for another day of arms moved inward in sleep our shelter. (Stephen Dobson, unpublished poem)

Abstract This chapter weaves together the different threads of preceding chapters. Clusters of generative mechanisms and structures from each chapter are re- presented, along with the corresponding kinds of assessment acts and practices. How to transform the assessment practices of an institution, in this case a university, is discussed using the concept of language games of assessment, along with assessment capital and assessment habitus. A layer of complexity is added for the reader, when different world views of assessment are introduced into the discussion. This involves the drawing together of (a) Māori language games and forms of life connected with aromatawai, which means ‘to take notice of’ or ‘pay attention to’ and ‘to examine closely’ and (b) traditional western language games of assessment that include assessing in a valid, reliable, authentic, equitable and informative (transparent) manner.

© Springer Nature Switzerland AG 2023 S. R. Dobson, F. A. Fudiyartanto, Transforming Assessment in Education, The Enabling Power of Assessment 10, https://doi.org/10.1007/978-3-031-26991-2_7

165

166

7 Moving Assessment in New Directions

The first quotation is intriguing for several reasons. We will draw attention to one. It is by the famous dramatist Bertolt Brecht who always wanted those attending his plays to be actively involved. Theatre for him was not a catharsis so that we could return to our lives outside the theatre feeling emotionally cleansed and ready to carry on as normal. Accordingly, theatre was not intended to be a relaxing entertainment; it was to be transformative, making us question the existing state of things and as a result to see and think differently. This approach matches and inspires our desire that assessment should not be merely an accompaniment to a period of study and learning in lessons or some other activity that is considered more important and enjoyable (or ‘fun’). For us assessment, like Brecht’s theatre, can be a source of learning that is valuable in itself. We might even say it should have didactic value and teach us something. If we ask somebody to recount a situation where they have had an oral assessment or a group presentation, they will most likely remember it vividly. They will give an account of the event, what they learnt from the assessment itself and how it is also a good memory hook for the knowledge, skills and values they were sharing with the listener at that time. By comparison, the sitter of multiple written exams may not as easily lay their finger on such things. They might all merge into each other, with the detailed knowledge quickly becoming blurred. The conversation soon turns to cramming and short-term memory and the need to have quick recall and good cognitive processing skills. The rationale adopted is that assessment requires learning how to think and perform more quickly with shorter and shorter reaction times. Historically the viva also has had a clear didactic goal and has taught the participant lessons. As Dobson recounts: In Greco-Roman times, public disputation among philosophers had become less a give and take, where questioner and answerer exchanged positions. For example, Plotinus (205–70 CE), a neo-Platonist, allowed his students to interrupt with questions. But it was not the more open Socratian question and answer; rather he dealt with questions such that they furthered his teaching on a particular theme. The role of questioner and questioned was fixed. The philosophical disputation therefore had an increasingly different goal in mind, one which was protreptic. It resulted in the instruction and socialisation of less advanced students into a more domesticated and controlled dialectical way of arguing. (2017, p. 9)

As this excerpt indicates, citing a word we rarely use today, namely protreptic, investigating the history of assessment reveals different twists and turns as education becomes a clear objective. The etymology of protreptic is to instruct and persuade: from the Greek protreptikos, to turn forward, urge on. A particular assessment practice and its accompanying language game can of course discipline and control participants, and this would be one reading of the poem above as newcomers to a profession, such as teachers might adopt the assessment practices and games of their more experienced and older teachers (between tūpuna and mokopuna i.e. between older and the younger, grandparents and grandchildren). Another reading would that when an experienced person recounts an assessment practice to younger counterparts it may not evoke a fond memory. Instead, it might suggest, ‘you are lucky these days, your assessments are easier and

7 Moving Assessment in New Directions

167

require less learning and a lesser demonstration of it.’ This may be what is shared, even when it is not the actual case and contains an exaggeration that cannot be proved or disproved. Accordingly, the form of assessment can discipline and control participants, even though teachers and students are at the same time learning something from the experience of assessment itself. Clearly stated rubrics can be considered at times to be one such mechanism of discipline and control, defining what is in scope and out of scope to be assessed. With this in mind, a major reason for our chapter on assessment and connoisseurship was not only to seek an answer to the relatively simple question: How can creativity be assessed? Our deeper intention was to understand that assessing with rubrics is only one side of the practice of assessment. Using experience as a measure and assessment resource suggested for us that the assessor can also decide to be a connoisseur who draws upon a wide set of reference points and information including their personal experiences to value, judge and therefore assess the work of the student. Where the supporter of rubric-guided assessment is clear about what can and cannot earn recognition and grading points, the assessor as connoisseur always has to be ready to consider how to assess something they might not have anticipated. Put differently, they might themselves learn something from the act of assessment, so that it becomes a memorable act in itself for them and not a routine and potentially burdensome pile of assignments to mark. We might be accused of proposing a simple dichotomy between the assessor as connoisseur and those who represent the other side. The latter may include those who routinise and ‘quality assure’ the assessment by controlling those who wish to break the rules expressed in the assessment policy, for example through cheating or asking for endless extensions. Such a position would of course be an over- simplification, as we will always need to strive to follow and reach agreement on the language games of assessment and the rules, norms and policies governing assessment actions in different situations. However if, as a consequence, assessment is ‘only’ used to enforce and reinforce the behaviour of those assessing and those being assessed according to existing rules and regulations, we will miss the potential richness of assessment. What is missed includes assessment that can consider unexpected situations and address the needs and diversity of all the students undertaking assessment. To take a simple example, students who come from strong oral backgrounds might perform equally well in terms of grades as those used to written assignments, with appropriate forms of assessment. This suggests that multiple ways of assessing the same constructs (e.g. knowledge of geography or history or music) should be permitted. In the currently fashionable parlance of the language games of assessment, we are talking of ensuring accessible assessment (Round Table on Information Access for People with Print Disabilities, 2019). Moreover, assessment is embodied in the habitus of individuals and institutions. It is carried in our bones so to speak and as such is not always easily verbalised. It is tacitly known and can be mobilised when relevant to make assessments. As the old adage suggests, we are not always able to say or describe what characterises quality, but we know it when we see it. The example of assessing students in

168

7 Moving Assessment in New Directions

performance-based disciplines and professions comes to mind, such as those studying vocational subjects, sport or the arts or even something like tourism. As we noted in an earlier chapter, citing two Norwegian-based assessment researchers: One of the really big challenges for the assessment of music in the classroom involves what Eisner (1985, p. 54) called expressive objectives. Expressive objectives are relevant in learning activities where students explore and experience wonderment, try and fail, are engaged in creativity and allowed to focus upon aspects in the respective activity that holds special, personal interest. These learning activities are strongly evident in improvisation and composition. Here, experience and exploration are central elements and qualities to be developed in student creativity, students’ perceptual skills … etc. For the development of such expressive qualities it will be difficult or impossible to set criteria or behavioral goals in advance, for the simple reason that there exist numerous solutions that can be just as ‘good’ or as ‘true’. (Særte & Vinge, 2010, p. 172; translation by Dobson)

It is important to acknowledge the continual ebb and flow between assessment governed by practising the rules and using the given rubric and something more. The latter could include assessment seeking to respond sensitively and creatively to the diversity of students and giving them all an equal opportunity to demonstrate the full extent of their knowledge, skills and values. A dialectic between these two positions is required and not a dichotomy. There will always be a place and time for acknowledging a weight towards one or the other. This is very different from excluding one in favour of others. Of course, we acknowledge that this is easy to identify as a goal, but much more demanding to implement as an assessment act and accompanying set of practices voiced as an assessment policy.

7.1 What Is Assessment? To answer this question one reply might be to draw attention to assessment of, for and as learning as three generative mechanisms and accompanying structures of assessment. By this we mean it might be possible to introduce, grow and identify for each of these types of assessment specific cultures and practices with accompanying language games, assessment capitals and habituses embodied by practitioners, students and institutions. Some might choose to reduce assessment as learning to a special case of assessment for learning; thus retaining only a simple dichotomy between assessment for learning and assessment of learning. We have argued for the retention of this tripartite distinction. There is a key reason. We are happy to agree that assessment for learning obviously intends that the recipient will act on the feedback and take it to heart, so to speak. However, on a deeper level assessment for learning is dependent on one party offering feedback to another, whether it is teacher to student, student to fellow student or even an automated chatbot offering immediate feedback. Assessment as learning relies upon a different generative mechanism and structure. It works on the individual learning of their own accord from assessment. Put differently, this self-referential aspect or, in the terms of Bourdieu, self-reflexiveness is the leitmotif of assessment as learning.

7.1 What Is Assessment?

169

Assessment of learning is a final point or summary of knowledge, skills and values acquired or, as noted, what remains after everything learnt has been forgotten (i.e. those identity-forming elements of twenty-first-century soft skills). The foundation of assessment for learning is that the learning is still taking place and it is not too late to focus on the process and make adjustments to learning. These three theoretical understandings of assessment carry with them implications for power and the relations between those involved. Assessment of learning suits those who can perform well in the final summing up of what has been learnt, under the pressure of the ticking clock in an exam for example. Correspondingly, the marker of these assessment actions has the sole power to set the grades and those subjected to the interpretation of rubrics may always feel (justified or not) that some form of tacit evaluative judgement and secret language game belonging only to the initiated has been at play. Assessment for learning allocates the power more equally, as it is hoped that the recipient is motivated to react to the feedback received. As two assessment scholars put it in their definition of feedback for learning: ‘Information which a learner can confirm, add to, overwrite, tune, or restructure … domain knowledge, meta-cognitive knowledge, beliefs about self and tasks’ (Winne & Butler, 1994, p. 5741). We return to our point above that assessment as learning is in our view the most overtly democratic and participatory. It rests upon the self-assessment skills developed by the student. Of course, not all students will take this opportunity to heart. Caution is also required. As Zhudi and Dobson (2021) have pointed out in reflecting upon Indonesia’s Freedom to Learn policy, it is noble in intention, but much harder to realise in practice. In late 2019, Indonesia sought to revolutionise the country’s education system – long criticised for its focus on rote learning. A series of national education policies dubbed the ‘Merdeka Belajar’ movement, or ‘Freedom to Learn’ were launched. Indonesia’s Freedom to Learn movement is not the first of its kind. Norway, for instance, implemented similar policies on customised learning in 1994. Reform94 sought to give teenage students more control of their learning. The policy focused on regularly giving students more choice and responsibility to work with teachers in designing their learning activities. However, national evaluations (Trippestad, 2011) found that stronger students possessed sufficient self-motivation to learn by themselves. Most other students did not. Students were reliant on teachers deciding what and how they were to be taught and assessed. When online learning surged due to COVID-19, many recognised a similar pattern of students becoming even more isolated and left without guidance. The pandemic taught us the importance of digital learning environments that are well designed to retain students’ attention. For some teachers, this might mean breaking up sessions that usually last for hours into 30-min or shorter bursts of taught lessons presenting learning material, followed by students undertaking self-directed individual or group activities spread across the remainder of the school week or the day. This way, students are able to engage with supporting digital learning resources to learn about DNA, for instance, at their own pace.

170

7 Moving Assessment in New Directions

Practices such as these have been long employed in New Zealand’s largest school, called Te Kura (2021) and are cornerstones of a well-designed online learning environment. Te Kura was founded in 1922. In pre-digital times, teaching resources and assignments were sent to and returned by students via (snail) mail for grading and feedback. Te Kura’s student cohort draws from across the whole of New Zealand and we could think of it as belonging to what was once the correspondence school tradition for those who could not attend school for whatever reason or because a particular subject was not offered in the normal face-to-face school in their area. Long before the pandemic, Te Kura had adopted teaching models varying from fully online to face-to-face sessions when required or possible. The learning resources and platform were tried and tested, and teachers already knew how to manage the difficult task of engaging students when COVID-19 hit. With the flick of a switch, they were able to move fully online. In seeking to answer the question ‘What is assessment?’ it is not therefore enough to be satisfied with the suggestion that we focus just upon the three terms assessment for, of and as learning. Each of the chapters in this book has drawn attention to different generative mechanisms and structures of assessment. Assessment for, of and as learning are one such cluster of generative mechanisms giving rise to structures (and accompanying language games). Another cluster concerns the challenge and skills, let us call it assessment literacy and experience, to measure creativity in all school subjects and in particular in those that are vocational, artistic or practice based. This evidences the importance of the assessor acting in the manner of a connoisseur. Following this line of argumentation the contention is that assessment takes on the appearance and reality of being an art and we might even call those assessing in such a way ‘assessment artists’. They are interested in what Eisner (1985) calls expressive objectives, or more simply what takes place in the process, along with the final product. The list of generative mechanisms and structures (and accompanying language games) is not finite, and in the remainder of this chapter we will highlight other clusters we have to date uncovered in this book. Ultimately, they are but a snapshot of what we as two authors have been exposed to in our experience of lifelong learning in different times, places, cultures and settings, some more institutionalised than others.

7.2 What Can We Learn About the Relationship Between Society and Assessment Practices? In ‘New forms of society: New forms of assessment’ (Chap. 2) we considered the dialectical relationship whereby generative mechanisms and structures in society impact the practices of assessment and vice versa. This is very much the topic of Stobart’s book from 2008, Testing times: The uses and abuses of assessment, although he does not use the conceptual framework of critical realism to support his

7.2 What Can We Learn About the Relationship Between Society and Assessment…

171

argument. We noted in the introduction that his book is eclectic in theoretical style, drawing upon the theories of assessment and constructs pertinent to a selected topic, and then moving to new topics, theories and constructs. However, there is something of additional note in his book: the elegant manner in which he argues that assessment is a social construction that constructs both pupils and teachers. It forms their identities as successful, weak or just plain average and this is something that they take with them into other spheres of life. As such, assessment for Stobart should always be understood as a potential opportunity for learning. Too much of a focus on grades, targets and achievement can mean that this opportunity is largely missed and there is little evidence of a backwash effect upon teaching and learning. As he writes, ‘assessment does not objectively measure what is already there, but rather creates and shapes what is measured – it is capable of “making up people”’ (Stobart, 2008, p. 1). Following Stobart’s basic proposition as indicative of the interrelationship between societal and assessment generative mechanisms and structures, we traced our ever-increasing dependence on the knowledge-based society, as reflected in changes in the manner in which knowledge is assessed. Thus understood, in the last 70 years criterion-based assessment has gained a greater and greater foothold with the move to competence-based assessment with threshold levels of achievement. Norm-based assessment has by no means disappeared and we still have evidence of it in popular university disciplines such as psychology and law. A norm-based grade cut-off is often used to thin out the number of students permitted to study in each successive year in professional degree-based programs. We see here how societies and institutions use assessment to select and exclude those who would otherwise be permitted to accrue greater amounts of cultural capital, materialised in the form of transcripts and parchments with symbolic value. There are further examples of the persistence of norm-based measures of knowledge, skills and values. Consider the well-known view that intelligence levels across a society are not stable, resulting in the need for the difficulty level of IQ tests to be periodically recalibrated to ensure a similar number of people continue to achieve similar test scores over time. Another example we considered is the so-called ‘diploma disease’, whereby more qualifications are required over time for the same jobs in the labour market. This is also indicative of societal changes that similarly impact upon the value placed upon different kinds of qualifications, or in this case the need for formal qualifications. The global interest in micro-credentials suggests resistance to norm-based measures of entry to courses of study. Just as IQ tests have to be periodically recalibrated as generations perform better (the Flynn effect; Baker et al., 2015), universities have traditionally sought to make entry to their programs and qualifications dependent on grades. But it is not so simple, and universities have also shown interest in micro-credentials as a pathway into their degrees, accepting them as credit towards courses. This is indicative of the way in which the assessment mechanism, in this case the micro-credential, leads to changes in assessment such as guidelines by national bodies on what is to be considered a micro-credential and its value. The

172

7 Moving Assessment in New Directions

New Zealand Qualifications Authority is a case in point and the regulations as of 2021 are as follows: At a minimum, micro-credentials will be subject to the same requirements as training schemes or assessment standards and will also be required to:

• • • •

be 5–40 credits in size have strong evidence of need from employers, industry and/or community not duplicate current quality assured learning approved by NZQA be reviewed annually to confirm they continue to meet their intended purpose. (NZQA, n.d.)

We have talked of an ebb and flow between greater control over assessment through rule-governed rubrics setting assessment parameters and the opposite, the need at times to loosen the control as other ways of assessing are required, such as the need to consider a wider set of knowledge, skills, values and experience possessed by student and assessor. What we witness here is another form of this ebb and flow. On the one hand, it is the movement brought about as societal pressures through a norm-based use of assessment include some and exclude others. On the other hand, through societal pressures working in the opposite direction, there is an interest in forms of qualifications and assessment that have fewer barriers to entry, progression and completion.

7.3 Why Is Motivation So Important in Learning and Assessment? It is not unusual to find among those offering professional development in assessment, and those taking it, an interest in theories and practices of motivation and learning. It is summed up by the sigh, ‘If only I as student or as a teacher knew the key to be totally engaged or make others engaged, whether in face-to-face learning or online’. Of course, as some might argue the answer may lie with gamers and the dream to make all learning like games, so called gameducation or gamification. McCallum et al. (2021) have suggested an answer in their short piece entitled, ‘Gamers know the power of “flow” – What if learners could harness it too?’ In their words: Gamers (athletes, too) experience this flow state when totally engaged in the game. Living in the moment and the experience, the activity is effortless and there is no sense of time passing. Students can also experience flow, and this is when learning is at its most productive. So, the challenge in education is to plan for and achieve that level of engagement. Flow is and always will be the gold standard … As Marshall McLuhan famously said, ‘the medium is the message’. Understanding how games grab and hold attention can help with the design and implementation of new online learning tools.

What we are alluding to is in many senses the holy trinity of education, namely, a triangular relationship between the social practices of assessment, secondly motivation and learning, and lastly the self. All are required and each interacts to strengthen

7.3 Why Is Motivation So Important in Learning and Assessment?

173

or alternatively weaken the other. This triangle suggests the bringing together of three distinct traditions of thought and practice. For some primacy is to be given to motivation as the engine of learning and assessment as a social practice is required in some form or other to make a judgement on how far the learning has progressed. The self in this triangle is at times viewed as a passive vessel or carrier of the motivation and learning, where learning might also be of a tacit character (i.e. ‘what remains after I have forgotten everything I have learnt’) and at times the self is used as a visual and corporeal metaphor, where the face is frowning and the body contorting as motivation and learning is clearly visible and connected with embodiment. We are reminded of the movement known as body-based pedagogy, which has shown great promise in teaching subjects such as numeracy. This tradition draws upon drama and the use of the body (Garrett et al., 2018). Bjørkvold (1992) was an early precursor of recent debates on body-based learning, showing how motivation and learning has always throughout history been woven in cultures through music and the body, from toddlers to teenagers and then through adults to old age. We have also proposed with critical realism in mind, that an active and holistic meaning of the self is mirrored in the generative mechanisms and structures of what is basically an uneasy unity of different parts. This considers the self as an enduring dispositional entity, expressed as laminated or structured by conscious, unconscious and affective forces. These elements of the self in combination give rise to a sense of self, or habitus in the embodied terminology of Bourdieu, characterised by self- esteem, self-belief and a willingness (or the opposite) to reach out and include others and other forms of shared experience. The motivation and learning side of the triangle makes references to different generative mechanisms and structures where motivation can be a cause or alternatively an effect of learning. In Chap. 4 on motivation, we had cause to present and reflect upon different theories of motivation from the cognitive to the emotional, from the individual to the shared, from the intrinsic to the extrinsic, and the whole idea of self-efficacy. To bring these together we drew upon framing questions suggested by Broussard and Garrison (2004): • Can I do this task? • Do I want to do this task and why? • What do I have to do to succeed in this task? However, as we have suggested above, some motivation and learning is not directly task oriented. It is tacit. By this we mean it just follows or accompanies the activity, as when we play football and over time are motivated to learn the importance of greater concentration, strategic insight, ball skills and what it means to belong to a team. Over time the motivation and learning become ritualised, automatic and embodied, indicative of what some might call muscle memory. They become part of a learner’s or teacher’s habitus at the personal level of the self, but also, as we have suggested in our later chapters using Indonesian examples, it becomes part of the institutional habitus in which the learning takes place and to which the individual contributes.

174

7 Moving Assessment in New Directions

So, what might this all mean with respect to the generative mechanisms and structures of assessment as the final side of the holy trinity identified above? In our chapter on assessment and motivation we walked the reader through the different traditions of assessing group work. One point we made was the change from the level of comfort in the 1970s and 1980s and even into the 1990s where participants were more than happy to receive a group grade for their work compared with those in the following years who have been increasingly keen to be graded for their particular efforts within the group process and their specific contribution to the final group product or presentation. This trend touches upon the manner in which the individual in group study may or may not buy into the need for the collective grade and the very real possibility of free riders. The dynamics of group work on student projects teach about the role of motivating the other and taking responsibility for not only your own learning but that of other group members. It is well-placed in this connection to recall the phrase that has passed into Norwegian folklore, coined by one of the country’s most influential football trainers Neils Arne Eggen. It goes something like this in a free translation into English, ‘By all means work on your own area of expertise, where you are strongest and most creative, but never forget to make the others play better’ (Eggen & Nyrønning, 2003). The point is that as the group of students work on a set project an interweaving inevitably occurs between personal motivation and the motivation of the group as a whole. In the scholarship on motivation this is framed as the debate on personal self-efficacy and collective self-efficacy. Let us recall how Sartre (1991) famously described this as a dialectical process where a group can move in pendulum fashion between what he called a serial group entity (individuals regarding themselves as individuals in competition with others) to a group as collective (group-in-fusion) and the reverse. For him the key factor in forming the group as a collective was an external threat to the group that unified and strengthened their shared sense of resolve, purpose and equality. The evidence for the generative mechanisms and structures he identified was not psychological tests or experiments, but an understanding of history and the French Revolution. The revolutionaries were able to collectively act as a group-in-fusion when they faced a shared outer threat – they stormed the Bastille. However, in the following period this sense of group we-ness broke down and members of the group began to fear and mistrust each other. The external threat for students in group work might be found in the motivation to achieve a good shared grade or the fear that if they behave too individualistically they might let their fellow group members down. In the philosophical tugs and pulls between Sartre and his one-time close friend Merleau-Ponty we find this played out differently. The early work of Sartre in his novels Nausea (2000) and Being and nothingness (1984) struggle with the existential understanding of motivation that is to be measured and assessed by the individual and all they are worth. In his later work on dialectical reason and revolutions he talks of the moments of collective we-ness that intersperse what is fundamentally still a fallback to the responsibility of the individual to make choices and assessments about historical and daily moments and how to act. In other words, throughout the work of Sartre there is a tacit understanding that motivation is ultimately a

7.4 How We Value and Assess Ourselves and Others and with What Language

175

question for the individual and to what extent they are or are not motivated by external threats faced by the group or by individuals acting as threatening other. Merleau-Ponty’s view is different. Our sense of connectedness is paramount in our motivations, and it is always experienced through our bodies and how they position us with and through others to undertake actions, use language and make assessments and evaluations. Merleau-Ponty (1968) introduced the term chiasm, by which he meant the shared inter-corporeal space of bodies, signs (signifier, signified) and history working together and giving rise to corporeal experiences of touched–touching, seen–seeing, speaking–spoken to and so on. A good example of this is the handshake or the conversation. In the former, who is touching and who touched is indivisible. So too in a conversation, where the moment of speaking is indivisible from listening, as the speaker also listens to their own voice and listens to how they are heard and received by the other. Before an argument against this is raised, let it also be added that the inter-corporeal space might be mediated or facilitated by digital, online interfaces. With Merleau-Ponty’s approach in hand we arrive at the view that assessment and motivation of the individual is inevitably interwoven with others and the group. Bourdieu’s institutional habitus arguably stands in a direct lineage from Merleau- Ponty. When we look at assessment the point is simple: considering it as something that can be tightly drawn around an individual is to immediately ignore and abstract away from the social and cultural context or field (Bourdieu) in which it is played out. Assessing oneself and others is thus imbued with the desire to assess the group in all its fullness as a site of motivation and learning and as a site in which the self can be played out as a sense of personal and shared belonging. We are now ready to consider this challenge in the next section of this chapter.

7.4 How We Value and Assess Ourselves and Others and with What Language Most have come across the saying, ‘Do we assess what we value or only value what we are able or have chosen to assess?’ This is an important reminder and worth asking ourselves periodically as a check and balance that can spur us to recalibrate or modify our assessment acts and practices. It highlights that assessment is intimately intertwined in dialectical fashion with the values of a society, institution, group or individual. Another way of putting it is to say that assessment is intertwined with the assessment cultures we create to mirror our values. A good example is the SAT university entrance exam in the USA, which has periodically been re-examined to see whether it could work in a more inclusive manner for diverse groups. There is in such a view an acknowledgement that there is no such thing as a culturally neutral form of assessment, even though we aspire to reduce the effect of one culture privileging itself over other cultures, including cultural disparities of a socio-economic, gender, ethnic or some other character.

176

7 Moving Assessment in New Directions

We addressed these concerns, admittedly somewhat indirectly or obliquely, in Chap. 6 where we sought to praise and critically appraise the work of Sadler. As we noted, from the 1980s through his work on formative assessment and related issues he has been the flagbearer for acknowledging the importance of assessing what we value and not merely valuing what we are able or willing to assess. Of course, before Sadler, Messick (1989) had also highlighted the importance of values in his work on assessment validity, where it was understood as part of an ongoing process of assessment validation and not a once-and-for-all judgement on a form of assessment or its practice. Accordingly, Messick was concerned with the ‘appropriateness, meaningfulness and usefulness’ of the assessment and he quoted Cronbach on this point: ‘the argument must link concepts, evidence, social and personal consequences, and values’ (1989, p. 19). Sadler’s work on assessment in the view of many has been unrivalled and in Chap. 6 we positioned his work in the debate on formative assessment, or assessment for learning as it is sometimes known. We asked if it were possible or even necessary to move beyond these terms and practices. Our argument is that in the work of Sadler himself we find an interest in the language game of noticing an aspect. Noticing plays a central role in the later work of Wittgenstein, and Sadler also makes reference to this philosopher and his concept. Sadler (2013, p. 58) quotes this famous extract: ‘I contemplate a face and then suddenly notice its likeness to another. I see that it has not changed; and yet I see it differently. I call this experience “noticing an aspect”’ (Wittgenstein, 1953/1967, II, 193c). Sadler, however, does not focus explicitly on Wittgenstein’s concept of language games. This is something which we consider important and a missed opportunity to progress the understanding of assessment acts and practices. Wittgenstein suggests that language is not simply pointing at things out there in the world, such as the words ‘multiple choice test’ conjuring up a pen and paper or electronic test with multiple answers to each question. Language is not simply a window to a reality outside the window. The words themselves encompass something more, a whole practice and way of acting connected with the word. Simply put, the multiple choice test is a language game and in being this it incorporates and signals a world of associated practices. So too with formative assessment and assessment for, of and as learning. These language games bring with them associated practices which are of course not fixed and can change according to cultures and traditions. Wittgenstein defined language games as follows: ‘Here the term “language-game” is meant to bring into prominence the fact that the speaking of language is part of an activity, or of a form of life’ (1953/1967, p. 11). The associated practices are in this sense forms of life. We have sought to emphasise the importance of directing attention in a way that Sadler does not, even though he does highlight some key concepts from Wittgenstein, to the many different language games of assessment and what they implore or command us to do in playing them. The number of possible language games of assessment is endless, but we in our institutions and practices are called upon to play language games that are actualised and realised in culturally specific ways. We as learners or assessors must acknowledge the boundaries of these games and the

7.4 How We Value and Assess Ourselves and Others and with What Language

177

values and norms that are created and enacted. In sum, to move beyond the tacit and implicit global acknowledgement of formative assessment and assessment for, of and as learning, along with other language games of assessment, there is a strong case for stepping back for a moment. What language games are we seeking to create and obey when we make assessments, evaluations and judgements? The words we use are not neutral, which answers the question of whether assessment can ever be culturally or value neutral. This is not possible because all assessment is played out in the language games and forms of life of assessment. Some participants are more equipped to play them with language game insights they have acquired over time. In the words of Wittgenstein, to learn a language game is to train and repeat and the student might have to repeat and repeat to get the point: ‘Here the teaching of language is not explanation, but training’ (Wittgenstein, 1953/1967, § 5). We must of course be cautious. This turn of phrase has behaviouristic connotations and is associated with rote learning. Wittgenstein was in this context referring to young children learning the language games of language. But the point seems valid irrespective of the age of the student: learning is not merely about psychological perception; it is about language games. In our context, these are the language games of assessment and evaluation. A student has to train in the use of criteria or other forms of noticing. They need to gain experience under the guidance of teachers and peers. The conceptual point we make to move the debate on formative assessment and assessment for learning forwards is to highlight something that is under-theorised in the work of Sadler and those who have followed his lead. The adoption and practice of formative assessment and assessment for learning is far from neutral. There is no one-size-fits-all within or across institutions or nations. To raise awareness of the disparities in not only the practice of formative assessment and assessment for learning, but assessment practices in general, we argued that the concepts of assessment habitus and assessment capital can be coined from the inspiration of Bourdieu. He was not interested in assessment directly, but we contend it is possible to think conceptually in this manner with this inspiration. Our argument is that over time assessment practices at the individual and institutional level in, say, a faculty or department become second nature and embodied as forms of habitus (i.e. habits) providing guidance on how assessment is practised. The practices are not fixed, as the dialectic between the individual habitus of the academic is modified and in turn also modifies the institutional habitus. Moreover, we would contend in addition that, just as a teacher and a student might possess different amounts of economic, social and cultural capital, so might assessment capital as the understanding and skills of assessment, what some might call assessment literacy, vary between the teacher and student and between students. A simple way of understanding this is to say first semester undergraduate students at university will possess relatively little assessment capital compared with final year students and likewise an early career academic will possess less assessment capital then a senior academic who has taught for many years. This is not to say of course that the assessment capital of the more experienced student or academic is necessarily fit for purpose if conditions change. For example, with the arrival of COVID-19 and the need to undertake more online teaching and more assessment mediated by a

178

7 Moving Assessment in New Directions

screen, the younger students and younger academics were often more used to a seamless connection with the digital world in all aspects of their lives and thus better equipped to manage the changes required. In assessment capital terms, even if their capital were less than that of the more mature students and senior academics, it was more easily exchanged to fit the needs of the online world.

7.5 Transforming Assessment Practices: Challenges and Opportunities An understanding of the differences in assessment capital and insight and training in assessment language games is crucial if an institution is seeking to change assessment acts and practices through wider policy change. Put differently, assessment habitus and assessment capital are not quickly or easily changed. They can be changed for sure, but resistance is to be expected even when the rationale is to future proof against the unexpected and trends which are emergent, but yet to become prominent. The objection will be raised and voiced as follows: ‘The way I have conducted assessment in my course or program has functioned well for many years, so why change things in the direction of what might be a passing fad?’ Such a resistance to changing assessment practices might also be voiced in terms such as, ‘I am used to age-based assessment mapped to the curriculum rather than seeking to offer teaching and assessment that reflects the developmental needs of the individual in a personalised progression plan.’ New technologies, such as computer adaptive testing and other forms of automated feedback, provide ways to customise learning and assessment to address these needs (Stanley, 2013). We are reminded of the Norwegian Education Act, which anticipated this intention to introduce learning progressions as a declared policy for many years, well before the affordances offered by new technologies: ‘Teaching must be adapted to the abilities and prerequisites of the individual student, the apprentice, the trainee letter candidate and the apprentice candidate’ (§ 1–3).1 Nevertheless, despite the advances in assessment offered by new technologies, resistance can be rooted in other concerns. For example, digital exams are often considered expensive to introduce if software is required to monitor integrity issues and reduce cheating. These arguments against adopting changes in assessment may not in themselves be enough to halt change. This is especially the case when external events present themselves, such as reduced face-to-face teaching and assessment with the arrival of COVID-19. Digital teaching and learning as well as digital assessment become an imperative. Following the argument made in later chapters in this book, changing

Primary and Secondary School Act (Lov om grunnskolen og den vidaregåande opplæringa) (opplæringslova). Translation by Dobson. 1

7.5 Transforming Assessment Practices: Challenges and Opportunities

179

the language games of assessment in use and circulation in different media and platforms in a university, school or other educational institution always remains a distinct possibility. To quote a commonly made point, ‘our language of feedback can always be improved.’ We can learn new language games of feedback. Some might even talk in religious terms about a leap in faith as a new set of assessment language games requires an accompanying new paradigm of understanding and practice. Recalling Kuhn’s (1996) work on paradigms, if there is a dominant assessment paradigm we might wait for its lack of efficacy to become obvious, so that a new and radically different assessment paradigm with its own language games might rise to the top. Inspired by Brecht we might look for the cracks: ‘disorder is when nothing is in the right place’. Riffing on Wittgenstein’s understanding of noticing, we seek through a radically new assessment policy to change the language games that are noticed and applied. This will also require transforming the assessment capital, understood as assessment literacy, possessed by participants. In a university setting it might be the students who demand the use of different assessment language games, or younger and also hopefully more senior academics who embrace the benefits of adopting these new assessment language games. For example, academics will no longer have large piles of exam papers to mark in short time frames if the assessment of group projects and presentations becomes preferred. There is also an important participant often not mentioned, namely the group of university administrators who might also be supporters of different assessment language games when they result in simpler, more efficient administrative procedures and, importantly, fewer student complaints to investigate and process. In what follows we will detail some of what has happened in one university which desired to transform the current spread of assessment capital, habitus and languages games across the whole university. The university in question is one in which Dobson worked for several years. The university has about 20,000 students and all the major faculties you would expect to find in a university with the exception of medicine. There is a mix of professional programs, sciences and the humanities. It is important at the outset to be clear that the terms assessment capital, habitus and language games belong to our book and have not been used by this university as it attempted to change its assessment policies and practices. This said, what has been attempted can without doubt be understood as seeking to change the way in which assessment is understood, as the concepts and language games of assessment are used differently, and there will be an accompanying need for professional development to transform and some might even say improve the assessment capital, habitus and practices adhered to by academics and professional administrators. With changes in assessment practices a washback is anticipated amongst students, that is, if assessments change they will learn differently. An unpublished discussion paper prepared by Dobson, Walbran and Schofield for the university’s Academic Board identified the assessment challenges facing the university in 2020, summarised in Fig. 7.1. These challenges are by no means unknown in other universities around the world. To mention a couple of points by way of example, how can assessment

180

7 Moving Assessment in New Directions

Fig. 7.1 Challenges to assessment practices

practices ensure academic integrity and avoid at the same time privileging certain cultures or socio-economic backgrounds? It was felt by many in the university that the current assessment policies were not empowering academics, professional administrators or students. The assessment literacy owned by each of these groups of participants was considered highly variable and not necessarily well-suited to addressing the challenges highlighted in Fig. 7.1. There was also an ambition to try to second guess what an assessment framework and accompanying assessment policies might look like to future proof the university, as best as possible, against as yet unforeseen challenges. For example, the university might need to assess and evaluate micro-credentials if they become increasingly important in a future global knowledge ecosystem. The authors also suggested that it was important not to think of assessment in isolation. Sometimes assessment had been viewed more as an afterthought in the development of new programs and courses or as in some way distinct and different to teaching and learning pedagogies, student agency and the objectives of courses and programs with respect to knowledge, skills and values. An interrelationship was suggested between these, as visualised in Fig. 7.2. In this figure assessment judgements refers to assessment types, such as assessment for, of and as learning, and assessment principles, which include equitable and inclusive (assessment is responsive, appropriate and accessible for the diverse needs of learners), transparent (assessment is explicit, with a logical relationship between outcomes and the grades associated with different standards of performance), manageable (assessment can be completed by all students and staff within the time frame of the relevant teaching period), sustainable (assessment pays attention to the overall workload and wellbeing of students and staff) and valid (assessment measures what it purports to assess). After the discussion paper had been presented and debated at a regular monthly meeting of the Academic Board where student representatives and academics are permitted to attend, a round of university-wide consultation was undertaken with

7.5 Transforming Assessment Practices: Challenges and Opportunities

181

Fig. 7.2 Assessment judgements interrelated with other components of teaching and learning

students, their associations, academics from different disciplines and professional administrators. What became clear was that any changes in the assessment policies and practices of the university could not be one size fits all. What might work in a law faculty might not be fit for purpose in a faculty of health or a faculty of humanities. There was also the challenge that many professional programs had to meet assessment and accreditation requirements set by professional bodies external to the university. What was clear was that different faculties and programs of study within them had traditions of assessment, such as applied group projects requiring group assessment in an engineering faculty, which were less typical in, say, a faculty of the humanities where assessment tended to be based on the performance and assessment of individual students. What the authors realised in this work, which is ongoing, is that, using the terminology of this book, the assessment capital and assessment habitus of individuals, programs and disciplines varies deeply. To combat this the authors considered that the new version of the university wide assessment regulations, guidelines and framework with principles and forms of assessment had to be broad enough to cater for these differences and importantly that professional development of all should be offered at the individual, program and discipline level. This would empower participants to develop and undertake assessment in accordance with their own disciplinary and professional needs and yet also remain within the broad boundaries and requirements set by the revised assessment regulations of the university as a whole.

182

7 Moving Assessment in New Directions

Fig. 7.3 The kete of assessment language games

A key instrument proposed was a hyperlinked visualisation of the assessment types and principles with links to the regulations, procedures and guidelines. This was to be a one-stop university wide platform for communicating what we would call the language games of assessment in the university. It would be easy to update and would bring together and harmonise what were previously in several instances, multiple different ways of defining the same language games of assessment. They were previously scattered across the different webpages of faculties, programs and courses. The visualisation was inspired by a so-called kete (a Māori woven basket) that interweaves the different language games of assessment (see Fig. 7.3). As noted, this work is ongoing and, while it does not use the terms assessment language games, assessment capital and habitus, this is what the authors have directly in mind. The hyperlinked, digital platform of the kete is a vehicle to circulate the language games of assessment across the university and make them equally accessible to academics, professional administrators and students. To increase inclusivity the goal will also be to introduce Te Reo (Māori language) into the kete and greater articulation of a Māori world view. Keeping mind that New Zealand is bound by the Treaty of Waitangi (Te Tiriti o Waitangi) signed in 1840 by Māori leaders and the British Crown. This is underpinned by a number

7.5 Transforming Assessment Practices: Challenges and Opportunities

183

of principles including government in the interests of Māori, self-management, equality, co-operation and redress. However, the whole idea of seeking a translation of the terms in the kete, if that is what is undertaken, is not without controversy. Would this reduce a Māori worldview and understanding of assessment to westernised ethnocentric terms? Translation is never neutral and never a simple mirroring of one language or its concepts in another. There might be purchase, as Dobson (2012) has suggested, in seeking instead to create a new inter-linguistic meaning as the two languages intermingle and bridge one another to create a new entity. Thus conceived, the two different world views are distinct languages games, each with their own encompassing form of life, which might be bridged to create a new language game and form of life. Form of life is a term introduced by Wittgenstein to embrace the social and cultural practices of the activity. How might this look in practice to avoid the reduction of one by the other linguistically and conceptually? We have in mind the example of the road being traversed through public consultation by the NZQA (New Zealand Qualifications Authority; in Te Reo or Māori language: Mana Tohu Mātauranga o Aotearoa) which is responsible for educational assessment and assurance of the quality of qualifications at all educational levels in the country and in accordance with the transnational Qualification Framework (NZQA, 2016). The title of the draft document is Consultation on draft principles of assessment and aromatawai (NZQA, 2021), where aromatawai refers to a teaching, learning and assessment approach based on Māori values, beliefs and aspirations. A somewhat simple summary view of this might be the focus on the child as always embedded in family and community relations, unlike in the western approach, where the child is immediately on commencing formal education introduced to the view that they learn as independent individuals and their achievement is to be judged and communicated as individuals. Anything else could conceivably be considered an integrity issue as the work assessed cannot be traced to only a single author. Considering what we have already said about noticing in this and earlier chapters, it is interesting that the language game of aromatawai conveys its role in learning, teaching and assessment as follows: aro means ‘to take notice of’ or ‘pay attention to’, while matawai means ‘to examine closely’. The document presents a visual model (reproduced in Fig. 7.4). It seeks to combine six Māori concepts or, as we would propose, language games (Rangatiratanga – teaching, learning and assessment reflects the world views of the student, Whanaungatanga – student success supported by positive mana enhancing relations, Manaakitanga – interactions are respectful and enhance the wellbeing of the student, Pūkengatanga – teaching practices are authentic and meet the aspirations of the student and their families, Kaitiakitanga – quality teaching, learning and assessment practices, and Te Reo – the value of Te Reo Māori language and customs is recognised) with more traditional assessment language games (valid, reliable, authentic, equitable and informative). There is much to commend in the visual clarity and the unambiguous language of the NZQA draft. The six Te Reo Māori concepts are language games positioned in the centre and convey the importance of different relationships, while the

184

7 Moving Assessment in New Directions

Fig. 7.4 The five principles of assessment and aromatawai

surrounding five concepts seem to be more about making judgements on assessment acts. Thus understood, the former are about the process in which assessment is conducted and the latter are about input and output measures. Both are obviously important in themselves. But we would not want the latter to be understood as the language and science of assessment and the former as the culture of assessment, which would indicate and attract different levels of assessment capital and accompanying symbolic value. The new NZQA principles (the previous ones date to 1996) in this draft iteration are to a great extent visualised as two interacting traditions, one informed by aromatawai and one informed by the traditional technical language of assessment. In many senses, the visualisation recognises the Māori right to be autonomous, as stated in the New Zealand founding document, the Treaty of Waitangi/Te Tiriti o

References

185

Waitangi signed in 1840 (Waitangi Tribunal, 2020). What is anticipated in coming iterations is a stronger weaving together, clarification and articulation of the shared treaty foundation acknowledging in all aspects of society. This brings us back to the point about the university kete presented above. It has clearly incorporated the Māori understanding of a weave but has not introduced a wider Māori world view with accompanying language games and forms of life. This must be a task for its next iteration as the university attempts to transform its assessment practices. In summarising this section, the chapter and the whole book we have drawn attention to considering assessment in terms of language games and forms of life (i.e. social practices with cultures, rules and norms) and the need to introduce new language games of assessment where appropriate. • From a slow start in chapters on traditional topics including the connection between society and assessment, the debate on and practices of assessment for, of and as learning, the relationship between assessment, motivation and learning and the self we have proposed understanding the role of assessment through the language game of the connoisseur and not merely thought the eyes of assessors using only rubrics. • Chapter 6 praising and appraising the contribution of Sadler allowed us to additionally propose a new understanding of assessment in terms of the language games of assessment and it also briefly mentioned assessment capital and assessment habitus. • The whole book has been framed by a desire to use the language game of critical realism to investigate the generative mechanisms and structures supporting assessment acts and their different practices. We have contended that, in much of the assessment work undertaken in different educational settings, too little time or effort is devoted to revealing the manner in which assessment acts and practices are driven by different generative mechanisms and structures. The closing discussion of the kete and the direction signalled by the forthcoming new version of the NZQA Principles of assessment and aromatawai indicates that the interweaving of different cultures of assessment, each reflecting their respective generative mechanisms and structures, will give rise to new language games of assessment and new forms of life. As a consequence, the ensuing assessment acts with all their family resemblances to existing acts will be worthy of noticing and further discussion.

References Baker, P., Eslinger, P., Benavides, M., Peters, E., Dieckmann, N., & Leon, J. (2015). The cognitive impact of the education revolution: A possible cause of the Flynn Effect on population IQ. Intelligence, 49, 144–158. Bjørkvold, J.-R. (1992). The muse within: Creativity and communication, song and play from childhood through maturity. Harper Collins.

186

7 Moving Assessment in New Directions

Brecht, B. (2019). Refugee conversations (trans. R. Fursland). Bloomsbury (Original work published 1956). Broussard, S., & Garrison, M. (2004). The relationship between classroom motivation and academic achievement in elementary school-aged children. Family and Consumer Sciences Research Journal, 33(2), 106–120. Dobson, S. (2012). The pedagogue as translator in the classroom. Journal of Philosophy of Education, 12(2), 271–286. Dobson, S. (2017). Assessing the viva in higher education: Chasing moments of truth. Springer. Eggen, N., & Nyrønning, S. (2003). Godfoten. Samhandling – Veien til suksess [The good foot. Interaction – The way to success]. Achehoug. Eisner, E. (1985). The art of educational evaluation: A personal view. Falmer Press. Garrett, R., Dawson, K., Meiners, J., & Wrench, A. (2018). Creative and body-based learning: Redesigning pedagogies in mathematics. Journal for Learning Through the Arts, 14(1), 1–20. Kuhn, T. S. (1996). The structure of scientific revolutions (3rd ed.). University of Chicago Press. McCallum, S., Schofield, E., & Dobson, S. (2021, August 2). Gamers know the power of ‘flow’ – What if learners could harness it too? The Conversation. https://theconversation.com/gamers- know-the-power-of-flow-what-if-learners-could-harness-it-too-164943. Accessed 21 Oct 2021. Merleau-Ponty, M. (1968). The visible and the invisible. Northwestern University Press. Messick, S. (1989). Validity. In R. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). Macmillan. New Zealand Qualifications Authority. (2016). The New Zealand qualifications framework. NZQA. https://www.nzqa.govt.nz/assets/Studying-in-NZ/New-Zealand-Qualification- Framework/requirements-nzqf.pdf. Accessed 21 Oct 2021. New Zealand Qualifications Authority. (2021). Consultation on draft principles of assessment and Aromatawai. https://www.nzqa.govt.nz/about-us/consultations-and-reviews/draft-principles- of-assessment-and-aromatawai/. Accessed 21 Oct 2021. New Zealand Qualifications Authority. (n.d.). Micro-credentials. https://www.nzqa.govt.nz/ providers-partners/approval-accreditation-and-registration/micro-credentials/. Accessed 21 Aug 2021. Round Table on Information Access for People with Print Disabilities. (2019). Guidelines for Accessible Assessment. The Round Table. http://printdisability.org/guidelines/guidelines-for- accessible-assessment-2019/. Accessed 8 Aug 2021. Sadler, D. R. (2013). Opening up feedback: Teaching learners to see. In S. Merry, M. Price, D. Carless, & M. Taras (Eds.), Reconceptualising feedback in higher education: Developing dialogue with students (pp. 54–63). Routledge. Særte, J., & Vinge, J. (2010). Musikk og vurdering [Music and assessment]. In S. Dobson & R. Engh (Eds.), Vurdering for læring i fag [Assessment for learning in subjects] (pp. 155–172). Cappelen Damm Høgskole Forlaget. Sartre, J.-P. (1984). Being and nothingness: An essay on phenomenological ontology. Washington Square Press. Sartre, J.-P. (1991). Critique of dialectical reason. Volume 1: Theory of practical ensembles. Verso. Sartre, J.-P. (2000). Nausea (R. Baldick, Trans.). Penguin Classics. Stanley, G. (2013). Foreword. In G. Masters (Ed.), Reforming educational assessment: Imperatives, principles and challenges (pp. iii–v). Australian Council for Educational Research. Stobart, G. (2008). Testing times: The uses and abuses of assessment. Routledge. Te Kura. (2021). About Te Kura. https://www.tekura.school.nz/about-us/who-we-are/about-te- kura/. Accessed 21 Oct 2021. Trippestad, T. (2011). The rhetoric of a reform: The construction of ‘public’, ‘management’ and the ‘new’ in Norwegian education reforms of the 1990s. Policy Futures in Education, 9(5), 631–642. Waitangi Tribunal. (2020). The Waitangi Tribunal and the Treaty of Waitangi/Te Tiriti o Waitangi. https://www.waitangitribunal.govt.nz/treaty-of-waitangi/. Accessed 21 Oct 2021.

References

187

Winne, P. H., & Butler, D. (1994). Student cognitive processing and learning. In T. Husen & T. Postlethwaite (Eds.), The international encyclopedia of education (pp. 5739–5745). Pergamon. Wittgenstein, L. (1967). Philosophical investigations (3rd ed., trans. G. E. M. Anscombe). Macmillan (Original work published 1953). Zhudi, M., & Dobson, S. (2021, July 27). Indonesia’s ‘Freedom to Learn’ movement at risk as students lose attention amid digital learning: How do we reclaim their drive to learn? The Conversation. https://theconversation.com/indonesias-freedom-to-learn-movement-at-risk-as- students-lose-attention-amid-digital-learning-how-do-we-reclaim-their-drive-to-learn-161866. Accessed 21 Oct 2021.

Glossary

Assessment act An action by student(s) or teacher(s), written, verbal or corporeal, that may or may not have as its stated goal an assessment of a student, individually or collectively, but with assessment as the outcome. The assessment act also includes assessment-related activity, such as the formulation and setting of curriculum-related learning goals. It also includes the effect of language, such as the illocutionary (e.g. in language we all agree we are talking about assessment and what is at stake) and perlocutionary potential (e.g. the teacher offers feedback and the students act in accordance with words given in the feedback) (Austin, 1976). Accordingly, the assessment act is the connected chain of events leading up to the assessment, the actual assessment, and the events afterwards when feedback may be given, and the results can have individual and social consequences for stakeholders (e.g. students, teachers, examiners, school principals and education policy makers). This is an aspect of language games. The social aspect of the assessment act additionally evokes the manner in which it contributes to and draws upon the accumulated assessment habitus of both the individual and the institution. Habitus is defined below. The point to note here is that the assessment act is connected implicitly and explicitly with accompanying assessment practices that simultaneously express themselves individually and socially. Assessment capital Assessment capital is a term we introduce which is inspired by Bourdieu’s concept of capital, particularly cultural capital. It is a type or form of (embodied) cultural capital designating accumulated ‘experience and knowledge’ (Bourdieu, 1984, p. 23) regarding assessment and predominantly inculcated during academic interactions. More specifically, it manifests in self- evaluative skills and knowledge regarding one’s learning. For students, it is their accrued personal qualities or capacity to assess their own learning accumulated from their active involvement in education (including more significantly higher education) and, predominantly, assessment for/as learning. Accordingly, assessment capital for teachers or academics possesses two dimensions: internal – for © Springer Nature Switzerland AG 2023 S. R. Dobson, F. A. Fudiyartanto, Transforming Assessment in Education, The Enabling Power of Assessment 10, https://doi.org/10.1007/978-3-031-26991-2

189

190

Glossary

assessing their teaching – and external – for assessing the students’ learning. Sadler (1989) understood this as self-evaluative knowledge and skills of students in assessment. Simply put, assessment capital is part of embodied cultural capital – it works as cultural capital within the practices of assessment. Assessment habitus Introduced in Chap. 6, we consider this a new concept which is inspired by Bourdieu’s concept of habitus. Assessment habitus is in our sense a limited application of Bourdieu’s habitus with exclusive reference to assessment, that is, habitus regarding assessment. By designation, assessment habitus is here understood as the dispositions or system of tendencies about assessment inculcated by social agents (individuals or institutions) from societies throughout their lives in order to behave accordingly within current and/or future educational or academic life in educational settings. It concerns assessing or being assessed and how we act in such situations with an embodied ‘habit of mind’ accumulated over time. Assessment theory Assessment theory seeks a greater theoretical understanding of assessment practices and their accompanying assessment acts by drawing upon general theories of learning and specific theories of assessment practice. If the desire is to evaluate learning either during or after a period of learning, then assessment theory quickly becomes associated with general theories of learning, such as theories of motivation, identity formation (bildung) and social mechanisms and structures of interaction. Specific theories of assessment practice concern what is commonly known as assessment for, of and as learning. This also makes reference to a limited number of theoretical assessment principles: validity, reliability, fairness (framed in terms of justice and equity), transparency and accountability. The goal is not merely to evaluate assessment practices from a principled standpoint; what might be called the desire to occupy a higher principled moral ground. Appropriating assessment principles might also provide theoretical insight into the generative mechanisms and structures supporting the practice of standardised testing. For example, if one teacher defends using standardised testing in their classroom because they regard it as fair, this can be taken as evidence of how an assessment principle might itself contribute as a generative mechanism and structure to the maintenance and justification of standardised testing. Bridging If critical realism provides a focal point for generative mechanisms and structures and assessment theory looks at more general theories of learning, motivation and the formation of identity, along with theories of assessment practice (e.g. assessment for learning and principles), then the bridging of assessment theory with critical realism, the topic of this book, requires a mediator – a process that bridges. This is supplied by the assessment acts of stakeholders (teachers, students) within and also outside the classroom (involving parents, school principals, policy makers and politicians). It must however be underlined that, in making the stakeholder the mediator, the goal is not to reincarnate a form of anthropic thinking with the actor as the origin and focal point. The focus on generative mechanisms and structures represents an attempt to tone down the centrality of the actor, even though it might be asserted that the actor subjected to them is capable of modifying their impact.

Glossary

191

Capital According to Bourdieu, capital is a ‘species of power whose possession commands access to the specific profits that are at stake in the field’ (Bourdieu & Wacquant, 1992b, p. 97). From this quotation, we understand capital as a form (or token) of power or authority that is required to gain positions or recognition in a social space or social field and has an ‘exchange/able value’ or ‘efficacy’ (Bourdieu, 2004, p. 16; Skeggs, 2004b, p. 75). In the original theory there were three types of capital, namely economic, cultural and social capital, which can be mutually transformed into each other. According to Bourdieu (2004, p. 16), economic capital is ‘immediately and directly convertible into money and may be institutionalized in the form of property rights’. Cultural capital is accumulated ‘experience and knowledge’ (Bourdieu, 1984, p. 23), which ‘may be institutionalized in the form of educational qualifications’. This study understands cultural capital as symbolic entitlement to knowledge, skills, experiences or expertise possessed by an individual. Social capital is ‘the possession of a durable network … or institutionalized relationships … or membership in a group …, a “credential” which entitles them to credit’ (Bourdieu, 2004, p. 21). So, social capital is a symbolic entitlement to being a member of a group which, to a great extent, allows persons with socially recognised credentials (social status, respect or legitimacy) to appropriate credit accordingly, and may be institutionalised in the form of a title (e.g. Dr) or a proper name (Central Queensland University alumni). Critical realism Critical realism seeks to transcend causal relations at merely the level of the empirically observed or experienced (appearances) and to reveal the generative mechanisms and structures supporting the existence of a phenomenon in the realm of the real. Moreover, a mechanism might still exist in the realm of the actual, even if it is not visible, because it could be activated but not perceived, not activated or potentially countered by other mechanisms. Empiricists/ positivists would by contrast say that the causal mechanism was not in existence if it was not functioning and evident. In the terminology of critical realism three realms are to be conceived: the empirically observed, the real and the actually taking place and not necessarily observed as stated below. A key goal is to create a greater understanding of generative mechanisms and structures (the real) that support and govern how assessment acts (the empirically observed and experienced), as an example, are played out and realised, even if they are beyond observation (actually taking place or operating but not observed) by different stakeholders, such as teachers, students, school principals, parents, researchers and policy makers. For example, a parent teaches a child to read a sign on the side of a building and it is not evident they are learning (not actually visible and operating), but the child does learn (empirically experience) and remembers this for a later occasion. The activity of the learning is the real of mechanisms and processes as the learning takes place. In this book we focus upon the central mechanisms and processes connecting capital and languages with assessment. Dialectic A major critical realist text for us is Dialectic: The pulse of freedom by Bhaskar (1993). In it he elaborates on the ontological depth of the world in terms of the longitudinal (the dialectic unfolding through non-identity, absence, totality

192

Glossary

and transformative agency), the lateral (expression in space-time, and corporeally), the four-planar social cube (of material transactions with nature, social interaction between agents, social structures influencing social relations through power, discourse and norms and the self laminated or structured by consciousness, unconsciousness and affective forces). A small note is in order: dialectic in a Hegelian sense is the bringing together of different and contradictory entities, so they are inevitably intertwined. A duality keeps entities close but they are different and separate. Put simply, both/and (dialectic) rather than either/or (duality).1 Dialectic is also understood in logic as argumentation by syllogism, whereby a conclusion is reached based upon assumed or given propositions. Fallacies Failure to maintain the distinction between the transitive world of knowing (epistemological) and the intransitive world of being (ontological) by conflating them or reducing one to the other can lead to a number of fallacies with respect to the knowledge produced. The epistemic fallacy occurs when appearances, as events, are taken to be evidence of causal knowledge. The anthropic fallacy of anthrocentrism is evident in the view that all knowledge somehow originates or is mediated through the subject. This connects with the ontic fallacy in the sense of the determination of Being by being, to cite Heideggerian terminology and reference how we live our existence (e.g. living life in an anxious manner can colour how we similarly know and experience the world as anxiety). There is also the linguistic fallacy, typical of some strains of postmodernism and discourse analysis, where the intransitive is collapsed and the world becomes discourse; a reality outside of discourse is discounted. We draw upon some terms from Heidegger and the reader new to his work must note the lower and upper case meaning of the words being and Being. According to Heidegger (1927/1962), emotions reveal the existential state of a person. For him this existential state is imminent in concrete, lived everyday existence, what he termed the ontic, the that-which-is of entities, das Seiende in German, usually translated simply as ‘being’. The existential refers to how a person lives the ontic in a certain manner (e.g. with joy or anxiety). Heidegger called this Being (das Sein), as it refers to the Being of being. To gain access to it, one could ask someone how they are, and how they are faring, and listen to the respondent’s tone, as an emotional mood colouring the reply (Heidegger, 1927/1962, pp. 173, 205). If the person is nervous and shaky in their voice it indicates an existential Being of angst and uncertainty. The challenge in making use of Heidegger’s existential (psychoanalytic) analysis is that it reveals only the state of the person in phenomenological fashion and says nothing of the ethical quality of the existential experience, before, during or after it takes place. Field Field is a social space where individuals interact with one another and, for Bourdieu, is organised by a system of operations or ‘the rule of the game’ or ‘regularities’ (Bourdieu & Wacquant, 1992b, pp. 18, 98). Field is therefore theo For Aristotle dialectic has a specific meaning connected with reasoning: ‘An argument in which, certain things having been assumed, something other than these follows of necessity from their truth, without needing any term from outside.’ (1949: §24, b18–22) 1

Glossary

193

rised by Bourdieu as ‘a network, or configuration, of objective relations between positions … in the structure of the distribution of species of power (or capital) whose possession commands access to the specific profits that are at stake in the field’ (Bourdieu & Wacquant, 1992b, p. 97). This conception implies that the positions in the field are interrelated by objective reasons in the form of the rules of the games in the field. The positions are determined and distributed by power or authority that is legitimate in the field. As a social space, field refers more to the actual ‘locus’ or contextual boundaries where society is working (Bourdieu, 1989; Bourdieu & Wacquant, 1992b; Costa & Murphy, 2015). Bourdieu adds that ‘a field consists of a set of objective historical relations between positions anchored in certain forms of power (or capital)’ (Bourdieu & Wacquant, 1992b, p. 16). As fields ‘are organized around specific types or combinations of capital’ (Broberg, 2015, p. 51), so field would essentially include the arena of interactions and the systems it operates by agents exercising capital/power for positions in the field. There are forces and struggles in this social field; struggle and power relations to secure positions or influence. In the field of higher education, the type of power or authority that is most legitimate is not money – economic capital – but symbolic and cultural capital. Habitus Bourdieu defines habitus as ‘a system of lasting and transposable dispositions which, integrating past-experiences, functions at every moment as a matrix of perceptions, appreciations and actions’ (Bourdieu & Wacquant, 1992b, p. 18). He wants to emphasise a combination of ‘a system of dispositions, that is of permanent manners of being, seeing, acting and thinking’ and a system of ‘long-lasting (rather than permanent) schemes or schemata or structures of perception, conception and action’ (Bourdieu, 2016, p. 43). Habitus is understood in this study as dispositions (systems of propensity, tendency, inclination) working as the mechanism of an individual’s behaviour (e.g. perceptions, appreciations, actions) that are gradually ingrained from the societies s/he is involved in through the process of inculcation (Bourdieu & Wacquant, 1992b; Maton, 2008). Mills (2013, p. 44) calls it embodied ‘habit of mind’. Language games Theorised by Wittgenstein (1953/1967) and Lyotard and Thébaud (1985), language games are believed to be intimately linked to assessment as a social practice. According to Wittgenstein, ‘Here the term “language- game” is meant to bring into prominence the fact that the speaking of language is part of an activity, or of a form of life’ (1953/1967, p. 11) or in our terminology a social practice and thus constitutes generative mechanisms and structures. In a simpler expression perhaps, language game is the concept that language – a word, a sentence and so on – means something only when it is in context and can mean something different in another context based on conventional rules. The word ‘game’ also implies ‘the rules of the game’ (Bourdieu & Wacquant, 1992b, p. 18), or the necessary knowledge to participate in a social field, such as in the assessment world. It involves noticing and thinking altogether (Dinishak, 2013; Wittgenstein, 1967). The language that is used in assessment, feedback, questions, commands and so on only means something when the people (the teachers and the students) know the game in which they are taking part. Language games

194

Glossary

are a vehicle that help us make what we highlight or notice in assessment – quality, standards, excellence and the like – explicit; noticing is thus naming and bringing to light what is assessed. Thus understood, we can refer to all the (language) activities in assessment as the language game of assessment or simply the assessment game. Wittgenstein (1953/1967, § 2) distinguished between simpler, more primitive language games and more advanced, complicated ones. We have not attempted this as it leads to the problem of defining the terms primitive and advanced and degrees of difference between the two, along with the implication that certain social practices are more primitive, and as a consequence less worthy, than others. Ontology and epistemology Critical realism posits a distinction between the objects of knowledge (ontology) and the conditions for knowledge about the objects (epistemology), in the sense that ‘any theory of knowledge presupposes an ontology in the sense of an account of what the world must be like for knowledge, under the descriptions given it by the theory’ (Bhaskar, 1993, p. 205; see also 1998, p. 8). To take an example, if our assessment theory is about peer assessment and it identifies as a key component interaction between students working in groups, then the ontological standpoint presupposes that interaction takes place in order that the theory, as epistemology, can both pick it up and notice it and explain the existence of precisely these things. The ontological–epistemological distinction makes it possible to conceptualise a relatively enduring form of being in the realm of ontology, the intransitive, as opposed to a historically specific and modifiable knowledge, the transitive world of knowing.In Bhaskar’s early work the transitive is limited to scientific knowledge, and is reasoned and causal in character. In Dialectic: The pulse of freedom the transitive knowledge dimension is widened: ‘we are concerned with the truth, ground, reason or purpose of things, not propositions … the concept of the transitive dimension should be metacritically extended to incorporate the whole material and cultural infra-/ intra-/superstructure of society’ (Bhaskar, 1993, p. 218). Symbolic capital When discussing capital, Bourdieu also outlines symbolic capital, which he theorises as ‘symbolic properties’ (Bourdieu, 1984, p. 20) or ‘symbolic benefits’ (Bourdieu & Wacquant, 2013, p. 293). It is applicable to every kind of capital (economic, cultural and social) when it is exemplified, represented or gained ‘symbolically’ (Bourdieu, 2004, p. 27) – for example a BA transcript from MIT holds symbolic capital. This concept of ‘symbolic effects of capital’ is only legitimate when it receives recognition within the habitus and with the specific logic of the field (Bourdieu, 1997, p. 242; Bourdieu & Wacquant, 2013). Alongside cultural and social capital, this type of capital is most significant in the field of education where institutions and academics are positioned based on their symbolic capital, whether they have prestige or otherwise (Bourdieu, 1988b). Symbolic capital has symbolic efficacy or symbolic exchange value, which entitles the possessor to symbolic profits and benefits (Bourdieu, 2004; Bourdieu & Wacquant, 1992b; Skeggs, 2004a, b).

Glossary

195

Tacit knowing and tacit knowledge Tacit knowing as a skill and tacit knowledge can be summarised in the adage, ‘I can’t tell you what quality is, but I know it when I see it’, or more simply, ‘We know more than we can say’. Conceptualised by Polanyi (1962), we understand tacit knowledge as human knowledge that is hard to articulate explicitly and thus become one’s subsidiary awareness. An example is our knowledge about a person’s face. It is recognisable among a thousand or million others, but we cannot tell how we recognise the face we know. As opposed to explicit knowledge, tacit knowledge can also be referred to as ‘the quiet storehouse of information that exists though it cannot necessarily, or at least easily, be put into words’ (Johnson, 2016, p. 2). In the terminology of critical realism the tacit belongs to the real of generative processes and structures. Thus understood, the tacit knowing of assessment, read evaluative knowing in the case of teachers or other professionals, is the generative processes and structures. According to Turner (2012, p. 387), ‘there are two families of tacit knowledge’, namely collective and personal tacit knowledge. Collective tacit knowledge is general tacit knowledge which is shared among the members of a society, such as a general understanding of English language enabling English-speaking people to understand each another. Personal tacit knowledge refers to the tacit knowledge of an individual, as in our individual understanding of English which is different from anyone else’s English. In regard to assessment, there are aspects of quality/criteria/standards which are (or should be) shared by teachers and students. However, especially in the actual practices of assessment, each teacher and student possesses ‘personal’ tacit knowledge of quality, which is very likely to be different. Thus understood, the personal tacit knowledge of assessment held by teachers should be shared with students by ‘consecrating’ assessment practices continuously to ensure they are recognised and respected. In other words, each institution suggests its collective tacit knowledge of assessment for teachers and students.

References Aristotle. (1949). Prior and posterior analytics. Oxford University Press. Atkinson, W. (2011b). From sociological fictions to social fictions: Some Bourdieusian reflections on the concepts of ‘institutional habitus’ and ‘family habitus’. British Journal of Sociology of Education, 32, 331–347. Atkinson, W. (2013). Some further (orthodox?) Bourdieusian reflections on the notions of ‘institutional habitus’ and ‘family habitus’: A reply to Burke, Emmerich, and Ingram. British Journal of Sociology of Education, 34, 183–189. Austin, J. (1976). How to do things with words. Oxford University Press. Bhaskar, R. (1993). Dialectic: The pulse of freedom. Verso. Bhaskar, R. (1998). The possibility of naturalism: A philosophical critique of the contemporary human sciences (3rd ed.). Routledge. Bourdieu, P. (1984). Distinction: A social critique of the judgement of taste. Harvard University Press.

196

Glossary

Bourdieu, P. (1988b). Homo academicus. Stanford University Press. Bourdieu, P. (1989). Social space and symbolic power. Sociological Theory, 7, 14–25. Bourdieu, P. (1997). Pascalian meditations. Stanford University Press. Bourdieu, P. (2004). The forms of capital. In S. J. Ball (Ed.), The RoutledgeFalmer reader in sociology of education (pp. 15–29). RoutledgeFalmer. Bourdieu, P. (2016). Habitus. In J. Hillier & E. Rooksby (Eds.), Habitus: a sense of place (2nd ed., pp. 43–49). Routledge. Bourdieu, P., & Wacquant, L. (1992b). An invitation to reflexive sociology (3rd ed.). University of Chicago Press. Bourdieu, P., & Wacquant, L. (2013). Symbolic capital and social classes. Journal of Classical Sociology, 13(2), 292–302. Broberg, G. (2015). Seeing is achieving: Assessment practice and student capital (PhD thesis). Arizona State University. Burke, C. T., Emmerich, N., & Ingram, N. (2013b). Well-founded social fictions: A defence of the concepts of institutional and familial habitus. British Journal of Sociology of Education, 34, 165–182. Byrd, D. (2019b). Uncovering hegemony in higher education: A critical appraisal of the use of ‘institutional habitus’ in empirical scholarship. Review of Educational Research, 89(2), 171–210. Costa, C., & Murphy, M. (2015). Bourdieu, habitus and social research: The art of application. Palgrave Macmillan. Dinishak, J. (2013). Wittgenstein on the place of the concept ‘noticing an aspect’. Philosophical Investigations, 36(4), 320–339. Heidegger, M. (1962). Being and time. Blackwell (Original work published 1927). Johnson, S. (2016). Tacit knowledge: An assessment of Michael Polanyi’s epistemology (MDiv thesis). Regent University. Lyotard, J.-F., & Thébaud, J.-L. (1985). Just gaming. University of Minnesota Press. Maton, K. (2008). Habitus. In M. Grenfell (Ed.), Pierre Bourdieu: Key concepts (pp. 49–66). Acumen. Mills, C. (2013). A Bourdieusian analysis of teachers’ changing dispositions towards social justice: The limitations of practicum placements in pre-service teacher education. Asia-Pacific Journal of Teacher Education, 41, 41–54. Polanyi, M. (1962). Personal knowledge: Towards a post-critical philosophy. Routledge. Reay, D. (1998b). ‘Always knowing’ and ‘never being sure’: Familial and institutional habituses and higher education choice. Journal of Education Policy, 13, 519–529. Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18, 119–144. Skeggs, B. (2004a). Context and background: Pierre Bourdieu’s analysis of class, gender and sexuality. Sociological Review, 52(2S), 19–33. Skeggs, B. (2004b). Exchange, value and affect: Bourdieu and ‘the self’. Sociological Review, 52(2S), 75–95. Tarabini, A., Curran, M., & Fontdevila, C. (2017b). Institutional habitus in context: Implementation, development and impacts in two compulsory secondary schools in Barcelona. British Journal of Sociology of Education, 38(8), 1177–1189. Turner, S. (2012). Making the tacit explicit. Journal for the Theory of Social Behaviour, 42(4), 385–402. Wittgenstein, L. (1967). Philosophical investigations (3rd ed., trans. G. E. M. Anscombe). Macmillan (Original work published 1953).