241 28 7MB
English Pages [179] Year 2020
Defending Assessment Security in a Digital World
Defending Assessment Security in a Digital World explores the phenomenon of e-cheating and identifes ways to bolster assessment to ensure that it is secured against threats posed by technology. Taking a multi-disciplinary approach, the book develops the concept of assessment security through research from cybersecurity, game studies, artifcial intelligence and surveillance studies.Throughout, there is a rigorous examination of the ways people cheat in diferent contexts, and the efectiveness of diferent approaches at stopping cheating.This evidence informs the development of standards and metrics for assessment security, and ways that assessment design can help address e-cheating. Its new concept of assessment security both complements and challenges traditional notions of academic integrity. By focusing on proactive, principles-based approaches, the book equips educators, technologists and policymakers to address both current e-cheating as well as future threats. Phillip Dawson leads research into academic integrity at the Centre for Research in Assessment and Digital Learning (CRADLE), at Deakin University in Melbourne, Australia. He uses his background in assessment and cybersecurity to protect education from cheating. His work involves unorthodox methods like computer hacking and paying professional cheaters.
Defending Assessment Security in a Digital World Preventing E-Cheating and Supporting Academic Integrity in Higher Education Phillip Dawson
First published 2021 by Routledge 2 Park Square, Milton Park,Abingdon, Oxon OX14 4RN and by Routledge 52 Vanderbilt Avenue, New York, NY 10017 Routledge is an imprint of the Taylor & Francis Group, an informa business © 2021 Phillip Dawson The right of Phillip Dawson to be identifed as author of this work has been asserted by him in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identifcation and explanation without intent to infringe. Disclaimer: References are made to use of the Internet in which material, products and services of potentially illegal nature may be found; the author and publisher maintain no responsibility or liability for any loss, damage or other outcome from searching for, or attempting to procure, any material, product or service, regardless of intent or method. British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library Library of Congress Cataloging-in-Publication Data Names: Dawson, Phillip, author. Title: Defending assessment security in a digital world : preventing e-cheating and supporting academic integrity in higher education / Phillip Dawson. Description:Abingdon, Oxon ; New York, NY : Routledge, 2021. | Includes bibliographical references and index. Identifers: LCCN 2020019978 (print) | LCCN 2020019979 (ebook) | ISBN 9780367341541 (hardback) | ISBN 9780367341527 (paperback) | ISBN 9780429324178 (ebook) Subjects: LCSH: Cheating (Education)–Technological innovations. | Cheating (Education)–Prevention. | Educational technology–Security measures. | Universities and colleges–Examinations. | College students–Conduct of life. | Education, Higher–Moral and ethical aspects. Classifcation: LCC LB3609 .D39 2021 (print) | LCC LB3609 (ebook) | DDC 371.2601/3–dc23 LC record available at https://lccn.loc.gov/2020019978 LC ebook record available at https://lccn.loc.gov/2020019979 ISBN: 978-0-367-34154-1 (hbk) ISBN: 978-0-367-34152-7 (pbk) ISBN: 978-0-429-32417-8 (ebk) Typeset in Bembo by SPi Global, India
Contents
List of Tables Acknowledgements 1 E-Cheating
2
x xi 1
Defning e-cheating What is diferent about e-cheating? Academic integrity, fraud, hacking and e-cheating An afordance-based taxonomy of e-cheating approaches Providing access to unauthorised information Cognitive ofoading to a tool Outsourcing work to a person Disrupting the assessment process E-cheating: a signifcant problem, but what can we do about it? Things to do
4 5 7 8 9 11 13 15 17 18
Assessment security
19
What is assessment? Two key features of assessment security Authentication Control of circumstances (Im)perfect assessment security Approaches to improving assessment security Detection, evidence and penalties Technology Assessment design Legal approaches Problems with assessment security Things to do
20 21 21 22 24 25 25 26 32 35 36 36
vi
Contents
3
The E-Cheating lifecycle and how to disrupt it
38
Awareness raising Students don’t just fnd e-cheating; e-cheating fnds students Review sites, discount codes and seduction ‘Free’ e-cheating Approaches for disrupting e-cheating awareness raising Purchasing Contract cheating websites E-cheating hardware websites Gig economy websites Large-scale e-commerce sites Dark web Approaches for disrupting e-cheating purchases Content production Human cheating content production Computerised cheating content production Approaches for disrupting e-cheating content production Delivery, submission and after-sales care Delivery Submission After-sales care Approaches for disrupting e-cheating delivery, submission and after-sales care Things to do
39 39 41 42 42 43 44 45 46 46 47 47 48 48 51 51 52 52 53 54 55 57
Cybersecurity, E-Cheating and assessment security
59
Cybersecurity basics Implications of cybersecurity for assessment security Cybersecurity depends on an adversarial mindset Assessment security has a harder authentication problem to solve than regular cybersecurity’s authentication problem The contexts of assessment security impose some challenging constraints for security Assessment security depends on hardware security and software security Assessment security will always be imperfect, but that doesn’t mean we shouldn’t try Security through obscurity is not enough Once an expert makes software that can cheat, everybody can cheat Attack is easier than defence Attackers don’t need to account for laws, policy, ethics, student rights or public opinion Improving assessment security through cybersecurity
59 61 61
4
61 62 62 63 63 64 64 64 65
Contents vii
5
6
7
Penetration testing Disclosure and transparency Working with students Working with cybersecurity Things to do
65 66 68 69 69
Lessons from E-Cheating in games and gambling
70
Cheating in online games Cheating in esports Cheating in online gambling Anti-cheating technologies in games and gambling What lessons can we learn? Even with much bigger budgets, e-cheating cannot be completely defeated Schneier is right: once an expert writes software that can do something, anybody can do that something Honest players are a big asset in reducing cheating Anti-cheating technologies reify particular understandings of e-cheating Independent bodies can regulate and keep records on individuals that span contexts Some degree of secrecy and undercover operations is necessary to keep ahead of e-cheating Anti-cheating can be ofered as a service To understand e-cheating’s future, look to games and gambling Things to do
70 73 75 77 79 79
E-Cheating, assessment security and artifcial intelligence
83
What AI is and what it can and cannot do AI for e-cheating AI can already meet some learning outcomes for students; do we still need to assess them? What if AI becomes so good at meeting outcomes for students that we can’t spot it? Where is the boundary between AI help and AI cheating? AI for assessment security Can AI detect e-cheating? Can AI be the decision-maker? AI is imperfect but it scales well What will humans and machines need to do? Things to do
83 85
Surveillance and the weaponisation of academic integrity
98
Academic integrity, e-cheating, assessment security and surveillance culture
99
80 80 80 81 81 82 82 82
86 87 89 90 90 90 95 96 97
viii
8
9
Contents Who will surveil students and to what end? Routine surveillance will catch too many Trust, surveillance and gamifcation The future of surveillance is weaponisation Alternatives to surveillance Living with surveillance Things to do
100 102 102 103 104 105 107
Metrics and standards for assessment security
108
The problems of metricless assessment security Difculty to cheat metrics Cost to cheat Limitations of cheating difculty metrics Detection accuracy metrics Limitations of detection accuracy metrics Proof Metrics Limitations of proof metrics Prevalence metrics Limitations of prevalence metrics Learning, teaching, assessment and student experience metrics Limitations of learning, teaching, assessment and student experience metrics Towards standards for assessment security Things to do
108 109 110 115 116 119 119 120 120 122 122 123 124 126
Structuring and designing assessment for security
128
Assessment design trumps assessment security Assessment design decisions What doesn’t work in designing assessment for security Obvious mistakes that weaken assessment security Reusing the same assessment task Unsupervised online tests Take-home ‘one right answer’ or lower-level tasks Poor examination practices Invigilation and assessment security Group work and assessment security Authentic restrictions and assessment security Programmatic assessment security Dialogic feedback and assessment security Random assessment security checks Assessment design and quality control processes Things to do
128 130 132 133 133 133 134 134 135 135 136 137 138 139 139 140
Contents ix 10 Conclusion: Securing assessment against E-Cheating
141
Focus on assessment security alongside academic integrity Take an afordance-based approach to understanding e-cheating, and a principles-based approach to stopping it Perfect is the enemy of good, so aim for incremental improvements Build a culture of evidence and disclosure Resist assessment conservatism and assessment security theatre
141
References Index
146 164
142 143 143 144
Tables
2.1 Summary of technologies for assessment security. 2.2 Relationship between learning outcomes levels and e-cheating threats. 8.1 Cost to cheat and cost to subcontract in fve assessments. 8.2 Comparing pre- and post-workshop sensitivity and specifcity scores. Numbers in brackets are 95% CIs. 8.3 Example of the sort of data necessary for calculating sensitivity and specifcity from detection studies. 8.4 Six levels of supervision and identity verifcation. 8.5 Hypothetical standards for assessment security.
27 34 112 116 118 124 125
Acknowledgements
This book would not have been possible without the eforts of many people. My wife Sam has heard many versions of the ideas in this book over many years. Thanks for the back-and-forth that has sharpened them so much.Thanks for the practical support in taking care of our boys during my writing stay-cation over the summer. But most of all, thanks for sharing the highs and lows of all of this with such love. A big thank you as well to my eldest son Syd for asking me to explain the ideas in this book more clearly, and to my youngest son Larry for your general enthusiasm about life. Wendy Sutherland-Smith has been my academic integrity guru and dear colleague for the past fve or so years.Thanks so much not just for collaborating with me on such interesting work, but also for being a model of ethical scholarship. Paige Mahoney provided excellent copy editing and proofreading support for this project.Thanks for saving me and the future readers of this text from my idiosyncrasies. If readers spot anything inelegant, please know that Paige tried to fx it but I was too stubborn. Several dear colleagues reviewed chapters in areas of their expertise. A huge thank you to Wendy Sutherland-Smith, Nick Patterson, Robbie Fordyce, Margaret Bearman, Peter Rogers, Mark Ricksen and Edd Pitt.Your generosity and specialist knowledge made this a better book.Any errors that remain are of course mine and mine alone. Finally, a big thank you to my colleagues at the Centre for Research in Assessment and Digital Learning, and at Deakin University more broadly. Extra special thanks to David Boud for supporting me to take time to write this book, Rola Ajjawi for doing my job while I was away, and Beverley Oliver for providing funding that got me started in academic integrity research.
1
E-Cheating
We are in the midst of a global panic about student cheating. Concerns about students outsourcing their work to ‘contract cheating’ websites, which produce bespoke assignments for a fee, regularly appear in the media.Alarming statistics are presented, such as 16% of students admitting to having paid someone online to do their work and only 1% of those cheating being caught (Lee, 2019). Cheating devices are being sold to students online; machine learning tools are being used by students to do their homework (Dawson, in press); and YouTube stars make big money selling product placement advertising for cheating services (Jefreys & Main, 2018). One thing unites this new context for cheating: technology.Whether it’s managing the logistics of providing cheating services, like an essay mill’s e-commerce site, pushing cheating to students through social media, or sophisticated artifcial intelligence producing papers for students, technology is pervasive in the current cheating landscape. On the fip side, technology is also heralded as part of the solution to cheating, ranging from locking down students’ computers while they complete assignments (Dawson, 2016) to fying drones above exam halls to spot cheaters (Reuters, 2017). Technology poses both great promise and great peril for the problem of cheating. And this is without considering the potential benefts of future technologies to both cheaters and those who wish to stop them. This is a book about technologised cheating, or ‘e-cheating’, and what we can do about it.The book is primarily targeted at researchers, educators, technologists and leaders in the feld of higher and professional education. But it’s also a topic of basic human interest; who doesn’t love a good story about cheating? The book explores the current and likely future states of e-cheating, what can be done to secure assessment in this new world, and some of the challenges we are likely to face along the way. Why a book specifcally focused on e-cheating? Firstly, because with new technologies come new ways to cheat. The rise of the World Wide Web in the late 1990s was associated with a rise in copy-paste plagiarism. Similar panics about new technologies can be found as far back as Socrates complaining about the emergence of writing as producing students who appear to know things they actually do not know (see Plato’s dialogue ‘Phaedrus’, sections 257c–279c). New technologies challenge norms about what is acceptable in education.
2
E-Cheating
The frst step in addressing new cheating technologies is a systematic interrogation of technology and cheating. In addition to interrogating technologised cheating, it is also essential to understand and question technologies that are being used to address cheating. In my work, I have tried to empirically test the claims of vendors of anti-cheating technologies. I’ve published about my successful hacking and cheating attempts at computer-based exams (Dawson, 2016). I’ve tested out new machine learning tools to see if they really can build an authorship profle on a student to see if the student has contract cheated (Dawson, Sutherland-Smith, & Ricksen, 2020). I’d like to give you a third example here, about replicating the types of cheating approaches I’ve seen used successfully in online exams, but legally I’m not allowed to do so.This is infuriating. Just as we need to scrutinise cheaters and cheating businesses, we need permission to question and study those on the other side, who proft from attempts to stop cheating. Do their anti-cheating technologies actually work? The study of e-cheating benefts from its adjacent technological disciplines, such as artifcial intelligence and cybersecurity, which ofer us new metaphors and methodologies to study cheating.These include the notion of ‘penetration testing’ a system to fnd its vulnerabilities; the holding of hacking competitions with bounties for those who can demonstrate successful hacks; and the study of ‘social engineering’ to fnd non-technical breaches (Schneier, 2018). As Schneier notes, with the integration of computers into all facets of society,‘computer security will become everything security’ (p. 7), which means that the lessons of cybersecurity apply to cheating as much as anywhere else. Why care so much about cheating? I could ask the same question of you reading this book – what motivated you to read about cheating? There are many answers to this question, often quite personal ones. Cheating might just feel fundamentally unfair or unethical.You might believe cheating damages learning and the reputations of educational systems. Philosophers have argued extensively about the moral wrongness of cheating (Bouville, 2009). But the biggest concern I have about cheating is a fairly pragmatic one: it leads to people being accredited as having met particular learning outcomes they have not actually met.When I go to the doctor or drop my kids of at school, I place great trust in the capabilities of professionals whom I hope have earned their qualifcations honestly. Cheating creates an unsafe situation where incompetent and dishonest professionals walk amongst us. This is a book about the minority of students who e-cheat; the vast majority of students do not e-cheat or even cheat. However, for reasons of fairness, public safety, trust in education systems and productivity, we need to do our best to stop the small minority that do. So what proportion of students e-cheat? Little data is available that specifcally targets e-cheating. Some e-cheating behaviours are considered in large-scale self-report surveys such as Bretag et al.’s (2018) survey of Australian university students, which found that roughly 6% of students admitted to having engaged in ‘contract cheating’, such as the use of websites that provide custom-made assignments (Bretag et al., 2018). A synthesis of the literature by Newton (2018) found a much higher proportion on average, as high as 16% since 2014; however, there was a very signifcant range in the fndings of individual
E-Cheating
3
research studies, with ten studies fnding prevalence of 20% or higher, and almost double that number of studies fnding prevalence of 1% or less.There is much less data about newer e-cheating behaviours, such as the use of unauthorised electronic tools, or the employment of computer-based exam impersonators. But what we know from contract cheating tells us that prevalence probably varies signifcantly between contexts and over time. You might have noticed this chapter uses the word ‘cheating’ a lot, and it has not yet used the term ‘academic integrity’.This is intentional and unashamed.The feld of academic integrity has tended to focus on the positive: education about the importance of integrity; about citing sources; on awareness campaigns and honour codes; and on cultural change (Fishman, 2014; McCabe, Treviño, & Butterfeld, 1999, 2002).This is important work and I do not in any way wish to diminish it. However, the feld’s discomfort with the systematic study of the ways students cheat, including the seedy underbellies of cheating sites and hacker forums, has led to a somewhat dichotomous situation: you’re either pro-integrity or anti-cheating. This is a false dichotomy.We can, and we must, both promote academic integrity and seek to detect cheating. In my Australian context, doing both is legally required of universities (Australian Government, 2015).This book’s focus on e-cheating – in addition to the promotion of academic integrity – allows us to explore the negative, with a focus on technology. Before continuing, it’s important to acknowledge that cheating is contextual and socially constructed. In that spirit, here’s a bit about me, my biases and my worldview. Firstly, I come from the research area of assessment and feedback. I’m currently Associate Director of the Centre for Research in Assessment and Digital Learning (CRADLE) at Deakin University. I view cheating as an educational assessment problem, and situate my work on cheating and academic integrity within my broader work on assessment in higher education. I take a pragmatic approach (Badley, 2003) to research and am comfortable working with qualitative and quantitative methods; I think that with a bit of lateral thinking, evidence from a range of paradigms can be compatible. I’m located in Australia, which seems to have engaged rather extensively with academic integrity.When it comes to cheating, I am something of a sinner: my mum helped me cheat on a poster when I was in grade four; my dad gave me too much help on a story I was writing in grade seven; and I have Googled the answers to some online compliance quizzes. But apart from that I think I’m clean. Some things that others consider cheating (e.g. reading a study guide rather than the primary text, which McKibban & Burdsal, 2013 mention as a cheating behaviour) I don’t consider cheating, and some things that I think are cheating (e.g. using an online auto-paraphrase tool rather than paraphrasing manually) others might think are ok. I’m an online privacy obsessive: I use Virtual Private Networks (VPNs) to protect my Internet trafc; I have a sticker over my webcams to physically stop spying; and I’ve been a member of the Electronic Frontier Foundation. Finally, Turnitin has provided funding to my research centre for the direct costs of some of my research (but not for this book). I’m aware of some of the tensions in the above positioning. All of this is to say that you should read what I have to say in this book with a critical eye.
4 E-Cheating This chapter introduces and defnes e-cheating, which is important as it’s a concept that is used across the whole book.The chapter goes on to explore ways in which e-cheating is currently happening, and what the best evidence is about its prevalence. The chapter includes a taxonomy for e-cheating, which is used later when discussing what can be done to prevent or detect diferent types of e-cheating. It also includes a roadmap for reading the rest of the book.
Defning e-cheating This is not the frst text to use the term e-cheating. Other authors have studied technologised cheating before, and prepending an ‘e-’ to cheating is a somewhat obvious move – think e-assessment, e-bikes and e-waste.The term e-cheating has been used rather loosely, often with an implicit, straightforward defnition that e-cheating means technologised cheating (e.g. Gao, 2012; K. O. Jones, Reid, & Bartlett, 2008a, 2008b; Khan, 2010; Khan & Balasubramanian, 2012). Some other researchers have been more specifc, defning e-cheating as: • •
• • •
‘student violations of academic integrity through the use of any technology oriented device’ (D. L. King & Case, 2014, p. 21); ‘using information technology (IT) to aid in the process of cheating in a class. This includes the use of personal digital assistants (PDAs), camera or picture cell phones, two-way pagers, programmable calculators, computers, the Internet, and so on to gain an unfair advantage’ (D. L. King & Case, 2007, p. 71); ‘Cheating through the use of electronic resources’ (Styron & Styron, 2010, p. 38); ‘the use of the World Wide Web to assist with cheating’ (Rogers, 2006, p. 207); and ‘the habit of students engaging ICT devices to indulge in examination misconduct’1 (Omonijo, 2012, p. 388).
Discussion of e-cheating has largely been restricted to relatively niche journals. At the time of writing, there is little use of the term e-cheating amongst the top educational technology journals: no mention in the British Journal of Educational Technology, one mention in The Internet and Higher Education (a brief and undefned discussion by Sellani & Harrington, 2002) and one mention in Computers & Education (a passing mention by Parks, Lowry,Wigand,Agarwal, & Williams, 2018). In the top higher education research journals, there has been a similar lack of engagement with the term, with no mention in Higher Education, Studies in Higher Education or Assessment & Evaluation in Higher Education.The top academic integrity journals are similarly bereft of mentions of the term e-cheating, with no mentions in the International Journal of Educational Integrity, and one mention in the Journal of Academic Ethics (Molnar, 2015, a mention that is also in passing and undefned). Given that e-cheating is often used without defnition, and that the available defnitions are diverse, there is a need to defne it in this book before moving forward. I defne e-cheating as: cheating that uses or is enabled by digital technology. To be clear, this includes hardware such as computers, smartphones, smartwatches and earpieces. It also includes software and online tools such as
E-Cheating
5
automated algebra solvers and auto-paraphrasing tools. E-cheating includes the use of technology like e-commerce platforms to logistically enable cheating and, as such, contract cheating counts as e-cheating where it is performed through technology. To avoid any doubt, I am of the view that cheating is socially constructed, and as such there are few absolutes.The acts and tools mentioned in the previous paragraph are only e-cheating where they are forbidden, or where a reasonable person would think they are forbidden. Educators often incorporate some of the tools mentioned previously in their assessment designs, and in these cases they are obviously not e-cheating; for example, in a later-year engineering assignment, students might be expected to use software to solve some time-consuming algebra problems to enable them to focus on more advanced concepts. However, educators are often unaware of the full range of possible e-cheating approaches, which creates a grey area.To take the example of auto-paraphrase tools, which students can use to rewrite text and fool text-matching tools like Turnitin (Rogerson & McCarthy, 2017): while few educational institutions explicitly prohibit these, I think most educators would probably regard them as cheating. As new technologies emerge, educators and institutions will likely become aware of them long after their uptake by students. It is therefore important to take a principles-based view of what e-cheating is, rather than attempting to categorically rule in or out every specifc cheating approach or technology. In our current digital world, exclusively ofine education is becoming less common.This is likely true for cheating as well. Does this mean that most cheating is now e-cheating – paralleling the earlier quote from Schneier that ‘computer security will become everything security’? If we take the example of bringing unauthorised notes into an examination: if those notes are scrawled on a piece of paper with a pen, it is not e-cheating. But what if those notes are typed and printed out? What if the student uses one of the many WikiHow guides on exam cheating to learn new ways to bring the notes into the exam? What if the notes are shared online? What if the student downloads a template to make the notes look like a label for their water bottle? For the purposes of this book, if the cheating involves technology, then it is e-cheating and it is relevant. However, the book will be focused primarily on those types of e-cheating where technology is more prominent, such as cheating in online exams, contract cheating, and the use of unauthorised digital tools like auto-paraphrase software.
What is diferent about e-cheating? Have you ever heard a frustrated educator say ‘I wish the cheating students would just apply the same ingenuity to their studies as they do to cheating’? This is common in the conversations I have with educators when we talk about cheating. It is even part of conversations I have had with some computer-based exams vendors, who claim that only somebody very skilled, and who tried very hard, would be able to cheat on their exams. My response is usually this: e-cheating is diferent because it does not require any cleverness or hard work on the part of students. E-cheating can be packaged up and shared freely.
6 E-Cheating E-cheating can be bought with a credit card. E-cheating is never more than a web search away. E-cheating can be anonymous. E-cheating is pushed to students by social media and online advertisements. E-cheating usually rates higher on web searches than legitimate help (try a search for ‘essay help’ in your favourite search engine). E-cheating is diferent to traditional cheating, and it requires us to think diferently. The technological afordances of copying and sharing are particularly transformative for e-cheating. Some countries are so afraid of these afordances that they shut down the Internet nationally during high-stakes examinations, out of fear that students will use it to share exam answers (Al Jazeera News, 2018; Gebre, 2019). Websites where students share answers to their homework have turned into multibillion dollar enterprises (I would provide citations here but I’m too scared to list them for fear they’ll come after me). However, copying and sharing applies not just to the content that is being assessed, but also to the approaches that can be used to cheat. A search for ‘how to cheat in [your favourite online exam tool]’ will likely yield many how-to guides.These can require some degree of skill to implement; however, a signifcant and under-explored threat posed by copying and sharing is the packaging up of software tools that automate cheating, particularly in online exams. As I have demonstrated in my work on computer-based exam cheating (Dawson, 2016), it is possible to package up cheating approaches so they can be implemented by non-technical students who would otherwise lack the capability to cheat in this way.This is the same problem computer security faces with ‘script kiddies’ – people who lack the ability to hack systems themselves but just download pre-built hacking tools (Schneier, 2018). Just as hackers no longer need to deeply understand what they are doing in order to cause great damage, students who wish to cheat can have much of the process automated for them. Copying and sharing therefore makes e-cheating a much greater threat than traditional cheating. E-commerce, and its associated technologies and processes, also makes e-cheating more of a problem. Contract cheating – the outsourcing of assessed work – is a good example of a cheating approach that’s almost as old as the written word but has become much more of a problem in recent years.The advent of e-commerce sites and online advertising has facilitated sophisticated business models and logistics, helping contract cheating fourish into a billion-dollar industry (BBC News, 2019b).This business can be conducted with far greater privacy than face-to-face contract cheating, through anonymous payment methods like cryptocurrencies and privacy-enhancing technologies like encryption. But even without those more advanced technologies, basic web e-commerce allows students to anonymously recruit a writer who they will never meet. So while contract cheating has been around for a very long time, thanks to technology it has never been easier to cheat in this way. E-assessment has also created opportunities for a range of new e-cheating approaches that were not even a possibility before assessment was technologised. Before the advent of online exams, the threat of someone hacking into the exam to enable them to cheat was not a signifcant problem. Nor was the threat of students giving their login credentials to a third party to have them sit the exam.
E-Cheating
7
Yes, there were always ofine analogues of these cheating approaches, but the computerisation of assessment now means there are new, additional ways to cheat. Finally, e-cheating is diferent because it changes and adapts at a diferent pace to traditional cheating. As part of the digital world of education, e-cheating innovates at the speed of technology. New approaches to e-cheating are developed by technology startups dedicated solely to the e-cheating industry (e.g. Monorean, 2019). New approaches to e-cheating are shared by students in forums. E-cheating professionals have their own discussion forums where they share their experiences and learn how to overcome new anti-cheating technologies. Review and comparison sites allow potential cheating students to shop around for the best deal.All of this means that educators, administrators, researchers and policymakers need to be kept up to date. It also means that we need to take new, principles-based approaches, rather than ‘playing whack-a-mole’ as each new threat emerges.
Academic integrity, fraud, hacking and e-cheating E-cheating is important, but what is its relationship with other concepts? The most obvious connection is with academic integrity, a concept that has been studied for decades.The International Center for Academic Integrity defnes academic integrity as ‘a commitment to fve fundamental values: honesty, trust, fairness, respect, and responsibility’ (Fishman, 2014).This is a positive mission, and one that has had signifcant impact on international practice. However, as acknowledged by the international academic integrity community, such a mission will never, on its own, completely stop cheating in general and e-cheating in particular.There will always be some individuals who are determined to cheat, and approaches like education around the fundamental values, and getting students to sign honour codes, will not stop them (McCabe et al., 2002).There is a need for parallel and complementary work into e-cheating.This includes e-cheating methods, the detection of e-cheating, and evidencing/proving cases of e-cheating. Compared to academic integrity, the study of e-cheating sounds negative. However, we have a responsibility to our communities to take reasonable measures to detect e-cheating, develop and implement measures to deter cheating, as well as continue to promote learning with integrity. E-cheating also has connections with fraud, which the Oxford English Dictionary defnes as ‘Criminal deception; the using of false representations to obtain an unjust advantage or to injure the rights or interests of another’ (Oxford English Dictionary, 2020b). Fraud could be considered the superset of e-cheating, and fraud laws could theoretically be used to prosecute e-cheating cases (Draper & Newton, 2017). However, fraud also includes a range of activities in education that go beyond our scope here, which makes it a less useful term than e-cheating. For example, in ‘Operation Varsity Blues’, several celebrities were charged with fraud for ofences relating to bribing university ofcials and infating grades (Colvin, 2019; Paris, 2019).While important to address, these matters are more in the legal domain than the educational domain, so they are not the focus of this book. Hacking is also adjacent to, and overlaps with, e-cheating. In a computing sense, hacking can mean writing computer code, fnding workarounds to problems, or
8 E-Cheating subverting or taking over a computer system (Oxford English Dictionary, 2020c). This last sense is the type of hacking that connects strongly with e-cheating.The ability to break in to an online exam tool or a learning management system provides many e-cheating opportunities. Hacking is thus one of the tools that can enable e-cheating. However, not all instances of hacking into educational institutions and their systems are considered e-cheating in this book.As with fraud, more general cybercrime – such as breaking into a university’s student database and changing a student’s grades – is not considered e-cheating. The solutions to that problem are primarily in the domain of cybersecurity, rather than anything educational.
An afordance-based taxonomy of e-cheating approaches Where regular cheating had centuries to evolve, e-cheating has developed rapidly over mere decades. It will likely continue to change rapidly as technology and our digital world evolve.The study of e-cheating therefore needs to consider not just the approaches of today, but the approaches of tomorrow. Rather than focusing on particular e-cheating approaches, which will change over time, we need to focus on the impacts of diferent approaches to cheating. Rather than focusing on just the approaches that are completely feasible right now, we need to be forwardlooking and consider what might evolve in the near future. In this section, I propose a taxonomy to consider e-cheating approaches, characterised by their afordances.An afordance is ‘[a] property of an object or an aspect of the environment, esp. relating to its potential utility’ (Oxford English Dictionary, 2020a). Focusing on the afordances that might threaten assessment, rather than the e-cheating approaches we are currently aware of, allows us to manage the e-cheating threat in a principles-based rather than reactionary way. This means we can think forward about what is allowed or not; we also won’t be surprised when new threats occur within a particular afordance category. Introducing another typology is necessary for this book as existing typologies tend to focus on cheating activities rather than their consequences. For example, McKibban and Burdsal’s (2013) empirical work identifes fve cheating factors (creative padding; interactive cheating; false personal excuses; taking credit for others’ work; and exam cheating), but none of them would facilitate comparison in terms of the impact of the student’s activity. Their paper also focuses on some activities that are arguably just strategic behaviour on the part of students (e.g. reading a condensed version of a text rather than the prescribed text), and it does not consider modern technology or contract cheating. Focusing on specifc cheating activities, rather than on the impacts of those activities, can lead to us being blindsided when new approaches emerge. The taxonomy in this chapter consists of four fundamental e-cheating afordances: • • • •
Providing access to unauthorised information; Cognitive ofoading to a tool; Outsourcing work to a person; and Disrupting the assessment process.
E-Cheating
9
Each of these high-level afordances will be discussed in the next section, along with examples. Before that, however, I wish to again emphasise one important caveat: these approaches are cheating only insofar as they are disallowed in a particular setting, either through policy, cultural norms or educator instructions. Many efective assessment approaches make productive use of the same afordances as e-cheating approaches do. Providing access to unauthorised information Some e-cheating approaches aford students access to information they should not have.This is the equivalent of sneaking notes into an exam hall, in that it provides the student with an unfair advantage; it also invalidates the assessor’s judgement, which was meant to be made on student work without access to such materials. There are two main subcategories of this afordance: breaching information restrictions; and accessing questions, exemplars, solutions or answers. Breaching information restrictions This afordance applies only to assessment tasks that depend on restricting student access to information which students would usually have access to – for example, online examinations that do not allow students to use a web browser.The easiest way to thwart this particular afordance of e-cheating is to design assessment tasks that do not depend on information restrictions.There are numerous examples of e-cheating approaches that breach information restrictions: INFORMATION-SMUGGLING DEVICES
A range of devices are available online for the sole purpose of bringing unauthorised information into exam settings. An exhaustive list would be unfeasible but could include earpieces that connect to an MP3 player; wristwatches that can be used to store information; and calculators that can store hundreds of thousands of words in a searchable fle (Ruby Devices, 2019). In terms of their ability to breach information restrictions, these devices are simply a new twist on the old approach of sneaking notes into an examination or writing formulae on your arm. RUNNING AN ONLINE EXAMINATION WITHIN A VIRTUAL MACHINE,AND RUNNING A WEB BROWSER IN THE HOST OPERATING SYSTEM
Many online examination tools restrict the programs that can be used by a student in their exam. I have previously demonstrated that it is possible to run some online exam tools within a virtual machine.You might have encountered a virtual machine before if you have ever run Windows on a Mac, within a window, to allow you to use some specialist software that only runs on Windows – I used to do this to run a particular version of the NVivo qualitative data analysis program. I have previously demonstrated the virtual machine approach can allow access to web browsers and other unauthorised information (Dawson, 2016). At the time of writing,
10
E-Cheating
cybersecurity researchers have also demonstrated this approach is currently functional with some online exam tools, despite vendors being made aware of the problem (Chase, 2018). TEXT INPUT HARDWARE
Online exam tools can be set up to deny students access to their computer’s flesystem. One method of bypassing this information restriction is to use hardware approaches, such as a USB key injector (HakShop, 2014), which is a device that tells a computer it is a USB keyboard but is really a tiny computer the size and shape of a USB drive.These devices can input specifc text at a specifc rate at a particular time, which the host computer is not able to distinguish from text typed on a keyboard. These tools can be hidden inside a computer and used to bring notes,‘cheat sheets’ or completed essays into an online exam environment (Dawson, 2016). Accessing questions, solutions, answers and exemplars The previous subcategory focused on ‘real world’ information the student would normally have access to but is denied under some assessment conditions. In contrast, this subcategory is about secret assessment-related information that the student should not be able to access.This includes e-cheating approaches that provide students access to exam questions ahead of time, to allow them to prepare; access to the solutions or answers for an assignment; or access to previous student work to use as an unauthorised exemplar. Specifc approaches that provide this afordance include: ‘BRAINDUMPS’ OF EXAM QUESTIONS
Several websites exist where people ‘dump’ the contents of their brains immediately after taking an examination, trying to remember as many questions as possible (for example, BrainDumps.com, 2019). These sites then combine questions from multiple test-takers in an attempt to reconstruct the examination, so that future test-takers can practise on the actual test questions. Braindump sites are very popular with industry certifcation tests. ESSAY BANKS AND ASSIGNMENT TRADING SITES
These websites store past student work, which current students can purchase either for a fee or for credits that are earned by uploading work.When educators reuse assignments from year to year, these sites provide students with access to solutions for their assignments.There is some debate about the moral and legal standing of these sites. In Australia, there is a current campaign called ‘Right to Publish’ (Responsible Student Publishers Association, 2019) that argues for students’ rights to share and proft from their work, which they claim is being breached by universities.
E-Cheating
11
FORENSIC ANALYSIS TO IDENTIFY QUESTIONS AND ANSWERS
When computer programs are running they store information in memory and can communicate with other devices on a network. It is possible for other computer programs to access what is in the computer’s memory or to intercept communications between machines.These and other forensic approaches can be used to identify exam questions and answers. In my work I have established the potential feasibility of a ‘cold boot’ attack, in which a computer’s memory contents are analysed after it has been turned of and on again (Dawson, 2016). This range of approaches is much more challenging to implement for the frst time than other approaches, however, if they are implemented by one person they could be packaged up and shared with other less technical users. Cognitive ofoading to a tool When I was a child, using a calculator in some exams was considered cheating.The educational justifcation for this was that I would not be carrying around a calculator with me every day, so I needed to learn to do the work in my head.Variations of this same logic are used to justify restrictions on the sorts of tools students are allowed to use in their assessments. As has been discussed earlier, concerns about student use of tools in education date back to Socrates’ criticism of writing. Be it writing things down to avoid having to remember them, or using a computer program to solve a difcult algebra problem so that you don’t have to do it yourself, people engage in all manner of ‘cognitive ofoading’; that is, ‘the use of physical action to alter the information processing requirements of a task so as to reduce cognitive demand’ (Risko & Gilbert, 2016, p. 677). Cognitive ofoading poses serious challenges for assessment, as educators and institutions are often unclear about which cognitive ofoads are allowed and which are not (Dawson, in press). However, there are some examples of cognitive ofoading that are generally forbidden by default in any context where they are not expressly permitted: Essay spinning, auto-paraphrase and back-translation tools These tools rewrite text for students, efectively allowing students to ofoad the task of paraphrasing. Some of these tools are adaptations of Search Engine Optimisation (SEO) tools, which are used by webmasters to increase trafc to a website by making it appear that there is more unique content – when actually, the content is just robotically paraphrased versions of the same content (Lancaster & Clarke, 2009; Rogerson & McCarthy, 2017; Spinbot.com, 2016). Students use these tools to avoid having to manually paraphrase (at best) or to reuse someone else’s work without getting caught for plagiarism (at worst). In addition to tools specifcally designed to paraphrase, students have also used translation tools for paraphrasing. This approach is known as back-translation (M. Jones & Sheridan, 2015), and involves translating text from one language (e.g. English) to another (e.g. Spanish) and then converting the translated text back to the original language. The usefulness of this particular approach appears to be diminishing, however, as
12 E-Cheating translation tools have become more sophisticated and are producing paraphrased text closer to the original source text; take, for example, this text from Jones and Sheridan’s (2015) paper: Source text:‘While many universities around the world would probably claim that their publicly stated policies and procedures will act as a deterrent to any student contemplating plagiarism, their publication alone is unlikely to cut any sway with would-be plagiarists’. (J.Williams, 2005, p. 8, as cited by Jones & Sheridan, 2015) Back-translation in Jones and Sheridan (2015):‘Whereas many universities anywhere in the world would demand probably that their indicated policies and procedures public will act like impediment to any student who contemplate plagiarism, its publication is only little probable to cut any shaking with the supposed plagiarists’. Back-translation in December 2019: ‘While many universities around the world would probably claim that their publicly declared policies and procedures will act as a deterrent to any student who contemplates plagiarism, its publication alone is unlikely to afect potential plagiarists’. As translation tools have become more sophisticated, they have become less useful as auto-paraphrase tools; the back-translated text that they return is very similar to the original text.This may make dedicated paraphrasing tools more attractive; take, for example, this paraphrasing of the same original text from Jones and Sheridan’s (2015) paper by spinbot.com: Auto paraphrase by spinbot.com in January 2020: ‘While numerous colleges around the globe would likely guarantee that their openly expressed approaches and systems will go about as an obstruction to any understudy pondering copyright infringement, their production alone is probably not going to cut any infuence with would-be literary thieves’. To me, this reads like the sort of text a high school student might produce after discovering the thesaurus feature in their word processor. However, I have sadly seen worse in my 15 years working as an academic, and for most of that time it would not have made me suspect plagiarism. Translation tools A key capability in foreign language courses is the ability to translate text from one language into another. Similarly, when studying in a foreign language, the ability to produce text in the language of instruction is a key outcome that may be challenging.These problems can be addressed, to a degree, through online translation tools. Where the specifc outcomes being assessed include the ability to produce work in a particular language, the unauthorised use of online translation tools would amount to e-cheating.The previous back-translated examples demonstrate how far
E-Cheating
13
computer translation has come in a short period of time; in the future, it may be that all but the most fuent second-language speakers may be able to produce better text through computer translations of their frst language, rather than attempting to write in their second language. Computer Algebra Solvers In some disciplines the ability to solve complex algebraic problems is a key learning outcome.These problems are often assessed in take-home assignments, and students have started using online tools to solve these problems. Computer Algebra Solvers have not been the subject of much e-cheating research so far, but they have been the topic of some panic in the popular media (e.g. Biddle, 2017).To some extent this panic may just be a more modern version of the ‘you won’t carry a calculator everywhere in your pocket’ concern raised in my childhood. Bespoke essay writing tools A range of tools now exist that can produce content for students on any topic. These tools conduct online research, fnd sources, and write and structure an essay. There are some reports online of students gaining passing (though not good) grades from using these tools. Some of these tools are particularly troubling in that they invent source material to quote from and, as such, have the potential to create ‘fake news’, as some media sources have identifed (Hern, 2019). For example, when fed the text from J.Williams (2005) that Jones and Sheridan’s (2015) paper used with back-translation tools, the Talk to Transformer tool (based on the OpenAI GPT-2 language model) produced the following text: While many universities around the world would probably claim that their publicly stated policies and procedures will act as a deterrent to any student contemplating plagiarism, their publication alone is unlikely to cut any sway with would-be plagiarists. It’s generally all about an individual’s perceptions and experiences. Meanwhile, John Oliver in the new HBO drama ‘Last Week Tonight’ has become a prominent champion for plagiarism in the world of journalism, and after delivering a great satire of the rise of cheating that was just published in the Hollywood Reporter, he’s defnitely on a platform to spread awareness about cheating in journalism, (text in italics is original material produced by Talk to Transformer, A. King, 2019) Outsourcing work to a person This set of e-cheating approaches involve getting someone else to do the assessed task for you.They are probably the oldest types of cheating, and some e-cheating
14
E-Cheating
approaches to outsourcing are merely streamlined or enhanced versions of older ofine approaches. Other approaches, however, are new and have been enabled by digital technology. Examples of outsourcing approaches include the following: Contract cheating When a student gets another student to complete all or part of their work it is known as contract cheating (Lancaster & Clarke, 2015).This is the case whether or not there is a formal contract in place. However, for the sake of this example, it is worth focusing on the formal contract cheating industry that produces bespoke assignments for students in exchange for money. Much of this contract cheating industry occurs online, which enables it to be highly sophisticated and geographically dispersed (Amigud, 2019; Lancaster, 2019b; Rowland, Slade,Wong, & Whiting, 2017). While face-to-face contract cheating is also a signifcant problem, digital technology allows writers to: be located anywhere in the world; exchange payment anonymously; recruit customers through targeted advertisements and social media; and manage workfow through e-commerce tools. Online course-taking A range of sites ofer to take entire online courses for students. In exchange for a fee, these sites will complete all assessed work, including examinations. Students need to provide their usernames and passwords to these sites, which poses signifcant risks, and is likely one of the methods used to obtain user accounts for sale on hacker websites (e.g. mpgh.net, 2019) – student login credentials are particularly valuable as they provide access to a range of resources (e.g. libraries) and discounts (e.g. discounted copies of software). These online course-taking sites say they will sit online exams for students, even those that are remote proctored, with no risk to the student in terms of getting caught; and while I’d like to verify this claim, I have not found any remote proctored exam vendors who are willing to let me try. Remote coaches Rather than completely outsourcing work to another party, some students opt to do the physical part of an assessment themselves but have someone else do the thinking.This approach is analogous to the clichéd romantic comedy premise of having someone on the phone telling you what to say during a date. Through technology, you can livestream the contents of an exam (paper or online) to a coach who can then tell you what to write, perhaps through a hidden earpiece. You can also buy devices that look like regular scientifc calculators but actually connect to an instant messaging service, so you can tell your coach what the questions are and they can answer them in real time. Businesses have been set up to sell hardware specifcally for this purpose, such as the Monorean cheating earpiece (Monorean, 2019) and the Ruby calculator that allows instant messaging (Ruby Devices, 2019).
E-Cheating
15
Expert input Across the Internet, there are a range of places where experts gather to answer questions from members of the public.The subreddit (similar to a discussion forum, on the website Reddit) /r/AskHistorians is one such example of an expert forum. On that subreddit, users ask history questions that are answered by qualifed historians. Thanks to careful moderation and vetting of experts, only high-quality answers are allowed. Students use these communities to outsource their homework so often that many have now adopted ‘no homework’ policies, though these are challenging to enforce as it can be difcult to distinguish outsourcing from legitimate curiosity. Disrupting the assessment process Some approaches to e-cheating provide students with an advantage by causing problems for the assessor, with assessment technologies, or for other students.There are two main subcategories of this e-cheating afordance: thwarting anti-cheating approaches and causing problems for students or teachers. Thwarting anti-cheating approaches A range of anti-cheating approaches are currently in use. Many of these are technology-based, such as text-matching software, remote proctoring software, and website blocking. Students use a range of approaches to thwart these technologies, including: BYPASSING TEXT-MATCHING SOFTWARE
The auto-paraphrase tools discussed earlier successfully defeat text-matching tools, as they produce original text that cannot be found in a database (M. Jones & Sheridan, 2015; Lancaster & Clarke, 2009; Rogerson & McCarthy, 2017). Other approaches have also been attempted, and one text-matching provider says their tool is robust against approaches including swapping letters for similar letters in other languages; using a thesaurus; putting an entire piece of work within quotation marks; or converting text into an image (Turnitin, 2019). DEFEATING REMOTE PROCTORING SOFTWARE
Remote proctoring tools provide ways to restrict what students can do with their computers, as well as placing restrictions on the environment within which students can take an examination.The approaches used by these tools include monitoring with webcams and microphones; keystroke biometrics; monitoring the software running on the student’s computer; and restricting the websites a student can visit. A range of approaches have been proposed to cheat in these exams, including hacking and designing rooms that permit cheating in ways the remote proctor cannot detect. Guides on how to defeat remote proctoring can be found easily online (e.g. Binstein, 2015); however, it is worth noting that guides to defeat some tools quickly become outdated as vendors respond.
16 E-Cheating BYPASSING WEBSITE BLOCKING WITH A VPN
The machine on which I am writing this book has been blocked by my university from accessing sites it has placed in the category of ‘Academic Fraud’.While I am careful to use the ofcial workaround to bypass this blocking, which involves submitting a request to IT, I know a much easier workaround: using a Virtual Private Network (VPN). VPNs encrypt Internet trafc and funnel it via a computer located somewhere else, which allows users to bypass website blocking. This approach can easily be used to thwart blocklists that prohibit access to cheating sites. Used in reverse, a VPN can also be used to make it appear that trafc originates within a university network, which can be used by online exam cheaters to hide the fact that they are located in other countries. PULLING THE PLUG
While some online exam technologies are resistant to a loss of Internet connectivity, all are reliant on electricity. Online exams vendors need to program in logic to handle power outages: will their tool terminate the exam if the power goes out? Will it recommence when the student logs back in? Will the time of the power outage count towards the exam time? While it might be tempting to just end the exam when the power goes out, this is likely unpalatable to the student body as a whole, which needs to know that in the event of equipment failure, crashes or power outages, their exam can continue from where they left of. However, if the online exams vendor does anything other than terminate the exam, they open up the door to students exploiting this leniency.This is a common concern in online games and gambling, and it is discussed in greater detail in Chapter 5. Causing problems for students and teachers A student underprepared for a face-to-face exam might be tempted to trigger the fre alarm or phone in a bomb threat; doing so causes problems for everybody involved but might buy the student some time. There are a range of e-cheating approaches that utilise the same underlying mechanism of causing problems for other people to gain an advantage; here is one example: DENIAL OF SERVICE ATTACKS
Denial of service attacks attempt to stop legitimate users of a service from using that service. One of the most common forms of denial of service is a Distributed Denial of Service (DDoS) attack, which often involves targeting a particular online service with huge numbers of requests from a huge number of devices (Mirkovic & Reiher, 2004). A range of DDoS tools can be downloaded which have varying degrees of efectiveness, and they can be used to target all manner of e-assessment tools, ranging from online exams to learning management systems. Using these approaches, students can claim they were unable to complete their assessment because the technology did not work. While vendors of some tools may invest
E-Cheating
17
signifcantly in reducing the risk of DDoS attacks on their tools, the interconnectedness of systems may mean attackers can functionally deny service to one tool by disabling another. For example, a DDoS attack on a university’s WiFi, student records database or single sign-on authentication software may be enough to block access to an exam, causing its cancellation. Similarly, if an assignment depends on students accessing a particular journal article, the use of a browser plugin such as a download manager can be enough to trigger the journal publisher’s DDoS protection and block access for all students at the institution. I have spoken with academics and students who have accidentally triggered institution-wide blocks by major publishers simply by downloading a hundred or so articles in rapid succession.
E-cheating: a signifcant problem, but what can we do about it? This chapter has put forth a case that e-cheating is here, it is diferent to cheating in the past, and we need to do something about it. The remainder of the book delves deeper into the problem of e-cheating and identifes ways to address it, drawing from a range of other disciplines and industries along the way. So, what can we do about e-cheating? Chapter 2 proposes a way to address e-cheating through focusing on assessment security. This is a complementary concept to academic integrity but, where academic integrity has a positive, educative mission, assessment security has a somewhat negative, adversarial mission in that it aims to detect cheating and make cheating very difcult. Chapter 3 delves deeper into how e-cheating happens. It explores the e-cheating lifecycle, from how students become aware of e-cheating through to transactions, content production and submission of cheated work. The chapter also considers how to disrupt each stage of the e-cheating lifecycle. The next four chapters explore what we can learn about securing assessment from the perspective of diferent disciplines and industries. Chapter 4 argues for the need to consider cybersecurity as an essential underpinning of assessment security, expanding on Schneier’s argument that cybersecurity is now everything security (Schneier, 2018). Chapter 5 draws on lessons learnt from two industries that have been dealing with e-cheating for decades: online videogames and online gambling. Chapter 6 asks how Artifcial Intelligence (AI) is afecting both e-cheating and assessment security, and what the future implications of more advanced AI may be. Chapter 7 builds on work from surveillance studies and explores some of the darker side of assessment security. The next three chapters draw together ideas from across the book to propose ways to improve assessment security. Chapter 8 proposes ways we might be able to think about the efectiveness of diferent approaches to assessment security through metrics and standards. Chapter 9 explores diferent ways to structure and design assessment to improve assessment security. Chapter 10 concludes the book with fve actionable suggestions to improve assessment security while maintaining awareness of the potential for unintended negative consequences.
18 E-Cheating I recommend you read the remaining chapters sequentially. However, if you prefer to pick and choose, I strongly recommend you read Chapter 2 before continuing on to any other chapter, as the concept of assessment security underpins the remainder of the book.
Things to do In this book, each chapter has a list of readings, suggestions or activities that you can follow up to learn more or put your learning into action. For this chapter I recommend you: •
•
•
•
Familiarise yourself with your academic integrity landscape. At a minimum this should include any academic integrity policies or procedures that are in place at your institution. Do you think these adequately cover e-cheating? Is their coverage of e-cheating forward-looking like the taxonomy, or is it instead focused on dealing with particular current threats? Browse the International Center for Academic Integrity’s website: https:// www.academicintegrity.org/.While I have advocated for more of a focus on cheating in this chapter, it is important to maintain a strong grounding in academic integrity. At a minimum, the Fundamental Values of Academic Integrity (Fishman, 2014) should be considered essential reading. Find out how to cheat at your institution. A likely frst step by any potential cheating student at your institution will be to search ‘how to cheat’ along with the tools your organisation uses. It’s important you familiarise yourself with those approaches. Search the course codes for the classes you teach and a range of cheatingrelated terms, like ‘cheat’, ‘buy’, ‘answers’ and ‘notes’. Many educators are shocked when they realise how much cheating is happening in public. But be careful not to view your search results as comprehensive; it’s likely that much of the cheating that is happening on your course is not indexed by search engines.
Note 1 This is attributed to one of three self-citations within that paper (it is unclear which); however, it cannot be found within those papers.
2
Assessment security
The previous chapter introduced the concept of e-cheating, and it built a taxonomy of approaches to e-cheating. This chapter asks: what can we do to secure assessment against somebody who wants to (e-)cheat? The chapter identifes the key features of assessment security, and examines the diferent approaches used to secure assessment against cheating. Just like Chapter 1, this one focuses on something negative: stopping cheating through securing assessment. And also like the previous chapter, there is a need to disambiguate our focus here from academic integrity.The simplest way of distinguishing the two concepts is that assessment security starts where academic integrity fnishes. Academic integrity is about: encouraging students not to cheat; equipping them with the capabilities they need to work with integrity; building a culture of integrity so that students value doing the right thing; and creating policies, procedures and pedagogies that integrate integrity and nudge students away from cheating. Academic integrity is not primarily focused on detecting cheating or punishing students who cheat.To reiterate my message from the previous chapter, I support the positive mission of the feld of academic integrity. But academic integrity on its own is not enough. Cheating is a complex social problem. Borrowing from the idea of the ‘social determinants of health’ (Marmot, 2005), there are also ‘social determinants of cheating’.Work is needed to address those determinants.When I give talks about cheating, people in question time often want to discuss what they think are the root causes of cheating: neoliberalism; university fees creating a student-as-consumer culture; universities’ reliance on international students; broadening participation; class sizes; and many other complex factors.Those are not the focus of this book, and while I agree that some of them are likely causes of cheating, they do not stop us from needing to secure assessment. Just as the study of the social determinants of health needs to co-exist with the study of medicine, the need to address the determinants of cheating does not absolve us of the need to secure assessment. Assessment security is a foundational topic for the rest of this book, so it’s important to defne it up front. For our purposes, assessment security is defned as: measures taken to harden assessment against attempts to cheat. This includes approaches to detect and evidence attempts to cheat, as well as measures to make cheating more difcult. Assessment security is a multidisciplinary feld, spanning education, sociology, psychology and cybersecurity.
20 Assessment security Assessment security is adversarial in that it focuses on detecting and stopping those who wish to cheat, rather than on convincing them of the error of their ways. Securing assessment matters for a range of reasons. Firstly, society needs to be able to trust that graduates have met the learning outcomes of their awarded qualifcations.The doctor I visit, my children’s teachers, the builder of my home: all pose great danger to me if they have cheated to become certifed. On its own, the promotion of academic integrity cannot assure us they did not cheat; if we wish to have any faith in the qualifcations we award, systematic measures need to be in place to develop and ensure assessment security. Secondly, honest students have a right to expect that we take adequate measures to ensure their less honest peers have not cheated.This is a matter of fairness and justice.Thirdly, the reputation of educational institutions rests on the security of their assessment. No matter how much positive culture exists at an institution, if assessment security is not a priority we will never know if that culture is really efective in preventing cheating. Academic integrity is like stopping a bank robbery by addressing the social determinants of crime; assessment security is about guards and vaults. Both are important and symbiotic. But most importantly, assessment security is about knowing if what we are doing, or what we might like to do, actually stops cheating. Many of the assessment security approaches discussed in this chapter are detrimental to learning, teaching or the student experience. If you have never sat an online remote proctored examination, I encourage you to try one out: you will be monitored relentlessly and some of your usual exam-taking behaviours, like staring of into space while you think, might cause red fags around potential cheating. Anti-cheating technology comes at a cost.Assessment security gives us a set of concepts to ask: how well does this anti-cheating approach actually stop cheating, and do we really want to accept all the negatives that come with it?
What is assessment? This chapter is about securing assessment – but what actually is assessment? Defnitions of assessment and related terms abound in the literature, spanning half a century at least (Cookson, 2017). Etymologically and historically,‘assessment’ and related terms are closer to torture, taxation and legal concepts than anything educational (Nelson & Dawson, 2014). There are geographic boundaries in views of assessment; what is called assessment in most of the current/former British Commonwealth is called ‘classroom assessment’ in Northern America. I agree with Joughin’s (2009) defnition, that assessment means ‘[making] judgements about students’ work, inferring from this what they have the capacity to do in the assessed domain, and thus what they know, value, or are capable of doing’ (p. 16). Adopting Joughin’s defnition has some implications for the study of assessment security. Firstly, this defnition is judgement-centric. This means that assessment security is ultimately about securing those judgements and what leads to those judgements. Anything that could invalidate an assessor’s judgement is therefore a concern for assessment security; the taxonomy in the previous chapter provides a structure for thinking about afordances that can invalidate assessor judgement.
Assessment security
21
Secondly, Joughin’s defnition is silent on who is making the judgement.Assessment is often thought of as the domain of the educator by default, but assessment judgements are also made by: answer keys on multiple choice question tools; other students, as in peer assessment; students themselves, as in self-assessment; external parties, as in the external examining processes common in the UK; and machine learning tools like automated essay scoring systems.All of these are therefore within the scope of assessment security. Finally, Joughin’s defnition makes no assumptions about the purpose of assessment. While assessment conversations can sometimes default to assuming a summative, certifying or credentialing purpose, assessment also serves other purposes (Boud, 2000).Assessment plays a signifcant role in guiding and shaping student learning, under the banner of formative assessment or assessment for learning (Carless, 2017); Ramsden (1992) observed that from the student’s perspective, assessment ‘always defnes the actual curriculum’ (p. 187) and is therefore very infuential on what students actually learn. Finally, assessment serves a sustainable purpose, in that it needs to develop student capability to function without formal university assessment upon graduation (Boud, 2000). This means students need to develop their ‘evaluative judgement’, which is their own understanding of what quality work looks like and the ability to make decisions about the quality of their own work and the work of others (Boud,Ajjawi, Dawson, & Tai, 2018a;Tai,Ajjawi, Boud, Dawson, & Panadero, 2018). This broad view of assessment means that assessment security is similarly broad. We are not just concerned with stopping cheating in high-stakes exams; we might also be concerned with preventing students from cheating themselves out of learning in low-stakes formative tasks. This, however, does not mean we must completely secure every act of assessment; for a consideration of this and other assessment design matters, see Chapter 9.
Two key features of assessment security Consistent with Joughin’s judgement-centric view of assessment, securing an act of assessment is about securing judgement with respect to what particular students have done and therefore are capable of. Operationally, assessment security is comprised of two key features: authentication, which refers to knowing that students are who they say they are and that they have produced their work themselves; and control of circumstances, which refers to knowing that the work was produced under particular conditions. Both features are required for assessment to be secure. Authentication The authentication feature of assessment security focuses on ensuring that the person being assessed is the person we think they are.We check students’ ID cards when they sit in-person exams in an attempt to achieve the authentication feature of assessment security. If the student looks enough like the person on the piece of plastic they brought with them, then we check them of a list and think we have a positive authentication. This has obvious faws; Australian investigative journalists found that exam impersonators were simply bringing their own fake student ID
22 Assessment security cards into exams (Potaka & Huang, 2018). But however inefective that approach may be, variations of it have been in practice for a long time, demonstrating education’s longstanding interest in authentication. The authentication component of assessment security directly addresses the ‘outsourcing work to a person’ afordances in the previous chapter’s taxonomy.This means that authentication is not just concerned with outsourcing when the student completely outsources a task (e.g. by hiring an exam impersonator).Authentication also covers circumstances where a student outsources part of a task, or where the student is physically present but essentially just transcribing (e.g. by having an exam coach tell them what to write through a hidden earpiece).Authentication in assessment security is therefore defned as: determining if the student being assessed completed the required parts of the assessment themselves. As framed here, authentication is not just a problem of ‘who is this person?’; it is also a problem of ‘who did this work?’ Society has a history of addressing the frst problem; from passport control at an airport to bouncers checking IDs at a nightclub, confrming people’s identities is a common feature of security that has seen signifcant infrastructure and human resources expenditure. But authenticating who produced a piece of work has seen much less investment, and it is likely a much harder problem.While analogies are made between identity-authentication approaches like fngerprinting, and work-authentication approaches like ‘fngerprinting’ of your writing style, the latter group of approaches are more probabilistic and difcult to prove (see the section on Stylometrics later in this chapter). This means that many approaches to work-authentication rely on monitoring the production of work in progress to reduce opportunities for outsourcing. The authentication problem faced in assessment security is diferent to the authentication problem faced in many other contexts.When it comes to my email service authenticating me, I do not want anybody else to be able to log in, so I work collaboratively with my email provider in ensuring nobody else can. I don’t share my passwords; I use a two-factor authentication tool that requires me to confrm login attempts on my phone; and I am careful to avoid using my account on insecure devices or networks. In assessment security, we are trying to solve a much harder authentication problem, one where students may be actively trying to authenticate somebody else so that that person can help them cheat.Where authentication in the traditional sense is collaborative with the legitimate user, authentication in assessment security is both collaborative and adversarial with the student. Control of circumstances Even perfect authentication won’t protect an algebra assessment from the range of free online algebra solvers, and it won’t protect a closed-book examination from a vocabulary crib sheet brought into a language exam by a student.Taking a judgement-centric view of assessment, this extra assistance invalidates the assessor’s judgement, as the assessor is no longer viewing what the student is capable of in the expected circumstances; instead, they are viewing what the student is capable of with additional supports. Assessment security therefore needs to ensure that an assessment is conducted within the circumstances that are expected.The control of
Assessment security
23
circumstances element of assessment security can be defned as: ensuring that the circumstances an assessment is meant to be conducted within are enforced. It can be broken down into three parts: information; tools; and availability. Information The information environment of an assessment is an important part of that assessment’s circumstances. In some disciplines, it is routine to conduct assessments that rely heavily on information restriction; for example, frst-year medical science courses typically incorporate anatomy assessments that assess students’ memorisation of names of parts of the human body. Courses that depend heavily on memorisation of lower-level content – what Biggs (1999) would call unistructural or multistructural learning outcomes – tend to rely on information restriction in assessment. When students have access to this proscribed information during an assessment, the assessor is no longer able to judge what they are capable of without the proscribed information. Assessment security therefore depends on restricting access to information – at least for some assessments. Information restriction of this more general kind becomes less important when assessing higher-level learning outcomes.The relational and extended abstract levels of learning outcomes (Biggs, 1999) are concerned with applying and extending knowledge. This sort of knowledge cannot be written onto a cheat sheet and smuggled into an exam, and it cannot be found through a search engine (assuming the task is novel). Sure, tips and explanations can be helpful, and these can be found online. But this information is not as damaging to the assessment process as basic factual information can be to the assessment of lower-order learning outcomes. The extent that information restriction matters to assessment security is therefore very dependent on the type of assessment and the outcomes it assesses. In addition to general information restrictions, there is a specifc set of information restrictions around assessment information. Even the most permissive openbook exam environment, focused exclusively on higher-order learning outcomes, can be invalidated if students are able to access assessment information that was meant to be kept secret. This might include model answers or marking guides. Assessment security therefore also depends on restricting student access to special assessment-specifc information. Tools As discussed in the previous chapter, there are a range of tools students can use to perform cognitive ofoading of aspects of an assessment task. Assessment security depends on limiting access to those tools which are not meant to be used with a specifc task.As with information restrictions, the allowed tools vary with the challenge inherent in the assessment task. In a basic arithmetic test, a calculator might be proscribed, whereas in a basic statistics course a calculator might be encouraged but statistical software disallowed. Control of circumstances requires an awareness of the full range of tools available. As experts in a feld, assessors should know what tools practitioners in their
24 Assessment security feld use. Assessors need to decide if students are allowed to use these real-world tools. However, there may also be a range of tools of which students are aware but assessors are not.Assessment security therefore relies on assessment designers keeping up to date with the tools that could be used to conduct a task, requires up-todate disciplinary expertise, and is not something that can be completely handed over to a generalist assessment security expert. A note about restrictions Control of circumstances is partially about restrictions.As an assessment researcher, particularly one from the assessment for learning tradition, restrictions make me uneasy. A substantial body of educational research promotes the idea that assessment should be authentic, and within that research the majority focuses on assessment that realistically represents what practitioners in a feld would actually do (Villarroel, Bloxham, Bruna, Bruna, & Herrera-Seda, 2018); this includes the physical context of the assessment, such as access to the same tools or information a practitioner would use (Gulikers, Bastiaens, & Kirschner, 2004). Cognizant of that research, I am not promoting the uncritical use of restrictions; instead, I am suggesting that where restrictions are in place they need to be carefully chosen and enforced – otherwise, they are somewhat meaningless. Availability In the cybersecurity literature, availability refers to systems being able to be used for their intended purposes when they need to be used; this is discussed further in Chapter 4.Assessment similarly needs to be able to be undertaken when a student needs to undertake it, and any assessment security technologies need to function equitably for all students. Availability in an assessment security context means ensuring all assessment-related systems and resources are accessible by, and applied to, all students as intended. Much of the responsibility for availability will fall within dedicated cybersecurity and systems administration teams at institutions and technology companies, so it will receive limited attention in this chapter.
(Im)perfect assessment security With all the e-cheating approaches described in the previous chapter, and all the challenges posed by assessment security in this chapter, it could be easy to become dismayed. If you’ve reached the conclusion that perfect assessment security is impossible, then please understand that I share this conclusion. Previous forms of assessment security were not perfect, and new forms are not either.This mirrors security in other contexts. The banking and airline industries are high-stakes in terms of security, and they invest substantially in security – but they are still not perfectly secure. However, they are a lot more secure than if they had never invested in security.The same attitude needs to be adopted in assessment security: we will never achieve perfection, but we should not stop trying to improve assessment security.
Assessment security
25
What would perfect assessment security look like? Essentially, it would be assessment where we could: know with absolute certainty who undertook an assessment; ensure that they did the work themselves; know the precise circumstances within which they undertook the task; detect absolutely any cheating they might try; and prove any attempts at cheating.This is not just impossible – it is also somewhat dangerous as a goal. If an uncompromising view of perfect assessment security is adopted, decision-makers may be swayed by promises from vendors that their products are cheat-proof. The impossibility of perfect assessment security might also lead to inaction, because the problem seems so big and unsolvable. What might an achievable compromise look like? Perhaps we might perform identity checks against a gold-standard identity database to assure identity. We might compare submitted work against work we have seen a student produce previously, to be sure they have done the work themselves.We might also design tasks to be robust and valid even when cognitive ofoading tools are used. None of these approaches can possibly deliver perfect assessment security, especially individually. But each of them can provide incremental gains. For information on how we might compare diferent approaches to assessment security, see Chapter 8.
Approaches to improving assessment security We’ve established what assessment security is, and that it will likely always be imperfect.The remainder of this chapter focuses on strategies that can be used to improve assessment security. Detection, evidence and penalties With some approaches to e-cheating, students can reasonably expect not to get caught.Through freedom of information requests to universities in the UK, journalists established that the average university caught fve cases of contract cheating in the 2017–18 academic year (Lee, 2019).They contrasted this with results from Newton’s (2018) meta-analysis which found 15.7% of students admitted to having contract cheated at some stage.The journalists concluded that less than 1% of contract cheating was getting caught.While more robust studies have been published on the prevalence of contract cheating that did not make it into Newton’s metaanalysis, fnding prevalence around a third of Newton’s (e.g. Bretag et al., 2018), the takeaway is still overwhelmingly the same: at most universities, if you contract cheat you won’t get caught.This means that the promises made by contract cheating sites – that their work is undetectable – are practically true most of the time.There is a need for cheating detection approaches that actually work, and penalties that are appropriate and enforced. Part of the detection problem relates to not being aware of cheating, and therefore not noticing it.Taking contract cheating as the example, it is unsurprising that so few cases are caught.There are multiple studies in the literature where university markers were given contract cheating assignments, under the guise of studies into marking, and asked to assess them (Lines, 2016; Medway, Roper, & Gillooly, 2018). In these studies, markers never raised any concerns about contract cheating.
26 Assessment security However, in my work with Wendy Sutherland-Smith, we have found that simply by asking markers to look for contract cheating they are able to spot it most of the time (Dawson & Sutherland-Smith, 2018, 2019). Simply making frontline academic staf aware of contract cheating is probably one of the most cost-efective approaches to improving detection. The problem with improved detection, in the sense that I’m using it here, is that suspicion and proof are two very diferent things.The burden of proof at my university and many others in Australia is ‘on the balance of probabilities’ (i.e. it is more probable than not) (Deakin University, 2011) rather than that of ‘beyond reasonable doubt’, which is used in the criminal justice system. But how do you prove that e-cheating probably occurred? The types of detection used in our research (Dawson & Sutherland-Smith, 2018, 2019) certainly don’t result in sufcient evidence to prove anything. Academics say that they often don’t report contract cheating because they believe it is impossible to prove (Harper et al., 2018).Work is underway across a range of institutions and technology vendors to develop ways of proving contract cheating. But ultimately, it will come down to how acceptable the evidence is to decision-makers.This means that, in addition to being a problem of detection, assessment security is also a problem of evidencing; a detection approach might have 100% accuracy, but if it cannot provide evidence in a way that convinces human decision-makers, then it cannot provide real assessment security. The challenges of detection and proof are discussed further in Chapter 8. Finally, penalties for cheating need to be appropriate. (Yes, I’m aware of the use of a ‘weasel word’ here in ‘appropriate’.) I wouldn’t advocate for either harsh or lenient penalties, but rather penalties that ft the circumstances. I support expulsion for particularly egregious cheating like contract cheating, and more educative approaches for frst-time copy-paste plagiarism. Whatever the penalties are, for them to have any realistic deterrent efect students need to be made aware of them upfront.While it is helpful for awareness-raising to have the media report on cases where penalties are applied (e.g. ‘Deakin University students kicked out for “contract cheating”’, Jacks, 2016), the fact that the application of penalties for contract cheating warrants any media attention at all suggests that the public are not aware of the prevalence of, or penalties for, cheating. Technology Many technological approaches to countering e-cheating have been proposed, and some are currently in use. This section explores the four main groups of anticheating technologies and how they support the diferent components of assessment security. A summary of these approaches, along with how they address particular assessment security features, their degree of invasiveness, and examples of each group, is presented in Table 2.1. Surveillance Some anti-cheating technologies monitor what students do to determine if they have cheated.The most obvious of these is remote proctored examinations.These
Assessment security
27
Table 2.1 Summary of technologies for assessment security Approach
Assessment security features
Invasiveness
Example technologies
Surveillance
Authentication through verifcation of identity and monitoring for outsourcing Control of circumstances through monitoring of student activity Control of circumstances through disabling features on devices or networks
High: students are monitored
Remote proctored examinations Assignment writing platforms Biometrics Analytics
High: students are denied full use of their devices or networks
Stylometrics
Authentication through comparing student writing style against writing they have produced previously
Moderate: involves building profles of students
Online exam platforms Lockdown browsers Exam operating systems Assessment apps Network blocks/ outages Stylometrics products from text-matching software vendors
Content matching
Authentication through comparing student submissions to work anybody has produced previously
Low: student work is retained
Lockdown
Text-matching tools Google Reverse Image Search
are analogous to in-person proctored examinations, where a proctor (also called an invigilator in British English) monitors what students do in order to spot any attempts to cheat. In remote proctored examinations, students sit an examination on a computer and they are monitored through one or more of the following: webcam; keyboard biometrics; microphone; analysis of network trafc and computer memory; or anything else that can be monitored through technology. But remote proctored examinations are just one surveillance approach currently in use. The online assessment platform Cadmus (2019) used to be marketed as an anticontract-cheating environment that monitored students while they completed their assignments. It similarly used keystroke biometrics and logged IP addresses of students to identify any instances of contract cheating. Surveillance technologies are also used for in-person assessments, such as drones in some examinations (Reuters, 2017). There is broad scope for surveillance as an assessment security approach, simply because monitoring can identify cheating, and it is a likely growth area for assessment security technologies. One key distinction in the surveillance technology space is between human and machine surveillance. In the remote proctored exams domain, for instance, there are human-proctored exams and exams that are proctored by Artifcial Intelligence (AI). An advantage of AI-proctored exams is that they can scale much more efciently in terms of cost than human-proctored exams. Another key distinction is
28 Assessment security between live proctoring, which occurs during the examination, and recorded proctoring, which can occur later. Combinations of these diferent approaches can be used as well, such as online exams that are monitored live by an AI proctor, with any suspicious incidents later analysed by a human proctor. One potentially radical and dystopian approach to surveillance would be the adaptation of ‘stealth assessment’ (Shute & Ventura, 2013) to stealth assessment security. Stealth assessment is largely used to assess learning outcomes while students play video games; as students achieve learning outcomes, they are measured in the background without their knowledge.This means students can focus on learning rather than discrete acts of assessment. Stealth assessment security might involve continual monitoring of students’ activities while they use their computers, in order to identify any problematic behaviour in terms of assessment security. To a much more limited extent, this approach has been trialled – and abandoned – by Cadmus: they monitored students’ keystrokes and IP addresses while students used their online assignment writing platform.When I asked them why they abandoned this approach, they gave three reasons. Firstly, it demonstrated distrust towards students. Secondly, students disliked being surveilled.And thirdly, it did not actually work at stopping cheating. For more discussion of the potential issues with surveillance technologies and academic integrity, see Chapter 7. One surveillance approach that is more mundane than stealth assessment security is simply the intelligent use of pre-existing data to identify cheating problems. There is a long history of comparing students’ responses to Multiple Choice Questions (MCQs) with their neighbours’ responses, to identify potential exam hall cheating; where a group of nearby students all choose the same distractor answers, this can be used to raise suspicions about over-the-shoulder cheating (Belleza & Belleza, 1989). Statistical, learning analytics and artifcial intelligence approaches have similarly been used to compare student achievement over time and identify anomalous over-performance that may be indicative of cheating (Clare,Walker, & Hobson, 2017; D’Souza & Siegfeldt, 2017). Use of existing data is likely a fruitful avenue for assessment security; however, the sorts of re-analyses presented here are likely to be probabilistic rather than conclusive. In terms of assessment security, surveillance technologies can directly address both authentication and control of circumstances. They achieve authentication through checks of identity, with some tools checking against stored databases and other tools checking against ID cards students show. Control of circumstances is achieved through monitoring what students do and identifying any aberrant behaviour. Many surveillance approaches also use lockdown approaches, which are discussed in the next subsection. One important caveat about surveillance technologies is that there is very limited evidence in favour of their efectiveness.Taking remote proctored exams as an example, I am aware of no peer reviewed studies of cheating that involve skilled cheaters attempting to cheat in this type of examination. I have received research funding to conduct such a study, involving hiring expert cheaters to attempt to cheat in remote proctored examinations; however, I had to return the funding unspent after all three remote proctoring companies I approached explicitly said I
Assessment security
29
must not do such a study as it is against their terms of service.This reluctance to prove the efectiveness of the surveillance approach is troubling, particularly given the range of cheating success stories available online from both expert security researchers and everyday students. There is some evidence that students might cheat less when they think they are being surveilled in an online exam (Brothen & Peterson, 2012), but this is quite diferent to the sort of hard-nosed penetration test discussed in Chapter 4.There is also evidence that keystroke biometrics can efectively distinguish between transcription (which might be contract cheating) and more ‘normal’ generative writing (Trezise, Ryan, de Barba, & Kennedy, 2019), but this has not been studied in an authentic cheating scenario where students are motivated to cheat. The efcacy of surveillance approaches may depend to an extent on convincing students that surveillance is efective, and on the degree of difculty of bypassing the surveillance. Promises from surveillance vendors should be taken critically, and they should be weighed against the potential harms to student experience, trust, and pedagogy. Lockdown While surveillance technologies aim to catch students doing the wrong thing, lockdown technologies stop students from doing the wrong thing by disabling functionality. Lockdown technologies include: web browsers that are restricted to visiting only certain websites; live-boot operating systems that take complete control of a computer; apps for mobile operating systems that prevent other device functions from being used; and network blocking to prevent access to part or all of the Internet. From a cybersecurity perspective, lockdown can adopt either a ban-list or an allow-list model.With a ban-list model, everything is accessible except what is on the ban-list; with an allow-list model, nothing is accessible except what is on the allow-list (also known as ‘blacklist’ and ‘whitelist’, Bacik, 2016). Both models are used in assessment security lockdown approaches. Live-boot exam environments (e.g. Bring Your Own Device eExams,Transforming Exams, 2014) are often set up to allow only prescribed software to be used. My university’s WiFi takes a ban-list approach, allowing students access to the entire web, except for sites on a ban-list that includes cheating-related sites. In the most extreme case, an allow-list or banlist might be written to completely disallow all network trafc. Lockdown approaches primarily focus on control of circumstances: assessment security is enhanced by disabling access to certain functionalities, or only enabling access to certain functionality. The efectiveness of lockdown approaches is questionable. In my own work I have demonstrated it is possible to bypass the lockdown functionality of some computer-based exams (Dawson, 2016). Network-based lockdowns, in the form of blocking particular sites, are also of questionable functionality when Virtual Private Networks (VPNs) can bypass site blocking. As with surveillance approaches, lockdown comes at a signifcant cost pedagogically and in terms of trust and the student experience, so evidence of its efectiveness should be carefully considered.
30 Assessment security Stylometrics A range of tools that are rapidly becoming available build up a profle of the sort of writing a student produces, and compare new submissions against that established profle. This feld is known as stylometrics (Ding, Fung, Iqbal, & Cheung, 2019; Juola, 2017). It uses linguistic markers including: readability scores; punctuation preferences; complexity; sentence length; preferences for certain words; vocabulary diversity; and other computable features. I have recently published a study with colleagues into the efectiveness of one such tool,Turnitin’s Authorship product (Dawson et al., 2020).Turnitin is not alone in this space; some researchers are experimenting in developing tools (Juola, 2017), and some other vendors of textmatching products have also made tools available (e.g. UniCheck’s EMMA). While stylometrics as a discipline has been focused on the written word, for the purposes of considering assessment security it is valuable to think how the underlying idea of stylometry might also apply to other types of assessment. Could it one day be possible to produce stylometry for video or audio assignments? This is a new frontier for stylometrics in assessment security, but one that will become more important as assessment task types continue to diverge from the traditional written essay or report. Contract cheating sites ofer to produce all manner of assignments for students, and if our primary means of authenticating authorship only functions with text, this will leave other task types comparatively less secure. Stylometric approaches attempt to ensure authentication by comparing new work against what is believed to be the student’s past body of work. I can attest that at least some stylometrics tools have been the subject of study. In our team’s work on an early alpha version of Turnitin’s Authorship tool, we found that markers who used the tool accurately spotted signifcantly more contract cheating, without an increase in ‘false positives’ (legitimate work being fagged as contract cheated). However, this is still a relatively new feld, and we were working on an alpha version, so our results should be taken with caution.There are also some obvious issues with stylometric approaches: what if a student simply gets substantially better, and writes diferently as a result? What about diferent genres of work? And, my favourite, from a cheating student response to a news article: what if the student just uses the same contract cheating writer throughout their entire degree (Fung, 2019)? To an extent, stylometrics are another expression of distrust towards students, as with surveillance and lockdown approaches. However, in comparison to lockdown and surveillance, stylometrics are much less invasive, as they do not involve monitoring students in real-time or interfering with the student’s work processes. Content matching The use of text-matching tools (e.g.Turnitin) is routine in some higher education contexts.These tools compare the text in a student’s assignment with a large database of text from publicly available sources, subscription sources, and work that has previously been submitted to the service.Text-matching tools are believed to act as an efective way to both detect and deter copy-paste plagiarism (Heckler, Rice, & Hobson Bryan, 2013). I have labelled this section ‘Content matching’ as,
Assessment security
31
with a forward-looking view, it is important to consider what might be possible beyond just matching exact text excerpts to existing entries in a database. Tools already exist to match images with other images. Google’s reverse image search will match an uploaded image with other images; the matches need not be exact or in the original fle format or resolution. Similarly,YouTube and other video services use algorithms to match audio content to avoid copyright breaches. While these are not assessment security tools, they are proof of concept for what could be part of broader content-matching tools for assessment security. Specifc content-matching tools do exist for some disciplinespecifc assessment types, such as tools that detect computer software source code plagiarism (Pandit & Toksha, 2020). There is a need for vendors of textmatching tools to expand their scope beyond just text, and start to incorporate other content types. When I used to teach information literacy courses I tasked my students with achieving as high as possible a similarity score on Turnitin, and then paraphrasing that work so that it was still plagiarism but received as low a score as possible.We then discussed the deeper concepts of plagiarism, and how we need to move beyond an obsession with exact matches toward a more sophisticated view of plagiarism that considers ideas and attribution. Text-matching has been focused on exact matches, or matches that have been subject to a degree of permutation.This makes them far less efective against cheating by paraphrasing, whether it’s manual paraphrasing or auto-paraphrasing with a tool. But work toward the ‘semantic web’ (Berners-Lee, Hendler, & Lassila, 2001), a machine-readable version of the web, could theoretically allow content-matching for ideas. If I take a paragraph from a journal article, paraphrase it, and use it in a new article without citing the original article, this would elude text-matching, but could be caught by some sort of semantic content-matching tool. In the long term, with increasing sophistication and adoption of paraphrasing tools, it will be necessary for text-matching tools to adopt semantic approaches, otherwise they will only catch the most blatant copypaste plagiarism. Content matching approaches attempt to provide control of circumstances, particularly control of the information circumstances, by checking if work students say they have produced matches with existing content. Content matching has demonstrated some degree of efectiveness, and it has also been around long enough to come with its own share of problems. Content-matching tools tend to produce quantitative reports, which have led to an obsession with ‘scores’ as the arbiter of the presence – or absence – of cheating. While these scores can be helpful, their role in assessment security should not be to produce an automatic decision; instead, they should be used by a person as evidence to make a decision. Similar to stylometrics, content matching approaches are also an expression of distrust; however, as content-matching approaches have become more commonly used they have come to be seen by many as just part of education.They are not, however, without their critics; see Chapter 7 for some criticisms of content matching. Also similar to stylometrics, content-matching approaches are much less invasive than surveillance or lockdown approaches, as content matching is usually incorporated into marking and feedback processes.
32 Assessment security Assessment design If assessment is what we are ultimately aiming to secure, perhaps there are some ways we can design assessment so that it is inherently more secure. On face value this sounds like a reasonable proposal.Arguments have been put forward in popular media that particular types of assessment are more or less prone to contract cheating – for example, Race’s (2018) piece in Times Higher Education argued that banning the assessed essay would do a lot to reduce contract cheating. But what actually works in assessment design to improve assessment security? One of the most common suggestions for addressing contract cheating in particular is making assessment more authentic. Authentic assessment is assessment that represents the real world of the discipline (Villarroel et al., 2018), and it has sometimes been touted as more challenging to contract cheat. The UK Quality Assurance Agency’s guidance document on responding to contract cheating lists ‘Consider authentic assessment’ under its recommendations for preventing contract cheating (UK Quality Assurance Agency, 2017). I used to agree with this position (Dawson, 2015); however, as time has gone on, I haven’t been able to fnd any evidence in support of it.And there is now mounting evidence that authenticity will not stop contract cheating. In one recent study, Ellis et al. (2020) analysed data from contract cheating sites against characteristics attributed to authentic assessment. They found that contract cheating sites appeared to have no problem dealing in authentic assessments, and concluded that authenticity does not stop contract cheating. One key distinction to draw here is between assessment security and student likelihood to cheat. In a large-scale student survey study, Bretag et al. (2019) examined which types of assessment students said they were more or less likely to cheat on. Students said they were much less likely to cheat on vivas, personalised and unique tasks, in-class tasks, and refections on practicums. Students were most likely to cheat on tasks requiring research, analysis and thinking skills, as well as in series of small graded tasks. In acting on this data, assessment designers might use more tasks students are less likely to cheat on and fewer tasks they are more likely to cheat on. This is probably sensible, but it falls more in the domain of academic integrity than assessment security; it is data about students musing on their likelihood of cheating in hypothetical circumstances, rather than data about how well actual attempts to cheat can be stopped or detected. Assessment design is a complex act. In a project I co-led with Margaret Bearman, we examined the actual assessment design practices of Australian university academics (Bearman et al., 2016).We found that academics balance a range of factors when designing and implementing new assessments, and that they did pay some consideration to assuring academic integrity (Bearman et al., 2017). However, it is important to note that academic integrity and addressing cheating were never the primary considerations of our assessment designers; other factors, such as ensuring the assessment addressed particular learning outcomes or incorporated efective feedback, were much more infuential. Academic integrity was seen as something that sometimes needed to be traded of against other priorities; for example, some educators set low-stakes online quizzes as an individual task for students, knowing
Assessment security
33
fully that many students would engage in unauthorised collaboration and cheat on the quiz. For these academics, this presented an opportunity for students to engage in peer learning, which they valued more highly than assessment security for this task. Assessment security is thus but one feature to be balanced against the many other features of a particular assessment task. Not all assessment tasks need to be secure. Taking a ‘constructive alignment’ approach to assessment design, the identifcation of intended learning outcomes is perhaps the most important step in designing an assessment (Biggs & Tang, 2007): we need to know what we are assessing before we fgure out how we will assess it. I would argue that learning outcomes are also very infuential in assessment security, as they identify what needs to be secured, and what can be left insecure. Take, for example, an introductory course on academic writing compared to a research project. The introductory course might include learning outcomes related to referencing and expression, whereas the research project might focus more on methodology and analysis. In the introductory course, tools such as bibliographic software and auto-paraphrasing tools might be explicitly forbidden, as these automate some of the learning outcomes that are being assessed. But for the research writing course, these outcomes are not what is being assessed, so it is not important to secure the assessment against these particular tools. Assessment security therefore needs to be outcomes-focused, lest it becomes obsessed with securing outcomes that are not actually the focus of assessment. The level of learning outcome being assessed is also very infuential for assessment security. Put simply, it is much harder to secure the assessment of lower-order learning outcomes, because they rely on denying students access to publicly available information. If we want to assess a learner’s ability to recall the periodic table, we can only do that in a context where they do not have access to the periodic table; in a geographically dispersed blended educational environment, this is very difcult to assure. However, for higher-level learning outcomes, where problemsolving or creativity are more important than recall, denying students access to materials may be much less important. Here, the assessment security challenge is much smaller. Table 2.2 shows the relationship between levels of learning outcomes and the e-cheating threats to which they are vulnerable. As Table 2.2 illustrates, extended abstract outcomes – which are the highest level of learning outcome and are associated with verbs like hypothesise, formulate and refect (Biggs, 1999) – are vulnerable only to a limited set of e-cheating approaches: outsourcing and disrupting.They are not vulnerable to students accessing general unauthorised information, as they are about creating new knowledge. Extended abstract outcomes are not vulnerable to cognitive ofoading, as they are about uniquely human creative tasks that cannot be performed by a tool (Bearman & Luckin, in press). They are not vulnerable to assessment-specifc unauthorised information because having a solution, exemplar, marking guide or other secret information should not be enough to advantage a student who is creating new and original ideas. Relational outcomes, the next level of learning outcome, are vulnerable to e-cheating approaches that work on extended abstract outcomes; but in addition, they can be vulnerable to cognitive ofoading because they are often
34 Assessment security Table 2.2 Relationship between learning outcomes levels and e-cheating threats Learning outcome level
E-cheating threats
Extended abstract, e.g. hypothesise, formulate, refect Relational, e.g. analyse, justify, apply
Outsourcing Disrupting All of the above, plus Cognitive ofoading Unauthorised information (assessment specifc)
Unistructural/multistructural, e.g. describe, list, identify
All of the above, plus Unauthorised information
activities that can be computerised. Relational outcomes are associated with verbs like analyse, justify and apply, and can sometimes have one ‘right answer’ and so can be vulnerable to cheating approaches that involve revealing secret assessment information, such as sharing solutions. But the assessment of relational outcomes is not invalidated by accessing more general information, as relational outcomes are not about memorisation of content, they are about its application.The lowest levels of outcomes, unistructural and multistructural,are vulnerable to all types of e-cheating, because their assessment relies on demonstrating memorisation, which can be invalidated through all types of cheating. Some inferences can be drawn from the relationship between learning outcome levels and e-cheating vulnerabilities. Firstly, higher-order outcomes are easier to secure. Secondly, since a single assessment task often assesses multiple outcomes, each assessment is only as secure as its lowest-level learning outcomes. If a task is primarily designed to assess extended abstract thinking but it also assesses memorisation (unistructural/multistructural), then the task is vulnerable to students cheating by accessing general unauthorised information. If the lower-level outcomes are removed by allowing access to information, or perhaps moved to another task conducted under exam conditions, the task no longer needs to be secured against the use of general unauthorised information.This plays out in the choice between open-book exams and closed-book exams.Two recent systematic reviews of the literature have found a range of motivations for the choice between openand closed-book exams (Durning et al., 2016; Johanns, Dinkens, & Moore, 2017), but a recurring theme is that when lower-level outcomes are not as important, allowing students to ofoad them to ‘cheat sheets’ can refocus the exam onto higher-level outcomes. When students are encouraged to bring outside information into an exam by design, there becomes less need to protect the exam from students bringing in outside information. This section has largely focused on individual tasks. However, there is much more to assessment security than a focus on securing each individual act of assessment. Chapter 9 goes into much greater depth about what can be done to secure assessment through structures and designs. In particular, it proposes re-thinking the unit of analysis of assessment security from the task to the programme (degree course), and focusing on ensuring each degree-level outcome has been assessed
Assessment security
35
with a high degree of assessment security, rather than focusing on ensuring each assessment has a high degree of assessment security. Legal approaches At the time of writing, several jurisdictions around the world are in the process of making cheating illegal.There is moderate support from academics for making the provision of cheating services illegal, and even some support for making student use of cheating services illegal (Awdry & Newton, 2019). Contract cheating is presently illegal in several US states, as well as in New Zealand (Amigud & Dawson, 2019). Legal approaches would hold promise for improving assessment security only if they made it more difcult to procure cheating services. So do these laws achieve that goal? In a study I co-authored (Amigud & Dawson, 2019), we sought to test out the efectiveness of legal approaches at reducing the supply of contract cheating. Lead study author Alexander Amigud posted tweets that he expected would draw out the bots that monitor Twitter for struggling students (as in his other work on Twitter bots and cheating,Amigud, 2019).We found that the majority of responses ofering to provide contract cheating came from jurisdictions where the provision of contract cheating was illegal.This is the only empirical study into the efectiveness of these sorts of laws that I am aware of, and it is not particularly positive. Legal approaches can come with unintended consequences.The laws initially proposed in Australia, for example, would have led to the criminalisation of friends and family members who merely ofered to do part of a student’s assignment. Fortunately, these laws met with a strong response from across the sector, including a petition, submissions to government and media outcry. The version of these laws that will likely be enacted will not criminalise non-commercial cheating services. However, they may lead to other unintended consequences.At best, the Australian legislation will efectively move commercial contract cheating ofshore; in my submission to the Australian government’s initial consultation on the matter I noted: If the proposed legislation was efective at moving contract cheating ofshore, students would expose themselves to the risks of dealing with overseas businesses with unethical practices. Depending on the locations of those businesses, students may be at greater risk of breaches of privacy, identity theft, fraud and blackmail, compared to when dealing with Australian businesses. Students may be reluctant to contact police if they are victims of these crimes if contract cheating provision was illegal, and police may be able to provide only limited help when dealing with overseas businesses. It is reasonably well-established that contract cheating is a globalised industry, and much of it operates in countries with far poorer information privacy and consumer protection laws than Australia (Lancaster, 2019b). Given the lack of any evidence in favour of legislation as an approach that can actually reduce the supply of contract cheating, the emerging evidence that it might not be efective,
36 Assessment security and the potential for unintended consequences, I do not support legislation as an assessment security approach.
Problems with assessment security This chapter has introduced the concept of assessment security.To conclude, it is worth taking a critical look at some of the problems with this approach.The issue that most concerns me is that a focus on assessment security could lead to steps backward in terms of the quality of assessment we use. Assessment serves multiple purposes, and these can be in tension with each other. Too much of a focus on assessment security might lead us to use assessments that are less valid or reliable, poorer for student learning, or damaging in terms of their impact on students’ ability to self-assess. One possible future I see if we focus too much on assessment security is one I call ‘assessment conservatism’. Over recent decades there has been a shift toward more progressive, learning-focused approaches to assessment and, in particular, away from high-stakes examinations and exclusively summative assessment. This shift has been benefcial for learning, teaching and assessment (Boud, 1995, 2007; Boud & Soler, 2015; Carless, 2017; Sadler, 1989). I worry that if we allow ourselves to be seduced by a cheating panic, many of these gains may be undone. Students may also sufer if we overemphasise assessment security. Our anticheating approaches might restrict them from learning in the ways they learn best. An overemphasis on individual assessment for the sake of security might be interpreted as an instruction to not collaborate or seek formative feedback.The data we gather on students through surveillance might be misused.We might also inculcate in students an acceptance of surveillance as part of everyday life.These matters are taken up elsewhere in the book, particularly in Chapter 7. To address these problems, it is important that we remain critical of assessment security and carefully weigh it against all the other components of assessment design, much as assessment designers already appear to do (Bearman et al., 2017). Assessment security may be vitally important for some tasks, but for others we may be willing to be more permissive. Now that the two core concepts of this book – e-cheating and assessment security – have been introduced, the next chapter explores the actual processes of e-cheating in greater depth, to understand what we can do to disrupt it.
Things to do •
•
Try to break your assessment – or, even better, get someone else to try and break your assessment. Until you have good evidence on what the potential holes are in your current assessment security you won’t really know what can be done to improve it. Audit your assessment against the features of assessment security.What authentication approaches do you take to ensure that students are who they say they are and that they did the work themselves? How do you control the circumstances of your tasks?
Assessment security •
•
37
Find out what assessment security technologies you have available. Do they cover both authentication and control of circumstances? What claims are made about the extent to which they support assessment security? What evidence is provided to support those claims? Read John Biggs’ (1999) classic paper What the Student Does: teaching for enhanced learning. Using the Structure of Observed Learning Outcome (SOLO) taxonomy presented in that paper, consider the learning outcomes your assessments focus on. Are they sufciently high-level? If you have predominantly low-level outcomes, consider the added challenges they pose to assessment security.
3
The E-Cheating lifecycle and how to disrupt it
The previous chapter took an assessment-centric view of addressing the problem of e-cheating and proposed the concept of assessment security. By contrast, this chapter focuses on the processes of e-cheating and how they can be disrupted. Where the key actors in the previous chapter were assessment designers and policy-makers, the key actors in this chapter are cheating students and those who help them to cheat. If we want to stop cheating, we really need to know how people cheat. The literature on how cheating actually happens is limited. The most famous account of life on the provider side of cheating is Ed Dante’s ‘The Shadow Scholar’ (Dante, 2010), which appeared in The Chronicle of Higher Education in 2010. This was the frst introduction for many to the world of essay ghostwriting, and it was straight from the writer’s mouth. Ed Dante was a pseudonym for Dave Tomar, who later wrote a tell-all book on the topic (Tomar, 2012). Dante/Tomar introduced us to the human side of contract cheating. But the best current source for understanding the daily lives of professional assignment writers is the forum site EssayScam.org, which hosts conversations about all manner of cheating experiences. I strongly encourage anybody interested in addressing e-cheating to browse through these forums to get a sense for how cheating writers think and what the everyday realities of their job are. There is some literature on the customer side of the cheating business too. Our research group has bought many contract cheating assignments, and my colleagues have written an article about what that experience was like (Sutherland-Smith & Dullaghan, 2019).Their article title, ‘You don’t always get what you pay for: User experiences of engaging with contract cheating sites’, neatly summarises their key fnding.While contract cheating sites use sophisticated marketing and make many promises (Rowland et al., 2017), the reality of the customer experience doesn’t live up to the hype. This chapter proposes a systematic approach for thinking about e-cheating: as a lifecycle. It breaks the lifecycle down into a series of steps, and asks what can be done at each step to disrupt e-cheating. It is important to note that this lifecycle is not necessarily linear, and students may iterate through diferent steps in diferent orders.
The E-Cheating lifecycle and how to disrupt it
39
Awareness raising In the lifecycle of intentional cheating, the frst step is developing the intent to cheat and an awareness of ways to cheat. A range of studies from criminological perspectives demonstrate that the social element of cheating is powerful: social learning about cheating and ways to rationalise cheating better predict cheating than theories about self-control (Stogner, Miller, & Marcum, 2013;Vowell & Chen, 2004). In non-educational lab experiments (e.g. Gino, Gu, & Zhong, 2009) and studies of online social networks (Woo, Kang, Kim, & Park, 2018), the problem of cheating contagion has been raised: once cheating takes root within a group, it spreads. But how is the frst seed of cheating planted within a group? Students don’t just fnd e-cheating; e-cheating fnds students Advertising was, and still is, one of the major ways contract cheating sites reach new customers. In 2007 Google announced they were no longer allowing advertisements from contract cheating sites (Coughlan, 2007); however, as late as August 2016, advertisements were still showing up in Google searches. I remember this date because at the time I was visiting Thomas Lancaster, one of the originators of the term ‘contract cheating’ (Lancaster & Clarke, 2007), and we were logged in to the Google AdSense platform, looking at the price of diferent cheating-related terms; the costs were very high, which we thought indicated cheating services spent signifcant sums on advertising. Enforcing these sorts of bans is challenging, and a journalistic investigation found Facebook,YouTube and Google still proft from selling advertising to contract cheating services (Christian, 2019). Beyond formal advertising, there are many other ways cheating providers get their message out to potential customers. Social media is a signifcant source of new customers for cheating providers (Amigud, 2019;Amigud & Lancaster, 2019a; Lancaster, 2019c), spanning microblogging sites like Twitter, social networking sites like Facebook, and messaging services like WeChat and WhatsApp.Amigud (2019) found that cheating providers monitor Twitter using bots, searching out students who tweet particular words or combinations of words. These included bigrams (two-word phrases) that are clearly connected with an intent to cheat, such as ‘pay someone’; however, they also included many bigrams that are only indicative of academic difculties, like ‘haven’t started’ and ‘hate statistics’.These bots proactively push the idea of cheating to students who haven’t even given cheating a thought. Social media infuencers are also a signifcant vector used to promote cheating. In 2018, a BBC investigation found contract cheating product placements by more than 250 YouTubers across 1,400 videos, with 700 million views in total (Jefreys & Main, 2018).These videos included paid advertisements where the YouTube stars themselves advertised EduBirdie, a Ukrainian contract cheating site. YouTube responded by saying that, while content creators are allowed to advertise in their videos, they must not include advertisements that breach YouTube’s guidelines for advertisements – which include a prohibition on essay writing services. Google,
40 The E-Cheating lifecycle and how to disrupt it YouTube’s parent company, includes these services under the category of ‘Enabling dishonest behaviour’, with the example of ‘paper-writing or exam-taking services’ (Google, 2019).YouTube has since taken down the videos. Promotional copy about cheating services has also found its way into the hands of journalists, who adapt or republish material provided by cheating services. A Google search for “according to a new survey conducted by edubirdie” (quotation marks included for an exact match of the phrase) quickly reveals 50 online news sources that have published remarkably similar content under the guise of a survey about astrology and cheating in romantic relationships. Sites (re-)publishing this content include mainstream sites with substantial readership, including Cosmopolitan (Andrews, 2019b), MSN Lifestyle (Andrews, 2019c) and Yahoo Finance (Andrews, 2019a). None of the mentions of EduBirdie that I have seen in these articles, or other articles on their survey, have been in any way critical of EduBirdie; most have also included a hyperlink directly to EduBirdie’s website. This sort of presence in mainstream media may serve not only to alert potential EduBirdie customers to the existence of their service, but also to legitimise it. While much e-cheating advertising is online, some businesses have also ventured into in-person advertising.At the time of writing, EduBirdie is advertising a role with the euphemistic job title of ‘Glory Days Conservation Specialist’. An excerpt from the job summary reads: EduBirdie’s Glory Days Conservation Specialist will serve as the head ambassador for EduBirdie, driving awareness through recruitment of student evangelists at major educational institutions around the country.Throughout the academic year, the Glory Days Conservation Specialist will visit 1 – 2 colleges per month, planning and hosting events with on-campus groups like fraternities, sororities, sports teams and other prominent organizations to create and execute social events at local bars, clubs, or entertainment centers. During spring break season, the Glory Days Conservation Specialist will travel to popular hotspots across the United States for additional meetand-greets with students from around the country to oversee ambassador recruitment and customer acquisition. EduBirdie is looking for a professional with a college mentality, who knows how to relate to student populations and keep the party going all day and night. The ideal candidate was considered the Van Wilder during college, is very outgoing and is ready to let their inner party animal back out of its cage. Instagram following of 5,000 or more a plus. (EduBirdie, 2019) There is no opening or closing date attached to this job ad, and at the time of writing it has been open for the better part of a year. Given EduBirdie’s clever marketing strategies, there is every possibility the job itself does not exist and it is merely an attempt to garner more press; if so, this strategy has been successful, as media outlets have covered the ad (e.g. McClear, 2019). EduBirdie’s multi-pronged guerrilla marketing campaign demonstrates that even bans on advertising will not stop cheating service providers from getting their message out.
The E-Cheating lifecycle and how to disrupt it
41
But avoiding advertisements, media articles, social media and college parties is not enough to shelter a student from the campaigns of cheating websites.Through Search Engine Optimisation (SEO), cheating businesses manage to outrank legitimate academic support services on many search terms.When I Google search ‘help with Harvard referencing’ with a sufciently anonymised web browser and Internet connection (to avoid search engines tailoring results to my interests), my university’s page on Harvard referencing is the sixth result, which indicates they have done good work in ensuring it reaches students in need of help. However, it is outranked by cheating site UK Essays, who have the third result on the page. Some other terms have an even higher proportion of cheating sites to legitimate help; the entire frst page of my Google search of ‘help with my essay’ was full of cheating sites: fve advertisements for cheating sites (clearly in breach of Google’s policy) and every non-ad result also a cheating site.Three of those search results have star ratings presented alongside the search results, all of which are 4.9 out of 5. Whether it’s an expression of exasperation on Twitter or a search for help with referencing or essay writing, e-cheating services use sophisticated methods to fnd students when they are most vulnerable.
Review sites, discount codes and seduction I consider myself a sophisticated online shopper.When I have decided that I want to buy something, I obsessively research online and compare diferent vendors. I carefully consider the promises diferent sellers make. I go to comparison websites. I watch YouTube reviews. All of these features of modern e-commerce are available to e-cheating consumers – but the problem is, most of them are bogus. There are a range of online comparison sites ofering reviews of various contract cheating sites. Many of the reviews are obviously fake, and many sites providing reviews are quick to point out that other sites have fake reviews.There are even sites that attempt to list fake review sites for contract cheating. One of these sites lists ‘signs of fraud’ in essay writing review sites: In my opinion, essay writing service review website is fraudulent if • •
it’s designed for driving trafc to websites owned by the same company it posts fake negative reviews of competitors.
Thus, it doesn’t matter whichever "top service” you choose. They’re all the same. (Anderson, 2019) That review site claims to have had takedown requests from some contract cheating companies. However, it is difcult to know if this is yet another fake review site, masquerading as an anti-fake review site. Generally, consumers rely heavily on online reviews when purchasing online, which makes fake reviews highly
42 The E-Cheating lifecycle and how to disrupt it problematic (Malbon, 2013). And the challenge of determining the quality of a cheating product is not restricted to contract cheating – e-cheating hardware is also difcult to appraise. Taking the ‘Monorean Pro’ exam cheating earpiece (Monorean, 2019) as an example, there are glowing videos from the device’s creator on YouTube, alongside two videos from a dissatisfed customer who claims it does not work and that he wasted $600 on it. Afliate marketing (Dufy, 2005) is likely a key driver of the fake review problem and the spread of contract cheating more broadly. In these schemes, commission is paid to ‘afliates’ who drive trafc to a site that leads to purchases. Contract cheating sites are known to engage in afliate marketing, sometimes through promoting discount codes on Twitter (Lancaster, 2017).This means that in addition to writers and business owners who proft from contract cheating there are networks of entrepreneurs seeking to proft from the promotion of cheating. Once potential cheating students visit an e-cheating site, how do the sites persuade them to buy? Rowland et al.(2017) examined 11 contract cheating sites against Diaz and Koutra’s (2013) framework for analysing the persuasive features of websites. Rowland et al. found that the sites are sophisticated in their persuasions to cheat. Sites attempt to convince students that assessment stress is a problem for which their cheating products are the solution.They also aim to convince students of the quality of their products, something that is common with other e-cheating businesses. Ultimately, e-cheating sites do all they can to convert the student into a purchaser. With fake reviews, discount codes, afliate marketing, sponsored content, plugs from YouTube infuencers and comparison-shopping sites, e-cheating is part of a mature online marketplace that has all the hallmarks of legitimate e-commerce. ‘Free’ e-cheating Most of the discussion of e-cheating in this section has focused on fee-for-service e-cheating. But what about free e-cheating? Several businesses do their best to convince students that they are ‘free’ because no money is changing hands. However, these businesses make their profts in other ways, including charging after a ‘free trial’, posting uploaded student work on essay mills (Bailey, 2013), or requiring students to upload their own work in exchange for credits (Rogerson, 2014). Even sites like WikiHow, which host a range of free e-cheating how-to guides, make money of advertising. It is very challenging to fnd entirely noncommercial e-cheating. Perhaps most e-cheating provision and advice is motivated by desires other than simply helping students cheat. Approaches for disrupting e-cheating awareness raising There are four key ways to disrupt e-cheating at the awareness-raising stage: Approach 1: Lobbying platforms to disallow the promotion of e-cheating E-cheating awareness raising makes use of existing platforms: advertising providers like Google AdSense; social media like Twitter; web hosting like Amazon Web
The E-Cheating lifecycle and how to disrupt it
43
Services; news aggregators like Reddit; video sharing sites like YouTube; messaging tools like WeChat; and a variety of mainstream media sources. Some of these platforms already have rules that prohibit the promotion of cheating services; however, they are not particularly efective. More sophisticated lobbying by the academic integrity community, in conjunction with peak bodies and industry groups, could pressure platforms into actually enforcing their bans. These platforms already enforce efective bans against the promotion of media piracy, hate speech and child pornography; it is time for e-cheating bans to be enforced as well. Approach 2: Blocking access to e-cheating sites Internet service providers, be they universities or private Internet companies, are able to block access to particular websites. My university has an extensive blocklist which forbids access to a range of sites, with a specifc category for ‘academic fraud’. I have two reservations about this approach. Firstly, it does not work particularly well, as it is fairly trivial to bypass blocklists using a VPN. Secondly, blocklists make it harder for educators to investigate potential contract cheating; this could, however, be addressed by not applying the blocklist to educators’ Internet access. Approach 3: Legislation against the promotion of e-cheating At the time of writing, Australia is considering legislation that would efectively ban the promotion of cheating services.This would include advertising and likely other promotion considered in this section on awareness raising.While these laws might have some efect on Australian businesses, and businesses willing to cooperate with Australian law, it would likely have little efect on overseas e-cheating frms.A purely legal approach would need an international legal partnership for it to have any meaningful efect. By contrast, lobbying platforms to disallow the promotion of e-cheating could have better international efect without requiring widespread international legal cooperation. Approach 4: Promotion of academic integrity Instead of focusing solely on stopping the e-cheating message from getting out, educators can also focus on getting the positive message of academic integrity to students.This approach is favoured by Rowland et al. (2017) – having reviewed the persuasive features of e-cheating sites, they encourage educators to: talk about trust; talk about interactivity and support; talk about quality; and talk about cost. This approach of openly and honestly discussing the ways that contract cheating sites attempt to persuade students may help equip students next time they are bombarded with ofers to cheat.
Purchasing If a student’s chosen e-cheating vendor is a commercial provider ofering products for sale, the next step is making a purchase. While the previous section largely
44 The E-Cheating lifecycle and how to disrupt it focused on dedicated e-commerce cheating sites such as EduBirdie, e-cheating can be purchased from a wide variety of storefronts including gig economy sites like Fiverr and marketplaces like eBay.As the negotiation and purchasing that happens on each type of storefront varies, this section considers each category in turn. Contract cheating websites The sort of contract cheating site students are most likely to encounter if they search the web is a dedicated contract cheating site selling assignments from a particular company. On these sites, students usually fll out a web form with details about the assignment they wish to have produced, and have the opportunity to upload any attachments.The sites calculate a price based on a number of diferent variables (e.g. word count, year level, quality) and then students pay, usually through a credit card or a payment processor like PayPal. Thanks to careful sleuthing around the infrastructure that supports these sites, Ellis, Zucker, and Randall (2018) have provided signifcant insights into how these sites operate. Firstly, Ellis et al. (2018) have identifed that while student customers see a range of diferent-looking sites – what they call ‘client sites’ – these are actually just student-facing frontends for a smaller number of real businesses. While students may interact with a site claiming to operate in a particular country or specialising in a particular discipline, the site is often just a front for a much larger enterprise. Secondly, Ellis et al. (2018) have identifed the existence of what they call ‘writer sites’, which recruit and manage people who produce the content students purchase. The third type, ‘master sites’, ofer to set up contract cheating websites using this architecture – for a fee, of course. If you’ve ever looked at a few contract cheating websites and thought they looked like they were produced from the same template, they may have been purchased from one of these master sites. Ordering a contract cheating assignment through sites using the architecture discovered by Ellis et al. (2018) happens through a web form.These forms usually incorporate most of the following inputs: the subject (discipline) area, the type of document (eg essay, term paper, dissertation chapter), the number of words which is also measured in pages, the line spacing, the urgency (measured in time), the academic level (ranging from High School to PhD), the required number of sources, the referencing style and the preferred language style (US or UK English). Clients are invited to enter information into a window entitled ‘order description’ which instructs them to ‘type your instructions here’. On the live sites, clients are required to enter in their names and contact information and to register with a password for the site. Clients are also given the opportunity to upload additional fles at a later stage of their order.The fnal stage of the ordering process adds up the costing of the variables for the academic work and requests payment from the customer. (Ellis et al., 2018, p. 10) The ordering experience on 18 contract cheating websites has also been investigated by Sutherland-Smith and Dullaghan (2019); they found the order forms to
The E-Cheating lifecycle and how to disrupt it
45
be very similar on eight of those sites. Pricing seems to be based on four variables (Ellis et al., 2018): discipline; length; turnaround time; and academic level. While discipline surcharges vary, and some sites do not charge additional costs for particular disciplines, the general pattern seems to be that ‘Science,Technology, Engineering and Mathematics (STEM) subjects like Physics, Chemistry, Aeronautics and Statistics are more expensive than Humanities, Arts and Social Sciences (HASS) subjects like English, History, Business and Sociology’ (Ellis et al., 2018, p. 10). Most sites also ofer optional extras, including ‘VIP support’ and ‘Written by Top 10 Writers’ (Ellis et al., 2018, p. 10). For a more detailed investigation into how these sites operate behind the scenes, Ellis et al.’s (2018) paper has many more fascinating details. When students purchase contract cheating assignments, they are required to agree to the terms of use of the site. Contract cheating sites’ terms of use should be concerning to any potential buyers.While most sites ofer refunds (Rowland et al., 2017), some explicitly do not; sites may even threaten to contact the student’s institution if they attempt a credit card chargeback to obtain a refund (SutherlandSmith & Dullaghan, 2019). Many sites explicitly disallow using the assignments for submission, claiming their products are only for reference (Sutherland-Smith & Dullaghan, 2019). Most sites examined by Sutherland-Smith and Dullaghan (2019) featured a privacy policy; however, around half of the sites stated that they ‘can or will pass on users’ personal details to third parties’ (Sutherland-Smith & Dullaghan, 2019, p. 10). In e-commerce more broadly, consumers generally do not read the terms of service of sites they interact with – for instance, less than 1% of participants in a study by Bakos, Marotta-Wurgler, and Trossen (2014) – and, if this is also true for contract cheating, students are agreeing to terms that might surprise them. E-cheating hardware websites Websites selling cheating hardware are much more diverse than contract cheating sites.There does not appear to be a general template. Some sites sell just one product, such as the ‘Ruby Calculator Ultimate Edition:The Ultimate Exam Cheating Device’ (Ruby Devices, 2019) or a small range of devices like the ‘Monorean’ range of exam cheating earpieces (Monorean, 2019). Such sites claim to be the manufacturers of these devices. Other sites sell a whole range of cheating hardware, such as 24Kupi (2019), which at the time of writing sells 22 diferent cheating products and accessories.These include: cheating watches; low-tech cheating pens that include a hidden paper scroll; high-tech cheating pens that include a hidden digital display; cheating calculators; cheating earpieces; and miniature button cameras for livestreaming. If you want a complete setup you can buy a bundle, and if your purchase is a gift, they also ofer to sell you a gift box.The degree of sophistication of these websites is much lower than the contract cheating sites; however, they typically still ofer worldwide shipping as well as payment through credit cards. Some also accept Bitcoin or payments through processors like PayPal. The terms of e-cheating hardware vendors are concerning. Some are difcult to fnd, such as 24Kupi’s, which yields a 404 Page Not Found error if not accessed directly from the site’s homepage, or Ruby Devices’, which is only accessible from a
46 The E-Cheating lifecycle and how to disrupt it link partway through a product page. Purchasers also agree to some interesting terms. 24Kupi purchasers agree that disputes will be resolved through binding arbitration in Croatia. Monorean disclaims any responsibility for ‘the possible illegal use of the products marketed on its website’. Ruby Devices purchasers agree that their cheating calculator, marketed as ‘The Ultimate Exam Cheating Device’, must not be used in ‘any educational examination or test’.They are also prohibited from warranty claims if the calculator does not match the description in the product’s manual; this is done to protect the appearance of the calculator, presumably so that invigilators and cheating researchers do not learn what it looks like.The terms of these e-cheating hardware sites thus explicitly prohibit the primary use of these devices – despite both sites displaying the words ‘cheating’ prominently on product pages. Gig economy websites In the ‘gig economy’, workers are re-cast as independent contractors who produce individual items of work, often mediated through technology (Friedman, 2014). Much contract cheating happens on gig economy sites like Fiverr (Lancaster, 2019b). According to Lancaster’s (2019b) analysis of Fiverr, there are two main models for connecting contract cheating buyers and sellers on Fiverr: the ‘request model’, in which the student puts up a request for which potential writers bid; and the ‘gig model’, in which writers put up an ofer to do students’ assignments under a set pricing structure. The gig model appears to be the more commonly used model. Lancaster (2019b) examined a subset of 93 gig model vendors, who had collectively produced 4,294 orders, and found that the ‘typical 2,000 words essay’ cost $59.54 (with a relatively large standard deviation of $37.53) and that these values scaled fairly linearly to longer and shorter assignments. In contrast with dedicated contract cheating sites and e-cheating hardware vendor sites, the gig economy vendors profled by Lancaster (2019b) were blatant in presenting as cheating services and explicitly discuss use of their products as assignments for submission. From the purchaser’s perspective, sites like Fiverr may be more attractive than dealing with contract cheating websites.The consumer protections ofered by gig economy sites, such as review/rating systems and dispute resolution, may be reassuring when compared to the promises made on individual cheating sites. Gig economy sites therefore add an air of credibility and trust to cheating transactions. They also bring with them terms of service that may be more familiar. However, individual vendors can still add their own terms to these, such as disclaimers against any liability for the grades received by students who submit their products (Lancaster, 2019b). Large-scale e-commerce sites Just as contract cheating services can be bought direct from dedicated websites or through intermediaries like Fiverr, e-cheating hardware products can be bought direct from vendors or through platforms like eBay,Amazon and AliExpress.And, just as with gig economy sites, large-scale e-commerce sites may add a degree of perceived trustworthiness and consumer protection for purchasers of e-cheating products.They may
The E-Cheating lifecycle and how to disrupt it
47
also provide substantial savings.Taking, for example, the Monorean set of cheating earpieces, which costs $399: similar devices are available on Amazon for less than 10% of that price, some of which have thousands of positive reviews, compared to the limited reviews available for Monorean’s products. Even cheaper products are available on AliExpress; however, it appears certain keywords are blocked, with searches for ‘cheating earpiece’,‘cheating watch’ and ‘cheating calculator’ returning zero results.While these sorts of products are defnitely for sale on AliExpress, they appear to be blocked from search results using the same approach AliExpress uses to block searches for counterfeit products (e.g. a search for ‘rayban’ returns zero responses despite there being countless counterfeit Ray-Ban sunglasses on AliExpress). While the review and rating functions of these large-scale e-commerce sites may be reassuring to potential purchasers, they are also vulnerable to fake reviews, as discussed in the Awareness raising section.Taking one product, the ‘NENRENT S570 Bluetooth Earbud, Smallest Mini Invisible V4.1 Wireless Bluetooth Earpiece…’ (Amazon, 2019) as an example: despite having more than 5,000 reviews, the fake review spotting service Fakespot (2019) awards this product an ‘F’ (the lowest possible score) because there are too few reviews that appear legitimate. Potential cheating students should carefully consider whether the reviews for their cheating devices may be faked, and therefore of limited usefulness. However, even if these devices disappoint, purchasers still have access to the dispute resolution and refund services ofered by large platforms, which may be more consumer-friendly than those of cheating device vendors. Dark web The ‘dark web’ is a part of the web that is only accessible through The Onion Router (TOR), a networking technology that ofers a high degree of anonymity (Broséus et al., 2016). On the dark web there is a range of marketplaces, known as ‘darknet markets’, where a variety of products can be purchased using anonymous cryptocurrencies like Bitcoin. Darknet markets are often used to sell illegal products and services, most notably illicit drugs (Broséus et al., 2016). At present, there is only a limited amount of e-cheating provision happening on the dark web, possibly because there is no real reason to conduct cheating there.While laws exist in some parts of the world that prohibit the provision of cheating products and services, these laws are not routinely enforced and they do not appear to be particularly efective (Amigud & Dawson, 2019). However, if such legislation becomes more widespread and is enforced more routinely, it is likely that cheating will join illicit drugs and other illegal activities on the dark web. If cheating moves to the dark web, it may be much harder to control, detect or study, due to the degree of anonymity and secrecy ofered by darknet markets and cryptocurrencies. Approaches for disrupting e-cheating purchases If students can be stopped e-cheating at the purchasing stage, it could potentially prevent them from e-cheating at all. Here are two suggestions for disrupting e-cheating purchases:
48 The E-Cheating lifecycle and how to disrupt it Approach 1: Lobbying payment processors E-cheating purchases rely heavily on payment processors. PayPal has announced a ban on payments to cheating services; however, this is quite difcult to operationalise as the boundaries between a cheating service and a tutoring service can be difcult to determine at scale (Coughlan, 2019). At the time of writing, many cheating services advertise PayPal as a payment method. Some banks and credit card companies already ban gambling (D. Hughes, 2019) and cryptocurrency (Aslam, 2019) purchases; they should similarly ban e-cheating purchases. Until payment processors institute efective bans for cheating services, they are profting from cheating. While some current users of PayPal and credit card payments for e-cheating may switch over to Bitcoin, it is much less simple to set up, especially on the purchaser’s side, so it is still likely to disrupt many e-cheating purchases. Approach 2: Lobbying gig economy platforms Similar to the sorts of lobbying action that is needed against payment processors, sites like Fiverr also need to be convinced to stop profting from cheating. Gig economy platforms face similar challenges to payment processors and advertising platforms in terms of identifying cheating. One site,Airtasker, has addressed this by simply banning all academic help: Academic activities and assignment based tasks are strictly prohibited. This includes but is not limited to completing an assignment, test or course on someones behalf. Additionally, academic assistance in the form of proofreading, editing, planning, structuring, referencing and giving feedback on someone else’s academic work is also not supported. (Airtasker, 2019) It appears this ban is working moderately well. At the time of writing, searching terms like ‘essay’ on Airtasker provides many results that appear to breach these rules, such as requests to edit student essays; however, none are overtly contract cheating. This contrasts with the blatant contract cheating found by Lancaster (2019b) on Fiverr. Platform bans such as the approach taken by Airtasker might be somewhat efective in disrupting cheating purchases.
Content production In those forms of e-cheating where content is being procured by students – those falling under the ‘cognitive ofoading’ and ‘outsourcing’ categories in the taxonomy in the frst chapter – the e-cheating provider needs to produce that content. But how is that content produced, and who produces it? Human cheating content production There is little in the way of scholarly literature about the everyday experiences of people who produce cheating content for students. However, there are a lot of
The E-Cheating lifecycle and how to disrupt it
49
tell-all experiences documented, starting with Ed Dante, aka Dave Tomar (Dante, 2010;Tomar, 2012).As discussed earlier, the forums on the contract cheating industry site EssayScam (2019) contain substantial discussion about the realities of content production. One of the most interesting and comprehensive discussions of life as a contract cheating writer is in the form of an AMA (Ask Me Anything) on Reddit by three writers from the site Unemployed Professors (Reddit user /u/ unemployedprofessors, 2014). In this AMA, the three writers discuss how they write large volumes of work for students.Years later, one of those apparently unemployed professors wrote a book for students on how to write essays, building on what they had learnt throughout their career as a writer (using the pen name Deleuzienne, 2018). There are thus several discussions about the experience of content production, but few scholarly sources. As Deleuzienne’s (2018) book is presented as a how-to guide for writing in areas one is not an expert in, and explicitly discusses the strategies they use to write for students, it presents one of the best available snapshots of a cheating writer’s actual work practices. Overall, Deleuzienne communicates a ‘work smarter, not harder’ message, aiming for efciencies. Some of these efciencies are usual academic practices: use a citation manager like EndNote; export references into it from Google Scholar; use synonyms with search tools; keep a checklist to ensure compliance with APA formatting. But some of the work practices Deleuzienne uses and encourages are poor academic practice. For example, they encourage: taking the sources from Wikipedia articles and citing them unread; citing articles for which you have only read the abstract; writing New York for publisher locations rather than putting in the efort to fnd them; and using padding techniques to meet word counts. Taken overall, the book suggests that the content production process used by Deleuzienne to generate more than 5,000 assignments cuts a lot of corners. The assignments our research team has procured for study bear a lot of the hallmarks of the slapdash processes described by Deleuzienne (2018). Across three studies, we have examined contract cheating assignments, by providing them to human markers for marking (Dawson & Sutherland-Smith, 2018; SutherlandSmith & Dullaghan, 2019), using them with markers in training courses on detecting contract cheating (Dawson & Sutherland-Smith, 2019), and analysing them with stylometric software (Dawson et al., 2020). Our studies suggest that the assignments produced by contract cheating sites are probably not produced by experts in the feld. In our frst study, a pilot study in the discipline of psychology (Dawson & Sutherland-Smith, 2018), our markers noted that the writers tended to ‘medicalise’ the assignment rather than approaching it from a psychology perspective. As someone from outside the discipline of psychology, this was something I did not spot when I attempted to identify which tasks were contract cheated and which were not. This fnding is supported by investigations my colleagues conducted into one of the writers our team had used: One site claimed that premium writers are specialists in a topic and discipline, ‘You will never have a person without a background in physics writing a paper on thermodynamics. Every writer assigned to our clients’ papers has
50 The E-Cheating lifecycle and how to disrupt it experience and academic expertise in the subjects they write on’. However, the site feedback for a particular writer we investigated, indicated that he completed quality work in sociology, movies/theatre studies, political science, ethics, philosophy and marketing. It is hard to imagine that one writer possesses a requisite degree of expertise over such a diverse range of disciplines. (Sutherland-Smith & Dullaghan, 2019, p. 1155) In addition to the challenges of identifying the real disciplinary expertise of authors, it is also challenging to determine where writers are actually located. However, there are reasons for suspicion that writers’ claimed locations are not accurate. Sutherland-Smith and Dullaghan (2019) found that ostensibly Australian contract cheating sites used North American terminology such as ‘sophomore’ and ‘freshman’. Most damningly, Ellis et al. (2018) found that some sites claiming to be Australian actually use the same PayPal IDs and backend writer sites as sites claiming to be located in other countries. Fiverr likely has similar deceptions around writer location, with Lancaster (2019b) fnding inconsistencies in the details of writers’ profles in terms of where they claimed to be located. However, the locations in Lancaster’s Fiverr data seems to be potentially more honest as it lists a range of countries in which writers live, including many that are not likely to be chosen to appeal to buyers. Of 93 writers studied by Lancaster, 28 provided their location as Kenya, slightly more than the next largest country group, 25 writers in the USA. Lancaster notes that the 25 writers purportedly in the USA might not really be located there, and notes some inconsistencies in some of their profles. The only other country with more than 10 writers in Lancaster’s dataset was Pakistan, with 15 writers. Looking at what is known about the countries where writers are located, two themes emerge: writers are often not located in their advertised location; and there are likely to be clusters of writers in countries where labour costs are low and English is an ofcial language. In addition to their locations and disciplinary expertise (or lack thereof), Lancaster (2019b) provides some further detective work around the motives and profles of contract cheating content producers. He identifes nine ‘Groups of individuals observed acting as academic ghostwriters and enabling contract cheating’, along with examples from the literature or private communications to support the existence of these groups. His nine categories are: 1. accidental academic ghost writers, who complete writing work without knowing it is for cheating students; 2. business opportunists, who are just in it for the money; 3. desperate individuals, who really need a job; 4. would-be academics, who would like an academic job but can’t get one, often despite advanced qualifcations; 5. internationally qualifed academic ghost writers, who have studied abroad, returned home, and now provide writing services; 6. career academic ghost writers, who have chosen contract cheating as a career;
The E-Cheating lifecycle and how to disrupt it
51
7. student peers, who write for classmates; 8. previous graduates, who specialise in a particular course at a particular university; and 9. friends and family members, who ‘have a vested interest in the success of a student’ (Lancaster, 2019b, p. 74). Each of these categories of writer is likely to work in a diferent way, and produce diferent types of content.A limitation of some of the research discussed in this section is that it largely focuses on a subset of these writers, and tends to exclude student peers, previous graduates, and friends and family.Their content production processes are likely somewhat diferent to other categories, as they may have greater familiarity with the discipline and/or stronger connections with the cheating students. One key promise made by many contract cheating services is that their work will be ‘plagiarism free’ (Lancaster, 2019b; Rowland et al., 2017). Operationally, this means that it will achieve a sufciently low score when put through contentmatching tools like Turnitin.While this is a far cry from actually being plagiarism free – contract cheating necessarily involves passing of someone else’s work as your own – it is enough to pass through the frontline defences against plagiarism. In order to provide these services, contract cheating providers need accounts with content-matching providers. While content-matching providers are unlikely to intentionally sell to cheating services, some shady online resellers in hacking forums (e.g. mpgh.net, 2019) ofer to provide accounts for as little as $100 per year. On such forums users also sell and share approaches to reducing their Turnitin scores.
Computerised cheating content production While there are some general themes to human content production, the landscape of computerised content production is much broader. The approaches taken by computer algebra solvers and auto-paraphrase tools are entirely diferent. Textbased methods like bespoke essay writing, translation, and auto-paraphrase tools appear to have the most commonality, in that they use computational linguistics and large databases of source content. However, since the mechanics of each approach are so diferent and complex, they are not discussed here in any depth. Approaches for disrupting e-cheating content production Stopping the people and tools that produce cheating content might help disrupt the e-cheating lifecycle. Here are two ways the content production process can be disrupted: Approach 1: Remote proctoring of the content production process As discussed in Chapter 2, tools exist that can monitor students throughout the process of producing their work. The use of these tools may disrupt the
52 The E-Cheating lifecycle and how to disrupt it outsourcing of student work to other people or tools. However, such approaches come at a cost to honest students, who may complain about being monitored and surveilled (Parkville Station, 2016; Stojanovic-Hill, 2017). Surveillance approaches like these are discussed in more depth in Chapter 7. Approach 2: Lobbying content-matching services to secure their products Unlike payment processors or advertising platforms, who proft from cheating, content-matching services are disadvantaged by the use of their products for cheating. However, the fact that so many contract cheating providers still advertise their use of particular tools (especially Turnitin) suggests that content-matching services have not been efective in shutting down cheating providers’ access. An industrywide ban on contract cheating services accessing content-matching tools could somewhat disrupt the business models of these sites.
Delivery, submission and after-sales care We are now at the fnal step in the e-cheating lifecycle – if we can’t disrupt it at this stage, e-cheating is likely to happen. In this step, the student receives their e-cheating materials and submits cheated work. They also receive what cheating sites might think of as ‘after-sales care’, though the realities can be somewhat darker. In this section we consider each of these subphases in turn, before a discussion on how this last phase can be disrupted. Delivery As you might expect, the mode of delivery depends on the goods being ordered. Virtual goods of contract cheated work tend to be delivered electronically, such as through email. Physical goods tend to be delivered by post. But what students might not expect is that some goods are not delivered by the requested date. Sutherland-Smith and Dullaghan (2019) write about multiple sites being late and non-communicative about delays: One site left a message several days after the user deadline passed claiming that the work would be delivered in fve hours. It was not. The assignment was eventually delivered one week later. The same site also delivered a separate order eight days late.Three sites were unable to meet user deadlines and sites requested extensions. Site-requested extensions varied from 24 hours to three days. One site requested, and was granted, a two day extension but delivered the assignment a further two days late, with no explanation. This fnding is critical, as students should be aware that not all sites deliver what they promise, or when they promise to do so. (Sutherland-Smith & Dullaghan, 2019, p. 1155) In addition to providing work late, contract cheating sites often provide work that does not meet the required standard. Sutherland-Smith and Dullaghan (2019)
The E-Cheating lifecycle and how to disrupt it
53
found that, despite all 18 sites used promising that they could undertake any type of assignment, 22% provided work that clearly did not meet the assignment requirements or the specifc requests made when purchasing. In two cases some very basic material was missing, and the writers were asked to make revisions; in both cases revisions were provided within 24 hours. Tomar’s (2012) tell-all book also provides evidence that writers do actually undertake revisions in some circumstances. Ultimately, however, 52% of 54 assignments failed to meet the university’s pass standard, according to markers who would usually mark such work.This range of bad experiences and poor-quality products reinforces Sutherland-Smith and Dullaghan’s (2019) headline fnding: that ‘you don’t always get what you pay for’. Submission This is the stage where a student submits something that they have cheated on for assessment. It might be an exam paper they have cheated on with assistance from a cheating calculator, a contract cheated assignment, or one of the many other types of cheating. At some stage, the student crosses a line from using the e-cheating afordances discussed in Chapter 2 to submitting work and claiming it as their own, falsely declaring – either explicitly or implicitly – that they have not cheated. In my opinion, submission is the step in the e-cheating lifecycle where a student actually cheats. If, for example, a student purchases a contract cheated assignment or a cheater’s earpiece, but does not use it, no actual cheating has occurred.What they have done is something lesser, perhaps analogous to conspiracy charges in criminal law; they have made an agreement and taken steps toward cheating, but have not actually cheated. It is unclear how many students have a revelation at this stage and abandon their plans for cheating – and how many students perhaps never intended to actually cheat but just wanted to experiment. Synchronous modes of assessment such as examinations, where cheating can occur any time during the examination but not prior, are an exception; in such cases, students can only conspire to cheat before an exam begins, but once commenced cheating may occur at any time before fnal submission. One criticism of the work that I have conducted with Wendy Sutherland-Smith (Dawson & Sutherland-Smith, 2018, 2019) has been that we get markers to mark unedited assignments from contract cheating sites. With the exception of anonymising these assignments, so that they don’t seem like anomalies alongside anonymised student work, we don’t amend the assignments we have purchased. The extent to which cheating students edit assignments they have purchased from contract cheating sites is unknown; however, it is probably a very signifcant step in the e-cheating lifecycle.The assignments we have purchased are often poor quality and lack some basic features; for example, Sutherland-Smith and Dullaghan (2019) found that, despite providing a template to six cheating sites for an assignment, only three of them actually used the template. A cheating student would be wise to ensure that their purchased assignment at least meets these basic requirements. In addition to ensuring an assignment meets the requirements of the task and does not appear anomalous, I suspect that smart cheating students also adjust the metadata of their purchased assignments. Metadata is data used to describe other
54 The E-Cheating lifecycle and how to disrupt it data and, in the context of a Word document purchased from a contract cheating site, it includes the name and afliation of the author. If students leave this metadata as provided by their contract cheating writer, it may alert careful readers that the document was produced by a well-known contract cheating service. Other artefacts of the production process are also likely to be attended to by careful students, such as updating fle paths for any embedded objects so that they do not appear to have been written on the computer of a cheating service.There are other less wellknown last steps students would be wise to use at this ‘cleaning’ stage, but I prefer not to mention them here for fear of creating a contract cheating how-to guide. As a fnal word on submission: many in academia have traded stories about students who make a fatal blunder at the point of submission. My favourite version of this story is the one where the student receives an email from a contract cheating site with their essay attached, which they simply forward to their lecturer for submission. But this sort of serendipitous stupidity is not common and, while a fun type of story, it might also be dangerous.When we can easily think up an example or instance of something, we tend to think that it’s more likely to occur; in behavioural economics this is referred to as the availability heuristic (Kahneman, 2011). The ease with which we can conjure up stories of contract cheating being caught may make us believe we catch the majority of it, which is certainly not the case. After-sales care The student has obtained their cheating products and used them to submit cheated work. At this point, they have now cheated. They might naïvely think that their cheating experience is fnished – but they might be wrong. Cheating vendors provide a range of services after purchases have been made. Some of these might be desirable, but some are defnitely not in the student’s interests. Cheating hardware vendors claim to provide various warranty services. However, it is difcult to determine how these warranties play out in practice. Taking Monorean (2019) as an example, there are instances online of users who claim the product does not work, but these users have not had their frustrations resolved satisfactorily by the manufacturer (e.g. YouTube videos by stratus star, 2017). By contrast, the eBay feedback for the Ruby exam cheating calculator suggests there are many satisfed customers, with all feedback being positive. Cheating students may fnd they have vastly diferent experiences obtaining after-sales support from diferent vendors. As modern e-commerce enterprises, some e-cheating sites ask for feedback. In Sutherland-Smith and Dullaghan’s experience (2019), almost all contract cheating sites asked for feedback on the quality of the work they provided. This feedback included surveys, requests for testimonials and star ratings.The trustworthiness of this feedback is questionable, however, as some sites had pre-populated star ratings with fve stars and comment felds with generic positive comments. Given Sutherland-Smith and Dullaghan’s experiences with the overwhelmingly poor quality of work from contract cheating writers, and the overwhelmingly high star ratings contract cheating websites show, I am suspicious that these survey results are fltered or tampered with.
The E-Cheating lifecycle and how to disrupt it
55
Students who contract-cheat tend to do it more than once; in one study, the rate of multiple use of contract cheating was 62.5% for those who had used it at least once (Curtis & Clare, 2017). Contract cheating sites capitalise on this potential for repeat business by marketing to existing customers.The after-sales marketing approaches of some contract cheating sites are particularly dogged, spanning private messages on the contract cheating sites, emails, phone calls and SMS messages (Sutherland-Smith & Dullaghan, 2019); one site was so persistent that Dullaghan had to change the phone number he had listed on his account because the site kept calling and messaging him. The darkest side of the e-cheating after-sales experience is blackmail. Several stories in the media and on contract cheating forums report cases where students have successfully contract cheated, but later received threats from cheating sites that the student would be reported to their university if they did not pay more money (Lancaster, 2019a). At the time of writing, the frst journal article on blackmail in contract cheating I am aware of has just been published online (Yorke, Sefcik, & Veeran-Colton, in press), which fnds that blackmail is real and does happen to students. However, 90% of students who elected to cheat in a hypothetical scenario in that study were unaware of the potential threat of blackmail. In addition to this one journal article, there has also been some discussion on blogs and at conferences. Lancaster (2019a) found there were more than 50 posts mentioning blackmail on the EssayScam (2019) forums.The most disturbing element of blackmail is the threat that it may be long term, extending beyond the student’s graduation and into their working life.The threat of a qualifcation being revoked may allow cheating sites to exploit students throughout their careers. Approaches for disrupting e-cheating delivery, submission and after-sales care If students successfully complete this step, e-cheating will have occurred, so it is very important we do all we can to disrupt it. Here are four approaches to disrupting e-cheating at this fnal step: Approach 1: E-cheating awareness-raising amongst staf When markers are marking e-cheated work, evidence suggests that they generally don’t spot it.Taking contract cheating as an example: in studies where markers have been given contract cheated work to mark, but have not been alerted to the potential presence of contract cheating, they generally do not fnd it; multiple studies essentially report zero percent detection rates (Lines, 2016; Medway et al., 2018). However, in our studies where markers have simply been alerted to the potential presence of contract cheating and asked to spot it, detection rates have risen to the point where, overall, most contract cheating is detected (Dawson & SutherlandSmith, 2018, 2019; Dawson et al., 2020). Whether this fnding extends beyond contract cheating to other types of e-cheating is unknown, but the suggestion that we need to be aware of, and looking for, new types of e-cheating does sound reasonable.
56 The E-Cheating lifecycle and how to disrupt it Approach 2: E-cheating detection approaches In our studies, we have demonstrated statistically signifcant improvements to contract cheating detection rates when markers are trained to spot contract cheating (Dawson & Sutherland-Smith, 2019), as well as when they use stylometrics software (Dawson et al., 2020). Just as the previous generation of e-cheating – copypaste plagiarism – required staf training and technologies to disrupt, there will likely be a need to provide training and resourcing for staf to address current and future generations of e-cheating. Approach 3: Dialogic feedback approaches Existing large-scale approaches to assessment and feedback in higher education have tended toward a model where students submit work, often online, which is marked at another time and in another place. Students then receive marks and comments on their work (Boud & Molloy, 2013b). There are critiques of this approach from a feedback perspective: it views feedback as something teachers do, rather than as a learning process for students (Boud & Molloy, 2013a); and it does little to develop students’ capabilities to meaningfully engage with feedback (Carless & Boud, 2018; Winstone & Carless, 2019). In response, some scholars advocate for dialogic feedback processes (Carless, 2016), which in the context of teacher-student feedback are two-way conversations.While I am not aware of any literature on dialogic feedback as a strategy for disrupting the submission of e-cheated work, some disciplines already employ this approach as a way to address e-cheating. When I taught undergraduate computer science, student out-of-class work was marked in-class. We engaged with students in feedback dialogues, and asked them to explain how key parts of their work functioned. It quickly became clear which students did not know how particular parts of their computer programs worked. Dialogic feedback approaches might help in disrupting e-cheating and, even if they don’t, they are an example of an approach to detection that is not detrimental to learning. Approach 4: Amnesties Students who are being blackmailed may feel that they have no choice but to comply with whatever the blackmailer wants. In response to this threat, some institutions have ofered amnesties or assistance for students who are being blackmailed. The University of New South Wales allows students to report if they are being blackmailed and, while the default penalty for a student who does so is a zero mark for any cheated tasks, the university ofers to support the student, contact the blackmailer, provide legal advice, and call the police (ARC UNSW Student Life, 2019).While this approach does not in and of itself discourage contract cheating, it does provide a way to support students who are already contract cheating to change their ways. The lifecycle view of e-cheating presented in this chapter allows us to consider e-cheating at diferent times and from diferent perspectives. Much of what has
The E-Cheating lifecycle and how to disrupt it
57
been discussed in this chapter relates to the particular afordances of the digital world. The next chapter takes a deeper look into a particular consequence of cheating happening in the digital world: the ways that e-cheating and assessment security are necessarily entangled with cybersecurity.
Things to do •
•
•
•
•
•
•
Consider each step of the e-cheating lifecycle. Which of them do you think you have any control or infuence over? If you’re a customer of any of the vendors I suggest lobbying, at the very least you can contact those companies and ask for change. Share some of the bad news with students. The seductive messages used by e-cheating vendors are incongruous with the realities of their products: e-cheating often doesn’t work; isn’t delivered on time; is of poor quality; and can result in blackmail. In addition to disseminating positive academic integrity messages, we also need to alert students to the negatives of e-cheating. Expose yourself to some of the awareness-raising approaches used by e-cheating vendors.You can do this by: tweeting about the difculties you are having completing a fctitious assignment and seeing how long it takes for contract cheating writers to ofer their services; searching online for legitimate help with assignments and seeing what sorts of services you uncover; searching on newer social media platforms for cheating-related content; or changing your demographic details on a social media platform to be similar to a student’s and seeing how the advertising you receive changes. If you do go looking for this sort of information, I recommend you use the ‘private browsing’ mode on your web browser, as it will stop search engines and platforms from using the various data they already have about you to tailor what you see; nobody wants to spend their precious advertising budget marketing cheating services to academics. Read reviews of e-cheating vendors, and run them through a fake review spotting tool like Fakespot (https://www.fakespot.com/). Examine the reviews on e-cheating comparison shopping sites – do you think these are likely to fool your students? Compare a number of contract cheating sites to see if you can fnd commonalities between them.As discovered by Ellis et al. (2018), these sites are highly templated; are some of the sites that try to present as tailor-made for your students really just fronts for some larger entity? Explore each of the diferent types of e-cheating marketplaces. If you need help accessing TOR to search the darknet markets, consider downloading the TOR Browser Bundle (Tor Project, 2019).You will need to fnd the darknet markets yourself, as these tend to get shut down somewhat frequently. Read Deleuzienne’s (2018) book on academic writing to get an understanding of the work processes of a contract cheating writer. Some of the approaches advocated for by Deleuzienne are, in my opinion, poor scholarship – do you agree?
58 The E-Cheating lifecycle and how to disrupt it •
•
If you have a budget, and are happy navigating the ethical quandary of fnancially supporting e-cheating businesses, consider purchasing some e-cheating products. I am aware of many academics who have learnt a lot from paying contract cheating frms to produce their assignments, but you might also consider purchasing e-cheating hardware if you make heavy use of examinations. The quality of the products you purchase might help you reconsider some of your assessment approaches. Raise awareness amongst your colleagues and teaching teams about the possibility of particular e-cheating approaches.You might also consider approaches to improving detection at the submission stage, such as training or software. If you want to try training, our team’s open-access paper includes a free workshop design you can adapt to suit your needs (Dawson & Sutherland-Smith, 2019).
4
Cybersecurity, E-Cheating and assessment security
We are living in a post-digital world; that is, one in which the ‘disruption’ promised by technology has already happened (Cramer, 2015).We are no longer waiting for our world to be transformed by technology as it already has been. As part of the world, education in general – and assessment in particular – has also been disrupted, transformed, and technologised. While technology holds great promise for assessment, it places assessment security in peril.This chapter argues that the additional cybersecurity threats posed by the digital world need to be taken seriously and engaged with systematically. More optimistically, the chapter also explores how cybersecurity approaches might help us think diferently about assessment security. Schneier (2018) states the impact of the post-digital world on security eloquently, if bluntly. His argument can be distilled into three points: 1. Everything is becoming a computer.While we have long used ‘technologies’, such as cars or telephones, these have not been underpinned by general-purpose networked computers until relatively recently. 2. Everything is becoming vulnerable. Scenes of hackers remotely controlling a car are no longer restricted to action movies; they are now real life (Greenberg, 2015). Because so much of everyday life is now enabled by devices that are, at their heart, general-purpose computers, they are all as vulnerable as a generalpurpose computer. 3. Cybersecurity is everything security. Lessons learned from the feld of cybersecurity are now applicable everywhere.All forms of security are underpinned and enabled by cybersecurity principles. Since assessment security is part of what Schneier calls ‘everything security’, the rules of cybersecurity apply to it as much as they apply elsewhere.Any time a computer is involved (which is practically all the time), assessment security is subject to a range of threats that are not typically considered by the academic integrity community. But what is cybersecurity, and what does it have to ofer assessment security?
Cybersecurity basics I use the term ‘cybersecurity’ to represent the security of information, computers and computer networks. The term has been criticised by some as a buzzword
60 Cybersecurity, E-Cheating and assessment security (Futter, 2018), but it remains in large-scale use, including by academia, governments and industry.A precise (but somewhat complex) defnition of cybersecurity, adapted from previous defnitions is: Cybersecurity is the organization and collection of resources, processes, and structures used to protect cyberspace and cyberspace-enabled systems from occurrences that misalign de jure from de facto property rights. (Craigen, Diakun-Thibault, & Purse, 2014, p. 17) Cybersecurity is therefore about ownership and control. It is about ensuring systems remain in the state which their owners wish them to be in. And this is achieved through ‘organization and collection of resources, processes, and structures’, an approach that is not solely the domain of computer scientists. Many different practitioners and researchers work on improving cybersecurity, including people in non-technical professions. If we take a Schneier-inspired line of argumentation and view assessment security as being inextricably linked with and underpinned by cybersecurity, this means that academic integrity specialists are also key players in assessment cybersecurity. The acronym CIA is often used to represent the three basic elements of cybersecurity: Confdentiality, Integrity and Availability (Simmonds, Sandilands, & van Ekert, 2004). Confdentiality has overlaps with popular understandings of privacy, in that it relates to controlling access to information. Confdentiality is crucial for assessment security, as a lack of confdentiality is what underpins some of the e-cheating afordances discussed in Chapter 1 to do with accessing secret assessment information. Integrity means ensuring information is not edited or deleted without permission. If a student improperly changes their mark for an assignment in an online gradebook, it’s an attack against the integrity of that data. It is worth noting that the cybersecurity concept of integrity is much narrower than the meaning of the term within an academic integrity context; in the CIA triad, integrity is not a primarily ethical matter. Finally, availability means that systems can be used when required. Availability concerns the entirety of a system and its data – and with good reason. In one study where the availability of an online proctoring tool was compromised and only available for part of the examination period, but students were still able to sit the test online without proctoring, students who completed the test without proctoring scored signifcantly higher (Brothen & Peterson, 2012).Availability is therefore of vital importance for assessment security systems, especially those that rely on surveillance or restrictions. In addition to the basic CIA triad, three additional terms have come into use more recently to refect the broader concerns of the feld of cybersecurity (Simmonds et al., 2004): Authentication, Access control, and Non-repudiation. Authentication involves ensuring users of a system are the people they claim to be. Approaches to ensuring authentication include passwords and, more recently, two-factor authentication methods that use passwords with a second system, like an SMS message or an authentication app. As discussed in Chapter 2, authentication is essential for assessment security, as assessment involves making judgements about what a particular person is capable of, and we need to be sure we are
Cybersecurity, E-Cheating and assessment security
61
judging the right person. Access control involves restrictions on which parts of a system particular users can access. In an assessment security context, the teacher might be able to access examination questions and their accompanying answers at any time, whereas the student might only be able to access the questions, and only within a particular time window. Non-repudiation means not being able to deny one’s actions.This is essential in an online examination context, as we need to be able to prove that a student did actually choose particular options, and that they chose to submit their work.
Implications of cybersecurity for assessment security The connections between cybersecurity and assessment security are clear: without cybersecurity, in a post-digital world we have no assessment security. But what are the implications of this? Cybersecurity depends on an adversarial mindset Academic integrity and cybersecurity have fundamental philosophical diferences. At its heart, academic integrity believes people are good and that, with the right measures, they can be persuaded or educated into acting ethically. Our team has described this elsewhere as the ‘positive mission of academic integrity’ (Dawson & Sutherland-Smith, 2018, p. 287). In contrast, cybersecurity is based on the premise that some people may try to break things for a variety of motivations, including proft, curiosity or malevolence (Schneier, 2018). Cybersecurity is an adversarial feld (Craigen et al., 2014); at its heart, it believes some people will always try to exploit weaknesses in a system. Following this mindset, we will not stop computer hackers by convincing them to act morally, and it is infeasible utopianism to think we can fx all of the social determinants of their misdeeds. Assessment security adopts cybersecurity’s adversarial mindset, and begins from the premise that some students who are tempted to cheat will not be won over by the benefts of academic integrity. To stop these students from cheating, and to protect the public good of education, we need to act in an adversarial manner. Assessment security has a harder authentication problem to solve than regular cybersecurity’s authentication problem In traditional cybersecurity authentication, the user does not want anybody else successfully authenticating as them. I do not want someone else logging in to my email, online banking, or social media accounts; therefore, I behave in a way that supports the authentication eforts of these providers. I do not share passwords, I take care to only use trusted computers when I log in to these accounts, and I use any additional security that is available, such as two-factor authentication. The authentication problem faced by assessment security is a much harder problem, to the extent that calling it ‘authentication’ is possibly unhelpful, as it makes it sound easy. Assessment security has to work both co-operatively and adversarially with legitimate users: for instance, it has to ensure a student’s account is not accessed by
62
Cybersecurity, E-Cheating and assessment security
somebody else against their will, but it also has to ensure a student’s account is not accessed by somebody else with the student’s consent.This is a much harder problem to solve, as no amount of password security can stop somebody from simply giving their password to somebody else.Two-factor authentication is similarly vulnerable to a cheating student sharing whatever the additional factor is (e.g. codes from an authenticator app or SMS). Even biometric information might be vulnerable to intentional sharing. Given the difculties that are still faced by the feld of cybersecurity in the authentication space – every time someone’s account is misused by somebody else it is essentially a breach of authentication – it is unsurprising that authentication is so challenging for assessment security, as the problem is even harder in this space. The contexts of assessment security impose some challenging constraints for security If you are trying to design a secure Automated Teller Machine (ATM), you face several design constraints, but ultimately you have a great deal of control.You get to choose the hardware, and you get to decide what inputs and outputs the system can take. In contrast, if you are designing an online assessment system that students can use on their own devices, you have far less control and you must operate under some challenging constraints. Online assessment systems need to function on a wide range of target devices, ranging from computers with many diferent operating systems through to smartphones and tablets running diferent versions of their own diferent operating systems.Your online assessment system needs to interoperate with a wide range of peripheral devices, including keyboards, microphones, webcams, USB fash drives and monitors. Amidst all of this diversity, you will also need to ensure that your system adheres to legal requirements around privacy and accessibility in education. These features pose challenges for cybersecurity. For example, for accessibility reasons you might need to allow the user to plug a keyboard in; but computers cannot tell the diference between a keyboard and a device like the ‘USB Rubber Ducky’ (HakShop, 2014), a type of USB key injector that can be programmed to virtually type any arbitrary text at any moment.You might need the user to use a webcam, but if they bring their own webcam it might not function in the ways you expect, and it is open to tampering.You could try using the increasingly proliferating fngerprint scanners present in many computers and portable devices, but even these can be confgured to misbehave.The ‘in the wild’ setting of assessment security makes it a challenging context to provide cybersecurity. Assessment security depends on hardware security and software security As mentioned previously, allowing users the freedom to bring their own device poses cyber challenges for assessment security.Assessment applications are a type of software and, at a fundamental level, software security is always dependent on hardware security (Barkley, 1994). When the legitimate user is a potential adversary, user-owned or user-administered devices provide weaker cybersecurity because
Cybersecurity, E-Cheating and assessment security
63
they can be tampered with. I have demonstrated this in my work with Bring Your Own Device (BYOD) online examinations (Dawson, 2016). By running online exams on my own computer, I was able to: use additional hardware that allowed me to bring in extra materials; run the examination environment within a virtual machine, allowing me to hand over control of the exam to someone else; and capture or broadcast the examination to other people. For matters of convenience, pedagogy, logistics and the student experience, the future of online examinations and e-assessment in general is likely to involve student-owned hardware, and I support this. However, we need to understand the assessment security that is given up when students administer their own assessment technology. Assessment security will always be imperfect, but that doesn’t mean we shouldn’t try Perfect cybersecurity is practically unattainable (Schneier, 2007). Yes, there are approaches to encryption that are perfectly secure.The one-time pad, which is one of the oldest encryption approaches, achieves perfect secrecy through an encryption key which is used only once and is as long as the message it is used to encrypt (Bellovin, 2011). But when the mathematics of encryption come into contact with the realities of large hardware and software projects, there is just too much complexity for perfect security to be feasible (Schneier, 2018). With the additional contextual layers that assessment adds to cybersecurity, it might be easiest to say that it’s all too hard and abandon any hope of assessment security. However, just because perfection is impossible or infeasible, doesn’t mean we shouldn’t try. As with gains in cybersecurity (Schneier, 2018), gains in assessment security are likely to be incremental and small but, over the long term, contribute to signifcant improvements. Security through obscurity is not enough The approach of hiding how a system works in an attempt at improving its security is termed ‘security through obscurity’ (Hoepman & Jacobs, 2007). In and of itself, security through obscurity is not a bad thing, as it can act as an additional layer in the security of a system. But it’s not very efective. Hiding a safe behind a picture in my house might make it harder for a burglar to fnd but, once found, the hiding place will do nothing to improve the security of the safe. If the burglar tells their friends where my safe is, they have efectively defeated this additional layer of security.There is little I can do to adapt, short of fnding a new place to install my safe. The problem with security through obscurity is that, when people accidentally learn how your system works, or if someone tells them, that layer of security is no longer functional. For example, one online exams vendor I cannot name used to distribute their exam questions on a USB fash drive from which students livebooted the exam. The test questions were ‘secure’ against being taken away from the examination venue because the student is unlikely to know where to look on the USB to fnd them. This works up until the point the student stumbles upon their location; after that point they can tell their friends, and so on, until anybody
64
Cybersecurity, E-Cheating and assessment security
can fnd and share the exam questions. Security through obscurity, on its own, is not an efective approach to assessment security. Once an expert makes software that can cheat, everybody can cheat Schneier (2018) notes that in the past, security breaches required expertise to implement. If I know how to pick locks then I can break into houses, but it requires expertise that not everybody has. If you don’t already know how to pick locks and want me to help you break into houses, I either have to do it for you or teach you how to do it.The landscape of modern cybersecurity is completely different. On sites like mpgh.net (2019) you can download software that can hack for you.As Schneier (2018) notes, experts can now make software that allows everyday users to do what an expert can do.This is one of the most troubling implications of cybersecurity for assessment security.When I initially raised my concerns about the weaknesses within one particular BYOD exam tool with its vendor several years ago, they rebufed me by saying that only a very clever, tech-savvy student could do the things I described (Dawson, 2016). However, I told the vendor the same thing Schneier (2018) says: if I build this hack into a piece of software, then every student will be able to do what I can do. Attack is easier than defence One unfortunate reality of cybersecurity is that attacking is easier than defending (Schneier, 2018).The defender’s job is to ensure there are no vulnerabilities, all of the time.This is a difcult, practically impossible job.The attacker’s job is to fnd one vulnerability that works some of the time, and exploit it.This is a much easier job, and it gets easier the more attackers there are and the more time the attackers have to work on fnding a problem. Adapting this to the context of assessment, it means that assessment security is a harder problem than cheating.Assessment security experts need to work on stopping all possible ways of cheating, whereas cheaters only need to fnd one way that works. Securing all acts of assessment all the time is likely an infeasible challenge; Chapter 9 considers how we might reduce the scope of the assessment security problem by being more targeted in using scarce assessment security resources. Attackers don’t need to account for laws, policy, ethics, student rights or public opinion Educational institutions are highly regulated, and they must follow a range of rules around both their educational activities and their information technology practices. In contrast, attackers do not need to follow these rules. For example, attackers can blackmail or infringe copyright if it helps achieve a goal.These are not legal (or ethical) options for educational institutions. Attackers can surveil connections and computers to identify ways to attack; educational institutions must operate within privacy frameworks that limit the sorts of monitoring they are allowed to undertake. Even if some practices might be ethically defensible and legally allowable,
Cybersecurity, E-Cheating and assessment security
65
educational institutions can still sometimes face signifcant resistance from students or the general public when they implement some assessment security approaches. For example, when the University of Sydney introduced the Cadmus system that monitored student completion of assessment tasks they faced strong resistance from students, which may have contributed to the university abandoning the use of Cadmus at that time (Mason, 2020; Stojanovic-Hill, 2017). As Schneier (2018) notes, attackers are at an advantage because they do not need to follow the rules. Importantly, I am not advocating for a softening of any rules around institutional practices; however, it is important to recognise that the assessment security playing feld is not level.
Improving assessment security through cybersecurity Given the implications of cybersecurity for assessment security, what can we do? Quite a lot, actually. Cybersecurity is not just a discipline of problems; it is a discipline of ‘resources, processes, and structures’ (Craigen et al., 2014, p. 17). In this section, I identify some of the approaches from cybersecurity that can be adapted to assessment security. Penetration testing An adversarial approach to assessment security means we need to understand the approaches an adversary might take. One of the key approaches used in cybersecurity to understand the vulnerabilities of a system is penetration testing; it involves systematically trying diferent types of attacks (Arkin, Stender, & McGraw, 2005). Preparing for a penetration test requires thinking about the potential attackers and the potential faws in a system (Bishop, 2007).When I have attempted to cheat in computer-based examinations, I have adopted a penetration testing approach (Dawson, 2016). But although penetration testing comes from the feld of cybersecurity, its underlying ethos can be applied to assessment security more broadly – even to completely ofine assessment.We need to think like somebody who is trying to break our assessment security.What would they do? The claims that can be made by penetration testing are existential rather than universal (Bishop, 2007). If we conduct penetration testing on a system and fnd some vulnerabilities, we now know those vulnerabilities exist (an existential claim). Penetration testing cannot, however, prove that all vulnerabilities have been found (a universal claim); similarly, if we do not identify any vulnerabilities through penetration testing this doesn’t mean that a system is secure. It just means we didn’t fnd any vulnerabilities this time. This means that, while penetration testing can give an indication as to the security of a system, it is at best a hunch.What penetration testing is good for, however, is identifying new vulnerabilities that might be able to be fxed. In the feld of cybersecurity there are professionals whose entire jobs revolve around penetration testing and associated activities. These roles typically require degree qualifcations in computer science or information technology, as well as relevant industry certifcations (Potter & Vickers, 2015). Educational institutions
66 Cybersecurity, E-Cheating and assessment security could similarly consider specialist roles for assessment security penetration testers, either as employees or consultants. In cybersecurity these roles are sometimes flled by former illegal hackers, who are recognised as having a particular set of skills that are hard to come by (Mitnick & Simon, 2009). Following cybersecurity’s lead, educational institutions could consider employing experienced cheating students as potential penetration testers for assessment security.This would not be without its controversies and ethical problems. One key challenge to penetration testing is that some vendors do not like outsiders testing their products in this way (Schneier, 2018). I have experienced this frst-hand in an assessment security context. In 2019, I was awarded an internal grant by my university to test the security of various online examinations tools. I was to work with professional cheaters hired through contract cheating sites, and pay them to attempt to cheat with a range of online examination tools.When my university approached the vendors to inform them about this project, all three vendors explicitly disallowed me from doing this work.They claimed it was against their terms of service. In case there were any ambiguities, one vendor even made special amendments to the terms of service, just for my university, explicitly disallowing any testing of this type. As a result, my university’s solicitors advised me against conducting this research; so I had to return the funds unspent. Unfortunately, attackers do not need to obey the law, as discussed earlier.While I was barred from conducting this research, I know that others have conducted penetration testing against online examinations tools without consideration for their terms of service. If legitimate researchers are unable to penetration test assessment security products, this means the only people conducting penetration testing are people willing to break the law – and these are probably the sorts of people online exam vendors and educational institutions should fear the most. Disclosure and transparency What should a researcher do if they fnd an assessment security vulnerability? In the world of cybersecurity, the problem of disclosure has been the subject of debate for decades. Diferent parties prefer diferent approaches. In his essay on disclosure, Schneier (2007) frst puts forward the case for secrecy, an approach where researchers disclose vulnerabilities to the vendor of the product but nobody else.The vendor can then fx the vulnerability before anybody else fnds out about it. Schneier (2007) notes that secrecy is premised on the false assumption that other people are unlikely to independently discover a vulnerability unless the researcher tells them about it. From a secrecy standpoint, the problem ‘is less the vulnerability itself and more the information about the vulnerability’ (Schneier, 2007). Secrecy proponents believe that vendors will fx problems when they are made aware of them, and that it is irresponsible for researchers to make those vulnerabilities public before they are fxed. There are two key problems with a secrecy approach to cybersecurity. Firstly, recent history has taught us that independent hackers often discover vulnerabilities researchers have also discovered; in fact, hackers sometimes get there frst (Schneier, 2018). Secondly, vendors have a poor track record when it comes to
Cybersecurity, E-Cheating and assessment security
67
fxing vulnerabilities within a reasonable amount of time (Schneier, 2018). This means that we end up with the worst of both worlds: the bad guys know how to break in, but the vendors don’t bother fxing the vulnerability because there is no pressure to do so. At the other extreme, some researchers advocate for immediate full public disclosure of vulnerabilities (Schneier, 2007). In the full disclosure model, as soon as a researcher fnds a vulnerability, they release it publicly. In efect this forces the vendor’s hand, as now everybody knows about the problem and, if the vendor doesn’t act quickly, they will lose customers. Schneier is an advocate of the full disclosure model. But critics of full disclosure argue that it ‘[unlocks] a pandora’s box where unsavory individuals are able to rapidly and easily produce exploits, and compromise vulnerable systems’ (M. Hughes, 2015). Remembering that, in cybersecurity, attack is easier than defence; it can be quicker to exploit a newly disclosed vulnerability than it is to fx it. Responsible disclosure is something of a compromise between secrecy and full disclosure, where a researcher tells a vendor about a vulnerability and gives them a period of time to fx it before the vulnerability is made public (Bergman, 2015). Advocates for responsible disclosure argue that it still provides motivation for vendors to fx problems, but it also gives vendors time to fx those problems (M. Hughes, 2015).There has been some debate about whether responsible disclosure should be enshrined in law (Bergman, 2015). Returning to assessment security (not that we ever left; remember, cybersecurity is everything security!), what should a researcher do when they discover new e-cheating approaches or other assessment security vulnerabilities? Should they tell the responsible vendor or institution and nobody else (secrecy), and risk no action being taken? Should they immediately tell the public (full disclosure), and risk alerting students of new ways to cheat? Or should they negotiate a timeline for disclosure (responsible disclosure), and provide the vendor or institution time to fx it? This is a debate that needs numerous voices: vendors; educational leaders; academic integrity experts; cybersecurity researchers; and, most importantly, students. Whatever approach to disclosure is adopted by the assessment security community, it needs to be communicated to researchers, journals and ethics committees. When I submitted my paper on hacking online exams, one reviewer took issue with me disclosing this information in a paper.They had a legitimate viewpoint: wasn’t I just telling students how to cheat? I addressed this by pointing out that I would never provide a cheating how-to guide (although those are readily available online); rather, that I was pointing out classes of vulnerabilities that need addressing. But we need to ensure that the processes that surround assessment security research, like peer review and ethical approvals, are mindful of whatever preferred stance is taken to disclosure. In some countries, when large organisations have cybersecurity breaches there are processes in place for supporting whistleblowers or requirements to publicly disclose (Pacella, 2016). Should universities similarly have processes in place to protect whistleblowers who publicly disclose large-scale assessment security breaches? In addition to the need for researchers to be able to disclose potential assessment security vulnerabilities, we need to think as a sector about the potential
68 Cybersecurity, E-Cheating and assessment security benefts and risks of transparency around assessment security. I am of the opinion that, as a predominantly publicly funded sector, we have a duty to the public to be transparent about cheating. However, a move to transparency would require cooperation by an entire sector; there is too much to lose for an individual institution to be the frst to admit to rampant cheating. Journalists in the UK have already demonstrated that they are able to obtain this information through Freedom of Information (FOI) requests (Lee, 2019); if universities do not act proactively they may face trial by media instead of carefully managed disclosure. Working with students Students are meant to be the ultimate benefciaries of what we do in academic integrity. In improving assessment security, we need to work with students. A frst step might involve educating students about the cybersecurity implications of e-cheating. Sites that provide e-cheating services can have poor cybersecurity, which can lead to breaches of cheating students’ privacy (Ellis, Zucker, & Randall, 2018).Through poor cybersecurity practices on the part of the MyMaster contract cheating site, journalists were able to gather data that ultimately led to more than 70 Australian students being suspended, and some having degrees revoked (Visentin, 2015). This was an Australian contract cheating operation that actually operated within Australia and was required to abide by Australian laws, which include strict privacy and cybersecurity protections. The protections aforded by cheating providers operating outside of jurisdictions like Australia may be far less. Students who provide cheating sites with sensitive information should know that they face great risk, thanks to those sites’ poor cybersecurity practices. Some e-cheating approaches require students to engage in poor cybersecurity practices themselves. When hiring someone else to take their online course, students need to hand over their login credentials.This provides cheaters with full access to whatever that student can access, potentially including: emails; fnancial information on the university’s administrative systems; personal information like their address; and private messages on the university’s learning management system. But to make matters worse, particularly devious e-cheaters can access university server systems using the student’s login details and run malicious software as the student. This might include engaging in media piracy or other illegal acts – all under the student’s name. Some of these approaches can persist long after the student has changed their password. Students need to be made aware of the consequences of sharing their account details with cheating services. In addition to educating students about the poor cybersecurity practices of cheating sites, we should also be aware that some assessment security approaches might inadvertently normalise poor cybersecurity practices to students. Some online exams and remote proctoring platforms rely on students essentially handing over complete control of their computers to a commercial vendor.These vendors can access anything on the student’s device and surveil everything they do. Some students are understandably wary about this intrusion (Singer, 2015). When universities use these approaches they are, to an extent, legitimising briefy handing over control of your entire computer, and by extension your entire online life, to a
Cybersecurity, E-Cheating and assessment security
69
corporation.This is far more power than students give to even Google or Facebook. Is this an attitude to cybersecurity that we want to teach to students? This topic is further explored in Chapter 7, with a particular focus on how surveillance at university might induct students into society’s broader cultures of surveillance. Working with cybersecurity It is unreasonable to expect academic integrity researchers and professionals to suddenly become experts in cybersecurity. But it is perfectly reasonable to expect academic integrity as a feld to make more meaningful engagements with cybersecurity.This is happening, albeit slowly. Papers are beginning to adopt cybersecurity approaches to academic integrity problems, especially to the challenges of securing online examinations (Dawson, 2016; Sindre & Vegendla, 2015). Articles are even being published in cybersecurity journals about academic integrity technologies (Amigud, Arnedo-Moreno, Daradoumis, & Guerrero-Roldan, 2018). But on a practical, local level, there is a need for greater connection and co-operation between the two felds. They tend to have diferent terminologies and concerns. With the notion of assessment security, it is my intention to advance a concept that appeals to both felds. Continuing the drive to connect assessment security with other felds that have already been thinking about ideas that concern us for a long time, the next chapter connects with research in games and gambling.
Things to do •
•
•
Read Schneier’s (2018) book, Click here to kill everybody: Security and survival in a hyper-connected world. It is an accessible snapshot of the current state of cybersecurity. I can personally recommend the audiobook, read by Roger Wayne.As an alternative, Schneier gave a 50-minute talk at Google in 2018 about the ideas in the book, which is available at https://youtu.be/GkJCI3_jbtg. Find a guide on penetration testing, and try the overall approach out on your assessment security. There are guides in peer-reviewed journals (e.g. Tang, 2014), as well as more how-to oriented guides accessible via web searches. There are also several Massive Online Open Courses (MOOCs) available on cybersecurity in general as well as specifc MOOCs on penetration testing. Remember, penetration testing can only identify vulnerabilities and approaches to exploit them; it cannot guarantee that no other vulnerabilities exist. Find out who is responsible for cybersecurity at your institution and take them out for a cofee (this is what we do in Melbourne; substitute for other drinks as appropriate). Find out what their thoughts are about the big threats to assessment security at your institution. See if they would be willing to collaborate on your next academic integrity or assessment security project.
5
Lessons from E-Cheating in games and gambling
E-cheating is an existential threat to education – but arguably no more of a threat to education than it is to other industries that rely upon knowing who their customers are and the conditions their products are being used within. This chapter continues the previous chapter’s mission of connecting assessment security with concepts from other felds facing similar challenges, taking games and gambling as two pertinent cases. As with education, these two industries had a history of predigital cheating and a set of approaches to deal with those forms of cheating.And, also similar to education, games and gambling have faced a range of new types of cheating as they have become more digitally enabled. But unlike education, games and gambling businesses and organisations have invested substantial sums in addressing e-cheating – as much as 20–25% of their budgets (Ashton, 2019).This chapter covers the ways in which e-cheating happens in online games and online gambling, as well as the anti-cheating technologies that are used to prevent, detect and prosecute e-cheating.The chapter then identifes key lessons that can be learnt from how these industries have dealt with e-cheating.
Cheating in online games The video games industry is huge. In 2018, the video games industry had greater revenues than the music industry and the flm industry combined, market capitalisation of almost a trillion dollars, and 2.4 billion players (D’Argenio, 2018b; Gough, 2019; Nakamura, 2019). The biggest entertainment product release of all time in terms of revenue is a video game, Grand Theft Auto V1, which at $6 billion in revenue surpasses even Star Wars and Gone With the Wind, whose revenues only come to $4 billion each after adjusting for infation (Batchelor, 2018; MarketWatch, 2018). Cheating in video games therefore threatens an almost trillion-dollar industry, with billions of customers. Video gamers have a complex relationship with cheating. As in educational cheating, the defnition of cheating is socially constructed, and it varies from context to context (Consalvo, 2007). Based on interview and survey data from gamers, Consalvo (2007) argues that most players give themselves permission to cheat sometimes – perhaps to get through an unfairly difcult or boring part of a single-player game – and that cheating can be a creative or positive act.
Lessons from E-Cheating in games and gambling
71
When gamers push the boundaries of games in a collaborative, social setting it can be its own unique type of gameplay (Apperley, 2011). Some cheating is therefore regarded as acceptable practice in games, especially single-player and noncompetitive games. But online competitive games are diferent (Consalvo, 2007). In these settings, players depend on a shared set of rules that are enforced by the game.While cheating in a vintage single-player game (like typing in a code to get more money in the city-building simulator Sim City) might be a matter of personal preference, when someone cheats in a competitive online game (like using an ‘aimbot’ to give them perfect aim in the online frst-person shooter Fortnite), or a single-player game with a competitive scoreboard (like gaining infnite lives in Candy Crush Saga), it ruins everybody else’s game. It doesn’t matter if there are not really any stakes in a game beyond topping the global scoreboard; cheating gives you an advantage over other players. Cheating in online multiplayer games is closer to cheating in a chess tournament than it is to cheating in the solo card game Solitaire. For the rest of this chapter, when referring to cheating in online games we will focus on competitive online gaming. Cheating in online games can provide a range of advantages to a player, depending on the type of cheat and the type of game. In the evenings between writing this book, I have been playing the 2019 frst-person shooter Call of Duty: Modern Warfare. On encountering players who were blatantly cheating, I went to the Multiplayer Game Hacking forums (mpgh.net, 2019) to fnd out what they have been up to. On those forums I found promises for hacks that can provide all manner of advantages. These include aimbots, which can greatly assist with aiming accuracy, as well as other subtler hacks, like tricks to fool the game into thinking you are playing with a console controller rather than with a mouse and keyboard. This prompts the game to provide you with assistance and pair you up with other players who use controllers. Many of the cheating approaches used in online games involve software that is downloaded to assist the player.While some of these cheats are free, many of the more efective and less detectable cheats are only available by purchasing (mpgh. net, 2019).The sites that sell these cheats make grand promises that their cheats are undetectable and cannot result in the user being banned, with marketing copy that is reminiscent of the promises made by contract cheating sites. In addition to downloading specialist software, some gamers manipulate weaknesses in the game itself – referred to as ‘exploits’, ‘glitches’, or ‘cheesing’ (Moeller, Esplin, & Conway, 2009) – which provide some particular advantage; for some older competitive games there are extensive lists of exploits that have been discovered and patched over time (e.g. the 2007 game ‘Team Fortress 2’, Team Fortress Wiki, 2019). Exploits are a somewhat murky territory, with some competitive games communities embracing them as game features – for instance, the ‘speed running’ community that uses exploits in titles like Super Mario Bros to complete games in minimal time (Gach, 2016). Some other communities and publishers view exploits as a bannable ofence, tantamount to cheating (e.g. the game building platform Roblox, Roblox Wikia, 2020). This again echoes the
72
Lessons from E-Cheating in games and gambling
socially constructed nature of cheating, and the murky boundaries between cheating and simply being strategic. As with e-cheating in educational settings, there are also hardware approaches to e-cheating in games. If you have ever played a fghting game like Mortal Kombat, Tekken or Street Fighter, you might have struggled to remember the long sequence of button presses required to execute particular special moves. Even if you could remember the ten or more button presses required to execute a particular move or sequence of moves, accurately and reliably pushing those buttons in the right sequence is challenging. To remedy this, some players use special keyboards or controllers that can record a series of moves and then play them back; this is efectively the same e-cheating approach as the ‘USB Rubber Ducky’ (HakShop, 2014) device discussed in the previous chapter that can type arbitrary text for students in an exam. Another e-cheating approach with parallels to educational e-cheating is the use of a ‘lag switch’, which temporarily interrupts a network connection. Many games attempt to shield players from the negative efects of unreliable Internet connections, and lag switches exploit these protections. For example, a player might hide behind cover in a shooting game, turn on their lag switch, and then be free to move around without being shot at until they turn their lag switch of.This method of hardware e-cheating can be directly applied to any synchronous e-assessment task like a remote-proctored examination to exploit any protections those platforms have against network instability.Willett and Hogan (2019) draw a distinction here between two types of cheating: ‘cheating the game’ and ‘cheating the network’. Attacks like the lag switch fall into the latter category, alongside simply pulling out a network cable before losing a game. There are also methods of cheating that require no special hardware or software.‘Screen peeking’ is one such method, and it originated in the days of splitscreen multiplayer gaming where players would use one console plugged into a TV to competitively play games; each player would be given a segment of the screen that presented their character’s perspective of the gameworld (Kubesch, Lankes, & Maurer, 2019). In split-screen games, a single screen (television, monitor, etc.) is split into sections, one for each player. If you look at another player’s section, you are screen peeking. Screen peeking was often termed ‘screen cheating’ (Kubesch et al., 2019), because it gave you a signifcant advantage. Players came up with a variety of approaches to stop screen cheating, even going as far as making cardboard dividers to attach to a television. However, screen peeking was not always considered cheating and some games even encouraged it, with the manual for Super Mario Kart recommending you ‘Always keep an eye on your opponent’s screen’ (Nintendo EAD, 1992). However, many players considered screen peeking in Super Mario Kart to be screen cheating. This again exemplifes Consalvo’s (2007) contextual and relational understanding of cheating. In the modern age, where ‘streamers’ play games that are livestreamed over the Internet, screen cheating has a new incarnation: stream sniping. Players who cheat in this way watch the livestream of a streamer while playing against that streamer, providing the same advantage that screen cheating provides but without needing to be in the same location. Players have been banned from games
Lessons from E-Cheating in games and gambling
73
for stream sniping; however, it is notoriously difcult to prove, and bans for stream sniping can wrongly target innocent players (Frank, 2017). Being banned from an online game for cheating can have very serious consequences.When you have been playing a particular online game for a long time, a ban threatens to disconnect you from a range of social connections, in-game rank or status, and virtual possessions. Bans can take a variety of forms.At the lowest level, you can be simply kicked of a particular multiplayer game server but allowed to reconnect immediately or after some set amount of time. Players can often administer these processes through voting when they think a particular player is cheating. More serious bans might block your account from a server, or a group of servers, for life. Even more seriously, your account could be banned from a game, or just its online multiplayer functionality, forever.At the most extreme end, your account on a platform – like your Xbox or a PC gaming platform like Steam – could be banned from online play for life. This represents a signifcant penalty for some users; according to the Steam platform I have spent $1,194.07 on games in the decade I have used it, and a ban would result in me losing access to multiplayer in many of those games. Bans are just one approach to penalising cheating. Penalties for cheating in competitive online games can be even more severe and, in rare cases, can include legal action by game publishers (Jeong, 2018) or even prison sentences (Hatmaker, 2018). Legal action has also been taken against producers of cheating software, as happened to ‘Bossland’ who made cheating software for the Massively Multiplayer Online Role Playing Game (MMORPG) World of Warcraft (Zhang, 2019). Some countries take cheating particularly seriously, most notably South Korea, with their Game Industry Promotion Act and Information and Communication Technology Protection Law each threatening maximum penalties of $43,000 in fnes and a fve-year prison sentence for game cheaters. On the less extreme end, Grand Theft Auto V punishes players’ more minor infractions by labelling them as ‘bad sports’ (IGN, 2016).The bad sport label applies for a limited amount of time, and comes with penalties that can include wearing a dunce cap in-game or only being allowed to play with other bad sports. Penalties for cheating in games are thus diverse, and range from ridicule through to exclusion, fnancial penalties and imprisonment.
Cheating in esports In the last two decades, there has been a consistent increase in commercial interest in ‘esports’: organised competitive video games in which professional gamers play against each other in tournaments that are often livestreamed (Taylor, 2012). In 2019, global esports revenues (including advertising, broadcasting rights and sponsorships) are projected to be $1.1 billion, with 454 million viewers (Russ, 2019). The stakes in esports are higher than in casual gaming, with real money on the line for winners.The 100 largest prize pools in esports history were all worth at least a million dollars and, at the time of writing, there are 83 players who have earnt at least a million dollars from esports (Esports Earnings, 2019).These prize pools are often provided by game publishers; for instance, Epic Games contributed a $100 million prize pool in the 2017–2018 Fortnite competitive season (Thier, 2018).
74
Lessons from E-Cheating in games and gambling
Therefore, the stakes are very high in esports – which heightens the importance of detecting cheating. Given the size of the prize pools available, it is perhaps unsurprising that esports has been the site of several high-profle cheating and integrity cases. In December 2018, twelve professional players of the game PlayerUnknown’s Battlegrounds (PUBG) received a range of penalties, including bans, for using cheating software (Fitch, 2019). In October 2018, a player in a Counter-Strike: Global Ofensive (CSGO) tournament had to be physically restrained when he was caught cheating, to stop him from deleting his cheating software (D’Argenio, 2018a). Six Australian esports professionals were arrested in August 2019 under the ofences of ‘engaging in conduct that corrupts a betting outcome’ and ‘use of corrupt conduct information for betting purposes’ for allegedly engaging in ‘match fxing’ by intentionally losing games of CSGO (Zhou, 2019).These are but a few of the cheating and corruption scandals that have rocked the esports world in recent years. Multiple entities have taken responsibility for addressing the problems of cheating and corruption in esports; for instance, ESL (formerly known as the Electronic Sports League), which was founded in 2000 and is the largest and oldest esports company still in operation (Graham, 2015). In addition to addressing cheating and corruption in esport, ESL has collaborated with Germany’s National Anti-Doping Agency in developing approaches to deal with drug use in esports (McCambridge, 2017). Around 20–25% of ESL’s budget goes towards anti-cheating measures, which include anti-cheating technology as well as special investigators: The confict against cheaters could be the basis for the script of a crime series. ESL has people who are not, on paper, related to the company; going undercover into online cheating boards. Similar to the virus industry, the goal is to get the latest cheats and reverse engineer them.“We basically have a cheating lab setup,” [Marcel Menge, SVP Play & Platforms at ESL] said.“It has its own internet connection, so none of our IP addresses can be linked to that room. Stuf like that is required to really compete with the cheat developers. It really is a big market.” (Ashton, 2019) In addition to game publishers and tournament organisers combatting cheating, independent bodies have formed to act as regulators of esports integrity.The largest of these is the Esports Integrity Commission (ESIC), established in 2016 (ESIC, 2019c). Just as the International Center for Academic Integrity has its ‘fundamental values’ (Fishman, 2014), ESIC also has a set of principles: integrity and respect; fair process; implementation, education and enforcement in standardised codes; recognition of sanctions; sharing of information; and confdentiality (ESIC, 2019c). In addition, the ESIC maintains a number of codes, including a Code of Ethics, a Code of Conduct, an Anti-Corruption Code and an Anti-Doping Code. The Code of Conduct in particular is worth further discussion here, as it outlines different levels of ofences, methods for reporting ofences, standards of proof and evidence, the types of sanctions that can be applied, and appeals processes (ESIC, 2019a). This code is more comprehensive than some universities’ academic
Lessons from E-Cheating in games and gambling
75
integrity procedures and, in some respects, more useful to its target audiences, as it details precisely what will happen where cheating is alleged and has an easily understandable schedule of penalties. The ESIC even explicitly details the substances that are prohibited under its anti-doping procedure (ESIC, 2019b); this is also much more advanced than the academic integrity sector, whose engagement with anti-doping has been relatively minimal (Dunn, Dawson, Bearman, & Tai, in press). However, despite the existence of bodies that claim responsibility for stopping cheating in esports, there has been little in the way of player education about integrity (with the possible exception of ESL’s anti-doping work). Instead, much of the activity in this area has focused on working with tournament organisers and vendors. ESIC Commissioner Ian Smith has stated that the industry generally deals well with the threat of cheating due to the ‘very real commercial threat’ it poses; he regards their response as appropriately ‘swift and merciless’ (Ashton, 2019). In another parallel to education, there is a range of groups in esports with different motivations for addressing cheating. ESIC Commissioner Smith notes that, while tournament organisers care most about stopping the few top percentage players from gaining an unfair advantage, game publishers are instead motivated by stopping players who cheat to ruin other players’ fun. This results in publishers focusing on prevention, whereas tournament organisers focus instead on detection (Ashton, 2019). The diferent interested parties in educational cheating – such as students, universities, accrediting bodies, employers, and educational technology companies – may also have diferent and potentially conficting motivations in addressing educational e-cheating.
Cheating in online gambling Since the 1990s, online gambling has grown from a few websites used by a small number of gamblers to a major global industry used by millions. In 2017 the online gambling market was worth $45.8 billion, with estimates that it would almost double in size by 2024 (Lock, 2019).While smaller than the video games industry, the stakes are arguably higher in online gambling because money is always on the line.The online gambling industry faces many problems, including unfair or illegal practices on the part of both players and business operators (R. J.Williams,Wood, & Parke, 2012). Players report being concerned they will be cheated, not only by other players but by the online gambling businesses themselves (Gainsbury, Parke, & Suhonen, 2013). How do players know that the other players on a virtual poker table aren’t collaborating, or if the virtual roulette wheel they are using is fair? Cheating in online gambling is a complex, multifaceted problem, made more complex by gambling’s contested legal and moral standing. As in online games and esports, specially-written cheating software can present a signifcant problem for online gambling. Casinos have the opportunity to fx the odds in ways that are neither transparent nor legal, which is cheating on the casino’s part.While players do not have that opportunity, or the chance to perform sleightof-hand moves with cards or chips, in games of skill they can still call in outside help. In much the same way that cognitive ofoading to a tool or outsourcing work
76
Lessons from E-Cheating in games and gambling
to a person can provide students with access to unauthorised information, gamblers can run bots that tell them how to play or get advice from expert players. In a faceto-face gambling situation these approaches would be easier to detect, but online it is difcult to identify. As with gaming more generally, it is challenging to tell the diference between a human and a bot in an online gambling situation (Altman, 2013; Kotkov, Pandey, & Semenov, 2018).This is troubling because bots are able to perform much better than the average player in games like poker. For example, in 2017 the Libratus bot won a 20-day poker tournament against four world champions (Solon, 2017).The ability for anybody to play at world champion level, simply by running a computer program, greatly threatens the integrity of poker. As with online games and educational e-cheating, hardware approaches can also be useful when cheating in online gambling. Some of the simplest approaches fall under Willett and Hogan’s (2019) ‘cheating the network’ category. Taking online poker as an example, players can exploit the disconnect protection built into the game, which is intended to help players whose connection has been interrupted (w34z3l, 2015). In many online poker games, if a player has already contributed money to the pot and their network connection is interrupted, they will be kept in the hand without having to match any additional bets made by other players. The maximum amount they can win is restricted based on how much they have contributed, but they no longer have to risk more money to have a chance at winning the hand; to use poker terminology, they are treated as if they had gone ‘all in’ on the hand, but with smaller stakes.To combat this cheating approach, some platforms restrict the amount of times a particular gambler can use disconnect protections. While this approach likely lessens the overuse of disconnect protections, it still permits the strategy to some extent, and it risks unfairly labelling players with poor Internet connections as cheaters. Just as face-to-face education is now threatened by e-cheating, face-to-face gambling is also now subject to e-cheating hardware attacks. For instance, Bursztein (2016) obtained a high-end poker cheating device for $1,300 that looks like a regular smartphone and is used alongside a range of accessories – like a hidden camera that looks like a car key fob and a hidden earpiece and/or haptic feedback device – to tell the player who has the best hand. Bursztein, an experienced security researcher and Google’s Anti-Fraud and Abuse Research Lead, noted that: High-end poker cheating devices are real and highly efcient.The complexity and the build quality of these devices as well as the number of games they support indicate that there is a very proftable and active black market for gambling cheating devices. (Bursztein, 2016) Hardware cheating of this type may require a signifcant investment, but it can potentially provide enough information to know whether you will win any particular hand.There are analogous approaches in online gambling that provide the same information to a player. Malware such as Win32/Spy.Odlanor (Lipovsky, 2015) has been designed to specifcally target users of large poker sites and transmit what is on their screens to the malware’s distributors.This same malware, adapted to an
Lessons from E-Cheating in games and gambling
77
educational high-stakes testing situation, would enable the e-cheating afordance of providing access to unauthorised information, as discussed in Chapter 1. The potential benefts of cheating in online gambling are signifcant enough to warrant attention from criminals. McMullan and Rege (2010) ofer three classifcations for gambling cybercriminals.‘Cybernomads’ predominantly use pre-packaged cheats developed by others and tend to operate alone.While they are part of a subculture, with online meeting places, they rarely collaborate outside of subcontracting arrangements. ‘Dot.cons’ are teams that join together for particular projects, which can be one-of or long-term collaborations. McMullan and Rege (2010) cite the case of Chris ‘BlufMagCV’Vaughn, who engaged in ‘seat-stealing’ – that is, he sold his seat in an online tournament to a more experienced collaborator in exchange for a percentage of the prize.This results in the player appearing to completely change their playstyle late in a tournament.Vaughn and his collaborator were detected by a change in IP address and denied the $1 million prize money they had earnt. Had they been joining the online tournament from the same Internet connection, perhaps via a VPN, they would not have been detected in this way.The third group of cybercriminals are ‘assemblages’: larger-scale criminal organisations that engage in activities like extortion, phishing, identity fraud and money laundering.They are structured as ongoing enterprises and engage in long-term fraud rather than the ‘short cons’ of dot.cons. Educational e-cheating equivalents for all three groups can easily be recognised: the lone cheating student as a cybernomad; a group of students colluding to cheat a particular online exam as a dot.con; and a contract cheating site as an assemblage.
Anti-cheating technologies in games and gambling The two main approaches used to thwart cheating in games and gambling are very similar to the lockdown and surveillance approaches discussed in Chapter 2. Both of these approaches have much longer histories in games and gambling than in education. Lockdown approaches depend on restricting the functionality available on a device during a gaming or gambling session. Games consoles and smartphones are useful examples of lockdown in this context.While a general-purpose computer allows for a range of software to be installed by the user, games consoles and smartphones can be set up to only allow software that has been digitally signed and/or distributed by a central authority to be installed. For example, my Nintendo Switch only allows software to be installed in two ways: through a cartridge that has been digitally signed by Nintendo, or through software purchased on Nintendo’s online store. Similarly, an iPhone only allows software purchased through Apple’s App Store to be installed. Both Nintendo and Apple are highly vigilant in patching any vulnerabilities that allow circumvention of these restrictions, and both have taken legal action against people who share ways to ‘jailbreak’ their devices to allow the use of third party software (BBC News, 2019a; Doolan, 2018).Although there are some cases of gamers cheating on consoles and smartphones, such as manipulating GPS data in the augmented reality game Pokemon Go (Zhao & Chen, 2017), gaming on these platforms has less of
78
Lessons from E-Cheating in games and gambling
a reputation for cheating than gaming on more open platforms like PCs, thanks to the use of lockdown approaches. However, while platform manufacturers may claim these high levels of restriction are motivated by a desire to stop cheating, it is likely that restrictions against unauthorised software are also motivated by a desire to prevent piracy. On the PC platform, where lockdown approaches are less prevalent, numerous surveillance tools are instead used to see if game players are cheating.These programs largely monitor software that is running or installed on the player’s computer, in order to determine if prohibited or suspicious software is also running. Common examples owned by major game publishers include Valve AntiCheat (VAC), which is installed alongside many games on the Steam platform, and Easy Anti-Cheat, which is installed alongside the game Fortnite amongst others. Competitive gaming organisers also operate their own anti-cheating software, such as FACEIT Anti-Cheat and ESL Anticheat. Some other groups also ofer standalone anti-cheat tools that are not attached to any particular publisher or games organiser, such as Kapersky, who are more well known as an antivirus company. In addition to monitoring for cheating software that is being run on the gamer’s computer, anti-cheat software also employs a range of other approaches. In 2014 VAC was controversially found to be monitoring users’ Domain Name Service (DNS) history (McCormick, 2014). DNS is a part of the underlying Internet infrastructure that is used to convert human-readable domain names (like philldawson. com) into machine-readable IP addresses (like 162.241.218.103). At the time, the story was reported as if Valve were monitoring users’ Internet history, leading to the following Q&A being posted by Valve’s president: 1. 2. 3.
Do we send your browsing history to Valve? No. Do we care what porn sites you visit? Oh, dear god, no. My brain just melted. Is Valve using its market success to go evil? I don’t think so, but you have to make the call if we are trustworthy.We try really hard to earn and keep your trust. (G. Newell, 2014)
The reason for Valve surveilling its users’ DNS requests was due to piracy prevention technologies used by cheating software creators.The particular type of highend cheating software Valve were attempting to defeat ‘phoned home’ on a regular basis to ensure that it was being used by a licensed user – that is, someone who had paid to cheat. By scanning for a particular DNS request in the user’s history,VAC was able to identify 570 cheaters. In defending Valve’s approach, G. Newell (2014) stated that secrecy is necessary for the viability of anti-cheating software. He also argued that scandals which paint VAC in a negative light are a form of ‘social engineering’ by cheaters against VAC; rather than trying to create new ways to cheat (which he regards as very difcult), cheaters can simply erode trust by ‘[casting] VAC in a sinister light’ (G. Newell, 2014). Parallels could be drawn with debates
Lessons from E-Cheating in games and gambling
79
currently underway in education around large-scale anti-cheating software vendors like text-matching and remote proctoring companies. Constant criticism of assessment security technology, including by cheating businesses, may serve to undermine trust in those technologies, which is good for cheaters. This doesn’t, however, mean the criticisms raised are not valid. Some anti-cheat tools, such as VAC, also attempt to detect if they are being run within a Virtual Machine (VM). Such detection is important because running a game within a VM allows the user to more efectively control what information is available to VAC, and thus run other software on their computer which could not otherwise be detected by VAC. In this the anti-cheat providers appear to be ahead of some online exam vendors, who do not routinely detect VMs (Chase, 2018; Dawson, 2016). The inability to run games within a VM does upset some legitimate users, such as those who run alternative operating systems, but it appears to be a necessary step; otherwise, essentially any cheating tool can be run by cheaters. Gambling anti-cheat tools use some of the same approaches that video game anti-cheat tools use, in particular looking for the presence of known cheating tools. Some gambling sites advertise the banning of players who have cheated, and in some cases they have even refunded gamblers who participated in matches where it was later found that cheating had occurred (J. Newell, 2019). However, there has been some debate around what counts as cheating and if the measures taken to stop it are efective or worthwhile.Taking the site Partypoker as one example, it regards the use of Heads Up Displays (HUDs), which ofer advice on what move to make, as cheating.To prevent the use of HUDs, Partypoker removed the ability for users to download their game history: a feature that is necessary for the creation of HUDs, but also valuable for players hoping to improve their poker skills (Smyth, 2019).Again, as in education, we see the difculty in striking balance between the interests of diferent parties and diferent goals.
What lessons can we learn? Cheating in online games and gambling has been happening for decades, and these two felds contain many possible approaches for addressing e-cheating, along with evidence of their efects. This section explores some possible lessons education could learn from these parallel felds for addressing e-cheating. Even with much bigger budgets, e-cheating cannot be completely defeated In Chapter 2, I argued that perfect assessment security was likely impossible. In this chapter, I have provided examples from video games and gambling sites with huge revenues to demonstrate that, even with substantial resourcing, e-cheating is still nearly impossible to defeat. Epic Games might have $100 million to spend on prizes at a tournament, but they cannot completely ensure it will be free of cheating. However, just as with assessment security, there are still signifcant and ongoing eforts being made to secure games and gambling from e-cheating.
80
Lessons from E-Cheating in games and gambling
Schneier is right: once an expert writes software that can do something, anybody can do that something Chapter 4 discusses Schneier’s (2018) observation that experts can create software that allows individual users to exploit vulnerabilities which previously only experts could exploit.While there are examples of pre-packaged e-cheating tools in education, some of which are discussed in the next chapter, these tools have seen relatively little engagement from the academic integrity community, with the possible exception of auto-paraphrase tools. By contrast, there are many examples of e-cheating tools for games and gambling, and combatting those tools is the mainstay of anti-cheating in these felds. I regard it as only a matter of time until specialist e-cheating software becomes one of the biggest threats to assessment security, and a major focus for academic anti-cheating software vendors.Assessment security should regard e-cheating in games and gambling as proof of concept for how bad academic e-cheating might get. Honest players are a big asset in reducing cheating It could be easy to see non-cheating gamers and gamblers as non-agents in addressing cheating, by viewing them as mere victims to cheaters and games/gambling companies. However, honest players hate cheating, and they will do what they can to fnd cheaters and report them.Videogames often provide built-in functionality to report other players anonymously, both for cheating and for other types of misconduct. They even sometimes provide democratic systems for players to vote to kick another player out of the game temporarily for cheating. While voting to punish students would probably veer too close to mob rule, universities should increasingly take the view that students are active agents in assessment security. Students see new types of cheating, and new cheaters, before staf see them, and just like honest players, honest students tend to hate cheating. Anti-cheating technologies reify particular understandings of e-cheating As in education, cheating in games and gambling is socially constructed. However, Consalvo (2007) notes that anti-cheating technologies in games work to ‘defne, stabilize, and secure specifc defnitions of cheating that occasionally may be at cross-purposes with each other as well as with player interests and activities’ (p. 12). This means that sometimes the views of big games vendors trump the views of communities of gamers who wish to play together; there is no space for ‘house rules’ when the very defnition of cheating has been turned into lines of code.Yes, sometimes vendors do respond to community outcry and change their rules, but ultimately the vendors write the rules. Similar criticisms have been made of assessment security technologies.Williamson (2019) argues that Turnitin works to ‘automate mistrust’ and contributes to the ‘robotization’ of higher education and academic writing. Through codifying academic integrity into algorithms, Williamson argues, a particular view of plagiarism has become standardised. Similarly, Introna (2015) argues that tools like Turnitin act to ‘govern’ academic
Lessons from E-Cheating in games and gambling
81
writing and make originality appear calculable; however, what they really show is ‘the diference between skillful copiers and unskillful copiers’ (p. 36). Just as in games, this computerisation of detection has led to an ongoing struggle between the algorithmic vs human ownership of decisions and processes around cheating. For example, Turnitin’s similarity index only indicates the extent to which a student’s work matches what is in a database; but there is a persistent view amongst some academics that a high similarity index means a student’s work was plagiarised (Meo & Talha, 2019; Sutherland-Smith & Carr, 2005). Consalvo’s observation that anti-cheating technologies reify particular understandings of cheating unfortunately transfers well to education. Independent bodies can regulate and keep records on individuals that span contexts One of the challenges of anti-cheating in games is that a player who gets caught cheating in one game can simply move on to the next game. At the highest levels of esports, where there are millions of dollars on the line, this could be disastrous. For this reason, the Esports Integrity Commission keeps records across the entirety of esports (ESIC, 2019c). In contrast, educational institutions tend to keep their own records. Students who are expelled from an institution for cheating can simply move on to the next institution with no record of their cheating. In Australia, students can even apply to transfer recognition of their prior studies; so, if a student cheated through two years of a degree and was only caught and expelled in one course module, they can likely get credit for almost two years of study at their next institution.An international body for educational integrity such as the ICAI could consider taking on the same record-keeping role that the ESIC plays in esports, but instead around educational cheating. Some degree of secrecy and undercover operations is necessary to keep ahead of e-cheating Across games companies, esports organisers and gambling sites, there are examples of clandestine anti-cheating operations taking place.These are diferent operations to day-to-day cheat-busting; they are more akin to research and development or ‘undercover’ investigations. Thanks to these operations, many new approaches to e-cheating tend not to last particularly long in games and gambling. By contrast, education seems wary about secret or undercover operations. My group’s work has been sensationalised as being undercover; my local newspaper The Age ran a story headlined ‘Academics go undercover to spot the telltale signs of a cheater’, which began with the line ‘Academics at a Victorian university just did exactly what they told their students they must never do – cheat’ (Cook, 2018).We have been criticised publicly and privately for taking this sort of approach in our work; however, we are of the view that, just as in the games and gambling industry, it is not enough to observe e-cheating from the outside.A robust programme of work into securing assessment will require us to buy assignments, purchase cheating tools, and try to break into exam systems.
82 Lessons from E-Cheating in games and gambling Anti-cheating can be ofered as a service In the video games industry, anti-cheating is provided as a software service.This means that each new game does not need to be accompanied by brand new anti-cheating software; instead, developers integrate with an existing tool. For example,Valve ofers VAC to developers who use their Steam games distribution platform, and publicly documents how to incorporate VAC into new games (Valve, n.d.). At the time of writing over 100 games incorporate VAC (Valve, 2020).While some assessment security technologies ofer integrations, particularly remote proctoring platforms like ProctorU (2020a), none appear to ofer publicly available Application Programming Interfaces (APIs). Integrations are more bespoke than a standardised way of doing things in assessment security, as they require much more substantial investment to develop. In addition to ofering standardised APIs specifc to assessment security, educational technology companies could also consider using some existing tools from the games industry. For example, some online exams platforms are able to be run inside a VM (Chase, 2018; Dawson, 2016); if their vendors were to use VAC, they would be able to immediately stop many types of virtualisation cheating. Collaborations between the games and education industries have the potential to develop more robust anti-cheating tools than either industry is able to produce on their own. To understand e-cheating’s future, look to games and gambling As a fnal overall learning, I think we should consider cheating in video games and online gambling as the cutting edge of e-cheating, and our likely future. The e-cheating approaches, mitigation strategies and debates that have taken place in education in recent decades are echoes of what has occurred in games and gambling in previous decades. To see where e-cheating might go in education in the next decade, we should look to where e-cheating is in games and gambling right now. Continuing the future-oriented focus of this book, the next chapter examines the implications of artifcial intelligence for e-cheating and assessment security.
Things to do • • •
Read a text on cheating in computer games, such as Consalvo’s (2007) book Cheating: Gaining advantage in videogames. Find a newly-released video game and search for cheats for that game. How easy are they to fnd? Are the cheats free, or paid? What claims are made about the detectability of those cheats? Visit a site like Multiplayer Game Hacking (mpgh.net, 2019) and look at the overlap between game cheating and other types of cheating, including educational cheating.
Note 1 In this chapter the names of several software products, mostly games, are mentioned. For the sake of saving trees, references are not provided to these products; all can be found easily through a web search.
6
E-Cheating, assessment security and artifcial intelligence
As the boundaries between what humans and computers can do become blurred, we need to think carefully about what humans and computers need to do. In an age where Artifcial Intelligence (AI) can paraphrase, does a student really need to learn to paraphrase? And in an age where AI can detect the presence or absence of contract cheating more accurately than a person, should we let computers handle decision-making? This chapter considers the roles of humans and AI in e-cheating and assessment security. It begins with a consideration of what AI actually is and what it can and cannot do. It then considers some AI-driven e-cheating challenges, as well as how AI can help assessment security.The chapter concludes by considering what humans and machines really need to do.
What AI is and what it can and cannot do According to the Oxford English Dictionary, AI can be defned as “[t]he capacity of computers or other machines to exhibit or simulate intelligent behaviour; the feld of study concerned with this” (Oxford English Dictionary, 2020d).Taking the ‘artifcial’ part of AI as the more straightforward component of the AI bigram, many scholars have focused instead on defning the ‘intelligence’ part. Legg and Hutter (2007) reviewed approximately 70 defnitions of intelligence and identifed that, across those defnitions, intelligence is largely considered: • • •
A property that an individual agent has as it interacts with its environment or environments. Related to the agent’s ability to succeed or proft with respect to some goal or objective. Depends on the agent’s ability to adapt to diferent objectives and environments (Legg & Hutter, 2007, p. 22).
Drawing these features together, Legg and Hutter (2007) provide a defnition: ‘Intelligence measures an agent’s ability to achieve goals in a wide range of environments’ (p. 22). AI is an artifcial form of this intelligence, commonly implemented in a mixture of hardware and software.
84 E-Cheating, assessment security and artifcial intelligence The problem with defning AI is that, even though precise defnitions exist, the term AI tends to be used fairly loosely, or as a way to build hype around much more mundane technologies. In a now-deleted tweet,Twitter user Baron Schwartz (@xaprb) is reported to have made this much-repeated joke: When you’re fundraising, it’s AI When you’re hiring, it’s ML1 When you’re implementing, it’s linear regression When you’re debugging, it’s printf() (Jensen, 2018, citing a now-deleted tweet by @xaprb) This joke is funny because it shows how AI is often used as a buzzword to attract funds or attention, when what is really happening behind the scenes is actually simple statistics (like linear regression) and amateurish development practices (in this case, the use of a function called printf() for debugging). Since most commercial AI assessment security products are closed-source, with little public documentation about their underlying technologies, it is hard to say to what extent these technologies are actually AI. However, for the purposes of this chapter I will take a broad and inclusive view of what AI is – from high level aspirational goals of fully-functioning intelligence that is indistinguishable from humans, through to systems that use simpler algorithms in narrowly-defned contexts. This chapter takes a forward-looking view of what AI can do, rather than being restricted to what has been proven possible already. The sort of futureoriented writing about artifcial intelligence I am engaging in here can sometimes age poorly. At the time of writing I can tell you I live in a world where cars can self-drive on the open road (somewhat), my email account does a reasonable job of fltering out spam emails, and when I use voice commands to ask my phone to do something it mostly does the right thing. None of these things were true when I completed my degree in artifcial intelligence in 2005. It is likely that by 2035 many more things will be possible – but how do we predict that? One of the earliest debates about the capability of AI was spawned by Dreyfus’ paper ‘Alchemy and AI’ (Dreyfus, 1965) and subsequent book What computers can’t do: The limits of artifcial intelligence (Dreyfus, 1972). Dreyfus was dismayed by the grand claims about what AI would be able to do. He begins his paper (Dreyfus, 1965) by identifying four predictions made by Herbert Simon in 1957 (later published as Simon & Newell, 1958) that were due to happen by 1967: • • •
That within ten years a digital computer will be the world’s chess champion, unless the rules bar it from competition. That within ten years a digital computer will discover and prove an important new mathematical theorem. That within ten years a digital computer will write music that will be accepted by critics as possessing considerable aesthetic value.
E-Cheating, assessment security and artifcial intelligence •
85
That within ten years most theories in psychology will take the form of computer programs, or of qualitative statements about the characteristics of computer programs. (Dreyfus, 1965, pp. 3–4, citing Simon & Newell, 1958, pp. 7–8).
Unfortunately, by 1967 none of these predictions had come to pass; however, there has since been some progress towards them. For example, IBM’s Deep Blue AI beat reigning world champion Gary Kasparov in one out of six chess games they played in 1996 (Campbell, Hoane, & Hsu, 2002), and computers have continued to improve markedly at chess since. Computer performance at chess has even been proposed as a yardstick for the progress of the feld of computer science at solving hard problems (Regan, 2019). Useful as they may be as a metric for the progress of AI, in 1967 these predictions had not been fulflled. Beyond technological advances,AI’s future will impact on society rather broadly. In trying to understand what potential impacts future AI might have on society, Muehlhauser (2016) looked back at previous predictions about AI and compared them against what has eventuated.The particular standard used by Muehlhauser (2016) was Müller and Bostrom’s (2016) notion of a ‘High Level Machine Intelligence’ (HLMI): an AI that can undertake most human professions to a human standard. Muehlhauser (2016) identifes 65 individual forecasts for the future of HLMIs. Many early predictions were almost science-fction-like in what they promised about the possibilities of HLMIs and how soon they would come. Muehlhauser concludes that, had he been around then, he would likely have been swept up in the hype of early predictions promising rapid development of a HLMI, particularly if he had not spent a couple days researching the problem.This is all to say that past predictions have not proven to be very accurate, as a means of prefacing the predictions I will next discuss. To understand the likely future of AI, Müller and Bostrom (2016) surveyed 170 experts to determine how long they thought it would be before a HLMI was developed. Two of the scenarios provided to the experts were how long they thought it would be before there was a 10% likelihood of a HLMI, and how long until there was a 90% likelihood of a HLMI. The median experts predicted the 10% scenario to occur by was 2040–2050. For the 90% scenario the median estimate was 2075. This suggests that, amongst experts, there is a kind of almostinevitability to HLMI – in other words, that AI will eventually be able to do everything a professional in a discipline can do. I do not wish to provide any hard predictions about what AI can do, or any rules about what will be impossible for computers to do. Instead, what I will do for the rest of this chapter is explore the possible implications of potential future states of AI, given the supposed inevitability of a HLMI, and what the implications might be for e-cheating and assessment security.
AI for e-cheating A HLMI can, by defnition, undertake the tasks students are preparing to be able to undertake upon completing their degree and entering a profession. Assuming assessment is authentic, in that it represents the tasks, contexts and challenges of
86
E-Cheating, assessment security and artifcial intelligence
real-world professional practice (Ashford-Rowe, Herrington, & Brown, 2014; Gulikers et al., 2004;Villarroel et al., 2018), a HLMI should be able to undertake any assessment task. In this case, HLMIs should be the ultimate e-cheating tool. But even if HLMIs do not eventuate, or if they are for some reason not suited to e-cheating, one of the other sub-aims of the feld of AI is passing the ‘robot college student test’ (Goertzel, Iklé, & Wigmore, 2012).This test involves the development of an AI that can meet the requirements of a degree programme all by itself. A range of eforts are therefore underway to develop AIs that, coincidentally, will be very useful for cheating.What are the implications of these eforts? AI can already meet some learning outcomes for students; do we still need to assess them? In Chapter 1, I identifed several e-cheating afordances enabled by AI that students are already using to cheat.These tools can already satisfy some learning outcomes for students; basic foreign language translation, paraphrasing and algebra can all be done for free by computer over the web. Translation, paraphrasing and algebrasolving software are a far cry from the sort of HLMI proposed by Müller and Bostrom (2016), in that they are much more narrowly scoped tools. However, their existence and use by students mean we are no longer dealing with a hypothetical. What should we do when AI can meet some learning outcomes for students? Earlier in this book I discussed the notion of cognitive ofoading. In a chapter in another book (Dawson, in press) I have discussed the challenges that cognitive ofoading poses for assessment in general. In that chapter I identify fve principles for incorporating cognitive ofoading into assessment, two of which are: 1. Specify learning outcomes and assessment with reference to the cognitive ofoading allowed 2. Programmatically ofoad lower order outcomes once mastered The underlying premise of the frst point is that cognitive ofoading should only be considered cheating if it is disallowed. Rather than remaining silent on the sorts of cognitive ofoading that students are allowed to undertake, I think we should explicitly state what is allowed and what is not allowed.These specifcations of allowable cognitive ofoading should be present on assessment task descriptions, and on the certifcates and statements of the learning outcomes that students have achieved. Has the student met an outcome with (a) no cognitive ofoading allowed (not even pen and paper); (b) some specifc cognitive ofoading allowed; or (c) with access to any cognitive ofoading they desire? If restrictions are to be placed on the sorts of cognitive ofoading that are permissible, this then becomes an assessment security challenge, as we need to have clear controls in place to demonstrate how we know students did not ofoad. Once a student has demonstrated that they have met a learning outcome without the support of cognitive ofoading, should they be free to use cognitive
E-Cheating, assessment security and artifcial intelligence
87
ofoading for that outcome when being assessed against higher-level outcomes? This is the question underpinning my second principle for cognitive ofoading in assessment. The use of calculators in senior school mathematics seems to suggest that we accept cognitive ofoading, to an extent, if the outcomes being ofoaded have been mastered. Students master basic mathematical operations in earlier schooling, and they are allowed to ofoad these to a calculator when undertaking more challenging mathematics. I advocate for a programmatic view of cognitive ofoading, whereby we should think about cognitive ofoading across an entire program of study (e.g. a degree), and consider allowing students to ofoad outcomes they have already mastered. More information about programmatic assessment can be found in Chapter 9. As an alternative view, Bearman and Luckin (in press) also examine what we should assess in a world of AI.They consider what kinds of intelligence are particularly human, and suggest that assessment should focus on these.They propose two clusters of capabilities that they think are uniquely human: meta-knowing or personal epistemologies; and perceived self-efcacy and evaluative judgement. The frst cluster of capabilities relates to understandings about knowledge, which develop over the course of a degree.The second cluster relates to understandings and beliefs about one’s own capability and the ability to accurately judge the quality of work. Rather than focusing assessment on capabilities that AI can easily undertake, Bearman and Luckin (in press) argue that we should instead assess these uniquely human capabilities. Do we need to assess learning outcomes that can be ofoaded to AI? My personal view is that we should, but not nearly as much as we currently do.There are legitimate reasons why we may wish to assess and reassess ofoadable outcomes, such as if students need to develop automaticity with those outcomes.We may also wish to prepare students for a less-than-utopian world where society’s progress is not upwards, towards greater development, but instead backwards, towards less availability of technology and the infrastructure that underpins it.Whatever we do, we need to engage in dialogue with students about what the allowable ofoads are, and we need to be clear when we credential students as having achieved particular outcomes about the particular ofoads that were permitted. What if AI becomes so good at meeting outcomes for students that we can’t spot it? At present, it is possible to satisfactorily meet some learning outcomes using cognitive ofoading tools. However, a trained eye can spot the use of some tools. If you have ever used Google Translate you might know that the translations are sometimes clunky or inelegant. But can you really tell the diference between a professionally translated text and one that was Google Translated? Here are two versions of an excerpt from the EU General Data Protection Regulation: The processing of personal data should be designed to serve mankind. The right to the protection of personal data is not an absolute right; it must be considered in relation to its function in society and be balanced against other fundamental rights, in accordance with the principle of proportionality. This
88 E-Cheating, assessment security and artifcial intelligence Regulation respects all fundamental rights and observes the freedoms and principles recognised in the Charter as enshrined in the Treaties, in particular the respect for private and family life, home and communications, the protection of personal data, freedom of thought, conscience and religion, freedom of expression and information, freedom to conduct a business, the right to an efective remedy and to a fair trial, and cultural, religious and linguistic diversity. The processing of personal data must be designed to serve humanity. The right to the protection of personal data is not an absolute right but must be considered in relation to its function in society and maintain balance with other fundamental rights, in accordance with the principle of proportionality. This Regulation respects all fundamental rights and observes the freedoms and principles recognized in the Charter as enshrined in the Treaties, in particular respect for private and family life, domicile and communications, protection of personal data. personal character, freedom of thought, conscience and religion, freedom of expression and information, freedom of enterprise, the right to efective judicial protection and fair trial, and cultural, religious and linguistic diversity. The frst of these is from the English text, and the second is from the Spanish text, run through Google Translate. This demonstrates that for some texts (likely texts for which Google has matched pairs across multiple languages), automatic translation can be very high quality.The frst sentence in particular strikes me as better in the translated version due to its more inclusive language. However, even for more obscure text the Google Translate version can be good enough to understand; take these two translations from German: The company anticipates that without building additional sections, breakeven can be achieved at year-end or certainly by Q1 of 2002. Cost improvement programs were accordingly initiated in October 2001, to lower the burn rate from the current level of € 200 to € 100. Based on realistic expectations, the company also assumes that the breakeven point can be reached at the end of the year or in the frst quarter of 2002 at the latest (without expanding further routes). In order to achieve this, corresponding cost adjustment programs have already been implemented, which will lead to a reduction in the money-burning rate from € 200 to € 100 (from October 2001). The frst text is a sample translation by a professional translation service (Premier Translations, 2011), and the second is an automatic translation by Google Translate in December 2019. For me, while there are some diferent word choices between the two texts, both texts communicate the same fundamental ideas, and the automatically translated text could be of sufcient quality to pass at an Australian university. There is nothing obviously ‘robotic’ about it. In coming years, translation software may become so adept that even experts will not be able to spot computer translations. What can we do when assessing language translations – or any other ofoadable tasks – if we cannot spot the AI anymore?
E-Cheating, assessment security and artifcial intelligence
89
In the previous subsection I argued that we need to assess the capabilities that computers cannot undertake, because those are the capabilities students will need to have. However, in the near future we may need to stop assessing AI-ofoadable capabilities simply because we will no longer be able to tell if they have been ofoaded.Yes, we may still be able to tell if these capabilities have been ofoaded through the use of in-person invigilation/proctoring; but, with the shift towards online education and self-paced take-home tasks, there are fewer opportunities to physically witness a student demonstrating a capability without the use of ofoading. Chapter 9 discusses some ways to achieve a happy medium through assessment design. But in general, if a computer can do something, and we can’t tell if the computer did it, we need a very good reason to continue to assess students without the computer’s help. Where is the boundary between AI help and AI cheating? While it is easy to construct a binary of cheating vs acting with integrity, in reality the boundaries are often blurrier. Private tutors regularly face the challenge of helping students but not helping too much. Some academic support schemes even institute rules around not helping with assessed work, as a way to ensure this line is not crossed (e.g. various programmes under the banner of ‘Supplemental Instruction’, ‘Peer Assisted Learning’, ‘Peer Assisted Study Sessions’ etc., Martin & Arendale, 1993). It is even harder for technology tools to navigate the murky ethical territory between cheating and legitimate assistance. Google’s Socratic app explicitly aims to help high school and university students with their homework. Students can take a photograph of their homework, and Socratic will provide step-by-step solutions as well as other materials to help. Prior to its acquisition by Google, one of Socratic’s developers posted something of a philosophy for the app (Bhansali, 2016). In it, he describes the problem of cheating as one that is driven by students being underserved, in much the same way that media piracy was a problem of consumers being underserved. The media piracy problem was, to an extent, solved by the introduction of streaming services like Netfix and Spotify, and Bhansali (2016) argues that the cheating problem might similarly be solved by providing better and more readily available help for students. The current academic support environment is in his view ‘inequitable’: ‘some students have tutors, some have parents who can help, some have smart friends, many have no one’ (Bhansali, 2016). The problem is, while Socratic clearly view their app not as cheating but as an alternative for students who might otherwise be tempted to cheat, this view is not universally shared. Common Sense Media, who rate various media, games and apps for children, summarise their review of Socratic thus: ‘Instant homework help with resources; watch for cheating’ (McQuillen, 2020). Other online critics share similar concerns about Socratic (e.g. Schirr, 2019), and there have been more sensationalised reports in mainstream media (e.g. ‘The ultimate exam cheat’ in the Daily Mail, MacDonald, 2017). Is using Socratic – or one of a range of similar apps, which make up around one in ten of the top 100 education apps on the Apple App Store (Schirr, 2019) – cheating? Or is it merely help, akin to the private tutor or helpful friend its creators
90
E-Cheating, assessment security and artifcial intelligence
claim it to be? While Socratic has fairly limited capabilities today, as it becomes more sophisticated it may edge closer to passing the robot college student test all by itself. We, as an education community, need to engage in sophisticated debate about the boundaries between help and cheating.
AI for assessment security While AI has grabbed many headlines through its potential to help students cheat, it has also become a prominent player in debates about the future of stopping cheating. Several technology companies have been marketing the AI underpinnings of their new cheating detection software. But do they actually detect cheating, and can they be used to actually prove cases of e-cheating? Can AI detect e-cheating? Given the scope of the use of AI in cheating detection, somebody is probably being monitored by AI for e-cheating right now. Remote proctoring in particular has been a key site for the use of AI in cheating detection. Multiple proctoring companies claim that their AI algorithms are highly efective in detecting suspicious behaviour, such as ProctorU’s UAuto (ProctorU, 2020c), and AIProctor (2018). While I am not aware of any independent peer-reviewed research evidencing the efectiveness of AI proctoring, and I have been prevented from conducting such studies by some vendors, I think vendors’ claims that these technologies catch some amount of cheating are probably true.These tools look for a range of diferent indicators of potential cheating, and use proprietary algorithms to determine what combination of these indicators are worth fagging for human proctors to follow up. Beyond AI proctoring, there is a range of other uses for AI in cheating detection that appear to be efective. My team has investigated Turnitin’s Authorship tool and found that, when using it, markers were more efective at detecting contract cheating (Dawson et al., 2020). That tool uses a range of stylometric approaches, in conjunction with Turnitin’s database of a student’s past work, to determine if new submissions match up with the style of previous submissions. Other vendors also claim to ofer similar AI-based approaches to detecting contract cheating, such as UniCheck’s EMMA.While these approaches are not perfect, I am convinced that they can assist in the detection of contract cheating. Can AI approaches detect e-cheating? I believe that they can – but with two caveats. Firstly, AI lacks a sufciently complex and nuanced social understanding around the nature of cheating to allow it to truly know if something is cheating or not.And secondly, I think a human should always be involved in making a decision to follow up any suspected cheating. Can AI be the decision-maker? I have split ‘detection’ and ‘decision-making’ into two separate aspects of assessment security, as I think AI has diferent roles to play in each. Detection is the
E-Cheating, assessment security and artifcial intelligence
91
process of building a suspicion that e-cheating has occurred.At the detection stage we need to privilege efciency; many academics do not report breaches of academic integrity like contract cheating simply due to their perception that it is too time-consuming (Harper et al., 2018). Decision-making is the process of reviewing the available evidence and reaching a conclusion as to whether e-cheating has occurred, along with determining any potential sanctions.At the decision-making stage we need to privilege accuracy. AI is very efective at learning patterns and identifying deviations from those patterns. However, deviation from a pattern does not necessarily mean cheating. We have learnt this from how people handle Turnitin similarity scores. Blind faith in the score can lead to many ‘false positives’: instances where cheating is thought to be indicated by an algorithm, but is not actually present.The preferred approach at many institutions is to have a human interpret each similarity report and decide how to proceed from there. But what if advancements in AI, whether full-blown HLMI or just specialised AI decision-making tools, allow AI to truly perform this task as well as a human? What would it even mean for an AI to perform academic integrity decisionmaking? Drawing from the literature on AI and social policy, as well as the academic integrity literature, I wish to propose fve principles for the use of AI in academic integrity decision-making: explainability and transparency; accuracy; fairness; accountability; and empathy. Explainability and transparency Some AI tools are ‘black boxes’ in that their internal logics are not able to be understood. It is not possible to tell how a black box AI really works even with access to the program source code, because it infers its rules from large datasets. Looking at the inner workings of a black box AI will not reveal the rules the AI has generated from the data. Unfortunately, black box approaches are often more accurate than ‘white box’ approaches, which are able to explain their reasoning (Kuhn & Johnson, 2013). Despite this, I think we need to push for explainability in AI systems for academic integrity, because students accused of cheating deserve to know the evidence and reasoning behind the accusation. Explainability is essential in systems that assist decision-making in other contexts, such as judicial decisionmaking (Deeks, 2019) and is, I think, no less necessary in academic integrity decision-making. In addition to being able to explain particular decisions, an AI for academic integrity decision-making needs to be transparent about its general workings. Transparency is more than just being able to audit the source code or use the tool; it also demands that the underlying logic or rationale for decision-making needs to be made public (Ananny & Crawford, 2016). These logics need to be understandable and justifable, perhaps using the concept of public reason:‘rules, institutions and decisions need to be justifable by common principles, rather than hinging on controversial propositions which citizens might reasonably reject’ (Binns, 2018, p. 545). There needs to be endorsement from students,
92
E-Cheating, assessment security and artifcial intelligence
educators, and broader society about the rules that will be used by AI to make decisions about cheating. Accuracy Given the very high stakes of academic integrity, the decisions made by an academic integrity AI should be at least as accurate as decisions made by a human. However, accuracy is a complex concept. In our work on contract cheating detection, we have used the following statistics, any of which could be considered as candidates for ‘accuracy’: • • • • • • • •
Correct classifcation rate: the proportion of cases that are correctly identifed as cheating or not cheating; Sensitivity: the proportion of cheating cases that are correctly fagged as cheating; Specifcity: the proportion of non-cheating cases that are correctly identifed as not being cheating; Predictive value of a positive test: the likelihood that any case identifed as cheating is actually cheating; Predictive value of a negative test: the likelihood that any case identifed as not cheating is actually not cheating; Incorrect classifcation rate: the proportion of cases that are incorrectly identifed as cheating or not cheating – the inverse of correct classifcation rate; False positive rate: the proportion of non-cheating cases that are incorrectly fagged as cheating – the inverse of specifcity; and False negative rate: the proportion of cheating cases that are not fagged as cheating – the inverse of sensitivity.
No single instance of the above statistics can tell the full story about accuracy. For example, the correct classifcation rate can tell us if the decisions made by a system were correct overall, but it can’t tell us if the mistakes made were in the form of false positives or false negatives. This matters, because we usually tend to weight false positives and false negatives diferently in diferent contexts. I have a 6 cm scar on my back thanks to a false positive from a doctor who thought I had a melanoma; I am happy to have the scar, because the consequences of a false negative (there being a melanoma but me not knowing about it) are much worse. In contrast, I would be very annoyed if a false positive on my university’s spam email detection system led to a single legitimate email being marked as spam, but I am not particularly worried about the dozen or so spam emails I have to process each day. Decision-making about cheating is probably closer to spam emails than melanoma: a single case of a non-cheating student being punished as a cheater is worse than multiple cheating students going unpunished. But how much worse is it? This is something which we will need to engage in deep soulsearching about if AI is to ever take on decision-making responsibilities in academic integrity.
E-Cheating, assessment security and artifcial intelligence
93
As a cautionary tale of the problem of inaccurate AI punishing people suspected of having defrauded a diferent type of system, we can look to the ‘Robodebt’ incident, in which 900,000 Australians had judgements made about them owing money to the government welfare agency Centrelink (Carney, 2018). These judgements were made automatically by an AI tool that had been promised to deliver AU$2.1 billion (roughly US$1.4 billion) in budget savings by recovering debts. However, the system was widely reported to be inaccurate and, as of February 2019, more than 2,000 people had died since receiving a Robodebt letter (Whyte, 2019). Some of these deaths have been attributed to suicide, and the advocacy site #NotMyDebt maintains a database of stories documenting the tragic consequences of incorrect AI decision-making on individuals’ lives (#NotMyDebt, 2020). Incorrect judgements about cheating would likely have similarly destructive consequences, so it is essential that any AI systems in academic integrity have high accuracy and, in particular, high specifcity. In addition to being accurate when frst deployed, an AI for making decisions about cheating would also need to stay accurate over time and across the contexts in which it is used. As in video games (Consalvo, 2007), understandings about cheating change, and there is a danger that if they are computerised they will become infexible.Accurate decisions are those which represent the best eforts to synthesise the views of the feld of academic integrity; accuracy does not mean consistency with a view of cheating developed by a technology company. Fairness A system can be more accurate than a human but still be less fair. Imagine a hypothetical AI for contract cheating decision-making. This AI can perfectly prove all cases of contract cheating in L2 writers, such as people with English as an additional language. However, it wrongly accuses 5% of legitimate L2 writers of contract cheating, and only spots 70% of contract cheating in L1 writers with no false positives. Depending on the proportion of L1 and L2 writers in a given cohort, this system would be much more accurate than a human in terms of correct classifcation rate, sensitivity and specifcity. However, it might be less fair, as it is more likely to have true and false positives for L2 writers than L1 writers. An AI system for academic integrity decision-making should be fair. Unfortunately, there are numerous examples of AI systems that have demonstrated themselves to be unfair. From beauty competitions (Levin, 2016) to criminal detection systems (Wu & Zhang, 2016) and chatbots (Nef & Nagy, 2016),AI has repeatedly demonstrated its ability to learn biased and even bigoted logic from data. If AI is trained on data without careful human oversight, it may learn biases from that data. Existing approaches to AI often use a What You See Is What You Get (WYSIWYG) view of data: that what is in the data is what is in reality (Friedler, Scheidegger, & Venkatasubramanian, 2016). This means that biases in the data become facts to the AI. If one subset of students cheated more in a dataset used for training the AI, the AI might view that subset of students as more likely to cheat when looking at real cases.To address this problem, AI researchers are working on alternative perspectives, such as We’re All Equal (Friedler et al., 2016), which is a set
94
E-Cheating, assessment security and artifcial intelligence
of mathematical approaches to addressing the inherent biases in datasets. Careful consideration needs to be given to matters of representation and bias in any potential AI for academic integrity decision-making. Accountability The notion of ‘algorithmic accountability’ has gained traction recently. It has a variety of meanings, and it could be interpreted as covering all fve principles discussed here. But I would like to draw particular attention to algorithmic accountability in terms of risk management, regulation, and handling of critical incidents. This is the focus of the proposed US Algorithmic Accountability Act of 2019 (Clarke, 2019). Under that proposed legislation, an AI system for academic integrity decision-making would be regarded as a ‘high risk’ system. If the Bill were passed into law, vendors of such an academic integrity AI would be required to conduct an ‘automated decision system impact assessment’: a study evaluating an automated decision system and the automated decision system’s development process, including the design and training data of the automated decision system, for impacts on accuracy, fairness, bias, discrimination, privacy, and security that includes, at a minimum— (A) a detailed description of the automated decision system, its design, its training, data, and its purpose; (B) an assessment of the relative benefts and costs of the automated decision system in light of its purpose, taking into account relevant factors, including— (i) data minimization practices; (ii) the duration for which personal information and the results of the automated decision system are stored; (iii) what information about the automated decision system is available to consumers; (iv) the extent to which consumers have access to the results of the automated decision system and may correct or object to its results; and (v) the recipients of the results of the automated decision system; (C) an assessment of the risks posed by the automated decision system to the privacy or security of personal information of consumers and the risks that the automated decision system may result in or contribute to inaccurate, unfair, biased, or discriminatory decisions impacting consumers; and (D) the measures the covered entity will employ to minimize the risks described in subparagraph (C), including technological and physical safeguards. (Clarke, 2019)
E-Cheating, assessment security and artifcial intelligence
95
Even if this legislation does not pass, the framework it puts forth for impact assessments could still be useful to technology companies for maintaining accountability for AI academic integrity systems. Educational institutions should also demand such impact assessments from any potential vendors of this sort of product. Empathy In 2020, a letter from a New Zealand university to an international student was shared widely on Twitter and reported in the media (Hutt, 2020).An excerpt from the letter reads: I understand that you were recently admitted to [a local mental health facility] following life threatening overdoses which required life-saving medical intervention in [a local hospital]. It is clear there has been a change to the state of your mental health leading up to these events, and you did not promptly inform the International Ofce about this change as you were required to do under your Enrolment Conditions (attached) The letter goes on to inform the student her enrolment has been cancelled for breach of her enrolment conditions.The letter has been criticised broadly for lacking empathy, both in the decision and the communication of the decision. This letter was sent by a human. As the letter above demonstrates, through its absence, empathy is crucial in communication and decision-making about serious student matters.While guidelines can be helpful in improving the consistency of decisions and sanctions, they should not impinge on the ability to show empathy towards students in matters of cheating. Any AI system designed to make and communicate decisions about academic integrity would need to have empathy for students. Fortunately, the feld of AI has been engaged in work around empathy and, in particular, empathy in teaching and learning contexts (e.g. Cooper, Brna, & Martins, 2000; Potapov & Rodionov, 2014), so it is not so far-fetched to imagine a system that might one day consider the real people it makes decisions about. In my opinion, we are not yet at the point where AI can uphold the fve principles put forth here, and because of that, I do not think AI is ready to be the decision-maker in cases of suspected cheating. However, as time goes on,AI – and our processes surrounding it – are likely to improve. If you fnd yourself in a situation where you are asked to consider ofoading academic integrity decisionmaking to any system, I encourage you to consider the fve principles of explainability and transparency, accuracy, fairness, accountability, and empathy. AI is imperfect but it scales well We should not be too quick to dismiss the potential roles for AI in improving assessment security. While AI might not yet be ready to hold decision-making authority, it can nevertheless support decision-making; this is an anticipated role
96
E-Cheating, assessment security and artifcial intelligence
for AI in other contexts like judicial decision-making (Deeks, 2019). In the context of cheating detection and decision-making, the potential for workload savings is a key attraction for AI.We already know that academics frequently do not investigate suspected contract cheating or follow their institutional procedures around reporting it, often due to workload (Harper et al., 2018) – it just takes too much time to gather the required evidence, and the related committee processes are enormously time-consuming. AI can scale well in terms of gathering evidence for these processes, and it also holds potential for providing logistical and data-processing support to make these processes more efcient. One criticism of AI approaches is that they are not perfect at detecting cheating. While I share these concerns, especially where they relate to proprietary black box AI that has not been independently tested, I think we should be wary of setting the bar too high. In our work with humans on the detection of contract cheating we have found that, with training, our markers could spot 82% of contract cheating (Dawson & Sutherland-Smith, 2019).When we have proposed training as a potential approach to improve detection, we have faced criticism that this sensitivity was too low. In my opinion, anything with a sensitivity over 50% should be considered highly efective when used across multiple assessments.Would you persist in contract cheating if, every time you did so, there was a 50% chance you would be found out and expelled from your institution? We should be similarly generous when it comes to our aspirations for the proportion of cheating that AI should be able to detect and/ or prove. However, we should probably not be so generous with regards to specifcity. False positives can be damning; and past instances of misinterpreted Turnitin similarity scores being considered incontrovertible evidence of cheating suggests that even careful messaging might not override reporting that lacks specifcity.
What will humans and machines need to do? Returning to the question posed at the start of this chapter, what should humans and machines do? I am partial to the argument put forth by Bearman and Luckin (in press): that we should base this decision around what each one does best. However – and at the risk of falling into the hype that Dreyfus (1965, 1972) criticised – I think AI is eventually going to be able to fulfl much of what currently falls on the human side of the ledger. Cheating is ultimately about ethics, and while that is something we may be able to outsource to machines, we should not do so. I think we should allow students to use AI, but we should require them to do so in a transparent way so we can clearly assess what they have personally done. We should develop AI systems to detect and evidence cases of cheating, but we need humans to decide what cheating actually is and what indicators of cheating we are comfortable with considering. Ultimately, we need to persist with a ‘human in the loop’ (Zanzotto, 2019) approach, whereby humans and AI work together, rather than believing AI can replace people. In assessment security, artifcial intelligence should support, not supplant, human intelligence.
E-Cheating, assessment security and artifcial intelligence
97
Things to do •
•
•
Find some AI tools for cheating, download them and try to use them. At the time of writing there are a few lists of cheating apps, such as Schirr’s (2019) list which focuses on the Apple App Store, and Common Sense Media’s list (Elgersma, 2019) which is more targeted towards parents of school children. Try these apps out on your assessments. Are there some assessments you have which are more challenging to cheat on with apps? Sign up to free trials of some assessment security tools. At various points in time some businesses ofer opportunities to test out their various apps; at the moment some vendors of stylometric and AI proctoring tools ofer free trials. Find a piece of text that that has been professionally translated into your language. Run the original language text through a translation tool, like Google Translate, into your language.What is the diference between the two?
Note 1 Machine Learning.
7
Surveillance and the weaponisation of academic integrity
In 2013, Edward Snowden’s revelations about the scope of government surveillance drew headlines internationally. Snowden spoke of ‘suspicionless surveillance’: the widespread and routine application of surveillance to people who were not considered to be doing anything wrong (Snowden & MacAskill, 2013). This concept has been taken up by surveillance scholars, most notably Lyon (2017, 2018), and it is closely related to dataveillance, which is ‘the continuous tracking of (meta)data for unstated preset purposes’ (van Dijck, 2014). Even before these specifc terms had been used by Snowden or van Dijck, analogous types of surveillance had been raised as a problem in the context of academic integrity; for example, in Zwagerman’s (2008) deliberations on plagiarism detection: Call surveillance technology what we will – deterring dishonesty, promoting originality, leveling the playing feld – there is no way to honestly call it anything other than forcing students, most of whom we have no reason to suspect, to prove their innocence (Zwagerman, 2008, p. 694). Since the advent of text-matching software, suspicionless surveillance has become a routine part of much educational assessment practice; however, it could probably also be argued that suspicionless surveillance dates back to the inception of the invigilated/proctored examination. This chapter connects work about surveillance, and surveillance in academic integrity, with some of the ideas discussed in this book. It then raises futureoriented problems, such as the problems that may occur when we know everybody’s integrity missteps (including yours and mine) and the way that this can be weaponised against individuals. The chapter also looks at the alternatives put forward by critics of surveillance approaches, and asks what evidence would be required to make surveillance acceptable given its problems in an academic integrity context. Throughout, I’ll attempt to faithfully represent arguments I disagree with – however, if you’ve gotten this far through the book, you’ll have noticed I’m OK with a degree of surveillance, as long as there is evidence that it actually works.
Surveillance and the weaponisation of academic integrity
99
Academic integrity, e-cheating, assessment security and surveillance culture Lyon gives a working defnition for surveillance as ‘the operations and experiences of gathering and analysing personal data for infuence, entitlement and management’ (Lyon, 2018, Introduction, Section 2, para. 2). In an academic integrity context, tools like stylometrics, content-matching and proctoring technologies ft within this understanding of surveillance. Each is primarily a tool for gathering and analysing personal data, such as student work samples, biometric information and behavioural data. These data are then used to infuence what students do, make decisions about accreditation, and discipline students. Assessment security technologies are often used for suspicionless surveillance, as they are used in a routine manner on entire cohorts of students, rather than being targeted at particular students suspected of cheating. For example, many institutions engage in suspicionless surveillance by checking all student work against a content-matching tool (e.g. Turnitin) or requiring that all online examinations be remote proctored.Alternatives to suspicionless surveillance include targeted surveillance of particular students who are thought to have engaged in cheating, or avoiding surveillance entirely. Over recent years, surveillance has shifted into what Lyon calls ‘surveillance culture’ (2017, 2018). No longer is surveillance simply something done by security agencies for suspected criminals or terrorists; surveillance is now carried out by a range of government and commercial entities. Citizens’ roles in surveillance have changed, from having surveillance done ‘to’ them, to being active participants in surveillance, such as when we enthusiastically share a picture on social media or virtually check in to a location. Surveillance is sold to us, the people being surveilled, as ‘for our own good’ (Lyon, 2017, p. 827); not only will surveillance help in addressing threats like terrorism (Cayford & Pieters, 2018), it will also provide us with insights into our own lives, such as our level of physical activity.The boundaries between who uses and owns this data are now leaky, with corporations, universities and governments sharing data. In surveillance culture, surveillance is part of the status quo, and questioning surveillance or proposing to regulate it is almost a subversive act.With this shift into surveillance culture, Lyon (2017, 2018) argues we need to broaden our focus, in thinking about surveillance, from the acts and apparatus of surveillance to the people and cultures of surveillance. Is higher education academic integrity a surveillance culture? Surveillance has been part of education since the frst invigilated examination, but the shift to surveillance culture requires a sort of ownership, engagement or active participation by those surveilled. The rise of student-initiated voluntary surveillance – such as use of Grammarly, which records all the typing a student does, or the now-defunct WriteCheck (Turnitin, 2017) service that allowed students to pay for a formative Turnitin report – suggests this criterion is met. And there are other similarities to society’s broader culture of surveillance. Surveillance in education is also sold to students as being ‘for their own good’: without surveillance, all those other students who cheat will pass and your degree will be worthless; and with surveillance, you will gain useful insights into your own work. The borders between the actors of surveillance are blurred, as in society in general, with institutions, governments and
100
Surveillance and the weaponisation of academic integrity
billion-dollar companies sharing data across international boundaries; as an aside, I was recently cold-called by a vendor ofering to demonstrate how the use of video surveillance and international passport databases could secure the authentication of in-person examinations. All of this surveillance takes place in a routine, mass and suspicionless manner. Higher education academic integrity does appear to have the hallmarks of a surveillance culture. I am not the frst to prosecute a case for higher education having a surveillance culture when it comes to cheating and academic integrity. For example, Zwagerman (2008) refers to plagiarism detection, even when done manually, as a type of surveillance. Referring specifcally to Turnitin, Penketh and Beaumont (2014) note that students associate Turnitin with the idea of surveillance. Ross and Macleod (2018) observe that the thinking and justifcation underlying anti-plagiarism technology is one of surveillance: Logics of surveillance are strongly at work in practices which attempt to regulate student behaviour through the exposure of their writing to algorithmic scanning and monitoring. These logics frame students as in need of careful monitoring to ensure learning and teaching runs smoothly, and framing academic writing as a space of dishonesty which is both rampant and solvable through technology. (Ross & Macleod, 2018, p. 235) Ross and Macleod (2018) build on Lyon’s (2017) work, and come to the same conclusion that higher education academic integrity has a surveillance culture. They go on to propose that students and staf should resist surveillance culture: if one views surveillance culture as negative then it is makes sense to resist it, as surveillance culture cannot function without the actions of the individuals within it. Just as remote proctored examinations might induct students into unsafe cybersecurity practices, as discussed in Chapter 4, higher education’s surveillance culture might also induct students into broader society’s surveillance cultures; as Lyon notes, ‘becoming familiar with objects and technologies in one domain may normalize those in the other’ (Lyon, 2018, Introduction, Section 1, para. 12).Academics who are willing to tolerate or support higher education’s surveillance culture, but who are uncomfortable with broader societal surveillance cultures, may wish to think carefully about the efects of university surveillance on students long after they have graduated.What are we teaching through our surveillance culture?
Who will surveil students and to what end? In a past age, where the fullest extent of surveillance was exam invigilation, it was easy to identify who was doing the surveillance. Specially-employed trained invigilators surveilled, and this surveillance was reported back to the institution. Universities and their employees are still major surveillers, but a range of other parties have joined as well. The biggest entrants are technology companies, and these are not just the assessment security vendors mentioned in this book so far. Other technology vendors – such as learning management system providers,
Surveillance and the weaponisation of academic integrity
101
publishers, Internet providers and email platforms – also surveil students on behalf of institutions. Anti-cheating rationales for this surveillance are easy to come up with, and I have heard arguments that the scope-creep of surveillance on these platforms is justifed by identifying and stopping cheating. Technology companies do much of the operationalising of surveillance in the name of academic integrity. But might they have ulterior motives beyond profting from providing surveillance services? Content-matching platforms, for instance, depend on amassing large databases of content. Surveillance is a way to build and refne these models.When Turnitin sold for $1.735 billion in 2019, concerns were raised that their massive database of student work could be used for purposes unrelated to improving their academic integrity products, such as for developing marketing campaigns targeted to students (McMurtrie, 2019). One critic described the Turnitin sale as part of a pretty common end game for tech companies, especially ones that trafc in human data: create a large base of users, collect their data, monetize that data in ways that help assess its value, leverage that valuation in an acquisition deal (Jesse Stommel, executive director of the Division of Teaching and Learning Technologies at University of Mary Washington, as quoted in Johnson, 2019). Critics argued that Turnitin’s student database, rather than their content-matching services, was the real ‘product’ the company was developing and selling. If academic integrity surveillance ends up being captured within society’s broader surveillance missions, we may also see greater partnerships with other surveillance entities. The CIA has openly stated its desire to ‘collect everything and hang onto it forever’ (Hunt, 2013), and this would include data of interest to institutions seeking to detect cheating. Conversely, much of the data collected by educational institutions when stopping cheating could also be of interest to intelligence agencies.This extends beyond traditional surveillance approaches, like monitoring which library books a student borrows or the websites they click on; it includes much more personal information like biometrics, keystroke habits and stylometric information that could be used to identify a current or former student. For example, there is a history of ‘anti-radicalisation’ eforts to profle and identify students who may be potential future extremists (Kundnani, 2012) and the education sector’s eforts to surveil students may support that mission. The ostensibly palatable and minimal scope for academic integrity surveillance may be unlikely to survive when a security agency presents a warrant to access student data.And the separateness of assessment security surveillance and society’s broader surveillance approaches is not necessarily assured in the long term. Finally, it is also worth considering cheating surveillance from the other side: how might cheating providers beneft from surveilling students? As with many online businesses, cheating services use sophisticated online shopfronts that likely provide a range of analytics (Ellis et al., 2018; Rowland et al., 2017).This most basic form of surveillance for service improvement is probably happening, as services vie for competitive advantages in attracting more customers. But students should be particularly wary of those sites that demand they hand over extensive personal
102 Surveillance and the weaponisation of academic integrity information. Many sites demand ‘passports, driver’s licences, visas, names, addresses, phone numbers and contact details’ (Sutherland-Smith & Dullaghan, 2019, p. 1160), which can be joined with other data the site collects and used with cookies to create highly specifc and valuable tracking technologies. Some sites even demand students’ institutional usernames and passwords, which allows them to set up all manner of surveillance technologies inside the student’s institutional account that can persist long after the student has changed their password. Such methods are all part of a broader academic integrity culture of surveillance that extends beyond trusted entities like universities.
Routine surveillance will catch too many Surveillance technologies are getting bigger, better, and more all-encompassing. I am glad I’m an academic right now and not a student; while I think the essays and computer programs I wrote in my undergraduate degree are plagiarism-free, I was a student in the early 2000s, prior to widespread surveillance in the form of content-matching tools. As I did not have access to these tools in a formative capacity, I think there is every possibility that retrospective use of content-matching or other assessment security tools would fnd unintentional plagiarism in my undergraduate work.The closer we get to perfect surveillance that can identify any breaches of academic integrity, the more cases of suspected cheating we will fnd. This will bring with it a range of problems. The frst challenge that increased and more accurate surveillance will bring, I think, is that much more inadvertent, accidental cheating will be caught.We have already seen this happen with content-matching tools, and future technologies have the potential to create even greater workloads for all involved. Students will have to run through their work with a fne-toothed comb and write in an incredibly defensive manner to ensure they do not accidentally plagiarise sources they may not have even read.Academics will see more cases, by some orders of magnitude, than they have seen in the past. Disciplinary panels will need to fnd ways to process the increased caseload. Given the massive workload that routine surveillance will create, it may be tempting to ofoad this task to a computer system. Chapter 6 provides a set of requirements for such an AI, which I think we are far from meeting. But, in addition, such a depersonalised and routinised system could make cheating feel more like a speeding ticket and less like a moral or ethical problem. Given that the penalties would have to be greatly reduced if we were to catch all the cheating that actually happens – lest we end up expelling a signifcant portion of our student body – cheating could lose the stigma it currently holds. If we are to continue along the path of ever-improving suspicionless surveillance capabilities, we will need to reconsider workload and penalties.
Trust, surveillance and gamifcation One of the biggest criticisms targeted at surveillance approaches is that they create a culture of distrust. Carless (2009) observes that there are multiple complex trust
Surveillance and the weaponisation of academic integrity
103
relationships at play in assessment, and that diferent acts can strengthen or weaken trust. These trust relationships are not just between teachers and students; they include management’s (mis)trust of staf, and students’ (mis)trust of each other. (Mis)trust guides a variety of assessment decisions, such as when examinations are used instead of take-home tasks because the assessor does not trust students not to cheat. Building on Carless’ (2009) work on trust and distrust in assessment, Ross and Macleod (2018) argue that surveillance technologies make teacher-student relationships ‘risk-averse and mutually suspicious’ (p. 235). Going even further, Williamson (2019) contends that not only do these technologies ‘automate mistrust’, they are also invested in ‘manufacturing and maintaining mistrust’ given their proftability depends on the very problem they exist to solve. Introna (2015) argues that distrust even extends to tutors, who begin to distrust their own capabilities and instead place their trust in surveillance technologies when it comes to detecting plagiarism. McArthur (2018) calls out Turnitin and similar tools as being the ‘plagiarism industry’, whose infuence we should fear more than any supposed plagiarism pandemic. The argument that cheating surveillance builds, and is built upon, a culture of mistrust has many proponents and compelling contentions. To me, however, the larger issue is that suspicionless surveillance ‘gamifes’ academic integrity. Surveillance technologies turn the issues of cheating and assessment security into an algorithmic problem with hard technical rules, rather than something grounded in trust and context (Introna, 2015). Despite best eforts to use these tools as change agents for promoting academic integrity, they end up being the arbiters of whether or not cheating has occurred.This has consequences that go beyond just problems of cheating; as one student in Penketh and Beaumont’s (2014) study noted, when asked if Turnitin promoted redrafting of their work: I think it did but not always for the better I think because I would change something that I was quite happy with because Turnitin said it wasn’t happy with it. (Penketh & Beaumont, 2014, p. 100) For this student, the ‘game’ of beating Turnitin was unfortunately more important than the quality of the work they were producing. Routine suspicionless surveillance means that losing this game, even once, is enough to be branded a cheat. Mistrust is at the heart of this game; rather than being like other aspects of assessment, where we expect students to be able to consistently do something once they have demonstrated their capability to do so, integrity becomes something that needs to be verifed every single time.
The future of surveillance is weaponisation Cheating detection has become a weapon, to be used selectively against one’s enemies – and we’re only now beginning to see the damage it can cause. Some fgures in public life can weather a plagiarism scandal, such as Vladimir Putin, whose economics thesis was found to have copied more than 16 pages almost verbatim from a textbook (Danchenko & Gaddy, 2006). However, for others it can be
104 Surveillance and the weaponisation of academic integrity career-ending. In 2013, the then–German education minister Annette Schavan resigned when her doctorate was revoked by her alma mater, after an anonymous person reported it for containing plagiarism; this followed the resignation of KarlTheodor zu Guttenberg, then–German Minister for Defence, who was nicknamed ‘Dr Cut-and-Paste’ and ‘Dr zu Googleberg’ ("German minister Annette Schavan quits over 'plagiarism'," 2013).All of these examples are allegedly cases of extensive, intentional plagiarism. But if we are to live in a world where suspicionless surveillance is the norm, and substantially more data is collected about students, personally- or politically-motivated cheating accusations may become more common. Chapter 2 introduces the concept of ‘stealth assessment security’ as a way to identify cheating, building on Shute and Ventura’s (2013) work on stealth assessment. Under this approach, students would be surveilled constantly and any potential cheating would be fagged.While this is already troubling from a surveillance perspective, the potential for future academic integrity technologies to use past surveillance data makes it even more concerning. New technologies can identify cheating in ways that older technologies could not. Running pre-2000s student assignments through content-matching tools would probably fnd signifcant copypaste plagiarism. Doing the same with pre-2020s assignments and the sorts of stylometric tools that might become available in the coming years will likely identify a signifcant contract cheating problem. If, in the 2030s, content-matching tools advance to the point of spotting plagiarism of ideas (rather than just copy-paste plagiarism), we will probably fnd a signifcant plagiarism of ideas problem.These advanced technologies are unlikely to be applied routinely to past student work; however, a motivated person with a vendetta and access to the data might be able to fnd enough evidence to brand even the most pious person a cheater for their accidental misdeeds. If standards around academic integrity continue to change, as they have over recent decades, student behaviour deemed acceptable at the point of data collection might be regarded as cheating at some later date. Taking Dr Martin Luther King Jr.’s work as an example, under current strict defnitions, King’s doctoral thesis contains signifcant plagiarism (Martin Luther King Jr. Papers Project, 1991). When this was discovered, King’s alma mater, Boston University, exercised contextualised judgement around cheating in deciding not to revoke his doctorate, by considering the academic and cultural traditions in which the work was undertaken.We cannot, however, expect that this weaponisation of academic integrity will be systematically rebufed in the future; so we need to think carefully about the boundaries that should be in place around retrospective cheating accusations.
Alternatives to surveillance This chapter has mostly given voice to the critics of surveillance approaches. Common amongst most of these authors is a preference for a diferent, more trusting and more positive approach to addressing cheating. While that approach is explicitly out of scope in this book – we are interested in securing assessment against people who have already committed to cheating – it’s still worth exploring the proposed alternatives to surveillance.
Surveillance and the weaponisation of academic integrity
105
The frst cluster of approaches take the form of positive academic integrity approaches. Such approaches include honour codes, which outline expected student behaviour and are accompanied by a range of privileges, like the opportunity to sit unproctored examinations (McCabe et al., 1999). Honour codes are an attempt to shift responsibility for academic integrity from educators to students. Studies have shown that honour codes are associated with signifcantly lower rates of self-reported cheating; however, the magnitude of the diference is quite low. In one study by honour code proponents McCabe et al. (2002), the presence or absence of diferent types of honour code only explained 20% of the variance in cheating. There is reason to be sceptical of these results due to social desirability biases and issues with non-representative samples.There is the additional possibility that students who sign an honour code may also be less likely to admit to cheating, out of a desire to present a consistent image of themselves. While honour codes have support at several US institutions, they have faced criticism elsewhere – especially in the UK, where staf and students proved resistant to the moral element of honour codes (Yakovchuk, Badge, & Scott, 2011). Given their contextdependence and limited efectiveness, on their own honour codes would be a poor replacement for surveillance in terms of ensuring assessment security. Another alternative proposed by anti-surveillance authors is to target the micro or macro causes of cheating. Potential measures to target these causes include: better relationships with students; less boring assessment; and developing students’ capabilities to undertake tasks without cheating (McArthur, 2018; Ross & Macleod, 2018; Zwagerman, 2008). I am sympathetic to these calls because, in addition to potentially reducing cheating, they are also largely positive changes to teaching and learning. Other, even more macro-level changes could be undertaken by targeting systemic and societal matters, like neoliberalism and other supposedly undesirable changes to the nature of teaching and learning (Introna, 2015; McArthur, 2018; Williamson, 2019; Zwagerman, 2008). I’m supportive of these approaches too; however, I think these macro-level issues are much larger and more intractable problems than even cheating and, as a pragmatist, I recognise that I have limited infuence over them.
Living with surveillance I take the view that surveillance in an assessment security context is something we should neither outright reject nor uncritically accept.There are obvious problems with surveillance but, at the same time, there are not really any alternatives that have proven robust at addressing the problems of cheating.We owe it to our communities to only graduate students who have met the required learning outcomes without cheating, and surveillance appears to be one of the more efective approaches for ensuring this. Embracing a degree of surveillance does not mean we should abandon work on improvements to teaching and learning that might obviate cheating, or that we should stop working on broader social projects to address the underlying causes of cheating. But when students, governments, employers and the community at large ask us how we know our students have not cheated, we need high-quality evidence. However, as Zwagerman (2008) notes, we are also
106 Surveillance and the weaponisation of academic integrity responsible for acting with integrity throughout our pursuit of the protection of academic integrity, and this demands that we deeply consider the negative consequences of surveillance. So, with each new wave of anti-cheating surveillance technologies, we need to ask: are the anti-cheating benefts really worth the potential harms of all this surveillance? To answer this question, we need evidence from a variety of sources. Firstly, consistent with Lyon’s (2018) notion of surveillance culture, we need to see the surveillance from the perspective of those being surveilled. What are the experiences of students inside higher education’s new surveillance culture? Work in this area has found some troubling impacts. For example, in one study students with high test anxiety performed much more poorly in remote proctored examinations when compared with in-person examinations (Woldeab & Brothen, 2019). Both assessment and surveillance have long histories of generating anxiety (Lyon, 2018; Nelson & Dawson, 2014), so it is unsurprising that the nexus between the two would be a hotspot for anxiety-related underperformance. There are many other stories of surveillance that we need to understand from the student’s perspective, especially instances where students voluntarily expose themselves to surveillance. I expect there are tales of anxiety, complicity, powerlessness, apathy, fear and rebellion that all need to be told. We also need evidence in the form of assurances, made and tested, that any proposed surveillance systems and data will be used only for specifc purposes. Given the potential for corporate entities to be sold and their data repurposed, there may be a signifcant legal component to this sort of work. The European Union General Data Protection Regulation (GDPR) may be helpful here, as it requires organisations operating in Europe to conform to particular standards around data and privacy. Finally, and most importantly, we need concrete evidence that surveillance approaches actually improve assessment security. How hard is it to cheat or defy the surveillance? How much cheating does the surveillance actually catch? Is the surveillance equally efective against all groups of potential cheaters? At present, there is limited evidence in favour of the efectiveness of surveillance. The decline in copy-paste plagiarism since the late 1990s could possibly be attributed to surveillance from content-matching tools, but alternative explanations are also possible. Living with anti-cheating surveillance is the likely future for most of us. Despite the criticisms of surveillance in this chapter, the adoption of surveillance aligns with broader social trends, and to decision-makers it may appear to address the existential crises of cheating in education. To make the best of this situation – whether you think it’s a good or a bad one – we need to work to establish best practices for institutions, corporations and government around surveillance. And we need to explicitly frame anti-cheating detection as surveillance. This framing brings with it a diferent mindset, and allows us to ask diferent questions. Students need to be engaged in critical conversations about surveillance as well. They know that the technologies educational institutions use to assess them can also surveil them, and they are not universally happy with this. For example, the arguments made by students against Cadmus’ now-abandoned assessment security features echo many of the anti-surveillance arguments in this chapter (Parkville
Surveillance and the weaponisation of academic integrity
107
Station, 2016; Stojanovic-Hill, 2017).The surveillance approaches we use in education come with an implicit endorsement; and if surveillance is acceptable at university, then why should it be challenged in society more broadly?
Things to do •
•
•
Ask your alma mater for a copy of all of the data that they have on you, including any copies of work you have completed there as a student. Depending on the country you studied in, this may be a simple, routine request or it may be something that requires some pushing. I also recommend you ask for that data to be destroyed. There is no beneft to you in the institution continuing to hold onto your data, and the risk of retrospective, weaponised anti-cheating surveillance being applied to you should be worth the few minutes it takes to ask them to delete it. Read David Lyon’s book The Culture of Surveillance:Watching as a Way of Life (Lyon, 2018) or, if you would prefer something more concise, the open-access paper that covers similar content (Lyon, 2017). Do you agree with my argument here that higher education has a culture of surveillance around cheating? Study the Electronic Frontier Foundation’s Surveillance Self-Defense guides, available at https://ssd.ef.org/ (Electronic Frontier Foundation, n.d.). These guides outline how to evade surveillance in broader society, and many of the lessons transfer directly to academic integrity surveillance.They have playlists of the guides set up for people in diferent roles, such as academic researchers and journalism students. Sharing these resources with your students might be an efective way to help them resist surveillance.
8
Metrics and standards for assessment security
As Carless (2009) notes, we tend to trust certain forms of assessment over others, often for reasons of tradition. Examinations are still seen as the default mode of assessment in some contexts, and to suggest any alternative is seen to pose a signifcant assessment security risk. Assessment security vendors make claims about the efectiveness of their products, but many are wary of outside researchers’ attempts to verify these claims; they are essentially asking us to trust them. Cheating vendors sell essays they claim are undetectable, which leads to responses such as suggestions to ban the essay (Race, 2018). But how can we move beyond simply trusting these assertions? How can we tell if one approach to assessment security is better than another approach? What is the minimum standard for acceptable assessment security? These are difcult, multi-criteria problems. In the absence of approaches to answer these questions, we seem to have fallen back on tradition and unevidenced assumptions about the relative security of one approach over another.These questions matter not just to educational institutions, but to the community overall; for example, employers need to know that sufcient assessment security standards are upheld at the institutions from which they recruit graduates.This chapter proposes a range of metrics and standards for assessment security that might help individuals and institutions in making more informed choices about the assessment security approaches they will take.
The problems of metricless assessment security There are three key problems with our current metricless approach to comparing assessment security approaches.The frst is that we have fallen for the ‘availability heuristic’ (Kahneman, 2011): because it is easy to think of stories where cheating has occurred in particular modes, we tend to think that this cheating is more common. Compelling and engaging stories about commercial contract cheaters are easy to conjure up, and are often reported in the media; however, in reality less interesting types of cheating are more common, like students outsourcing their work to friends and family members (Bretag et al., 2018).The availability heuristic applies to detecting and proving cases of cheating as well. Since it is easy to come up with particularly egregious cases of cheating that have been caught and proven, we are likely to believe students face a signifcant risk when they cheat. In reality,
Metrics and standards for assessment security
109
without hard evidence about the prevalence of cheating and the proportion that gets caught and proven, we are really just speculating. The second problem metricless assessment security has caused is a culture where vendor promises are taken uncritically. Educational institutions should be very sceptical of vendors that claim to secure assessments. But, in the absence of the routine use of metrics to ascertain assessment security within the institution, there is no precedent that metrics should be expected of vendors.We don’t even know what an acceptable level of performance for assessment security products might be. This means we just have to trust vendors of assessment security products; as Chapter 7 argues, we might even trust them more than we trust our own staf and students. The third problem is that, in the absence of fne-grained metrics, binary thinking has taken over.Assessment approaches are seen as either completely vulnerable or practically impenetrable, whereas neither is likely to be an accurate label. This all-or-nothing thinking means that cheating scandals can lead to radical shifts towards what I have labelled ‘assessment conservatism’: any approaches with features that might appear to militate against cheating are preferred to approaches that don’t attempt to provide security. But if these features are untested, then we may simply be engaging in what Schneier (2018) terms ‘security theatre’: looking like we are doing something to improve security which actually has no beneft, and often comes with signifcant cost and inconvenience. If we really want to be systematic and evidence-based in the design of assessment security, we need ways of comparing diferent assessment security approaches. The development of metrics is one such approach. Metrics for assessment security might, for example, compare how difcult it is to cheat in a particular type of task, or how well a particular technology detects a certain type of cheating. Metrics might also tell us what the rate of undetected and unproven cheating is in a particular cohort. In an ideal world, metrics would allow us to shift from hunches and vendor promises towards something calculable, comparable and externally verifable. This chapter proposes four types of metrics for assessment security: difculty to cheat; detection accuracy; proof; and prevalence. Taken together, an assessment approach that scores highly across all of these metrics would be reasonably secure. Such an approach could, however, come with signifcant negative consequences looking beyond assessment security, so the chapter also advocates for metrics addressing impacts on learning, teaching, assessment and the student experience. The chapter concludes by calling for assessment security standards, and proposes a set of hypothetical standards.
Difculty to cheat metrics The extent to which it is difcult to cheat is one potential metric for assessment security: all else being equal, if it is more difcult to cheat in assessment A than assessment B, then assessment A is more secure. Cheating difculty metrics could apply across diferent units of analysis, such as individual assessment tasks, assessment security technologies, or entire degree programmes. Consider a lockdown
110 Metrics and standards for assessment security browser approach of the type discussed in Chapter 2. If the lockdown browser makes particular approaches to cheating more challenging than when conducting the same task in a normal web browser, this could contribute to a hypothetical ‘cheating difculty’ metric. However, if the lockdown browser could be easily bypassed, then it would not score so highly on a cheating difculty metric.Taking a content-matching tool that fnds copy-paste plagiarism as another example, an assessment task that used such a tool might score more highly on a cheating difculty metric than if that same task did not use that tool. However, if the tool only works on exact matches rather than paraphrased matches, the diference would be less, as students who use auto-paraphrase tools can easily defeat it. Similarly, given two examination designs, one being open-book and the other being closed-book, the open-book examination might make cheating more difcult by permitting students to bring in outside materials; this would be considered cheating in the closed-book examination, and the range of approaches that provide access to outside materials discussed in Chapter 1 would need to be stopped. In this sense an open-book test might be considered more secure against some hypothetical cheating difculty metric. Cheating difculty metrics are easy to consider in abstract terms, but much harder to convert into tangible criteria or measures. How can the difculty of two approaches to cheating really be compared, such that one scores better than another? How can the ease with which someone can smuggle notes into an examination be located on the same scale as the difculty of plagiarising using an autoparaphrase tool? There are no easy answers here; however, I can ofer one metric that is readily comparable: the fnancial cost of undertaking diferent approaches to cheating. Cost to cheat Chapter 1 discussed several approaches to cheating that involve the student spending money. For example, a student might buy a cheating calculator for an examination or hire a contract cheating writer for an assignment. A cost-based metric would consider the cost of diferent cheating approaches as a proxy for cheating difculty. Cost-based approaches are common in computer science when considering the difculty of cybersecurity breaches.The computational time required to decrypt a secret message without knowing the secret key is one way that the security of diferent algorithms are compared. Similarly, the arduous password requirements we have to meet when changing or setting passwords are another way to make password-guessing more expensive, in terms of time and resources, and therefore more difcult. To provide an example of what a cost metric might look like, I have researched the prices of diferent types of cheating strategies as applied to diferent types of assessment task. I have split cheating into two diferent types: the cost to do general cheating that provides an unfair advantage, or ‘cost to cheat’; and the cost to pay someone else to do the substantive part of a particular assessment, or ‘cost to subcontract’.These two metrics indicate how fnancially expensive it is for a student
Metrics and standards for assessment security
111
to partially or completely invalidate an assessor’s judgement. Before going into the details of how much each type of cheating costs, it is worth considering how different factors infuence the costs, and therefore the difculty, of cheating. Cost factors There are many circumstances where ‘cost to cheat’ is efectively zero, such as when a friend who has taken a test tells you what the questions were just before you sit the same test. Copy-pasting from the Internet into an essay assignment is similarly free, as is passing notes in class. However, these sorts of approaches can be easily countered through obvious good practices like not reusing tests, using content-matching software, and invigilating in-class exams. To avoid creating a straw man situation where cheating in every situation is efectively free, it is useful to exclude cheating approaches that have obvious solutions which are already broadly used. Having a very generous friend write your essay for you may be free, making the ‘cost to subcontract’ zero. Similarly, a hypothetical identical twin who is an expert in the feld could likely fool all but the best biometric approaches to exam identity verifcation. For the purposes of this exercise, however, I will exclude this sort of charitable cheating, as these volunteer labours are a special case. As a parallel, we could propose addressing the problem of contract cheating by having an army of volunteer writing supervisors monitor all of our students one-on-one for as long as it takes them to write their essays; although this edge case may work in some contexts, it is not really a ‘free’ solution to the problem at hand. Cost to subcontract is therefore calculated based on an honest day’s pay for a dishonest day’s work. The costs of cheating and subcontracting are infuenced by globalisation and international labour markets. Students in afuent nations may have a higher purchasing power for online labour than their peers in the developing world, making online exam impersonation approaches cheaper. Similarly, costs to hire in-person exam impersonators may vary based on factors such as income inequality and the education levels in a local area. Specialist cheating hardware may be more available in large high-tech locations, and less common in smaller agricultural settings. Costs to cheat and subcontract are therefore infuenced by a range of contextual factors; what is more secure in one context may be less secure in another. Cost will also vary over time, as the cheating workforce changes and as cheating technology advances. Contract cheating, for example, is an approach that is not new; ghostwriting is at least as old as biblical texts (Weinberg, 2003). However, the Internet has enabled the logistics of contract cheating, and provided layers of anonymity. Using advanced privacy-enhancing technologies such as Tor and Bitcoin, it is possible for both parties in a contract cheating exchange to maintain almost absolute anonymity. Advances in technology thus change the risk and logistics of cheating, which may infuence how costly it is for students. Calculations of costs to cheat and subcontract therefore need to be revisited regularly. Some approaches to cheating or subcontracting consist of largely one-of costs, whereas others include a capital investment and then per-task costs. Contract cheating services are largely one-of purchases, although some services do provide
112 Metrics and standards for assessment security loyalty discounts. As a comparison, some approaches to cheating in online remote proctored tests involve largely capital costs to set up a cheating workstation environment, which can be reused for many tests. It is even feasible that specialist cheating venues could be set up for remote proctored tests, and hired by students on a per-test basis. Cost to cheat and subcontract therefore need to be considered in the context of a programmatic view of assessment; if teachers reuse the same sort of assignment extensively, it may make capital investment in cheating tools better value. Taking the above into consideration, the fgures presented here are from my context of Melbourne, Australia, in Australian dollars, and contextualised within the higher education sector. Figures are sourced from the open web where possible. Multiple prices have been sourced, and the median is provided. Cost to cheat and cost to subcontract on fve task types Table 8.1 below provides a summary of fve task types, and the full cost to cheat or subcontract using particular approaches. Each task is then briefy explained, along with relevant approaches to cheating and contracting. TRADITIONAL CLOSED-BOOK EXAM
Cost to cheat: free or very cheap Assessment judgements made about student performance in closed-book exams rely upon restricting the information available to Table 8.1 Cost to cheat and cost to subcontract in fve assessments. Task
Type of cheat
Approach
Cost
Traditional closed-book exam
Cheat
Use authorised material to bring unauthorised notes
Free or very cheap
Subcontract Cheat
Exam impersonator Hide unauthorised materials in room Online exam-taking service Access past questions Collude with friends in same room Copy existing materials Pay professional to undertake task
Allegedly $3,500 Free or very cheap
Remote proctored exam
Subcontract Unsupervised online MCQ
Cheat Subcontract
Authentic take-home video task
Cheat Subcontract
Traditional essay (2000 words)
Cheat Subcontract
$140 Free or very cheap Free or very cheap
Back-translation
Free Hourly rate of professional somewhere in the world Free
Contract cheating service
$100
Metrics and standards for assessment security
113
students undertaking the exam. Although there is much debate about the relative merits of open-book vs closed-book exams (J. B.Williams & Wong, 2009), in circumstances where a judgement needs to be made about low-level learning outcomes that rely on memorisation, closed-book may be appropriate. However, if students bring with them the information that they are supposed to have memorised, they can invalidate the assessor’s judgement. There are a range of creative approaches to cheating that are free or very cheap. For example, water bottle labels can be printed that appear normal but actually contain cheat notes, and students can hide notes in the bathroom for themselves or others. One popular article on how-to site WikiHow (2016) provides step-by-step pictorial guides for 28 diferent exam cheating techniques which only require stationery; this article has been viewed over a million times. Assuming students have access to basic study supplies, the cost to cheat with these methods is negligible. Cost to subcontract: allegedly $3,500 In response to a recent set of media stories, multiple Australian universities have stated that they have identifed students using exam impersonators (The Feed, 2015). One of these stories, which employed an undercover hidden-camera approach, found that $3,500 was the going rate to sit an exam. This fee is substantial, and can cover a range of costs on behalf of the impersonator, including constructing a fake student identifcation card. In my research, I have not been able to fnd exam impersonation services in my area advertising publicly on the web or the dark web, so I am unable to verify these costs; for reasons of research ethics, I am not comfortable employing the same techniques used by investigative journalists. A cheaper approach than full-blown in-person impersonation could involve technology mediation. Although many forms of technology are now banned by some institutions (e.g. wristwatches are banned due to the threat of smartwatches), cheaters’ earpieces can be bought which hide inside the ear canal and are very difcult to detect. These earpieces can be confgured to connect to an outside expert who, under some conditions, may be able to dictate answers to the test. At the more expensive end, the Monorean Pro costs US$600 (AU$885, Monorean, 2019), but cheaper devices can be found for around $20 that claim to provide the same functionality. It is likely that in the future, exam cheating hardware will become smaller, cheaper and easier to hide, making cost-to-subcontract cheaper as well. REMOTE PROCTORED EXAM
Cost to cheat: nil to $50 As discussed in Chapter 1, USB key injectors are a proven cheating method that is very hard to detect, even in face-to-face invigilated computer-based exams.This approach to cheating uses a device that looks roughly the same as a USB drive, but tells the computer that it is a keyboard. USB key injectors can be programmed to type any text at any time, and are an efective way to bring study notes or prewritten responses to questions into a computer-based exam. They can be plugged in to hidden USB ports inside a computer case, and to the
114 Metrics and standards for assessment security exam software appear indistinguishable from an actual keyboard. A typical model costs around $50 (HakShop, 2014). A range of cheaper, low-tech options are also available to bring unauthorised material into a remote proctored exam.The most obvious of these involves placing study notes in a location that cannot be seen by the webcam, even when a mirror is used. Cost to subcontract: $140 As discussed in Chapter 1, some security researchers claim that some remote proctored examinations can be run inside a virtual machine, allowing students to run other software that cannot be detected by the proctoring or lockdown tools.This can allow a cheating student to hand over control of their computer, and the exam, to an external third party. This would allow the use of online services that can take exams for around $140, or even services that can take entire courses for $1,000. Lower-tech solutions also exist for subcontracting remote proctored exams. A subcontractor can view the exam questions by watching the student’s computer monitor through one-way glass.The subcontractor can then communicate through means as low-tech as tugging on fshing line tied to the student’s toe. While an inefective strategy against essay tests, multiple-choice exams can be completed through this method. These methods still require input from the subcontractor, who would also need to be paid. UNSUPERVISED COMPUTER-BASED MULTIPLE-CHOICE TEST
Completely unsupervised tests carry with them an obvious range of free cheating methods, particularly if the test largely focuses on lower-level learning outcomes. All of the cheating options available to remote proctored exams are available. Easier and cheaper options include Googling the questions, and colluding with peers in the same room.These options are free, or very cheap. However, it is important to note that assessment designers are often aware of the ease of cheating in these tasks, and as a result may give the tasks a lower weighting in a course (Bennett, Dawson, Bearman, Molloy, & Boud, 2017). Of the task types considered in this chapter, unsupervised computer-based multiple-choice tests are the cheapest to cheat or impersonate in, and would likely be regarded as the least secure assessment type. AUTHENTIC TAKE-HOME VIDEO TASK
This task involves creating a ten-minute video-recorded demonstration of professional-level skills. It might be demonstrating a particular skill a physiotherapist would undertake, or an on-camera group role-play task around an accounting problem. Cost to cheat: free Take-home tasks are vulnerable to a range of copy-paste cheating. Text-based tasks often handle this through text-matching tools like Turnitin; however, for other tasks there are limited options. For video in particular, there are few options available to educational institutions to automatically detect copy-paste.
Metrics and standards for assessment security
115
In this particular video task, students may cheat by doing a ‘mash up’ of existing video material available on YouTube with their own skills demonstration. However, if the task is designed to require on-camera demonstration of skills by the student, there is a limit to how much of the assignment can come from other sources. Cost to subcontract: very task-dependent; hourly rate of professional as an estimate Despite extensive web searching, services specifcally advertising contract cheating for this type of assignment could not be found – however, from what is already known about the contract cheating industry, many sites would likely attempt this task if the price is right. If this task is well-designed, and standards are enforced in marking such that only someone with professional competence in the task could pass, it would require the student to pay for practitioners in their chosen feld to spend time on-camera demonstrating on behalf of students, as well as for video recording, directing, and editing. This would be made more or less expensive based on the degree to which a student’s appearance is familiar to the marker, combined with global labour market forces around people who look like the student and have the expertise. TRADITIONAL ESSAY
Cost to cheat: free Existing content-matching tools do a good job at detecting verbatim copying; however, to an expert cheating student they only add steps to the process. The use of free auto-paraphrase tools allows a student to copy-paste existing work and claim it as their own, efectively making cheating free. Cost to subcontract: $100 A range of diferent papers have provided diferent prices for contract cheating work.The dataset our group bought cost an average of $156, and the typical assignment in that dataset was around 2,000 words (Sutherland-Smith & Dullaghan, 2019). That data was drawn from commercial contract cheating sites. By contrast, analysis of Twitter data has found students are willing to pay even less, only US$33.32 (AU$48.20) per thousand words (Amigud & Lancaster, 2019b). Prices on gig economy site Fiverr were similar, at US$31.73 (AU$45.88) per thousand words (Lancaster, 2019b). While one study on purchased history essays found that more expensive services tended to produce better quality work (Lines, 2016), Sutherland-Smith and Dullaghan (2019) found there was not a strong relationship between price and quality in the assignments our group purchased. Limitations of cheating difculty metrics While cheating difculty is a useful metric for understanding the relative security of diferent assessment approaches or technologies, it is not without challenges. Cheating difculty is highly contextual and very difcult to generalise; taking the cost-based metrics proposed here, the nuances of the particular assessment tasks would greatly infuence how expensive each is to cheat or subcontract for each task. But most importantly, difculty to cheat needs to be considered alongside the
116 Metrics and standards for assessment security next type of metric I propose, detection accuracy – because no matter how easy a cheating approach is, it is worthless if it is easily detectable.
Detection accuracy metrics Another way to consider the relative security of diferent assessment approaches, or diferent assessment security technologies, is how well they detect cheating. In Chapter 6 there was some discussion of diferent potential measures for cheating detection, including correct classifcation rate, sensitivity, specifcity, rates of false positives and false negatives, and the predictive value of positive or negative tests. Of these, I think sensitivity and specifcity are the most useful when it comes to metrics for assessment security. From a detection accuracy perspective, a highly secure assessment approach or tool needs to spot as much actual cheating as possible, while making as few mistakes as possible.This means high sensitivity and high specifcity will be the key to detection accuracy metrics. Delving deeper into sensitivity and specifcity, it is important to consider how precise these measures are. In our team’s work on contract cheating, we provide confdence intervals for our studies into marker detection accuracy. For instance, in our study Can training improve marker accuracy at detecting contract cheating? A multidisciplinary pre-post study (Dawson & Sutherland-Smith, 2019) we provide the following Table 8.2: Table 8.2 Comparing pre- and post-workshop sensitivity and specifcity scores. Numbers in brackets are 95% CIs. Pre-workshop
Post-workshop
Diference
Sensitivity
0.58 [0.47, 0.68]
0.82 [0.73, 0.89]
0.24 [0.11, 0.37]
Specifcity
0.83 [0.77, 0.88]
0.87 [0.82, 0.91]
0.04 [−0.03, 0.11]
From Dawson and Sutherland-Smith (2019).
The pre-workshop column contains accuracy fgures for 15 markers from four disciplines who made a total of 300 decisions about the presence or absence of contract cheating, each considering a dataset of 20 assignments from their discipline.The sensitivity row shows their accuracy at detecting actual contract cheated work was 58%.This is accompanied by a 95% confdence interval, which is a way of expressing the potential for the result to have been infuenced by the sample size. A simple interpretation of the meaning of this confdence interval is: we are 95% confdent that a much larger study of this type would lead to a sensitivity score between 47% and 68%. That is a wide range, and it indicates that, adjusting for sampling error, sensitivity could really have been as low as less than half or as high as more than two thirds. Similar interpretations are possible for the post-workshop column as well, for example: the specifcity of markers after a training workshop on detecting contract cheating was 87%, and we are 95% confdent that a much larger study of the same type would fnd results between 82% and 91%.This might not sound particularly dramatic expressed as specifcity, but consider the inverse:
Metrics and standards for assessment security
117
the false positive rate.These confdence intervals suggest that somewhere between a 9% and 18% false positive rate would be the fnding in a much larger study; as these results difer by a factor of two, we should take them with caution. Confdence intervals are therefore highly important, as they give a sense of how much of the result of the study might be due to random chance from the sample size. In the same study, we also calculated how much better or worse markers were after the workshop compared to before the workshop. This is presented in the Diference column.You can see that markers spotted an additional 24% of contract cheating correctly after the workshop, improving from 58% to 82% in sensitivity. Confdence intervals are also provided here, spanning from 11% to 37%. These show a 95% confdence level that the result of a much larger study would produce a diference between 11% and 37% better detection rate after the workshop.This is regarded as a signifcant diference because these fgures do not span 0%. In contrast, the diference for specifcity of 4% is not signifcant because its confdence interval of −3% to 11% does span zero; this means that while a much larger study might fnd an even greater increase in specifcity of 11%, it could also fnd a decrease in specifcity after the workshop of 3%.Where possible, confdence intervals should be provided when comparing diferent approaches, as they allow us to say if the diference between approaches is signifcant or just an artefact of having a small sample size. If confdence intervals are not provided, we should be extremely sceptical of any claims that one approach to detecting cheating is better than another. An ideal detection accuracy metric would therefore include sensitivity and specifcity, as well as confdence intervals for both measures. Any comparisons between approaches should include confdence intervals as well. Comparability raises two challenges. Firstly, there is a need for greater detail in the results provided alongside statistics like sensitivity and specifcity, to enable the calculation of diferences between approaches. And secondly, there is a need for shared datasets, in order to be able to conduct these comparisons. The provision of sensitivity and specifcity scores with confdence intervals should be accompanied by a table showing the number of true positives (cheating accurately detected), true negatives (legitimate work accurately detected), false positives (legitimate work identifed as cheating), and false negatives (cheating work identifed as legitimate). One such table is reproduced from Dawson and Sutherland-Smith (2019), showing the accuracy of diferent groups of markers at detecting contract cheating before a training workshop (Table 8.3). Providing this sort of table allows other researchers to conduct their own statistical tests, which enables them to verify the results in the paper, and also to reuse them in other studies.When there are multiple similar studies presenting this sort of data, they can be meta-analysed, providing a single overall estimate for measures like sensitivity and specifcity that draws on all the published data. At present, when researchers want to study detection rates for diferent types of cheating, they need to gather their own datasets. For example, when we studied contract cheating, we needed to get permission from students to use their legitimate assignments and we also needed to purchase contract cheating assignments from a range of websites. Not only is this process expensive and time-consuming, it is also ethically challenging: as contract cheating researchers, the last thing we
118 Metrics and standards for assessment security Table 8.3 Example of the sort of data necessary for calculating sensitivity and specifcity from detection studies.
True positives True negatives False positives False negatives Total
Psychology
Nutrition
Marketing
Biology
Total
21 44 12 3 80
14 48 8 10 80
4 24 4 8 40
13 58 12 17 100
52 174 36 38 300
Reproduced from Dawson and Sutherland-Smith (2019).
want to do is fund an industry we despise. However, in the absence of a freely available corpus of contract cheated work, we found it necessary and ethically justifed to purchase assignments for our studies. In the future, it would be helpful if a group of researchers were to pool together their individual contract cheating assignment datasets into one larger dataset, and share it as a public dataset.This would connect with broader movements in the sciences and social sciences towards ‘open data’, which argues that research data should be shared to allow reuse and replication of results (Gewin, 2016). An open dataset of contract cheating assignments could be used in a number of ways to study detection accuracy. Firstly, researchers studying the detection of contract cheating could use a dataset to conduct studies similar to ours and see if their results are similar. In educational research there is something of a ‘replication crisis’ (Makel & Plucker, 2014) – very few replication studies are conducted. Makel and Plucker (2014) examined the entire publication history of the top 100 education journals, and found that only 0.13% of articles were replication studies. While those replication studies largely succeeded in reproducing the results of the original studies, where there was no overlap in authorship with the original study the likelihood of a successful replication was signifcantly lower. Open data can help address this crisis by allowing researchers to use the same datasets. Knowing that fndings from one dataset are not peculiar to an institution or a particular group of markers would help researchers and practitioners assess the generalisability of sensitivity and specifcity scores. One important caveat to open data is its potential misuse by technology vendors, whether intentionally or unintentionally. If a dataset is shared on a public webpage or as supplementary materials to a journal article, it is likely to quickly become indexed by search engines and content-matching vendors. This would mean those tools would behave in an unexpected way when tested on the dataset. For example, a public corpus of contract cheating assignments would likely yield a 100% similarity score on Turnitin, simply because Turnitin had indexed that corpus.This means that vendors would need to be alerted to make special exceptions for this corpus. However, there is also the possibility that some vendors could specifcally train or optimise their tools to perform well against such a public corpus of assignments. Vendors would need to agree upfront not to use this dataset for
Metrics and standards for assessment security
119
development of their products, and to only use it for testing. Or, alternatively, the dataset could be partitioned into a training dataset and a testing dataset, which is a common approach used to ensure machine learning algorithms do not overft to a particular dataset. Limitations of detection accuracy metrics Detection accuracy metrics measure an important component of assessment security: how well can we spot cheating, and can we do so without false positives? But they are not a panacea. Exclusively relying on detection accuracy would also mean ceasing to engineer assessments and tools to make cheating very difcult. But more importantly, no matter how sure we are that we have detected cheating, evidencing and prosecuting it is something else entirely.This means a detection metric needs to be considered alongside the next type of metric I will discuss: proof metrics.
Proof metrics Let’s say that we have an assessment security approach with a 95% detection specifcity and a 95% detection sensitivity for contract cheating, and we have just detected that a student has contract cheated. What happens next varies greatly between contexts. In most Australian universities, there are formal procedures to follow that, in this case, would usually involve the allegation being put to the student and then heard at a formal academic integrity panel. At such a panel, clear evidence needs to be put forward that satisfes a burden of proof, usually the ‘balance of probabilities’ – that is, it is more likely that the student has cheated than that they have not cheated. This is a lower burden of proof than ‘beyond reasonable doubt’. The high specifcity of the detection approach might be enough to sway some academic integrity panel members; however, others would likely point out that one in 20 legitimate assignments would be considered cheating under such an approach.This might lead to many cases of contract cheating being thrown out, for fear of wrongly fnding against non-cheating students. It might also lead to fewer academics pursuing cases of contract cheating, which already have a reputation for being too time-consuming and too difcult to prove (Harper et al., 2018).A robust assessment security approach must therefore also be measured by the proportion of cases it can correctly prove as cheating, and the proportion of suspected cheating cases that are correctly proven as not cheating; this could be measured by sensitivity and specifcity. While in detection we may be happy to trade of a little specifcity for a lot of sensitivity, proof metrics are the inverse: the consequences of wrongly fnding against a student are probably worse than the consequences of wrongly dismissing allegations against someone who has cheated. Proof metrics have the potential to be both a powerful deterrent, and a powerful temptation.Taking contract cheating as an example, the rate at which it’s claimed UK universities successfully detect and prove contract cheating – less than one percent, as reported in the media (Lee, 2019) – is unlikely to be much of a
120 Metrics and standards for assessment security deterrent for potential cheating students. If anything, such a low proof rate may make contract cheating seem like a safe bet. However, if there was robust and transparent evidence that there actually was a high likelihood of cheating not only being detected but also being proven, those same students may be less likely to cheat. Developing this sort of robust evidence for proof metrics would require research in a similar vein to what is required for detection accuracy metrics.The key diference is that the decision-makers would need to feel the weight of their decisions as if they were passing judgement on real people. This would mean that artifcial environments would be less useful for proof metrics than real decision-making scenarios. However, experimenting with live proof scenarios involving real cases is also challenging, both for matters of research ethics and because we cannot know in any given real instance if cheating has actually occurred. One possible research design for gathering data towards proof metrics could be gaining permission from a decision-making body to insert occasional test cases, for which the presence or absence of cheating is already known, and have these judged alongside real cases without fagging them as test cases.This would mean that decision-makers would need to treat the cases with the gravity of a real case, and it would make the context of the decision-making as natural as possible. Limitations of proof metrics Proof metrics are essential to assessment security as they allow us to demonstrate the likelihood that any given instance of cheating that is accurately detected will be proven. However, on their own proof metrics are an incomplete measure of assessment security, as they rely on detection being sensitive enough to fnd cases to prove. In practice, there will likely be a preference towards specifcity rather than sensitivity for proof metrics and, while I agree with this preference, it will likely result in a signifcant number of situations where cheating goes unproven. But on their own, proof metrics are unable to tell us the proportion of unknown cheating that goes unproven; for that, we would also need to know about the prevalence of cheating in a given population.
Prevalence metrics In terms of harm to broader society beyond the education sector, what ultimately matters the most in assessment security is the prevalence of unproven cheating. This is because unproven cheating leads to graduates who have fraudulently been certifed as having met particular learning outcomes, unlike cases of proven cheating. A hypothetical situation where there is an epidemic of cheating but all cases are detected and proven is, therefore, better from an assessment security perspective than another hypothetical situation where there is a moderate amount of cheating that goes entirely undetected.That is not to say that the cheating epidemic situation is at all acceptable; it is probably a sign that things are going wrong well outside the scope of assessment security. Prevalence metrics will therefore be key in establishing the relative assessment security of diferent approaches or tools.
Metrics and standards for assessment security
121
The prevalence of undetected cheating is a function of the prevalence of cheating, the proportion of cheating that is detected, and the proportion of that cheating that is ultimately proven.This can be expressed as: Prevalence of undetected cheating ˜ Prevalence of cheating ° ˛1 – ˛ detecction sensitivity° proof sensitivity ˝ ˝ For example, a scenario with 5% cheating, a detection sensitivity of 50% and a proof sensitivity of 50% would have the following prevalence of undetected cheating: Prevalence of undetected cheating ˜ 5% ° ˛1 ˙ ˛ 50% ° 50% ˝ ˝ ˜ 3.75% In other words, in a cohort of 1,000 students, we might estimate 38 students will likely cheat and still be eligible for credit. However, in order to make such a calculation, we would need evidence about the prevalence of cheating in a particular cohort.This sort of evidence is difcult to obtain and usually imperfect. Self-report surveys are the dominant approach in the literature to calculating the prevalence of cheating.These have been conducted across many contexts and over several decades (Bretag et al., 2018; Curtis & Clare, 2017; Newton, 2018). For contract cheating, there have been ten studies showing prevalence rates of more than 20%; however, almost double that amount have found prevalence fgures of 1% or less (Newton, 2018). Cheating prevalence is likely to vary greatly from context to context, and diferent self-report designs yield diferent rates of honest self-disclosure. Newton (2018) notes that, in the broader survey literature, people are not completely honest when self-disclosing deviant behaviours like cheating, and fear of data misuse often drives poorer response rates on sensitive topics. He also observes that the people who tend to fll in surveys are largely ‘older, female, well-educated, and from a higher socioeconomic background’ – precisely the sort of people who are less likely to engage in cheating (Newton, 2018). Self-disclosure surveys are therefore limited, in terms of their accuracy, in helping us to understand the true prevalence rates of various types of cheating. There are some potential alternatives to self-disclosure surveys for estimating the prevalence of cheating; however, they are also imperfect. Cheating market research can give some indication of the scope of the cheating provision available and, in particular circumstances, even the amount of cheating contracts that have been made (Amigud, 2019; Amigud & Lancaster, 2019b; Lancaster, 2019b, 2019c). At best, however, this market research only provides fairly narrow data, in that it focuses on specifc cheating methods and marketplaces; it also tends not to be very useful for identifying non-commercial cheating. Data security breaches of cheating providers can also provide some data about specifc institutions, as occurred in the 2014 breach of the MyMaster contract cheating site (Visentin, 2015), which resulted in fgures being published in Australian national newspapers about the rates of contract cheating across more than ten Australian universities. However, although these numbers may look big – some universities had more than 100
122 Metrics and standards for assessment security orders for cheated work – they still represent a narrow view of prevalence, as they are only from a single cheating provider. Limitations of prevalence metrics Knowing the prevalence of unproven cheating lets us know the proportion of students who are receiving credit for learning outcomes they have cheated rather than achieved. One key limitation of this type of metric is that it requires data we have not yet fgured out how to obtain: the true rate of cheating in a population. However, another key limitation of prevalence metrics – as well as difculty, detection and proof metrics – is that they have no concern for impacts beyond assessment security.There is also a need for metrics that consider how assessment security impacts on learning, teaching, assessment, and the student experience.
Learning, teaching, assessment and student experience metrics In the context of cybersecurity, Gene Spaford is often quoted as saying,‘The only truly secure system is one that is powered of, cast in a block of concrete and sealed in a lead-lined room with armed guards – and even then I have my doubts’ (as quoted by Dewdne, 1989). It is possible to concoct a similarly secure assessment system – and it would be similarly unusable. Like cybersecurity, assessment security involves compromises between security and usability, which in our context includes impacts on learning, teaching, assessment, and the student experience. While not strictly ‘assessment security’ metrics (as they do not measure the security of assessment), learning, teaching, assessment and student experience metrics are an important counterbalance to the metrics discussed earlier. Assessment plays an essential role in guiding student learning and there is an entire feld of assessment research and practice that privileges assessment’s learning purposes, usually under the banner of terms like ‘formative assessment’,‘assessment for learning’, or ‘learning oriented assessment’ (Carless, 2017). If assessment security leads to poorer learning, then we must carefully consider if such a trade-of is acceptable. Boud (1995) argues that we must consider the ‘consequential validity’ of assessment: the impacts beyond the immediate act of assessment on students and student learning.This concept transfers directly to assessment security. For example, if we are afraid of cheating in a take-home task so we shift to a closed-book remote proctored examination, the concept of consequential validity might prompt us to consider the learning activities that will surround the examination. If the exam leads to students adopting short-term cramming study strategies, rather than deeper learning strategies, this means the exam might have poorer consequential validity, as students are likely to forget what they have crammed.We therefore need to balance assessment security metrics against the impacts of assessment security on learning, which may be much harder to quantify. Teachers and teaching are also an important consideration in assessment security. Some of the assessment security approaches that we might consider potentially efective, such as conducting mini-vivas or oral examinations for every assessment task,
Metrics and standards for assessment security
123
might be infeasible from the perspective of teacher workload.Teachers also have their own beliefs about assessment and operate within disciplinary and departmental contexts with their own norms, the combination of which can be more powerful than any research or guideline in shaping what teachers do (Bearman et al., 2016).Teacher capabilities must also be considered: teachers may not have the skills required to implement particular approaches, nor the inclination to learn particular technologies. Without teacher support, even the most secure assessment approach is doomed to failure, as it is ultimately teachers who implement assessment.As with learning metrics, the impact of assessment security on educators, along with its congruence with their beliefs and their ability to implement it, is challenging to quantify through metrics, and may be more efectively investigated qualitatively. Just as a focus on assessment security could lead to poorer assessment for learning, assessment security can also lead to poorer assessment of learning. Restricting students from bringing any outside materials into an examination in the name of assessment security might stop some types of cheating, but if the examination is assessing anything other than memorisation of unistructural or multistructural outcomes (Biggs, 1999), this may render the student less able to demonstrate their actual capabilities. Standard metrics used to measure the quality of assessment of learning, such as reliability and validity, should be applied when considering the impacts of assessment security on assessment of learning. Finally, it is worth considering the sacrifces made for assessment security in terms of the student experience.This includes high-level concerns around surveillance, as discussed in Chapter 7, but also some lower-level practical concerns. Some assessment security technologies are simply really annoying. For example,in most ProctorU examinations over the past six months students have had to remove unauthorised materials from their room before the exam, and proctors have stopped the exam to intervene in 7.3% of cases (ProctorU, 2020b). Most of these students likely had no desire to cheat, and if it were me sitting the exam, I would fnd such disruptions annoying. However, when considering the student experience, it is also important to consider the alternatives; for example, if I had the choice between a remote proctored examination and a two-hour drive to campus, I would probably choose the remote proctored option. Assessment security has the potential to both hurt and help the student experience. Existing student experience instruments, such as end of semester surveys, could easily incorporate metrics to measure student experience of assessment security. However, in-depth qualitative research into the experiences of students while undertaking assessments is also needed. Limitations of learning, teaching, assessment and student experience metrics It is crucial that we consider the impacts of assessment security on broader educational matters like learning, teaching, assessment and the student experience, and metrics may be one way to quantify such impacts. However, it is important to note that, while necessary, these metrics are not assessment security metrics.They tell us nothing about the extent to which an assessment is resistant to cheating, or the likelihood of us successfully identifying cheating and not awarding credit for
124 Metrics and standards for assessment security cheated work. Metrics for learning, teaching, assessment and the student experience should instead be considered an important counterweight to assessment security, helping us to understand the potential harms (as well as any serendipitous benefts) that may come alongside our attempts to stop cheating.
Towards standards for assessment security Metrics are a valuable way of thinking about how we might compare diferent assessment security approaches. However, they do little to communicate if an approach is of sufcient quality – just as, when we assess students, knowing one student scored better than another student does not tell us if either of the students are competent. In addition to metrics, we also need standards that outline the minimum and aspirational levels of assessment security – just as, when assessing students, we need to know if they have met the required or aspirational levels of competence with respect to particular learning outcomes. There has already been some work done towards assessment security standards in the context of micro-credentials (short, often online courses, such as Massive Open Online Courses or MOOCs). In her proposal for a ‘critical information summary’ for micro-credentials, Oliver (2019) identifes six levels of assessment security under the heading of ‘Supervision and identify verifcation’: (Table 8.4). Oliver’s (2019) hierarchy (2019) employs two concepts: supervision, which is similar to the assessment security concept of control of circumstances; and identity verifcation, which is similar to the assessment security concept of authentication. Typical take-home assessments, like essays, would sit at the lowest level of Oliver’s hierarchy, as no attempts are made to verify that the student is who they say they are, or to supervise students while they produce their work. A traditional examination, viva or in-person simulation would sit at the highest level of the list, because attempts are made to control for both identity and circumstances. Oliver argues that this sort of information is important not only to potential students but also to potential employers, as it gives them a degree of trust in the integrity of the qualifcation. Extending the idea of a critical information summary for supervision and identity verifcation, there is a need for standards that list not only the approaches used, but also how broadly and strategically they are used, and how efective they are. Table 8.5 demonstrates how such standards could look; however, these are hypothetical and have not been validated. Table 8.4 Six levels of supervision and identity verifcation, from Oliver (2019). Supervision and identity verifcation
Choose one (best ft): Unsupervised, no identity verifcation Supervised online or on-site, identity not verifed Supervised online, identity verifed (one factor) Supervised online, identity verifed (two factors) Supervised on-site, identity verifed (one factor) Supervised on-site, identity verifed (two factors)
Table 8.5 Hypothetical standards for assessment security. Aspirational: minimum, plus
Excessive/overkill: aspirational, plus
Coverage across a programme
No rationale for which tasks are secure
The major task in every unit/course is secure as below
At a degree level, all major outcomes have been assessed in a secure manner as below
All tasks are secure as below
Authentication
No attempt to verify identity
Photo ID; stylometrics; biometrics
Dialogue with students on completion of tasks
DNA tests
Control of circumstances
No attempt to control circumstances
Proctoring (remote or in person); lockdown; text matching
Content matching
Constant surveillance
Difculty to cheat metrics
No attempt to verify difculty to cheat
Educators attempt to cheat to test difculty
Audit by specialists shows high difculty
Audit by specialists shows impossible to cheat
Detection accuracy metrics
No attempt to verify detection accuracy
Detection methods based on research
Local detection methods verifed by research
Perfect detection accuracy
Proof metrics
No attempt to verify proof rates; proven cases of cheating are rare
Record keeping of proof rates; proven cases of cheating are common
External audit of proof rates; benchmarking with other similar programmes
Experimental studies conducted to validate proof rates
Prevalence metrics
No attempt to compare rates of proven cases with expected rates
Rates of proven cases benchmarked against prevalence data and similar programmes
Existing research used to calculate prevalence of unproven cheating, which is less than 1% of student body
Local prevalence data is used to calculate prevalence of unproven cheating
125
Minimum: all of the following, as appropriate, and no unacceptable
Metrics and standards for assessment security
Unacceptable: any of the following
126 Metrics and standards for assessment security This set of hypothetical standards could be applied across an institution as a whole or within a more local setting, such as a department or a degree programme. To meet the overall standard for a level, every indicator for that level must be met. For example, to meet the minimum standard, all indicators in the minimum column must be met: the major task in every unit/course is secured, through approaches like photo ID checking, stylometrics or biometrics (as appropriate), with either remote or in-person proctoring, lockdown or text matching (as appropriate). The minimum standard is also contingent on using detection methods based in research and evidence, with records kept on proof rates that show cases of proven cheating are common, and that are compared against prevalence data from the literature and other comparable programmes. There is a need for institutions, national regulators or the academic integrity community to outline clearly what the acceptable standards for assessment security should be.Table 8.5 is my attempt; however, there is a need for more hard data to validate these hypothetical standards.Validation would be met through consultation with diferent stakeholder groups about what they think is enforceable, practical, necessary and ethical, and where they think the line should be.Whatever the result of such a process, it is important that standards are transparent and easily understood by potential and current students, accrediting bodies, employers and other institutions. Until we have clear standards that are widely shared and based on sound evidence, the security of assessment is based largely on patchy evidence at best, and hunches at worst. Given assessment security is the fnal arbiter of accreditation for life-and-death professions, the stakes are too high to leave it up to intuition.
Things to do • • •
•
•
Use the standards in Table 8.5 to evaluate the assessment security of your programme. Does it meet the minimum standards? What are its strengths and weaknesses? Develop your own version of Table 8.5, with the standards you think are most appropriate. Ask the vendors of assessment security technologies you use for data against the metrics supplied in this chapter. Are they able to provide these metrics? What evidence can they share about the metrics? Have their metrics been peer-reviewed? Find data about the prevalence of diferent types of cheating in your context. At the time of writing there are numerous studies about prevalence; however, a scan of these studies suggests that prevalence is highly context- and timedependent. Try to get prevalence data that is as recent and specifc to your context as possible. Consider conducting a study on some of the metrics in this chapter. Ideas include: a survey about prevalence; a study into the detection rates of approaches you use; gathering cost estimates for diferent types of cheating; an evaluation of the efcacy of the investigation and proof processes you use; or
Metrics and standards for assessment security
•
127
something entirely diferent that helps you understand if your processes work. Try to use existing methodologies and instruments wherever you can. If you are statistically inclined, you may wish to experiment with some of the statistics that we use to evaluate the efectiveness of diferent approaches to assessment security. The main statistical package we use is DAG_Stat (Mackinnon, 2000), which takes the form of an Excel spreadsheet that is surprisingly easy to use (available at https://www.biostats.com.au/DAG_Stat/). If you want to compare the efectiveness of two diferent approaches and determine if one is signifcantly more accurate in terms of sensitivity or specifcity, you can use Newcombe’s (2001) method, which has also been turned into an easy-to-use Excel spreadsheet (available at http://profrobertnewcomberesources.yolasite.com/).
9
Structuring and designing assessment for security
Assessment is one of the most powerful infuences on student learning and behaviour. Assessment is a gatekeeper and an accreditor. Assessment is ofcial and it has stakes. Taking Ramsden’s (1992) infuential statement that assessment ‘always defnes the actual curriculum’ (p. 187) for students, it is easy to infer that changes to assessment might change student behaviour around cheating. If we set tasks that students actually want to do, teach them how to do those tasks well, and don’t schedule them all at once, fewer students might be tempted to cheat. But what about determined cheaters – can assessment design also improve assessment security? There’s a lot of advice on this topic, mostly based on thin evidence. This chapter considers approaches to improving assessment security through assessment design that have been disproven. It then covers some obvious mistakes in assessment design that can make assessment less secure. Approaches that appear to have a stronger evidence base are then discussed – some of these build on empirical evidence, and some on the logics of what assessment needs to do. But frst, the chapter examines why assessment design fundamentally matters to assessment security, and how assessment design works.
Assessment design trumps assessment security Assessment security’s main purpose is ensuring that cheating does not stop assessment from doing what it is meant to do. But what is assessment meant to do? Each act of assessment can serve a multitude of purposes, which can be summarised as summative, formative, and sustainable (Boud, 2000, 2007; Boud & Soler, 2015). Assessment’s summative purpose involves certifying and credentialing that students have achieved particular learning outcomes; marks, grades, standards and criteria are common hallmarks of assessment that serves this purpose.Assessment that privileges the summative purpose over other purposes is often called assessment of learning or summative assessment.Assessment’s formative purpose involves guiding students through tasks that are worthwhile learning experiences; feedback, active learning and time on task are common considerations in assessment that serves this purpose. Assessment that privileges the formative purpose over other purposes is often called formative assessment, assessment for learning or assessment as learning (Carless, 2017). Finally, assessment’s sustainable purpose is concerned with
Structuring and designing assessment for security
129
preparing learners to function without formal assessment processes. Assessment that privileges the sustainable purpose tends to engage students in making ‘evaluative judgements’ about their own work and the work of others (Tai et al., 2018). Assessment that focuses on sustainability is termed sustainable assessment. Many tasks serve multiple purposes. Most or all of our attention in assessment security should be paid to tasks that privilege assessment’s summative purpose. Summative assessment is by its very nature high stakes, as it leads to marks and grades that qualify students for degrees and professions.When students cheat in summative assessment, it creates a dangerous situation where qualifed but incompetent professionals enter the workplace. It also creates an unfair situation where less capable but cheating students gain opportunities that should have gone to their more capable and more honest peers.The argument for securing summative assessment is clear-cut and has been made throughout this book. But why shouldn’t we care as much about securing assessment for learning and sustainable assessment as we should for summative assessment? Assessment for learning, even if grade-bearing, is primarily intended to support student learning (Carless, 2017). In a properly designed assessment system, where marks in formative tasks have very limited impact on fnal grades, GPAs, degree classifcations and progression, there is little to be gained from securing assessment for learning – and a lot to be lost. As previously discussed in this book, the approaches and technologies of assessment security can hurt learning and the student experience. Since learning comes frst in assessment for learning, assessment security needs to take the back seat. The same argument applies to sustainable assessment. In developing their capability to see quality in their own work and the work of others – their evaluative judgement – learners need opportunities to rate their own work and look at what their peers are doing.An obsession with securing assessment – such as prohibitions on students looking at each other’s work, or not giving students exemplars for fear they will copy them – can run counter to sustainable assessment. This is not to say that formative and sustainable assessment should be unconcerned with cheating.The diference is that, while summative assessment needs to focus on assessment security, formative and sustainable assessment need to focus on academic integrity. This may include familiarising students with the approaches and technologies of assessment security, so that they are prepared to undertake summative tasks. When summative assessment does not meet its summative purpose well enough, there is little point in securing it. For example, when an assessment task does not assess the outcomes it should be assessing, security will do little to ensure that the judgements made are valid for their purpose. This problem of poor alignment (Biggs, 1999) can be addressed through existing course design approaches (Wang, Su, Cheung, Wong, & Kwong, 2012). Similarly, if assessment is norm-referenced (Turnbull, 1989) – for instance, if it is marked to a curve or if there are quotas on how many of each grade can be awarded – a focus on assessment security is likewise misguided. Grades awarded in norm-referenced systems are only capable of
130 Structuring and designing assessment for security communicating how well a student performed relative to their peers, and they are not able to communicate if students met particular standards against predetermined criteria. Perfect assessment security would only protect that ranking system, and it would not enable us to make valid inferences about whether students had met the course or programme outcomes. Assessment design therefore trumps assessment security. The next section explores how assessment design happens as a complex social process.
Assessment design decisions There is a lot of advice available about how to design assessment, building on a research literature spanning decades. The same could be argued for academic integrity, and to a much lesser extent for assessment security; however, in assessment, this advice is not routinely actioned. In 2012, I worked with a team of assessment researchers to explore why this might be the case, and we developed a rich understanding of the everyday assessment design practices that academics use (Bearman et al., 2016; Bennett et al., 2017; Boud et al., 2018b; Dawson et al., 2013; Dawson et al., 2014). It is this practical view of assessment design – rather than an idealised one – that I will use to frame my discussion of assessment design here, as a precursor to discussing how assessment design can support assessment security. We conducted 31 interviews with Australian academics, from across diferent academic disciplines and types of university, to understand how they actually designed and implemented assessment (Bearman et al., 2016). From those interviews and some related literature, we identifed six key elements which we developed into the Assessment Design Decisions Framework:
Box 9.1: Excerpt from the Assessment Design Decisions Framework, adapted from Bearman et al. (2014). Purposes of assessment How can assessment: (1) support student learning; (2) generate grades that will form part of subsequent certifcation; and (3) equip learners in making future judgements?
Contexts of assessment Which of the following attributes needs to be considered in assessment design? What specifcally about each can be taken into account? How can tensions between diferent needs be reconciled? • •
Characteristics of learners/students Institutional assessment principles and policies
Structuring and designing assessment for security • • • •
131
Professional, vocational or employment-related requirements Departmental, disciplinary and personal norms, expectations and ideas Overall program and role of the unit/module Learning environment, e.g. class size or mode (online/face-to-face/ blended).
Learner outcomes How does assessment align with, and promote, desired student outcomes, including: (1) unit/module learning outcomes; (2) overall program learning outcomes; (3) professional requirements; and (4) students’ general professional or intellectual development.
Tasks Students need to engage with a range of tasks to: (1) develop and (2) demonstrate their learning. • • • • • •
What is the rationale for each task? How do the tasks drive learning? What do the tasks specifcally require learners to do? How will successful completion be judged? How are tasks best distributed across the semester? How will students contribute? Which tasks will be graded?
Feedback processes • • •
How are multiple feedback opportunities achieved through the distribution and relationship of tasks across the unit/module/overall program? What types of feedback information will be provided and by whom? How will learner performance be used to infuence the (re)design of later tasks?
Interactions • • • •
How will resistance or engagement from learners or colleagues infuence assessment processes? How will learners understand what is required in the assessment task(s)? What information will be needed to improve this assessment for subsequent occasions? What associated changes in teaching and learning activities will be required?
132 Structuring and designing assessment for security As the framework illustrates, there is more to assessment design than simply setting a task. Assessment design also involves thinking about the purposes of assessment, the contexts the task will be conducted within, the specifc outcomes the task addresses, the feedback processes students will engage with, and how the task will be communicated. If any of the components of the framework are particularly interesting or unfamiliar, you can refer to an extended guide (Bearman et al., 2014) and accompanying materials on the project website http://assessmentdecisions. org. Each of the six main components are infuential in the assessment that students experience, and ultimately on the assessment that needs to be secured. When thinking about assessment, and assessment security, it is therefore important to think beyond just the task.There is currently no element around assessment security in the framework; however, if we were to redevelop it now, I would argue for the need for assessment security as part of the assessment design process.
What doesn’t work in designing assessment for security Several researchers and commentators have ofered advice on how to design assessment that can stop cheating. Some have even argued that their proposed changes will make cheating impossible. Unfortunately, some of these claims have been thoroughly debunked, and I am not aware of any claims with overly robust evidence. At this stage, we know more about what doesn’t work in designing assessment for security than what does work. One proposal that I am guilty of having supported is a shift towards authentic assessment (Dawson, 2015; Race, 2018).The cost metrics in the previous chapter could be interpreted as providing some support for the idea that assessment which requires the capabilities of an industry professional should be more expensive to procure than generic university essays. However, there is evidence from staf and students (Bretag et al., 2019), as well as contract cheating sites (Ellis et al., 2020), that students are successfully purchasing authentic assessments; this strongly suggests that authentic assessment is not a robust approach to securing assessment against contract cheating.This should not be interpreted as a criticism of authentic assessment, just a realisation that it is no panacea for assessment security.This might not matter much if, returning to the discussion about purposes of assessment, authentic assessment is being used primarily as part of assessment for learning – securing authentic tasks then becomes less important. There has been similar debate around time restrictions. While it might sound reasonable to propose that limiting the time students have to complete an assignment will also limit the amount of time they have to obtain a contract cheated assignment, contract cheating writers are adept at handling short turnaround times (Wallace & Newton, 2014). But whereas authentic assessment is generally seen as a positive force in assessment design (Ashford-Rowe et al., 2014; Gulikers et al., 2004;Villarroel et al., 2018), and therefore an acceptable byproduct of misguided attempts to improve assessment security, time restrictions can be seen as more negative due to their potential impacts on stress, workload and equity. These are just two now-disproven examples from recent history of well-meaning suggestions to stop cheating. But while we know that some claimed approaches to improving
Structuring and designing assessment for security
133
assessment security through assessment design don’t work, we also know that there are some easily avoided assessment design mistakes which weaken assessment security.These are discussed in the next section.
Obvious mistakes that weaken assessment security In some contexts, assessment security is being undermined by obvious assessment design mistakes – but conversely, avoiding these mistakes may improve assessment security in some contexts.This section reviews four types of common mistakes. Reusing the same assessment task Setting the same task (e.g. the same essay question) year after year leads to an accumulation of answers for that task, which students can submit as their own work. While content-matching tools can address this problem to an extent, they are limited. In some contexts, content-matching tools are not confgured to save student submissions to a repository; in these contexts, the tools will never detect students resubmitting assignments from previous years.Also, current content-matching tools cannot detect paraphrased work; this means students can resubmit their friend’s assignment from last year as long as they paraphrase it manually or use an auto-paraphrase tool. Content-matching tools have not, therefore, solved the problem of using the same task year after year. Unsupervised online tests Another obvious assessment design mistake is setting unsupervised online tests.As with all of these mistakes, my criticism here only applies to summative tasks; whilst cheating in online formative tests is a signifcant problem and can lead to poorer performance in summative tasks (Arnold, 2016), it is not really an assessment security concern. Indeed, when discussing the design of these tasks (Dawson et al., 2014), we heard arguments in favour of their formative use; students working together, in what could be termed collusion, was instead characterised as ‘peer learning’. Unsupervised tests can provide students with opportunities to cheat and, whilst increased performance due to cheating can be ofset by challenges inherent in taking tests online (Fask, Englander, & Wang, 2014), this sort of compensatory thinking runs counter to ideas of reliability and validity in assessment. Some disciplines appear particularly ill-suited to summative unsupervised online tests, especially mathematical disciplines where there is only one right answer that can easily be copied (Trenholm, 2007). Multiple choice questions are particularly insecure in unsupervised contexts, especially when a single set of questions is presented in the same order.When considering using unsupervised online tests, imagine the worst-case scenario: a group of students in the same room being coached through the test by each other or an outside expert. In recognition of the challenges posed by unsupervised online tests, my university’s assessment procedure states: 13.a: the total maximum weighting of any unsupervised tests which are administered online and automatically computer-marked is 20% of the mark
134 Structuring and designing assessment for security for the unit; administration of those tests is carried out in accordance with the standards set out in Schedule C: Standards for Administering Unsupervised Online Tests which are Automatically Computer-Marked. (Deakin University, 2018) The “Standards for Administering Unsupervised Online Tests which are Automatically Computer-Marked” document specifcally addresses some of the weaknesses of unsupervised online tests that have been discussed here. Take-home ‘one right answer’ or lower-level tasks One of the most obvious assessment design mistakes to impact assessment security is the use of summative take-home tasks for which there is only one right answer. Such tasks clearly invite copying. Similarly, tasks which exclusively assess lowerlevel outcomes of various learning outcomes taxonomies – e.g. levels one or two in Biggs’ (1999) SOLO taxonomy – can be copied. If these outcomes must be assessed summatively, some type of supervision should be mandatory. For higherlevel ‘one right answer’ tasks, the risk of copying might be reduced through individualisation. For example, in an undergraduate cryptanalysis course I studied, every student was provided with a unique piece of encrypted text to decrypt.This ensured that copying was not an option. Poor examination practices There is a range of research addressing obvious assessment security mistakes that happen in examinations. McCabe’s (2005) infuential paper on cheating in North American higher education found that, out of more than 64,000 undergraduates, at least a third had engaged in some type of exam cheating, much of which I would regard as exploiting obvious assessment security mistakes. Of the 64,000 students, one third had learnt what was on a test from someone who had already taken it; this exposes the magnitude of the poor practice of allowing diferent students to sit the same exam paper at diferent times. Exam papers should be considered one-use items, and supplementary papers should be prepared for any future sittings of the exam. At least 11% of undergraduates admitted to copying from other students in some way, revealing the prevalence of copyable exam papers.The research is clear on the need to provide alternative but equivalent test papers to students sitting next to each other; in fact, there is so much research on this topic that researchers have been able to explore seemingly niche impacts of the approach, such as the efects of diferent-coloured exam paper (Tal, Akers, & Hodge, 2008). Returning to the 64,000 undergraduates, 8% admitted to using unauthorised notes, and 5% admitted to using unauthorised devices, exposing weaknesses in proctoring approaches. Such a high prevalence of unauthorised materials calls into question any supposed advantage examinations have over unproctored forms of assessment. Finally, the poor examination practice of verifying student identity by matching the appearance of the person in the exam to the person on the student card they have brought with them has also been called into question. An investigation into
Structuring and designing assessment for security
135
exam impersonation by Australian journalists (Potaka & Huang, 2018) found that university student identifcation cards were easily forged. Relying on these cards, rather than ofcial photographs stored on university systems, is another obvious mistake that weakens assessment security. There is a wealth of other poor exam practices that should be eliminated as well, and their persistence should call into question any default superiority exams are supposed to have over other forms of assessment when it comes to security. Many of these apply equally to high-stakes tests in an exam hall and to in-class quizzes in a classroom. The following sections discuss diferent assessment design considerations and their potential impacts on assessment security.
Invigilation and assessment security The extent to which a task will be invigilated, proctored or supervised is a key choice when designing an assessment. Invigilation can be in person or online, and it is usually chosen in an attempt to control circumstances and/or authentication. But even the best invigilation practices do not solve the assessment security problem – they just change what we need to be worried about. One large-scale Australian survey actually found that undetected third-party cheating was more common in exams than assignments (Harper, Bretag, & Rundle, 2020), although it is worth noting that this fnding relies on comparing self reported cheating rates from students against rates of detected cheating reported by staf. In the absence of evidence to the contrary however the results call into question the perceived superiority of exams in terms of being more resistant to contract cheating. Invigilated tasks come with a set of assessment security threats that are discussed in Chapter 1. In addition to the metrics and standards discussed in Chapter 8, the key assessment design considerations from an assessment security perspective are: • • • •
Which of these threats is more damaging to the validity of assessor judgements about a particular task? Can the task be redesigned so access to outside materials is less of a problem? Can students be allowed to use whatever cognitive ofoading they want? What conditions are absolutely necessary to enforce for assessors to make a valid judgement about these particular outcomes?
Group work and assessment security Group assessment is very efective at fnding out what a group of students can collectively do – but fnding out what each individual student in a group is capable of is much more difcult. Group work situations where one person does a disproportionately large amount of work but all students get the same mark are obviously unfair, and they are likely also invalid. It is important to distinguish between the concepts of fairness and validity. Systems are often put in place to reward efort or volume of work in group tasks.These are a reasonable idea from the perspective of fairness. However, they do not necessarily solve the validity problem of accrediting the individuals in a group as having met particular outcomes. A student who does
136 Structuring and designing assessment for security more work has not necessarily demonstrated that they have met the outcomes better than their peers. Consider a task that is ostensibly about assessing students’ ability to use particular statistical methods on analysing survey data. One student in the group does the majority of the work, which involves 20 hours spent gathering the data; however, they undertake none of the analysis.The other three students conduct the analysis, which takes them an hour each.The student who gathered the data did most of the work, but they have not demonstrated they have met the outcome being assessed. It might be considered fair to give them a higher mark than their peers, but it would not be valid. It might be considered valid to give them a fail grade, as they have not demonstrated the assessed outcome at all, but this would not be fair. At this stage, the problem is not yet one of assessment security – but what if the group members infuence the grades awarded to their peers, as is common in some group tasks (Sridharan, Muttakin, & Mihret, 2018)? Whilst in formative tasks students are often open with their perceptions of their group members’ relative contributions, they can be much less forthcoming in summative tasks (Sridharan, Tai, & Boud, 2019); this may be explained, to a small extent, by a bias towards reciprocity in peer ratings (Magin, 2001). Group work peer assessment tasks present opportunities for cheating, and so should be considered from an assessment security perspective as well. If using peer group work assessment, where students will have direct input into their peers’ grades, assessors need to have approaches in place to monitor attempts to cheat the process.
Authentic restrictions and assessment security Whilst authentic assessment might not, in and of itself, stop particular types of cheating (Ellis et al., 2020), the idea of authenticity may be a useful way to rethink the restrictions placed on students while completing tasks. If a task is redefned to allow students a resource or service that was previously considered cheating, this can lead to a total reduction of that form of cheating. An ‘authentic restrictions’ approach to assessment security asks: does this restriction apply to professionals in the discipline as they complete this task? Examples of restrictions that may not be authentic include: • • •
A prohibition on calculators in an engineering exam, if professional engineers would use a calculator for that task; Not allowing students completing a systems analysis assignment to outsource the writing of computer code to somebody else; and Banning the use of medical books during a diagnosis task.
It is arguable that students in the above scenarios need to be able to do all of these tasks with these restrictions in place. A case could also be made that each of these restrictions helps promote equity, by not allowing students to buy better materials or outside help. However, each restriction above introduces a degree of inauthenticity to the task. And although these restrictions might appeal to some
Structuring and designing assessment for security
137
‘back to basics’ sensibilities, they also make the tasks more prone to cheating, because they introduce new conditions that must be upheld for the assessment to work. It may be counterintuitive, but inauthentic restrictions make assessment less secure.When considering restrictions on a task, assessment designers should consider what Gulikers et al. (2004) term the physical and social contexts of a task: what resources, people and settings do professionals use when they do the task?
Programmatic assessment security The term programmatic assessment refers to assessment that is conceptualised not as a series of one-of, disconnected and haphazard tasks but as a continuous, connected and planned programme (Schuwirth &Van derVleuten, 2011). Programmatic assessment addresses the concern that everyday assessment practices in postsecondary education might fail to ensure that graduates have actually met the learning outcomes specifed at a programme level, even if high grades were achieved in individual tasks and courses. Under a non-programmatic assessment approach, scoring highly in a task that assesses one outcome can ofset poor performance in a task assessing another outcome, leading to students who have not met all required outcomes completing the course. However, under a programmatic approach all degree outcomes must be met to pass the programme. While programmatic approaches are not yet widespread in higher education – possibly due to the challenges of implementation (Lawson et al., 2015) – the legislation that governs higher education in Australia can be interpreted as efectively mandating programmatic assessment, by requiring all degree-level outcomes as having been met for an institution to be allowed to award a degree (Australian Government, 2015). Programmatic assessment may allow us to change the assessment security unit of analysis, shifting from a focus on securing every individual assessment task to a focus on securing the program.To use an analogy: programmatic approaches try to win the war, whereas non-programmatic approaches try to win every battle. By identifying the tasks that assess a particular outcome, we can see which are most important to secure. A greater proportion of assessment security resourcing could be put towards these high-importance tasks, and a lesser proportion towards tasks that do not infuence degree-level judgements for that outcome. Tasks that have less infuence over programmatic judgements could potentially be recast as formative tasks, and include a greater emphasis on development of academic integrity. The suggestion of a programmatic view of assessment security should not be interpreted as a call for more examinations. If anything, in most contexts it is a call for fewer examinations.The current practice of individual courses each having their own end-of-semester examinations by default is a product of non-programmatic thinking about assessment. In the absence of clarity around which assessments constitute key assessment moments for particular program outcomes, institutions have adopted end-of-course exams in an attempt to secure each course, with the intention that doing so will secure the program. However, under a programmatic approach, assessment security resources would be focused only on key degree-level assessment moments, rather than arbitrarily at the end of each semester.This would mean an end to exams as a default mode of assessment in most circumstances.
138 Structuring and designing assessment for security But what about key summative, programmatic moments of assessment? Do they need to be examinations? As has been discussed in Chapter 8, we lack hard data on which types of assessment are more secure than other types. Bretag et al. (2019) express concern that a shift to programmatic assessment would mean a shift towards high-stakes examinations, which their data also fnds to be the largest site of undetected third-party cheating according to student and teacher self-report data (Harper et al., 2020). In contrast to my recommendation to focus on securing key summative acts of assessment, Bretag et al.’s (2019) preferred approach is the securing of all assessment types.While I share the opinion that examinations should not be trusted as a default secure task type, I think assessment security resources are best invested in trying to secure those task types that (a) are used in assessments that are key decision-making moments from a programmatic perspective or for accreditation purposes, and (b) appear to hold promise as being more secure than other task types.This may include examinations in some contexts; however, in others it may not. What would programmatic assessment security look like? In most instances of assessment, it would look less like an emphasis on assessment security and more like an emphasis on learning. For instance, in a three-year degree the frst and second years would probably contain less invigilated assessment, and instead would integrate signifcant work around developing academic integrity capabilities in students. This reduction in assessment security eforts across most of a programme would result in signifcant savings, as invigilation and other security measures are costly on a per-task basis. Savings made on reduced security costs could be put toward more secure tasks that matter from a programmatic perspective.This might include investing in more secure examinations, but it could also include other task types that are often deemed prohibitively expensive, such as oral assessments, simulations, observations of workplace learning, or more intensive stylometric approaches.The savings made by not securing every act of assessment would permit multiple types of more secure assessment to be used, to assure each programmelevel outcome. This is necessary because, from a programmatic assessment perspective, no single assessment can be considered robust; instead, high-stakes assessment judgements need to be made from multiple observations using diferent methods (van der Vleuten et al., 2012).
Dialogic feedback and assessment security Recent work has sought to reposition feedback from something done by teachers to students to a process done by students (Boud & Molloy, 2013a; Winstone & Carless, 2019). This view of feedback shifts our emphasis away from feedback information and towards feedback processes. One approach that is particularly compatible with a process view of feedback is a view of feedback as a dialogue (Carless, 2016): that is, an ongoing conversation between the student and other people, such as teachers or peers, towards improvement of the student’s work. In addition to its potential to improve learning, further research into dialogic feedback is needed to determine its assessment security potential – the logic being that feedback conversations may be a way for teachers to identify potential cheating,
Structuring and designing assessment for security
139
and also that contract cheating may be more difcult to undertake as the cheating provider would not be present in the feedback interaction. While I think this approach holds promise, I also have some signifcant concerns. Firstly, I am worried about polluting feedback interactions with concerns about cheating. Feedback is already stymied to an extent by its close connection with marking, sitting in an ‘uneasy alliance’ (Watling, 2016); adding assessment security to this mix might make it downright adversarial. Expecting feedback to serve an additional assessment security purpose might interfere with feedback’s ability to achieve its primary purpose of supporting learning. Secondly, there is as yet no evidence I am aware of that supports the idea that feedback dialogues will actually make cheating more difcult to undertake or easier to detect and, since some cheating providers already ofer to engage in feedback processes, a student who took notes or recorded a feedback interaction could feasibly outsource the actions required by dialogic feedback. None of this is a criticism of dialogic feedback; it is merely concern that introducing assessment security to dialogic feedback may come at a cost, and we do not yet know if it would be worth it.
Random assessment security checks Most of the discussion of assessment security in this book has assumed that, when securing an assessment task, all students will experience the same security measures. In practice, this approach often means the assessment security resourcing invested into each student’s assessment is relatively small. An alternative approach could be to randomly select a smaller percentage of student work and subject this selection to more in-depth investigation. Similar approaches are taken to detect tax evasion in many countries – most taxpayers’ flings are not investigated thoroughly, but a random subset is selected for audit.The threat of random audit appears to encourage compliance in the tax system; however, the rate of compliance may adopt an upside-down U-shaped distribution, whereby compliance increases up to a point, but then decreases (Mendoza,Wielhouwer, & Kirchler, 2017). One possible explanation for poorer compliance at the high extreme is that very high audit rates communicate distrust to taxpayers, or even an expectation that noncompliance is normal (Mendoza et al., 2017). Random assessment security audits would likely face similar challenges; however, with careful selection of the right rate of audit, they may also support higher compliance rates while reducing the overall level of surveillance conducted on the student body.
Assessment design and quality control processes In some educational contexts, there are signifcant administrative and quality control processes in place around the design and implementation of assessment. In Australia, most universities have in place committee processes and forms that are required before any changes can be made to an assessment.While there is substantial variability in the level of oversight provided by these processes, and the extent to which they are viewed as helpful or a hindrance (Bearman et al., 2016), as these processes are gatekeepers to the assessment that is enacted they may also be
140 Structuring and designing assessment for security important arbiters of assessment security. Having reviewed the paperwork that accompanies these processes from several Australian universities, I am yet to see any that consider assessment security.Whilst these committees may sometimes choose to discuss assessment security, its absence from the paperwork means that it is not a routinely considered part of assessment change. At a minimum, these processes should be amended to explicitly consider assessment security, especially for assessments that constitute important programmatic moments of assessment. Placing emphasis on these proactive processes, rather than on reactive anti-cheating approaches, may help stop problems before they occur. More broadly, the various quality assurance processes surrounding assessment should be improved to specifcally focus on assessment security. Whilst external quality assurance agencies such as UK and Australian regulators already audit for assessment security, other processes like external examining, internal programme reviews and student surveys could fruitfully be amended to include a focus on assessment security. If the approaches detailed in this book are left up to individuals to initiate, or alternatively are prescribed in policy but not integrated into quality processes, it is likely that little systemic improvement to assessment security will occur.
Things to do •
•
•
Tackling assessment security from an assessment design perspective requires a good grounding in assessment design.The Assessment Design Decisions project, which I co-led with my colleague Margaret Bearman, has extensive resources available to support assessment design and is grounded in best practices from the literature and the actual experiences of educators. Visit our website: http://assessmentdecisions.org If the programmatic assessment suggestion was appealing, it may be valuable to read more about it, as there is a lot more to programmatic assessment than has been presented here. I have focused on the potential benefts of programmatic assessment for assessment security, but it also has great opportunities for improving learning through assessment. Cees van der Vleuten is regarded as a leading scholar in this feld, and his website contains several papers and videos on programmatic assessment: https://www.ceesvandervleuten.com/publications/Programmatic-assessment. The UK Transforming the Experience of Students through Assessment project https://www.testa.ac.uk/ and the Australian Assuring Learning project http://assuringlearning.com/ both also have practical examples and guides for programmatic assessment. Audit your assessment designs for some of the features discussed in this chapter. Do they contain any of the obvious mistakes? Are the restrictions imposed on students authentic? Are the assessment security approaches used ft for achieving the purposes the program requires of the task?
10 Conclusion Securing assessment against E-Cheating
E-cheating is here to stay, so what can we do about it? This chapter addresses the key concern raised at the beginning of the book: that e-cheating is threatening assessment security. It draws together fve key messages from across the book to identify an agenda for securing assessment against e-cheating.The fve messages are summarised below: 1. Focus on assessment security alongside academic integrity; 2. Take an afordance-based approach to understanding e-cheating, and a principles-based approach to stopping it; 3. Perfect is the enemy of good, so aim for incremental improvements; 4. Build a culture of evidence and disclosure; and 5. Resist assessment conservatism and assessment security theatre. You might notice the tone in this chapter is somewhat diferent to the preceding nine chapters. This is intentional. What remains of this book is not just a research text, it’s a plea for action on a topic I believe is a matter of life and death. Without further preamble, here’s what I’d really like you to do.
Focus on assessment security alongside academic integrity The key motivating premise of this book is that educational institutions have a responsibility to society to ensure that graduating students have achieved the outcomes we say they have achieved.Academic integrity is an essential component in achieving this goal, as it develops the cultures and capabilities necessary for students to act ethically. But we also need a sharper and unfortunately more negative and adversarial focus, too. We need to focus on making cheating very difcult, on detecting cases of cheating, and on proving when cheating has happened.We need to focus on assessment security. A failure to focus on assessment security has led to the accreditation of an unknown number of incompetent professionals who have feigned competence through cheating.These doctors, teachers, lawyers, engineers, academics, journalists and accountants walk among us. At the risk of melodrama, our lack of focus on
142 Conclusion assessment security has likely led to poorer productivity, an unfair job market, and deaths due to negligence. Some within academic integrity view a focus on assessment security as anathema to their positive mission. I know this because I have workshopped these ideas across institutions and continents while developing this book. I’ve heard the problem of cheating blamed on neoliberalism, assessment design, poor teaching, distrust, non-traditional students, international students, credentialism, online learning, and a host of other factors. Unless we fx these problems, the argument goes, we won’t be able to fx the problem of cheating. Whilst I agree that we need to fx the macro- and micro-level causes of cheating and equip students to act with integrity, there are two reasons why I think this approach is not enough in and of itself. Firstly, some of these problems are intractable social issues without clear solutions. Secondly, there is little evidence to support the claims that solving these problems would stop cheating. But this isn’t an either/or proposition.We can work towards discouraging cheating by making the world a better place, and at the same time keep an eye out for cheating. Assessment security and academic integrity can, and must, coexist. We need to move beyond all-or-nothing and dichotomous thinking, and carefully balance assessment security and academic integrity.There is both tension and complementarity between these two foci. But society is full of similar situations, from the balance between policing and crime prevention to the struggle for resources between public health and frontline medical care provision.There are no easy answers, except to say that we need both the positive (academic integrity) and the adversarial (assessment security).
Take an afordance-based approach to understanding e-cheating, and a principles-based approach to stopping it Chapter 1 introduced the concept of e-cheating.The technologisation of cheating brings with it many challenges, but the one that I am most worried about is the pace of change.We have had centuries to respond to the threat of students bringing notes into examinations, and we have developed strategies we think work reasonably well to minimise this type of cheating.We have had decades to respond to the threat of students copy–pasting from the Internet, and we have also developed technologies that counter this form of cheating. But the shift to e-cheating has been accompanied by the rapid proliferation of new ways to cheat, to which we have not yet responded. Hacking, cheating earpieces, auto-paraphrase tools, automatic content generation and homework-completing apps are just a few of the unsolved problems that have come to the fore over the past decade.Where cheating may have once been developed in isolated pockets by expert cheaters, now cheating can be bought and sold anonymously, shared and pushed through targeted ads and social media infuencers, and coded into an app so it’s easy for anyone to do. Things are only going to get worse as we progress towards the sort of high-level machine intelligence discussed in Chapter 6, which experts believe has a 90% chance of happening within our students’ lifetimes. We cannot continue our reactive approach of responding to new cheating strategies only once they become a problem.
Conclusion
143
Rather than responding to each new form of e-cheating as it arises, in this book I have advocated trying to understand the potential afordances of new approaches to cheating.The taxonomy in Chapter 1 is a starting point for thinking about these afordances.‘Rules’ (policies, procedures, etc.) around cheating need to be written with afordances in mind, rather than focusing on the specifcs of every possible way to cheat. Institutional policy need not explicitly ban the use of auto-paraphrase tools or specifc homework apps, but it should specify which types of assistance are allowed. Here we can borrow from the feld of academic integrity. Rather than proscribe a ‘ban-list’ of forbidden activities, we may instead wish to prescribe an ‘allow-list’ of what we consider ethical scholarship, and the specifc afordances that we will allow students to make use of.
Perfect is the enemy of good, so aim for incremental improvements As we have seen in cybersecurity, video games and gambling, perfect security is not really practical and might not even be attainable. We should be extremely suspicious of any assessment approaches or technologies that are marketed as being cheat-proof. As the discussion of penetration testing in Chapter 4 explained, we can at best identify cheating approaches that might exist. We can’t ever claim to have found all the possible ways to cheat, or that there are no ways to cheat.This means we can never claim to have stopped or detected all possible ways to cheat. Please do not despair or let the unattainability of perfection lead you towards apathy or inaction.Whilst we cannot reach perfect assessment security, we still have a responsibility to try hard to make things better. Rather than setting perfection as a goal, instead consider setting incremental milestones for improvement of assessment security. Conduct an audit of your current assessment security practices, perhaps against the standards in Chapter 8, and identify specifc improvements to undertake within a set timeline. Any improvement is better than paralysis due to the unattainability of perfect assessment security.
Build a culture of evidence and disclosure Academic integrity is built on a culture of trust; and I support that culture of trust as far as it extends to trust between students and educators, tempered by the necessity of assuring assessment security.Whilst we can do more to cultivate that culture of trust, there is one culture of trust I think we should disrupt: the culture of blind trust in the assessment security benefts of assessment traditions, assessment security vendors and their products, anti-cheating legislation, and our own assessment designs. I want to replace this with a culture of evidence and disclosure. As Chapter 8 discussed, we don’t currently have a culture of evaluating the assessment security benefts of diferent assessment approaches. Where there is a lack of evidence, we have seen a proliferation of suggestions and products that, on face value, appear to work, but when held up to the scrutiny of peer-reviewed research their assessment security benefts are often thin.We need to work towards an assessment security culture that not only produces this sort of evidence, but
144 Conclusion demands it. This particularly matters when dealing with the multi-billion-dollar assessment security industry.Vendors need to not only produce evidence, but also put that evidence before the scrutiny of peer review and allow external parties to conduct their own studies. If a vendor will not allow us to verify claims of their product’s efectiveness, we should not trust that vendor. Similarly, if legal approaches to improving assessment security are proposed – such as criminal sanctions for cheating providers – we should also demand evidence. The sorts of laws that are currently being considered in several countries have been in existence in some other jurisdictions for decades, with no evidence they work and some evidence they do not work. Rather than legal prohibitions on types of cheating, I would prefer to see legal protection for attempts by researchers to test the security of assessment. What should we do if we get evidence about assessment security and it is bad news? In Chapter 4 I argued for a culture of disclosure, whereby researchers are allowed to go public with assessment security vulnerabilities.As a feld, we need to have a conversation about the type of disclosure we would prefer. I am partial to responsible disclosure, which would involve researchers informing the relevant vendors or institutions about a vulnerability and negotiating a timeline for full public disclosure. This creates a sense of urgency to fx the vulnerability, but also gives time to fx it. However, for researchers – including students who discover faws or cheating hotspots – to be able to make such disclosures, they need assurance there will be no reprisals against them. News about an assessment security vulnerability in a vendor’s product, or the prevalence of a mode of cheating at an institution, could have disastrous commercial consequences.Within the university context, it may be possible to work within existing cultures and policies of academic freedom to accommodate these – however, any protections would need to be made explicit rather than left up to interpretation. A more robust alternative could be to amend whistleblower legislation that protects employees who speak out publicly about serious matters. If we are serious about securing assessment, as a sector we need to develop a culture of gathering evidence, publicly disclosing problems, and working together to fx them.
Resist assessment conservatism and assessment security theatre This book has, to an extent, raised the alarm about cheating and called for improvements to assessment security. I hope this leads to a measured response that carefully balances the need for assessment security against the many other competing concerns of education, especially learning and the student experience. However, there is a small part of me that is afraid what has been presented might be used to justify a more draconian response, characterised by more intensive use of assessment security approaches that restrict and surveil because they appear to stop cheating. Therefore, my last plea in this book is that you work to resist assessment conservatism and assessment security theatre.
Conclusion
145
Assessment conservatism is a preference towards traditional modes of assessment due to the unevidenced belief that they will stop cheating.This might manifest as more progressive approaches like authentic assessments being replaced by highstakes examinations.As a researcher who is predominantly from the feld of assessment for learning, assessment conservatism worries me, as it brings with it the loss of many advances that have been made in teaching, learning, assessment and the student experience since the 1980s. Assessment security theatre builds on the security theatre notion discussed in Chapter 4. Assessment security theatre is the use of approaches that make us look like we are doing something about cheating, but which actually have no proven anti-cheating benefts. If, for example, a remote proctoring tool does not meaningfully protect against cheating, then it is just an expensive and annoying piece of assessment security theatre. Educational institutions are under pressure by regulators and the media to appear to do something about the various new types of cheating, so it is understandable that they wish to appear to be acting in response. However, being seen to be addressing the problem of cheating and doing something that works are two very diferent things. Until we have evidence that an approach works, we should remain sceptical that it may just be assessment security theatre. How can we resist assessment conservatism and assessment security theatre? Here are three approaches. Firstly, we can minimise the impact of assessment conservatism and assessment security theatre by pushing for a programmatic approach to assessment security, as presented in Chapter 9. Under a programmatic view of assessment security, we need to secure far fewer acts of assessment, because we secure the assessments that matter. Secondly, for those assessments we must secure, we can make the argument that restrictions actually make assessment security harder. Whatever we disallow students access to – be it the Internet, a calculator, friends or a textbook – we need to enforce that restriction. An unenforced restriction is much more damaging to assessment security than a task that has been redesigned to be less restrictive. Thirdly, we need to continue to collect and use data about the harms of assessment security. What is the impact of assessment security approaches on student learning? On the student experience? On the reliability and validity of assessment? Only with this data, and accompanying data about the anti-cheating benefts of proposed approaches, can we have a meaningful conversation about assessment security.
References
24Kupi. (2019). Best student watches, calculators and everything you need to pass exam. Retrieved from https://www.24kupi.com/ #NotMyDebt. (2020). NotMyDebt Stories. Retrieved from https://www.notmydebt.com. au/stories/notmydebt-stories AIProctor. (2018). AIProctor | Bleeding edge artifcially intelligent online proctoring like never before. Retrieved from https://www.aiproctor.com/ Airtasker. (2019).Are there any task posting guidelines? Airtasker Help Centre. Retrieved from https://support.airtasker.com/hc/en-au/articles/225001588-Are-there-any-task-postingguidelines Al Jazeera News. (2018, June 21). Algeria and Iraq shut down internet to prevent exam cheating.Al Jazeera. Retrieved from https://www.aljazeera.com/news/2018/06/algeriairaq-shut-internet-prevent-exam-cheating-180621074343644.html Altman, B. (2013). Hiding behind cards: identifying bots and humans in online poker (MSc, University of Guelph). Retrieved from https://atrium.lib.uoguelph.ca/xmlui/ handle/10214/6645 Amazon. (2019). NENRENT S570 bluetooth earbud, smallest mini invisible V4.1 wireless bluetooth earpiece headset headphone earphone with Mic hands-free calls for iPhone iPad Samsung Galaxy LG HTC smartphones 1pcs (Nude). Retrieved from https://www. amazon.com/NENRENT-Bluetooth-Invisible-Hands-Free-Smartphones/dp/ B01MQVMFUN/ref=sr_1_6?keywords=cheating%2Bearpiece&qid=1576458408 &sr=8-6&th=1 Amigud, A. (2019). Cheaters on Twitter: an analysis of engagement approaches of contract cheating services. Studies in Higher Education, 45(3), 692–705. doi:10.1080/03075079. 2018.1564258 Amigud,A.,Arnedo-Moreno, J., Daradoumis,T., & Guerrero-Roldan,A.-E. (2018).An integrative review of security and integrity strategies in an academic environment: current understanding and emerging perspectives. Computers & Security, 76, 50–70. doi:https:// doi.org/10.1016/j.cose.2018.02.021 Amigud,A., & Dawson, P. (2019).The law and the outlaw: is legal prohibition a viable solution to the contract cheating problem? Assessment & Evaluation in Higher Education, 45(1), 98–108. doi:10.1080/02602938.2019.1612851 Amigud,A., & Lancaster,T. (2019a). 246 reasons to cheat: an analysis of students’ reasons for seeking to outsource academic work. Computers & Education, 134, 98–107. doi:https:// doi.org/10.1016/j.compedu.2019.01.017 Amigud,A., & Lancaster,T. (2019b). I will pay someone to do my assignment: an analysis of market demand for contract cheating services on twitter. Assessment & Evaluation in Higher Education, 45(4), 541–553. doi:10.1080/02602938.2019.1670780
References
147
Ananny, M., & Crawford, K. (2016). Seeing without knowing: limitations of the transparency ideal and its application to algorithmic accountability. New Media & Society, 20(3), 973–989. doi:10.1177/1461444816676645 Anderson, D. (2019). Ultimate list of fake essay writing service reviews websites. Retrieved from https://www.ihatewritingessays.com/fake-review-sites Andrews, T. (2019a). Here’s how likely it is your partner will cheat, based on their zodiac sign. Yahoo! Finance. Retrieved from https://fnance.yahoo.com/news/likely-partnercheat-based-zodiac-212700365.html Andrews, T. (2019b). Here’s how likely it is your partner will cheat, based on their zodiac sign. Cosmopolitan. Retrieved from https://www.cosmopolitan.com/sex-love/ a28340080/zodiac-signs-likely-to-cheat/ Andrews, T. (2019c). Here’s how likely it is your partner will cheat, based on their zodiac sign. MSN Lifestyle. Retrieved from https://www.msn.com/en-us/lifestyle/lifestylebuzz/here%E2%80%99s-how-likely-it-is-your-partner-will-cheat-based-on-theirzodiac-sign/ar-AAE5wjg Apperley, T. (2011). Gaming rhythms: play and counterplay from the situated to the global. Amsterdam: Institute of Network Cultures. ARC UNSW Student Life. (2019). Contract cheating | ARC UNSW student life. Retrieved from https://www.arc.unsw.edu.au/help/university-advice/contractcheating#If-I-am-being-blackmailed Arkin, B., Stender, S., & McGraw, G. (2005). Software penetration testing. IEEE Security & Privacy, 3(1), 84–87. doi:10.1109/MSP.2005.23 Arnold, I. J. M. (2016). Cheating at online formative tests: does it pay of? The Internet and Higher Education, 29, 98–106. doi:https://doi.org/10.1016/j.iheduc.2016.02.001 Ashford-Rowe, K., Herrington, J., & Brown, C. (2014). Establishing the critical elements that determine authentic assessment. Assessment & Evaluation in Higher Education, 39(2), 205–222. doi:10.1080/02602938.2013.819566 Ashton, G. (2019, May 27). Cheating in Esports—how is it done, and how is it dealt with? The Esports Observer. Retrieved from https://esportsobserver.com/cheating-in-esports/ Aslam, N. (2019, February 5). Banks banning cryptocurrency purchase on credit cards, why? Forbes. Retrieved from https://www.forbes.com/sites/naeemaslam/2018/02/05/ banks-banning-cryptocurrency-purchase-on-credit-cards-why/#490351c13cf9 Australian Government (2015). Higher education standards framework (threshold standards) 2015, 5.2 CFR. Awdry, R., & Newton, P. M. (2019). Staf views on commercial contract cheating in higher education: a survey study in Australia and the UK. Higher Education, 78(4), 593–610. doi:10.1007/s10734-019-00360-0 Bacik, S. (2016).Whitelisting. In Information security management handbook (Vol. 5, pp. 29–34). Boca Raton, FL:Auerbach Publications. Badley, G. (2003). The crisis in educational research: a pragmatic approach. European Educational Research Journal, 2(2), 296–308. Retrieved from http://www.ncbi.nlm.nih. gov/entrez/query.fcgi?db=pubmed&cmd=Retrieve&dopt=AbstractPlus&list_uids=776 7016247578699752related:6KedMSD8yWsJ Bailey, J. (2013).Viper plagiarism checker: posting your essays on essay mills. Retrieved from https://www.plagiarismtoday.com/2013/11/18/viper-plagiarism-scanner-stealing-essays/ Bakos,Y., Marotta-Wurgler, F., & Trossen, D. R. (2014). Does anyone read the fne print? Consumer attention to standard-form contracts. The Journal of Legal Studies, 43(1), 1–35. doi:10.1086/674424 Barkley, J. (1994). Security in open systems. Washington, DC: US Government Printing Ofce.
148 References Batchelor, J. (2018,April 9). GTA V is the most proftable entertainment product of all time. gamesindustry.biz. Retrieved from https://www.gamesindustry.biz/articles/2018-04-09gta-v-is-the-most-proftable-entertainment-product-of-all-time BBC News. (2013, February 9). German minister Annette Schavan quits over “plagiarism”. BBC News. Retrieved from https://www.bbc.com/news/world-europe-21395102 BBC News. (2019a, December 31). Apple accused of crackdown on jailbreaking. BBC News. Retrieved from https://www.bbc.com/news/technology-50956680 BBC News. (2019b, October 22).The Kenyan ghost writers doing “lazy”Western students’ work. BBC News. Retrieved from https://www.bbc.com/news/av/worldafrica-50126963/the-kenyan-ghost-writers-doing-lazy-western-students-work Bearman, M., Dawson, P., Bennett, S., Hall, M., Molloy, E., Boud, D., & Joughin, G. (2017). How university teachers design assessments: a cross-disciplinary study. Higher Education, 74(1), 49–64. doi:10.1007/s10734-016-0027-7 Bearman, M., Dawson, P., Boud, D., Bennett, S., Hall, M., & Molloy, E. (2016). Support for assessment practice: developing the assessment design decisions framework. Teaching in Higher Education, 21(5), 545–556. doi:10.1080/13562517.2016.1160217 Bearman, M., Dawson, P., Boud, D., Hall, M., Bennett, S., Molloy, E., & Joughin, G. (2014). Guide to the assessment design decisions framework. Retrieved from http://www.assessmentdecisions.org/wp-content/uploads/2014/09/Guide-to-the-Assessment-DesignDecisions-Framework.pdf Bearman, M., & Luckin, R. (in press). Preparing university assessment for a world with AI: tasks for human intelligence. In M. Bearman, P. Dawson, R. Ajjawi, J. Tai, & D. Boud (Eds.), Re-imagining university assessment in a digital world. New York: Springer. Belleza, F. S., & Belleza, S. F. (1989). Detection of cheating on multiple-choice tests by using error-similarity analysis. Teaching of Psychology, 16(3), 151–155. doi:10.1207/ s15328023top1603_15 Bellovin, S. M. (2011). Frank Miller: inventor of the one-time pad. Cryptologia, 35(3), 203– 222. doi:10.1080/01611194.2011.583711 Bennett, S., Dawson, P., Bearman, M., Molloy, E., & Boud, D. (2017). How technology shapes assessment design: fndings from a study of university teachers. British Journal of Educational Technology, 48(2), 672–682. doi:10.1111/bjet.12439 Bergman, K. M. (2015).A target to the heart of the frst amendment: Government endorsement of responsible disclosure as unconstitutional. Northwestern Journal of Technology and Intellectual Property, 13(2), 117. Berners-Lee,T., Hendler, J., & Lassila, O. (2001).The semantic web. Scientifc American, 284(5), 28–37. Bhansali, S. (2016, August 2). How to make learning easier than cheating. Retrieved from https://blog.socratic.org/how-to-make-learning-easier-than-cheating-4dd81d05bbbd Biddle, P. (2017, July 5). AI is making it extremely easy for students to cheat. Wired. Retrieved from https://www.wired.com/story/ai-is-making-it-extremely-easy-forstudents-to-cheat/ Biggs, J. (1999). What the student does: teaching for enhanced learning. Higher Education Research & Development, 18(1), 57–75. doi:10.1080/0729436990180105 Biggs, J., & Tang, C. (2007). Teaching for quality learning at university. Maidenhead: Open University Press. Binns, R. (2018). Algorithmic accountability and public reason. Philosophy & Technology, 31(4), 543–556. doi:10.1007/s13347-017-0263-5 Binstein, J. (2015). On knuckle scanners and cheating—how to bypass proctortrack, examity, and the rest. Retrieved from https://jakebinstein.com/blog/ on-knuckle-scanners-and-cheating-how-to-bypass-proctortrack/
References
149
Bishop, M. (2007). About penetration testing. IEEE Security & Privacy, 5(6), 84–87. doi:10.1109/MSP.2007.159 Boud, D. (1995). Assessment and learning: contradictory or complementary. In P. Knight (Ed.), Assessment for learning in higher education (pp. 35–48). London: Kogan Page. Boud, D. (2000). Sustainable assessment: rethinking assessment for the learning society. Studies in Continuing Education, 22(2), 151–167. doi:10.1080/713695728 Boud, D. (2007). Reframing assessment as if learning were important. In D. Boud & N. Falchikov (Eds.), Rethinking assessment in higher education: learning for the longer term (pp. 14–28). London: Routledge. Boud, D., Ajjawi, R., Dawson, P., & Tai, J. (Eds.). (2018a). Developing evaluative judgement in higher education: assessment for knowing and producing quality work. London: Routledge. Boud, D., Dawson, P., Bearman, M., Bennett, S., Joughin, G., & Molloy, E. (2018b). Reframing assessment research: through a practice perspective. Studies in Higher Education, 43(7), 1107–1118. doi:10.1080/03075079.2016.1202913 Boud, D., & Molloy, E. (2013a). Rethinking models of feedback for learning: the challenge of design. Assessment & Evaluation in Higher Education, 38(6), 698–712. doi:10.1080/0260 2938.2012.691462 Boud, D., & Molloy, E. (2013b). What is the problem with feedback? In D. Boud & E. Molloy (Eds.), Feedback in higher and professional education: understanding it and doing it well (pp. 1–10). London: Routledge. Boud, D., & Soler, R. (2015). Sustainable assessment revisited. Assessment & Evaluation in Higher Education, 41(3), 400–413. doi:10.1080/02602938.2015.1018133 Bouville, M. (2009). Why is cheating wrong? Studies in Philosophy and Education, 29(1), 67. doi:10.1007/s11217-009-9148-0 BrainDumps. (2019). Retrieved from https://www.braindumps.com/ Bretag,T., Harper, R., Burton, M., Ellis, C., Newton, P., Rozenberg, P., … van Haeringen, K. (2018). Contract cheating: a survey of Australian university students. Studies in Higher Education, 44(11), 1837–1856. doi:10.1080/03075079.2018.1462788 Bretag, T., Harper, R., Burton, M., Ellis, C., Newton, P., van Haeringen, K., … Rozenberg, P. (2019). Contract cheating and assessment design: exploring the relationship. Assessment & Evaluation in Higher Education, 44(5), 676–691. doi:10.1080/02 602938.2018.1527892 Broséus, J., Rhumorbarbe, D., Mireault, C., Ouellette,V., Crispino, F., & Décary-Hétu, D. (2016). Studying illicit drug trafcking on Darknet markets: structure and organisation from a Canadian perspective. Forensic Science International, 264, 7–14. doi:https://doi. org/10.1016/j.forsciint.2016.02.045 Brothen,T., & Peterson, G. (2012). Online exam cheating: a natural experiment. International Journal of Instructional Technology and Distance Learning, 9(2), 15–20. Bursztein, E. (2016, October). Full(er) House:exposing high-end poker cheating devices. Retrieved from https://elie.net/blog/security/fuller-house-exposing-high-end-poker-cheating-devices/ Cadmus. (2019). Retrieved from https://cadmus.io/ Campbell, M., Hoane,A. J., & Hsu, F.-h. (2002). Deep blue. Artifcial Intelligence, 134(1), 57– 83. doi:https://doi.org/10.1016/S0004-3702(01)00129-1 Carless, D. (2009). Trust, distrust and their impact on assessment reform. Assessment & Evaluation in Higher Education, 34(1), 79–89. doi:10.1080/02602930801895786 Carless, D. (2016). Feedback as dialogue. In Encyclopedia of educational philosophy and theory (pp. 1–6). Singapore: Springer Singapore. Carless, D. (2017). Scaling up assessment for learning: progress and prospects. In D. Carless, S. M. Bridges, C. K.Y. Chan, & R. Glofcheski (Eds.), Scaling up assessment for learning in higher education (pp. 3–17). Singapore: Springer Singapore.
150 References Carless, D., & Boud, D. (2018). The development of student feedback literacy: enabling uptake of feedback. Assessment & Evaluation in Higher Education, 43(8), 1315–1325. doi:10.1080/02602938.2018.1463354 Carney,T. (2018). Robo-debt illegality: the seven veils of failed guarantees of the rule of law? Alternative Law Journal, 44(1), 4–10. doi:10.1177/1037969X18815913 Cayford, M., & Pieters,W. (2018).The efectiveness of surveillance technology: what intelligence ofcials are saying. The Information Society, 34(2), 88–103. doi:10.1080/01972243 .2017.1414721 Chase, R. (2018). How ProctorU and Examity make cheating on college exams easier than ever. Retrieved from https://rchase.com/blog/posts/how-proctoru-and-examity-makecheating-on-college-exams-easier-than-ever/ Christian, N. (2019). Facebook,YouTube and Google ads for essay writing companies coax students into cheating. Daily Telegraph. Retrieved from https://www.dailytelegraph. com.au/technology/facebook-youtube-and-google-ads-for-essay-writing-companiescoax-students-into-cheating/news-story/2753697ec95622bdd53fadbd31166cc Clare, J., Walker, S., & Hobson, J. (2017). Can we detect contract cheating using existing assessment data? Applying crime prevention theory to an academic integrity issue. International Journal for Educational Integrity, 13(1), 4. doi:10.1007/s40979-017-0015-4 Clarke, Y. (2019). Algorithmic accountability act of 2019. Retrieved from https://www. congress.gov/bill/116th-congress/house-bill/2231/text Colvin, B. (2019). Rethinking admissions in the wake of operation varsity blues. Paper presented at the Phi Kappa Phi Forum. Consalvo, M. (2007). Cheating: gaining advantage in videogames. Cambridge, MA: MIT Press. Cook, H. (2018, November 13). Academics go undercover to spot the telltale signs of a cheater. The Age. Retrieved from https://www.theage.com.au/national/victoria/ academics-go-undercover-to-spot-the-telltale-signs-of-a-cheater-20181113-p50fng. html Cookson, C. J. (2017). Assessment terms half a century in the making and unmaking: from conceptual ingenuity to defnitional anarchy. Assessment & Evaluation in Higher Education, 43(6), 930–942. doi:10.1080/02602938.2017.1420138 Cooper, B., Brna, P., & Martins,A. (2000). Efective afective in intelligent systems—building on evidence of empathy in teaching and learning. In A. Paiva (Ed.), Afective interactions: towards a new generation of computer interfaces (pp. 21–34). Berlin, Heidelberg: Springer Berlin Heidelberg. Coughlan, S. (2007, May 22). Google bans essay writing adverts. BBC News. Retrieved from http://news.bbc.co.uk/2/hi/uk_news/education/6680457.stm Coughlan, S. (2019). PayPal to reject essay-writing frms. BBC News. Retrieved from https://www.bbc.com/news/education-47800531 Craigen, D., Diakun-Thibault, N., & Purse, R. (2014). Defning cybersecurity. Technology Innovation Management Review, 4(10), 13–21. Cramer, F. (2015).What is ‘Post-digital’? In D. M. Berry & M. Dieter (Eds.), Postdigital aesthetics: art, computation and design (pp. 12–26). London: Palgrave Macmillan UK. Curtis, G. J., & Clare, J. (2017). How prevalent is contract cheating and to what extent are students repeat ofenders? Journal of Academic Ethics, 15(2), 115–124. doi:10.1007/ s10805-017-9278-x D’Argenio,A. M. (2018a). OpTic India caught cheating at Extremesland CS:GO tournament. GameCrate. Retrieved from https://www.gamecrate.com/optic-india-caught-cheatingextremesland-csgo-tournament/21152
References
151
D’Argenio,A. M. (2018b). Statistically, video games are now the most popular and proftable form of entertainment. GameCrate. Retrieved from https://www.gamecrate.com/ statistically-video-games-are-now-most-popular-and-profitable-form-entertain ment/20087 D’Souza, K. A., & Siegfeldt, D.V. (2017). A conceptual framework for detecting cheating in online and take-home exams. Decision Sciences Journal of Innovative Education, 15(4), 370– 391. doi:10.1111/dsji.12140 Danchenko, I., & Gaddy, C. (2006).The mystery of Vladimir Putin’s dissertation. Retrieved from https://www.brookings.edu/events/the-mystery-of-vladimir-putins-dissertation/ Dante, E. (2010, November 12). The shadow scholar. The Chronicle of Higher Education Retrieved from https://www.chronicle.com/article/The-Shadow-Scholar/125329/ Dawson, P. (2015, June 12). Policing won’t be enough to prevent pay-for plagiarism. The Conversation. Retrieved from https://theconversation.com/policing-wont-be-enoughto-prevent-pay-for-plagiarism-42999 Dawson, P. (2016). Five ways to hack and cheat with bring-your-own-device electronic examinations. British Journal of Educational Technology, 47(4), 592–600. doi:10.1111/ bjet.12246 Dawson, P. (in press). Cognitive ofoading and assessment. In M. Bearman, P. Dawson, R. Ajjawi, J. Tai, & D. Boud (Eds.), Re-imagining university assessment in a digital world. New York: Springer. Dawson, P., Bearman, M., Boud, D. J., Hall, M., Molloy, E. K., Bennett, S., & Gordon, J. (2013). Assessment might dictate the curriculum, but what dictates assessment? Teaching & Learning Inquiry:The ISSOTL Journal, 1(1), 107–111. doi:10.2979/teachlearninqu.1.1.107 Dawson, P., Bearman, M., Molloy, E., Boud, D., Joughin, G., & Bennett, S. (2014). Final report, Improving assessment: understanding educational decision-making in practice. Retrieved from https://ltr.edu.au/resources/ID12_2254_Dawson_Report_2014_0.pdf Dawson, P., & Sutherland-Smith,W. (2018). Can markers detect contract cheating? Results from a pilot study.Assessment & Evaluation in Higher Education,43(2), 286–293. doi:10.1080/ 02602938.2017.1336746 Dawson, P., & Sutherland-Smith, W. (2019). Can training improve marker accuracy at detecting contract cheating? A multi-disciplinary pre-post study. Assessment & Evaluation in Higher Education, 44(5), 715–725. doi:10.1080/02602938.2018.1531109 Dawson, P., Sutherland-Smith, W., & Ricksen, M. (2020). Can software improve marker accuracy at detecting contract cheating? A pilot study of the Turnitin Authorship Investigate alpha.Assessment & Evaluation in Higher Education,45(4), 473–482, doi:10.1080/ 02602938.2019.1662884 Deakin University. (2011). Regulation 4.1(2)—academic misconduct. Retrieved from https://www.deakin.edu.au/__data/assets/pdf_fle/0008/348452/regulation-4-1-2academic-misconduct.pdf Deakin University. (2018). Assessment (higher education courses) procedure. Retrieved from https://policy.deakin.edu.au/view.current.php?id=00187maj5 Deeks, A. (2019). The judicial demand for explainable artifcial intelligence. Columbia Law Review, 119(7), 1829–1850. doi:10.2307/26810851 Deleuzienne. (2018). Hacking papers: how to write a successful paper (even if you never did the reading): one of America’s best academic ghostwriters shares top tips & strategies for YOUR success. Retrieved from https://www.amazon.com/dp/B079C98DG7?ref_ =pe_2427780_160035660 Dewdne, A. K. (1989, March). Computer recreations: of worms, viruses and core war. Scientifc American, p. 110.
152 References Diaz, E., & Koutra, C. (2013). Evaluation of the persuasive features of hotel chains websites: a latent class segmentation analysis. International Journal of Hospitality Management, 34, 338– 347. doi:10.1016/j.ijhm.2012.11.009 Ding, S. H. H., Fung, B. C. M., Iqbal, F., & Cheung,W. K. (2019). Learning stylometric representations for authorship analysis. IEEE Transactions on Cybernetics, 49(1), 107–121. doi:10.1109/TCYB.2017.2766189 Doolan, L. (2018, December 13). Nintendo takes legal action against California man for selling hacked systems. Nintendo Life. Retrieved from http://www.nintendolife.com/ news/2018/12/nintendo_takes_legal_action_against_california_man_for_selling_ hacked_systems Draper, M. J., & Newton, P. M. (2017). A legal approach to tackling contract cheating? International Journal for Educational Integrity, 13(1), 11. doi:10.1007/s40979-017-0022-5 Dreyfus, H. L. (1965). Alchemy and artifcial intelligence. Santa Monica: RAND Corporation. Dreyfus, H. L. (1972). What computers can’t do: the limits of artifcial intelligence. New York: Harper & Row. Dufy, D. L. (2005). Afliate marketing and its impact on e-commerce. Journal of Consumer Marketing, 22(3), 161–163. doi:10.1108/07363760510595986 Dunn, M., Dawson, P., Bearman, M., & Tai, J. (in press).‘I’d struggle to see it as cheating’: the policy and regulatory environments of study drug use at universities. Higher Education Research & Development, 1–13. doi:10.1080/07294360.2020.1738351 Durning, S. J., Dong, T., Ratclife, T., Schuwirth, L., Artino, A. R., Boulet, J. R., & Eva, K. (2016). Comparing open-book and closed-book examinations: a systematic review. Academic Medicine, 91(4), 583–599. doi:10.1097/ACM.0000000000000977 EduBirdie. (2019). EduBirdie’s hiring: glory days conservation specialist. Retrieved from https://edubirdie.com/glory-days-conservation-specialist Electronic Frontier Foundation. (n.d.). Surveillance self-defense | Tips, tools and how-tos for safer online communications. Retrieved from https://ssd.ef.org/ Elgersma, C. (2019, July 19). It’s ridiculously easy for kids to cheat now. Retrieved from https://www.commonsensemedia.org/blog/its-r idiculously-easy-for-kids-tocheat-now Ellis, C., van Haeringen, K., Harper, R., Bretag,T., Zucker, I., McBride, S., … Saddiqui, S. (2020). Does authentic assessment assure academic integrity? Evidence from contract cheating data. Higher Education Research & Development, 39(3), 457–469. doi:10.1080/072 94360.2019.1680956 Ellis, C., Zucker, I. M., & Randall, D. (2018). The infernal business of contract cheating: understanding the business processes and models of academic custom writing sites. International Journal for Educational Integrity, 14(1), 1. doi:10.1007/s40979-017-0024-3 ESIC. (2019a). Code of conduct. Retrieved from https://esic.gg/codes/code-of-conduct/ ESIC. (2019b). ESIC prohibited list. Retrieved from https://esic.gg/codes/esic-prohibited-list/ ESIC. (2019c).Who we are. Retrieved from https://esic.gg/about/ Esports Earnings. (2019). Largest overall prize pools in Esports. Retrieved from https:// www.esportsearnings.com/tournaments EssayScam. (2019). Academic paper service reviews and freelance writing talk. Retrieved from https://essayscam.org/forum/ Fakespot. (2019). Fakespot | NENRENT S570 bluetooth earbud, smallest mini invisible V4.1 wireless bluetooth. Retrieved from https://www.fakespot.com/product/ nenrent-s570-bluetooth-earbud-smallest-mini-invisible-v4-1-wireless-bluetooth-headset-headphone-earphone-with-mic-hands-free-calls-for-iphone-ipad-samsung-galaxylg-htc-and-other-smartphones-nude
References
153
Fask, A., Englander, F., & Wang, Z. (2014). Do online exams facilitate cheating? An experiment designed to separate possible cheating from the efect of the online test taking environment. Journal of Academic Ethics, 12(2), 101–112. doi:10.1007/ s10805-014-9207-1 Fishman, T. (2014). The fundamental values of academic integrity (2nd ed.). Clemson, SC: International Center for Academic Integrity, Clemson University. Fitch, A. (2019, January 8). PUBG Corp. issues further competitive bans for cheating. ESports Insider. Retrieved from https://esportsinsider.com/2019/01/pubg-corp-bans-pel-cheaters/ Frank, A. (2017, July 30). Playerunknown’s Battlegrounds ‘stream sniping’ ban divides community. Polygon. Retrieved from https://www.polygon.com/2017/7/30/16059138/ playerunknowns-battlegrounds-stream-sniping-ban Friedler, S. A., Scheidegger, C., & Venkatasubramanian, S. (2016). On the (im)possibility of fairness. ArXiv e-prints, 1609. Retrieved from http://adsabs.harvard.edu/ abs/2016arXiv160907236F Friedman, G. (2014).Workers without employers: shadow corporations and the rise of the gig economy. Review of Keynesian Economics, 2(2), 171–188. Fung, J. (2019). Chinese students paid to rort Australian universities as government tackles cheating. Retrieved from https://www.abc.net.au/news/2019-12-01/chinese-studentspaid-to-ghost-write-for-australia-uni-students/11725330 Futter, A. (2018). “Cyber” semantics: why we should retire the latest buzzword in security studies. Journal of Cyber Policy, 3(2), 201–216. doi:10.1080/23738871.2018.1514417 Gach, E. (2016, October 10).Why the Super Mario Bros.World record has been broken so many times recently. Kotaku. Retrieved from https://www.kotaku.com.au/2016/10/ new-glitch-reignites-the-super-mario-bros-speedrun-race/ Gainsbury, S., Parke, J., & Suhonen, N. (2013). Consumer attitudes towards Internet gambling: perceptions of responsible gambling policies, consumer protection, and regulation of online gambling sites. Computers in Human Behavior, 29(1), 235–245. doi:https://doi. org/10.1016/j.chb.2012.08.010 Gao, Q. (2012). Biometric authentication to prevent e-cheating. International Journal of Instructional Technology and Distance Learning, 9(2), 3–14. Gebre, S. (2019, June 14). Exam cheats cited in three-day internet shutdown in Ethiopia. Bloomberg. Retrieved from https://www.bloomberg.com/news/articles/2019-06-13/ exam-cheats-cited-in-three-day-internet-shutdown-in-ethiopia Gewin,V. (2016). Data sharing: an open mind on open data. Nature, 529(7584), 117–119. doi:10.1038/nj7584-117a Gino, F., Gu, J., & Zhong, C.-B. (2009). Contagion or restitution? When bad apples can motivate ethical behavior. Journal of Experimental Social Psychology, 45(6), 1299–1302. doi:https://doi.org/10.1016/j.jesp.2009.07.014 Goertzel, B., Iklé, M., & Wigmore, J. (2012).The architecture of human-like general intelligence. In P.Wang & B. Goertzel (Eds.), Theoretical foundations of artifcial general intelligence (pp. 123–144). Paris:Atlantis Press. Google. (2019). Advertising policies help: enabling dishonest behaviour. Retrieved from https://support.google.com/adspolicy/answer/6016086 Gough, C. (2019,August 9). Number of active video gamers worldwide from 2014 to 2021. Retrieved from https://www.statista.com/statistics/748044/number-video-gamersworld/ Graham, B. A. (2015, July 24). Anti-doping in eSports:World’s largest gaming organization will test for PEDs. The Guardian. Retrieved from https://www.theguardian.com/technology/ 2015/jul/23/anti-doping-in-e-sports-worlds-largest-gaming-organization-will-test-for-peds
154 References Greenberg,A. (2015, July 21). Hackers remotely kill a jeep on the highway—with me in it. Wired. Retrieved from https://www.wired.com/2015/07/hackers-remotely-kill-jeephighway/ Gulikers, J.T. M., Bastiaens,T. J., & Kirschner, P.A. (2004).A fve-dimensional framework for authentic assessment. Educational Technology Research and Development, 52(3), 67. doi:10.1007/bf02504676 HakShop. (2014). HakShop—USB rubber ducky deluxe. Retrieved from https://hakshop. myshopify.com/products/usb-rubber-ducky-deluxe Harper, R., Bretag,T., Ellis, C., Newton, P., Rozenberg, P., Saddiqui, S., & van Haeringen, K. (2018). Contract cheating: a survey of Australian university staf. Studies in Higher Education, 44(11), 1857–1873. doi:10.1080/03075079.2018.1462789 Harper, R., Bretag,T., & Rundle, K. (2020). Detecting contract cheating: examining the role of assessment type. Higher Education Research & Development, 1–16. doi:10.1080/0729436 0.2020.1724899 Hatmaker,T. (2018, June 26). An Overwatch hacker in South Korea just got sentenced to a year in prison. TechCrunch. Retrieved from https://techcrunch.com/2018/06/25/ overwatch-hacker-seoul-jail-time/ Heckler, N. C., Rice, M., & Hobson Bryan, C. (2013).Turnitin systems. Journal of Research on Technology in Education, 45(3), 229–248. doi:10.1080/15391523.2013.10782604 Hern, A. (2019). New AI fake text generator may be too dangerous to release, say creators. Retrieved from https://www.theguardian.com/technology/2019/feb/14/ elon-musk-backed-ai-writes-convincing-news-fction Hoepman, J.-H., & Jacobs, B. (2007). Increased security through open source. Communications of ACM, 50(1), 79–83. doi:10.1145/1188913.1188921 Hughes, D. (2019, June 26). Macquarie bans credit cards for gambling. Retrieved from https://www.afr.com/companies/financial-services/macquarie-bans-credit-cardsfor-gambling-20190626-p521lo Hughes, M. (2015). Full or responsible disclosure: how security vulnerabilities are disclosed. Retrieved from https://www.makeuseof.com/tag/responsible-disclosure-securityvulnerabilities/ Hunt, I. (2013). The CIA’s “Grand Challenges” with Big Data. Paper presented at the GigaOm Structure: Data, New York. http://new.livestream.com/accounts/74987/events/ 1927733/videos/14306067 Hutt, K. (2020, January 9).Auckland University terminates enrolment after alleged rape victim attempts suicide. Stuf. Retrieved from https://www.stuf.co.nz/national/118695850/ auckland-university-terminates-enrolment-after-alleged-rape-victim-attempts-suicide IGN. (2016). GTA 5 wiki guide: bad sports. Retrieved from https://au.ign.com/wikis/ gta-5/Bad_Sports Introna, L. D. (2015).Algorithms, governance, and governmentality: on governing academic writing.Science,Technology,& Human Values,41(1), 17–49. doi:10.1177/0162243915587360 Jacks,T. (2016, May 18). Deakin University students kicked out for “contract cheating”.The Age. Retrieved from https://www.theage.com.au/national/victoria/deakin-universitystudents-kicked-out-for-contract-cheating-20160518-goxm1y.html Jefreys, B., & Main, E. (2018, May 1).The YouTube stars being paid to sell cheating. BBC News. Retrieved from https://www.bbc.com/news/education-43956001 Jensen, M. H. (2018, March 23). Tech Buzzwords explained. Retrieved from http://www. aftertherye.com/articles/tech-buzzwords-explained/ Jeong, S. (2018, May 9).The curious case of the Fortnite cheater.The Verge. Retrieved from https://www.theverge.com/2018/5/9/17333292/fortnite-cheater-lawsuit
References
155
Johanns, B., Dinkens,A., & Moore, J. (2017).A systematic review comparing open-book and closed-book examinations: evaluating efects on development of critical thinking skills. Nurse Education in Practice, 27, 89–94. doi:10.1016/j.nepr.2017.08.018 Johnson, S. (2019, March 6). Turnitin to be acquired by advance publications for $1.75B. EdSurge. Retrieved from https://www.edsurge.com/news/2019-03-06-turnitin-to-beacquired-by-advance-publications-for-1-75b Jones, K. O., Reid, J., & Bartlett, R. (2008a). Cyber cheating in an information technology age. Digithum, 10, 19–29. Jones, K. O., Reid, J., & Bartlett, R. (2008b). E-learning and E-cheating. Communication & Cognition, 41(1), 61. Jones, M., & Sheridan, L. (2015). Back translation: an emerging sophisticated cyber strategy to subvert advances in “digital age” plagiarism detection and prevention. Assessment & Evaluation in Higher Education, 40(5), 712–724. doi:10.1080/02602938.2014.950553 Joughin, G. (2009). Assessment, learning and judgement in higher education: a critical review. In G. Joughin (Ed.), Assessment, learning and judgement in higher education (pp. 13– 27). Dordrecht: Springer Netherlands. Juola, P. (2017). Detecting Contract Cheating via Stylometric Methods. Plagiarism across Europe and Beyond, 187–198. Kahneman, D. (2011). Thinking, fast and slow. New York: Macmillan. Khan, Z. R. (2010). E-cheating in the UAE: a critical review of existing literature. Proceedings of the 2010 International Conference on E-Learning, E-Business, Enterprise Information Systems, & E-Government, EEE 2010. Las Vegas: CSREA Press, 320–324. Khan, Z. R., & Balasubramanian, S. (2012). Students go click, fick and cheat... e-cheating, technologies and more. Journal of Academic and Business Ethics, 6, 1. King,A. (2019).Talk to transformer. Retrieved from https://talktotransformer.com/ King, D. L., & Case, C. J. (2007). E-Cheating: are students misusing it. Issues in Information Systems, 8(1), 71–75. King, D. L., & Case, C. J. (2014). E-Cheating: incidence and trends among college students. Issues in Information Systems, 15(1), 20–27. Kotkov, D., Pandey, G., & Semenov,A. (2018). Gaming bot detection: a systematic literature review. Paper presented at the Computational Data and Social Networks, Cham. Kubesch, M., Lankes, M., & Maurer, B. (2019). Exploring the efects of time pressure on screencheating behaviour: insights and design potentials. Barcelona, Spain:Association for Computing Machinery. Kuhn, M., & Johnson, K. (2013). Applied predictive modeling (Vol. 26). New York: Springer. Kundnani, A. (2012). Radicalisation: the journey of a concept. Race & Class, 54(2), 3–25. doi:10.1177/0306396812454984 Lancaster,T. (2017).A decade of contract cheating—the impact ofTwitter on the essay industry. Retrieved from http://thomaslancaster.co.uk/blog/a-decade-of-contract-cheating-theimpact-of-twitter-on-the-essay-industry/ Lancaster, T. (2019a). Educational blackmail in the world of fake degrees, essay mills and contract cheating. Retrieved from https://www.linkedin.com/pulse/ educational-blackmail-world-fake-degrees-essay-mills-thomas-lancaster/ Lancaster, T. (2019b). Profling the international academic ghost writers who are providing low-cost essays and assignments for the contract cheating industry. Journal of Information, Communication and Ethics in Society, 17(1), 72–86. doi:10.1108/ JICES-04-2018-0040 Lancaster, T. (2019c). Social media enabled contract cheating. Canadian Perspectives on Academic Integrity, 2(2), 7–24.
156 References Lancaster,T., & Clarke, R. (2007).The phenomena of contract cheating. In T. Roberts (Ed.), Student plagiarism in an online world: problems and solutions (pp. 144–158). Hershey: Idea Group Inc. Lancaster,T., & Clarke, R. (2009).Automated essay spinning—an initial investigation. Paper presented at the 10th Annual Conference of the Subject Centre for Information and Computer Sciences.University of Kent at Canterbury. Lancaster, T., & Clarke, R. (2015). Contract cheating: the outsourcing of assessed student work. In T. A. Bretag (Ed.), Handbook of academic integrity (pp. 1–14). Singapore: Springer Singapore. Lawson, R., Taylor, T., French, E., Fallshaw, E., Hall, C., Kinash, S., & Summers, J. (2015). Hunting and gathering: new imperatives in mapping and collecting student learning data to assure quality outcomes. Higher Education Research & Development, 34(3), 581–595. doi: 10.1080/07294360.2014.911249 Lee, G. (2019). Universities catch less than one per cent of “bought in” essays, own records suggest. Retrieved from https://www.channel4.com/news/factcheck/factcheck-universities-catch-less-than-one-per-cent-of-bought-in-essays-own-recordssuggest Legg, S., & Hutter, M. (2007). A collection of defnitions of intelligence. In B. Goertzel & P. Wang (Eds.), Frontiers in artifcial intelligence and applications (pp. 17–24).Amsterdam: IOS Press. Levin, S. (2016, September 9).A beauty contest was judged by AI and the robots didn’t like dark skin. The Guardian. Retrieved from https://www.theguardian.com/technology/2016/sep/08/artifcial-intelligence-beauty-contest-doesnt-like-black-people Lines, L. (2016). Ghostwriters guaranteeing grades? The quality of online ghostwriting services available to tertiary students in Australia. Teaching in Higher Education, 21(8), 889– 914. doi:10.1080/13562517.2016.1198759 Lipovsky, R. (2015, September 17).The Trojan Games: Odlanor malware cheats at poker | WeLiveSecurity. Retrieved from https://www.welivesecurity.com/2015/09/17/ the-trojan-games-odlanor-malware-cheats-at-poker/ Lock, S. (2019). Market value of online gambling worldwide 2017 and 2024. Retrieved from https://www.statista.com/statistics/270728/market-volume-of-online-gamingworldwide/ Lyon, D. (2017). Digital citizenship and surveillance culture: engagement, exposure, and ethics in digital modernity. International Journal of Communication, 11 (2017). Retrieved from https://ijoc.org/index.php/ijoc/article/view/5527 Lyon, D. (2018). The culture of surveillance: watching as a way of life. Cambridge, UK: Polity. MacDonald, C. (2017, January 21). The ultimate exam cheat: Socratic app can answer any question you point your phone’s camera at (and you can take it on with our test). Daily Mail. Retrieved from https://www.dailymail.co.uk/sciencetech/article-4141968/ Socratic-app-answer-questions-just-taking-picture.html Mackinnon, A. (2000). A spreadsheet for the calculation of comprehensive statistics for the assessment of diagnostic tests and inter-rater agreement. Computers in Biology and Medicine, 30(3), 127–134. doi:https://doi.org/10.1016/S0010-4825(00)00006-8 Magin, D. (2001). Reciprocity as a source of bias in multiple peer assessment of group work. Studies in Higher Education, 26(1), 53–63. doi:10.1080/03075070020030715 Makel, M. C., & Plucker, J. A. (2014). Facts are more important than novelty: replication in the education sciences. Educational Researcher, 43(6), 304–316. doi:10.3102/00131 89x14545513 Malbon, J. (2013).Taking fake online consumer reviews seriously. Journal of Consumer Policy, 36(2), 139–157. doi:10.1007/s10603-012-9216-7
References
157
MarketWatch. (2018, April 9). This violent videogame has made more money than any movie ever. Retrieved from https://www.marketwatch.com/story/ this-violent-videogame-has-made-more-money-than-any-movie-ever-2018-04-06 Marmot, M. (2005). Social determinants of health inequalities. The Lancet, 365(9464), 1099–1104. Martin, D., & Arendale, D. (1993). Supplemental instruction: improving frst-year student success in high-risk courses. In The freshman year experience: monograph series (2nd ed.,Vol. 7). Columbia, SC: National Resource Center for the First Year Experience and Students in Transition, University of South Carolina. Martin Luther King Jr. Papers Project. (1991).The student papers of Martin Luther King, Jr.: a summary statement on research. The Journal of American History, 78(1), 23–31. doi:10.2307/2078081 Mason, R. (2020, March 29). Privacy concerns raised over exam provider, ProctorU. Honi Soit. Retrieved from http://honisoit.com/2020/03/usyds-online-exam-providerproctoru-raises-privacy-concerns/ McArthur, J. (2018). Assessment for social justice: perspectives and practices within higher education. London: Bloomsbury Publishing. McCabe, D. L. (2005). Cheating among college and university students: a North American perspective. International Journal for Educational Integrity, 1(1). McCabe, D. L.,Treviño, L. K., & Butterfeld, K. D. (1999).Academic integrity in honor code and non-honor code environments. The Journal of Higher Education, 70(2), 211–234. doi:10.1080/00221546.1999.11780762 McCabe, D. L.,Treviño, L. K., & Butterfeld, K. D. (2002). Honor codes and other contextual infuences on academic integrity: a replication and extension to modifed honor code settings. Research in Higher Education, 43(3), 357–378. doi:10.1023/A:1014893102151 McCambridge, E. (2017, July 21). Anti-doping eforts still in their infancy in eSports. DW. Retrieved from https://www.dw.com/en/anti-doping-eforts-still-in-their-infancy-inesports/a-39783790 McClear, S. (2019, February 8). Dream job alert: this company will pay you to party at colleges all over the country. Ladders. Retrieved from https://www.theladders.com/career-advice/ dream-job-alert-this-company-will-pay-you-to-party-at-colleges-all-over-the-country McCormick, R. (2014, February 17). Gabe Newell says Valve isn't tracking your browser history. The Verge. Retrieved from https://www.theverge.com/2014/2/17/5421054/ valve-isnt-tracking-internet--says-gabe-newell McKibban,A. R., & Burdsal, C.A. (2013).Academic dishonesty: an in-depth investigation of assessing measurable constructs and a call for consistency in scholarship. Journal of Academic Ethics, 11(3), 185–197. doi:10.1007/s10805-013-9187-6 McMullan, J. L., & Rege,A. (2010). Online crime and internet gambling. Journal of Gambling Studies, 24, 54–85. doi:http://dx.doi.org/10.4309/jgi.2010.24.5 McMurtrie, B. (2019).Why a plagiarism-detection company is now a billion-dollar business. Chronicle of Higher Education. Retrieved on March 6, 2020, from https://www. chronicle.com/article/Why-a-Plagiarism-Detection/245832 McQuillen, G. (2020). Socratic by Google App review. Retrieved from https://www. commonsensemedia.org/app-reviews/socratic-by-google Medway, D., Roper, S., & Gillooly, L. (2018). Contract cheating in UK higher education: a covert investigation of essay mills. British Educational Research Journal, 44(3), 393–418. doi:10.1002/berj.3335 Mendoza, J. P.,Wielhouwer, J. L., & Kirchler, E. (2017).The backfring efect of auditing on tax compliance.Journal of Economic Psychology,62, 284–294. doi:https://doi.org/10.1016/j. joep.2017.07.007
158 References Meo, S. A., & Talha, M. (2019).Turnitin: is it a text matching or plagiarism detection tool? Saudi Journal of Anaesthesia, 13(Suppl 1), S48–S51. doi:10.4103/sja.SJA_772_18 Mirkovic, J., & Reiher, P. (2004). A taxonomy of DDoS attack and DDoS defense mechanisms. SIGCOMM Computer Communication Review, 34(2), 39–53. doi:10.1145/ 997150.997156 Mitnick, K. D., & Simon,W. L. (2009). The art of intrusion: the real stories behind the exploits of hackers, intruders and deceivers. New York: John Wiley & Sons. Moeller, R. M., Esplin, B., & Conway, S. (2009). Cheesers, pullers, and glitchers: the rhetoric of sportsmanship and the discourse of online sports gamers. Game Studies:The International Journal of Computer Game Research, 9(2). Molnar, K. K. (2015). Students’ perceptions of academic dishonesty: a nine-year study from 2005 to 2013. Journal of Academic Ethics, 13(2), 135–150. doi:10.1007/ s10805-015-9231-9 Monorean. (2019). Monorean pro. Retrieved from https://www.monorean.com/en/earpiece/pro mpgh.net. (2019). Multiplayer game hacking. Retrieved from https://www.mpgh.net/ Muehlhauser, L. (2016, September). What should we learn from past AI forecasts? Retrieved from https://www.openphilanthropy.org/focus/global-catastrophic-risks/ potential-risks-advanced-artifcial-intelligence/what-should-we-learn-past-ai-forecasts Müller, V. C., & Bostrom, N. (2016). Future progress in artifcial intelligence: a survey of expert opinion. In V. C. Müller (Ed.), Fundamental issues of artifcial intelligence (pp. 555– 572). Cham: Springer International Publishing. Nakamura,Y. (2019, January 24). Peak video game? Top analyst sees industry slumping in 2019. Bloomberg. Retrieved from https://www.bloomberg.com/news/articles/2019-01-23/ peak-video-game-top-analyst-sees-industry-slumping-in-2019 Nef, G., & Nagy, P. (2016).Automation, algorithms, and politics| talking to bots: symbiotic agency and the case of tay. International Journal Of Communication, 10, 4915–4931. doi:1932-8036/20160005 Nelson, R., & Dawson, P. (2014).A contribution to the history of assessment: how a conversation simulator redeems Socratic method. Assessment & Evaluation in Higher Education, 39(2), 195–204. doi:10.1080/02602938.2013.798394 Newcombe, R. G. (2001). Simultaneous comparison of sensitivity and specifcity of two tests in the paired design: a straightforward graphical approach. Statistics in Medicine, 20(6), 907–915. doi:10.1002/sim.906 Newell, G. (2014). Valve, VAC, and trust: gaming. Retrieved from https://www.reddit. com/r/gaming/comments/1y70ej/valve_vac_and_trust/ Newell, J. (2019, April 12). PartyPoker closes 277 bot accounts and refunds players. Legal US Poker Sites. Retrieved from https://www.legaluspokersites.com/news/ partypoker-closes-277-bot-accounts-and-refunds-players/18320/ Newton, P. M. (2018). How common is commercial contract cheating in higher education and is it increasing? A systematic review. Frontiers in Education, 3(67, 1–18). doi:10.3389/ feduc.2018.00067 Nintendo EAD. (1992,August 27). Super Mario Kart. Game [SNES]. Nintendo EAD. Oliver, B. (2019). Making micro-credentials work for learners, employers and providers. Retrieved from Deakin University: http://dteach.deakin.edu.au/2019/08/02/ microcredentials/ Omonijo, D. O. (2012).A study of E-cheating habit of students in three selected universities in Nigeria. WUFENIA Journal, 19(9), 387–402. Oxford English Dictionary. (2020a).“afordance, n.” Oxford: Oxford University Press. Oxford English Dictionary. (2020b).“fraud, n.” Oxford: Oxford University Press.
References
159
Oxford English Dictionary. (2020c).“hack, v.1” Oxford: Oxford University Press. Oxford English Dictionary. (2020d). “artifcial intelligence, n.” Oxford: Oxford University Press. Pacella, J. M. (2016). The cybersecurity threat: compliance and the role of whistleblowers. Brooklyn Journal of Corporate, Financial & Commercial Law, 11, 39. Pandit, A. A., & Toksha, G. (2020). Review of plagiarism detection technique in source code. In G. Singh Tomar, N. S. Chaudhari, J. L.V. Barbosa, & M. K. Aghwariya (Eds.), International conference on intelligent computing and smart communication 2019 (pp. 393– 405). Singapore: Springer Singapore. Paris, D. C. (2019). Scandal! Change:The Magazine of Higher Learning, 51(4), 4–7. doi:10.1080/ 00091383.2019.1618136 Parks, R. F., Lowry, P. B.,Wigand, R.T., Agarwal, N., & Williams,T. L. (2018).Why students engage in cyber-cheating through a collective movement: a case of deviance and collusion. Computers & Education, 125, 308–326. doi:https://doi.org/10.1016/j.compedu. 2018.04.003 Parkville Station. (2016, April 19). Is Cadmus watching you? Retrieved from http:// parkvillestation.com/2016/04/is-cadmus-watching-you/ Penketh, C., & Beaumont, C. (2014). “Turnitin said it wasn’t happy”: can the regulatory discourse of plagiarism detection operate as a change artefact for writing development? Innovations in Education and Teaching International, 51(1), 95–104. Potaka, E., & Huang, C. (2018). Pens for hire: how students cheat, and how they get away with it. Retrieved from https://www.sbs.com.au/news/the-feed/pens-for-hire-howstudents-cheat-and-how-they-get-away-with-it Potapov,A., & Rodionov, S. (2014). Universal empathy and ethical bias for artifcial general intelligence. Journal of Experimental & Theoretical Artifcial Intelligence, 26(3), 405–416. Potter, L. E., & Vickers, G. (2015).What skills do you need to work in cyber security?: a look at the Australian market. Paper presented at the Proceedings of the 2015 ACM SIGMIS Conference on Computers and People Research, Newport Beach, CA. Premier Translations. (2011). Sample German English translation: business. Retrieved from https://www.uebersetzungenglischdeutsch.eu/EN/service/sampletranslation/ business.html ProctorU. (2020a). Integrations—ProctorU. Retrieved from https://www.proctoru.com/ integrations ProctorU. (2020b). Integrity in action—ProctorU. Retrieved from https://www.proctoru. com/integrity-in-action ProctorU. (2020c). UAuto—automated online proctoring powered by AI—ProctorU. Retrieved from https://www.proctoru.com/services/automated-proctoring Race, P. (2018, September 13). The essay is failing us. Discuss. Times Higher Education. Retrieved from https://www.timeshighereducation.com/opinion/essay-failingus-discuss Ramsden, P. (1992). Learning to teach in higher education. London: Routledge. Reddit user /u/unemployedprofessors. (2014). We are PhDs who, as a result of the economic downturn, couldn't fnd jobs in academia and instead decided to work for an essay writing site. We get paid to do research, help with homework, and write admissions essays for students. Ask us whatever you'd like to know, about our work, grad school, etc! Retrieved from https://www.reddit.com/r/IAmA/comments/2fu6ey/ we_are_phds_who_as_a_result_of_the_economic/ Regan, K. W. (2019). Rating computer science via chess. In B. Stefen & G. Woeginger (Eds.), Computing and software science: state of the art and perspectives (pp. 200–216). Cham: Springer International Publishing.
160 References Responsible Student Publishers Association. (2019). Right to publish. Retrieved from https://www.rtp.org.au/ Reuters. (2017, June 8). Chinese exam authorities use facial recognition, drones to catch cheats. Reuters. Retrieved from https://www.reuters.com/article/us-china-exam/chinese-examauthorities-use-facial-recognition-drones-to-catch-cheats-idUSKBN18Z150 Risko, E. F., & Gilbert, S. J. (2016). Cognitive ofoading. Trends in Cognitive Sciences, 20(9), 676–688. doi:10.1016/j.tics.2016.07.002 Roblox Wikia. (2020). Ban | Roblox Wikia | Fandom. Retrieved from https://roblox. fandom.com/wiki/Ban Rogers, C. F. (2006). Faculty perceptions about e-cheating during online testing. Journal of Computing Sciences in Colleges, 22(2), 206–212. Rogerson,A. M. (2014). Detecting the work of essay mills and fle swapping sites: some clues they leave behind. Paper presented at the 6th International Integrity & Plagiarism Conference, Newcastle-on-Tyne. Rogerson, A. M., & McCarthy, G. (2017). Using Internet based paraphrasing tools: original work, patchwriting or facilitated plagiarism? International Journal for Educational Integrity, 13(1), 2. doi:10.1007/s40979-016-0013-y Ross, J., & Macleod, H. (2018). Surveillance, (dis) trust and teaching with plagiarism detection technology. In M. Bajić, N. B. Dohn, M. de Laat, P. Jandrić, & T. Ryberg (Eds.), Proceedings of the 10th international conference on networked learning (pp. 235–242). Zagreb, Croatia: Zagreb University of Aplied Sciences. Rowland, S., Slade, C.,Wong, K.-S., & Whiting, B. (2017). “Just turn to us”: the persuasive features of contract cheating websites. Assessment & Evaluation in Higher Education, 43(4), 652–665. doi:10.1080/02602938.2017.1391948 Ruby Devices. (2019). Ruby calculator! BEST exam cheating calculator! Retrieved from https://www.rubydevices.com.au/productSelect/RubyCalculator Russ, H. (2019, February 13). Global esports revenues to top $1 billion in 2019: report. Reuters. Retrieved from https://www.reuters.com/article/us-videogames-outlookidUSKCN1Q11XY Sadler, D. R. (1989). Formative assessment and the design of instructional systems.Instructional Science, 18(2), 119–144. doi:10.1007/BF00117714 Schirr, K. (2019).What happens when AI does the homework? The ethics of cheating apps. Retrieved from https://businessesgrow.com/2019/09/11/cheating-apps/ Schneier, B. (2007). Full disclosure of security vulnerabilities a “Damned good idea”. Retrieved from https://www.schneier.com/essay-146.html Schneier, B. (2018). Click here to kill everybody: security and survival in a hyper-connected world. New York:WW Norton & Company. Schuwirth, L. W. T., & Van der Vleuten, C. P. M. (2011). Programmatic assessment: from assessment of learning to assessment for learning. Medical Teacher, 33(6), 478–485. doi:10. 3109/0142159X.2011.565828 Sellani, R. J., & Harrington,W. (2002). Addressing administrator/faculty confict in an academic online environment. The Internet and Higher Education, 5(2), 131–145. doi:https:// doi.org/10.1016/S1096-7516(02)00090-8 Shute,V. J., & Ventura, M. (2013). Stealth assessment: measuring and supporting learning in video games. Cambridge, MA: MIT Press. Simmonds, A., Sandilands, P., & van Ekert, L. (2004). An ontology for network security attacks. Paper presented at the Applied Computing, Berlin, Germnay. Simon, H. A., & Newell, A. (1958). Heuristic problem solving: the next advance in operations research. Operations Research, 6(1), 1–10. doi:10.1287/opre.6.1.1
References
161
Sindre, G., & Vegendla, A. (2015). E-exams versus paper exams: a comparative analysis of cheating-related security threats and countermeasures. Paper presented at the Norwegian Information Security Conference (NISK), Ålesund, Norway. Singer, N. (2015). Online test-takers feel anti-cheating software’s uneasy glare. Retrieved from https://www.nytimes.com/2015/04/06/technology/online-test-takers-feel-anticheating-softwares-uneasy-glare.html Smyth, D. (2019, June 24). Partypoker defends decision to protect newbies from evil HUDs. Cardschat News. Retrieved from https://www.cardschat.com/news/partypoker-defendshud-ban-82174 Snowden, E., & MacAskill, E. (2013, June 17). Edward Snowden: the truth about US surveillance will emerge. The Guardian. Retrieved from https://www.theguardian.com/ world/2013/jun/17/edward-snowden-us-fair-trial Solon, O. (2017, January 31). Oh the humanity! Poker computer trounces humans in big step for AI.The Guardian. Retrieved from https://www.theguardian.com/technology/2017/ jan/30/libratus-poker-artifcial-intelligence-professional-human-players-competition Spinbot.com. (2016). Spinbot: article spinning, text rewriting, content creation tool. Retrieved from https://spinbot.com/ Sridharan, B., Muttakin, M. B., & Mihret, D. G. (2018). Students’ perceptions of peer assessment efectiveness: an explorative study.Accounting Education,27(3), 259–285. doi:10.1080/ 09639284.2018.1476894 Sridharan, B.,Tai, J., & Boud, D. (2019). Does the use of summative peer assessment in collaborative group work inhibit good judgement? Higher Education, 77(5), 853–870. doi:10.1007/s10734-018-0305-7 Stogner, J. M., Miller, B. L., & Marcum, C. D. (2013). Learning to E-cheat: a criminological test of internet facilitated academic cheating. Journal of Criminal Justice Education, 24(2), 175–199. doi:10.1080/10511253.2012.693516 Stojanovic-Hill, Z. (2017, July 11). University abandons Cadmus anti-cheating software. Honi Soit. Retrieved from https://honisoit.com/2017/07/university-abandons-cadmusanti-cheating-software/ stratus star. (2017). Monorean pro test part 1. Retrieved from https://www.youtube.com/ watch?v=FuvZIH2YyXE Styron, J., & Styron, R. A. (2010). Student cheating and alternative web-based assessment. Journal of College Teaching & Learning (TLC), 7(5), 37–42. Sutherland-Smith,W., & Carr, R. (2005).Turnitin.com: teachers’ perspectives of anti-plagiarism software in raising issues of educational integrity. Journal of University Teaching & Learning Practice, 2(3), 94–101. Sutherland-Smith,W., & Dullaghan, K. (2019).You don’t always get what you pay for: user experiences of engaging with contract cheating sites. Assessment & Evaluation in Higher Education, 44(8), 1148–1162. doi:10.1080/02602938.2019.1576028 Tai, J.,Ajjawi, R., Boud, D., Dawson, P., & Panadero, E. (2018). Developing evaluative judgement: enabling students to make decisions about the quality of work. Higher Education, 76(3), 467–481. doi:10.1007/s10734-017-0220-3 Tal, I. R., Akers, K. G., & Hodge, G. K. (2008). Efect of paper color and question order on exam performance. Teaching of Psychology, 35(1), 26–28. doi:10.1080/ 00986280701818482 Tang,A. (2014).A guide to penetration testing. Network Security, 2014(8), 8–11. doi:https:// doi.org/10.1016/S1353-4858(14)70079-0 Taylor, T. L. (2012). Raising the stakes: E-sports and the professionalization of computer gaming. Cambridge, MA: MIT Press.
162 References Team Fortress Wiki. (2019, 17 December 2019). Exploits. Retrieved from https://wiki. teamfortress.com/wiki/Exploits The Feed. (2015). Pens for hire: ofcial university statements. Retrieved from http://www. sbs.com.au/news/thefeed/article/2015/11/10/pens-hire-ofcial-university-statements Thier, D. (2018, May 21). Fortnite’s massive prize pool just made it the biggest game in esports. Forbes. Retrieved from https://www.forbes.com/sites/davidthier/2018/05/21/ fortnites-massive-prize-pool-just-made-it-the-biggest-game-in-esports/#5c79be671063 Tomar, D. (2012). The shadow scholar: how I made a living helping college kids cheat. New York: Bloomsbury Publishing. Tor Project. (2019).Tor project | Anonymity online. Retrieved from https://www.torproject.org Transforming Exams. (2014). Transforming exams—a scalable examination platform for BYOD invigilated assessment. Retrieved from http://www.transformingexams.com/ Trenholm, S. (2007). A review of cheating in fully asynchronous online courses: a math or fact-based course perspective. Journal of Educational Technology Systems, 35(3), 281–300. doi:10.2190/Y78L-H21X-241N-7Q02 Trezise, K., Ryan,T., de Barba, P., & Kennedy, G. (2019). Detecting academic misconduct using learning analytics. Journal of Learning Analytics, 6(3), 90–104. doi:10.18608/jla.2019.63.11 Turnbull, J. (1989). What is… normative versus criterion-referenced assessment. Medical Teacher, 11(2), 145–150. doi:10.3109/01421598909146317 Turnitin. (2017). Plagiarism checker | WriteCheck by Turnitin. Retrieved from http:// en.writecheck.com/ Turnitin. (2019). Can students “trick”Turnitin? Retrieved from https://www.turnitin.com/ blog/can-students-trick-turnitin UK Quality Assurance Agency. (2017). Contracting to cheat in higher education—how to address contract cheating, the use of third-party services and essay mills. Retrieved from https://www.qaa.ac.uk/docs/qaa/quality-code/contracting-to-cheat-in-highereducation.pdf?sfvrsn=f66af681_8 Valve. (2020). Steam search. Retrieved from https://store.steampowered.com/search/?sort_ by=Type&sort_order=ASC&category1=998&category2=8 Valve. (n.d.).VAC integration (steamworks documentation). Retrieved from https://partner. steamgames.com/doc/features/anticheat/vac_integration van der Vleuten, C. P. M., Schuwirth, L. W. T., Driessen, E. W., Dijkstra, J., Tigelaar, D., Baartman, L. K. J., & van Tartwijk, J. (2012).A model for programmatic assessment ft for purpose. Medical Teacher, 34(3), 205–214. doi:10.3109/0142159X.2012.652239 van Dijck, J. (2014). Datafcation, dataism and dataveillance: Big Data between scientifc paradigm and ideology. Surveillance & Society, 12(2), 197–208. Villarroel,V., Bloxham, S., Bruna, D., Bruna, C., & Herrera-Seda, C. (2018).Authentic assessment: creating a blueprint for course design. Assessment & Evaluation in Higher Education, 43(5), 840–854. doi:10.1080/02602938.2017.1412396 Visentin, L. (2015). MyMaster essay cheating scandal: more than 70 university students face suspension. Sydney Morning Herald Online. Retrieved from https://www.smh.com.au/ national/nsw/mymaster-essay-cheating-scandal-more-than-70-university-students-facesuspension-20150312-1425oe.html Vowell, P. R., & Chen, J. (2004). Predicting academic misconduct: a comparative test of four sociological explanations. Sociological Inquiry, 74(2), 226–249. doi:10.1111/ j.1475-682X.2004.00088.x w34z3l. (2015, June 29). Protection vs poker cheaters. Retrieved from https://www.pokervip.com/strategy-articles/maximize-your-poker-earnings/protection-vs-poker-cheaters Wallace, M. J., & Newton, P. M. (2014). Turnaround time and market capacity in contract cheating. Educational Studies, 40(2), 233–236. doi:10.1080/03055698.2014.889597
References
163
Wang, X., Su,Y., Cheung, S.,Wong, E., & Kwong,T. (2012). An exploration of Biggs’ constructive alignment in course design and its impact on students’ learning approaches. Assessment & Evaluation in Higher Education, 38(4), 477–491. doi:10.1080/02602938.2012. 658018 Watling, C. (2016). The uneasy alliance of assessment and feedback. Perspectives on Medical Education, 5(5), 262–264. doi:10.1007/s40037-016-0300-6 Weinberg, J. P. (2003). Authorship and author in the ancient near east and in the Hebrew Bible. Hebrew Studies, 44(1), 157–169. Whyte, S. (2019, February 18). Minister denies robodebt caused more than 2000 deaths. Sydney Morning Herald. Retrieved from https://www.smh.com.au/politics/federal/ minister-denies-robodebt-caused-more-than-2000-deaths-20190217-p50yep.html WikiHow. (2016). How to cheat on a test using school supplies. Retrieved from http:// www.wikihow.com/Cheat-on-a-Test-Using-School-Supplies Willett, S., & Hogan, M. (2019). Cheating the network: how gamers play the infrastructure. Canadian Journal of Communication, 44(3), 439–453. doi:10.22230/cjc.2019v43n3a3382 Williams, J.B. (2005). Plagiarism: deterrence, detection and prevention. In The handbook for economics lecturers. London:The Economics Network, pp. 1–19). Williams, J. B., & Wong,A. (2009).The efcacy of fnal examinations: a comparative study of closed-book, invigilated exams and open-book, open-web exams. British Journal of Educational Technology, 40(2), 227–236. doi:10.1111/j.1467-8535.2008.00929.x Williams, R. J., Wood, R. T., & Parke, J. (2012). History, current worldwide situation, and concerns with Internet gambling. In R. J.Williams, R.T.Wood, & J. Parke (Eds.), Routledge international handbook of Internet gambling (pp. 23–46).Abingdon, Oxon: Routledge. Williamson, B. (2019, June 28). Automating mistrust. Retrieved from https://codeactsineducation.wordpress.com/2019/06/28/automating-mistrust/ Winstone, N., & Carless, D. (2019). Designing efective feedback processes in higher education: a learning-focused approach.Abingdon, Oxon: Routledge. Woldeab, D., & Brothen,T. (2019). 21st Century assessment: online proctoring, test anxiety, and student performance. International Journal of E-Learning & Distance Education, 34(1), 1–10. Woo, J., Kang, S.W., Kim, H. K., & Park, J. (2018). Contagion of cheating behaviors in online social networks. IEEE Access, 6, 29098–29108. doi:10.1109/ACCESS.2018.2834220 Wu, X., & Zhang, X. (2016).Automated Inference on Criminality using Face Images.ArXiv e-prints, 1611. Retrieved from http://adsabs.harvard.edu/abs/2016arXiv161104135W Yakovchuk, N., Badge, J., & Scott, J. (2011). Staf and student perspectives on the potential of honour codes in the UK International Journal for Educational Integrity, 7(2), 37–52. Yorke, J., Sefcik, L., & Veeran-Colton,T. (in press). Contract cheating and blackmail: a risky business? Studies in Higher Education. doi:10.1080/03075079.2020.1730313 Zanzotto, F. M. (2019). Human-in-the-loop artifcial intelligence. Journal of Artifcial Intelligence Research, 64, 243–252. Zhang,T. (2019). Fighting cheaters in games after Blizzard v. Bossland: the history and future of anti-cheats litigation. Journal of Law,Technology & Policy, 2019(2), 241–264. Zhao, B., & Chen, Q. (2017). Location spoofng in a location-based game: a case study of Pokémon Go. Paper presented at the Advances in Cartography and GIScience, Cham. Zhou, N. (2019, August 24). Game over: six arrested by Australian police over alleged online gaming match-fxing. The Guardian, Australia Edition. Retrieved from https://www. theguardian.com/australia-news/2019/aug/24/game-over-six-arrested-by-australianpolice-over-alleged-online-gaming-match-fxing Zwagerman, S. (2008).The Scarlet P: plagiarism, panopticism, and the rhetoric of academic integrity. College Composition and Communication, 59(4), 676–710. Retrieved from www. jstor.org/stable/20457030
Index
academic integrity 3, 7, 18, 19–20, 32, 40, 61, 98–107, 129–130, 137–138, 141–143 advertising 39–41 algorithmic accountability 94 apps for cheating 89 artifcial intelligence 27–28, 83–97 assessment 20–21, 128–132 assessment design 32–35, 128–141 Assessment Design Decisions Framework 130–131 assessment security 19–21; criticisms of 36; defnition 19; key features 21–24; metrics 108–127; standards 125 authentic assessment 32, 132, 136 authentic restrictions 136 authentication 21–22, 27, 60–62, 124–125, 135 auto paraphrase tools 11–12, 15, 51, 110, 115, 133 back translation 11–12; see also auto paraphrase tools blackmail 35, 55, 56–57 braindumps 10 burden of proof 26, 119 calculators for cheating 9, 11, 14, 23, 45–46, 87, 136 cognitive ofoading 11–13, 23, 33–34, 75–76, 86–89 computer algebra solvers 13 computer based exams 6, 9–10, 14, 16, 27, 62–64, 66, 79; see also remote proctored exams consequential validity 122 content matching 27, 30–31, 51–52, 99–102, 104, 106, 125, 133
contract cheating 6, 14, 25–26, 39–42, 44–45, 48–51, 52–57, 68, 90–93, 96, 110–112, 115–120, 132; prevalence of 121 control of circumstances 22–24, 27–29, 31, 124–125 cost of diferent cheating approaches 112 dark web 47 denial of service attack 16–17 detection vs. decision making 90–91 dialogic feedback 56, 138–139 disclosure 66–68, 143–144 e-cheating 1–18; defnition 4; lifecycle 38–58; taxonomy 8–9 e-sports 73–75 earpieces for cheating 14, 42, 45, 47, 113 essay spinning see auto paraphrase tools group work 135–136 hacking 7–8, 15, 59–69 High Level Machine Intelligence (HLMI) 85–86, 142 honour codes 105 learning outcomes 33–34 legal approaches to stopping cheating 35–36 lockdown 27, 29, 77–78, 109–110 online exams see computer-based exam online gambling 75–77 online games 70–75 penetration testing 65–66 programmatic assessment security 137–138, 140
Index remote proctored exams 14–15, 26–28, 51–52, 60, 82, 90, 99, 112–114, 122–123, 125–126; see also computer-based exams sensitivity and specifcity 92–93, 96, 116–120 stylometrics 27, 30–31, 56, 90, 99, 101, 125–126 surveillance 26–29, 30, 36, 52, 78, 98–107 surveillance culture 99–100
trust 102–103, 108–109, 143 USB key injector 10, 62, 72, 113 virtual machine 9–10, 63, 79, 114 weaponisation of assessment security 103–104
165