Ethics of Artificial Intelligence (The International Library of Ethics, Law and Technology, 41) 3031481348, 9783031481345

This book presents the reader with a comprehensive and structured understanding of the ethics of Artificial Intelligence

156 41 6MB

English Pages 263 [254] Year 2024

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Acknowledgments
Contents
Chapter 1: Introduction
References
Part I: Can an AI System Be Ethical?
Chapter 2: Bias and Discrimination in Machine Decision-Making Systems
2.1 Introduction
2.2 Why Machine Failure Is More Serious
2.3 How Machine Learning Works
2.4 What Is Meant by Machine Discrimination
2.4.1 Fairness Through Unawareness
2.4.2 Individual Fairness
2.4.3 Counterfactual Fairness
2.4.4 Group Fairness
2.4.5 Impossibility of Fairness
2.5 What We Are Talking About: Example of Machine Discrimination
2.6 Why Machine Learning Can Discriminate
2.7 How Machine Discrimination Can Be Overcome
2.7.1 Pre-processing for Fairness
2.7.2 In-training for Fairness
2.7.3 Post-processing for Fairness
2.8 Conclusion
References
Chapter 3: Opacity, Machine Learning and Explainable AI
3.1 Introduction
3.2 Fundamentals of Trustworthy and Explainable Artificial Intelligence
3.3 Dimensions and Strategies for Promoting Explainablity and Interpretability
3.3.1 Dimensions of Explainability and Interpretability
3.3.2 Interpretability Strategies
3.4 Digging Deeper on Counterfactual Explanations
3.4.1 Basics of Counterfactual Explanations
3.4.2 Overview on Techniques for Counterfactual Explanations
3.5 Future Challenges for Achieving Explainable Artificial Intelligence
3.5.1 Multimodal Data Fusion for Improved Explainability
3.5.2 Reliable and Auditable Machine Learning Systems
3.5.3 GPAI Algorithms to Learn to Explain
3.6 Concluding Remarks
References
Chapter 4: The Moral Status of AI Entities
4.1 Introduction
4.2 Can Machines Be Moral Agents?
4.3 Do We Need a Mind to Attribute Moral Agency?
4.4 The Challenge of Responsibility
4.5 Artificial Moral Patients and Rights
4.6 Relationalist Proposals
4.7 Conclusion
References
Part II: Ethical Controversies About AI Applications
Chapter 5: Ethics of Virtual Assistants
5.1 Introduction
5.2 What Are Virtual Assistants?
5.3 What Ethical Issues Do Virtual Assistants Raise?
5.3.1 Human Agency and Autonomy
5.3.1.1 Manipulation and Undue Influences
5.3.1.2 Cognitive Degeneration and Dependency
5.3.2 Human Obsolescence
5.3.3 Privacy and Data Collection
5.4 Should We Use Virtual Assistants to Improve Ethical Decisions?
5.5 Concluding Remarks
References
Chapter 6: Ethics of Virtual Reality
6.1 Introduction
6.2 Preliminaries
6.2.1 Prehistory and History
6.2.2 Is It Real?
6.3 My Avatar
6.3.1 What They Reveal About the User
6.3.2 How They Influence the Behaviour of Other Users
6.3.3 How They Influence the User’s Behaviour
6.4 What Is Good
6.5 What Is Bad
6.5.1 Personal Risks
6.5.2 Social Risks
6.6 What Is Weird
6.7 Ethical Issues
6.7.1 Privacy
6.7.2 Ethical Behaviour
6.8 Conclusions
References
Chapter 7: Ethical Problems of the Use of Deepfakes in the Arts and Culture
7.1 Introduction: What Is a Deepfake? Why Could It Be Dangerous?
7.2 Are Deepfakes Applied to Arts and Culture Harmful?
7.2.1 Encoding-Decoding and GAN Deepfakes
7.2.2 The Moral Limit of Artistic Illusion
7.2.3 Resurrecting Authors
7.2.4 Falsifying Style
7.3 The Limits of Authorship
7.4 Conclusion
References
Chapter 8: Exploring the Ethics of Interaction with Care Robots
8.1 Introduction
8.2 State of Art
8.3 What Are Care Robots?
8.3.1 Definition
8.3.2 A Bit of History
8.3.3 Taxonomy
8.3.4 Some More Examples of Existing Robots
8.4 Design
8.5 An Ethical Framework for Care Technologies
8.6 Conclusion
References
Chapter 9: Ethics of Autonomous Weapon Systems
9.1 Introduction
9.2 Autonomous Weapon Systems
9.2.1 Definitions
9.2.2 Examples
9.2.2.1 Sentry Robots: SGR—A1
9.2.2.2 Loitering Munitions with Human in the Loop: Switchblade and Shahed-136
9.2.2.3 Autonomous Loitering Munitions: HARPY
9.2.2.4 Autonomous Cluster Bomb: Sensor Fuzed Weapon (SFW)
9.2.2.5 Hypothetical AWS: SFW + Quadcopter + Image Recognition Capabilities
9.3 Legal Basis
9.4 Main Issues Posed by AWS
9.4.1 Low Bar to Start a Conflict: Jus Ad Bellum
9.4.2 Availability of Enabling Technologies and the Dual Use Problem
9.4.3 Meaningful Human Control
9.4.4 Unpredictability of AWS
9.4.5 Accountability
9.4.6 Human Dignity—Dehumanization of Targets
9.5 Conclusions
References
Part III: The Need for AI Boundaries
Chapter 10: Ethical Principles and Governance for AI
10.1 Intro: Risk and Governance
10.2 AI Risks, Responsibility and Ethical Principles
10.3 Ethical Guidelines and the European Option for AI Governance
10.4 The Artificial Intelligence Regulation in Europe
10.5 AI Governance: Open Questions, Future Paths
References
EU Legislation and Official Documents Cited
Other Resources Mentioned
Chapter 11: AI, Sustainability, and Environmental Ethics
11.1 Introduction
11.2 Energy Demands and Environmental Impacts of AI Applications
11.3 What Is Sustainability?
11.4 A Path to Make AI More Sustainable from Environmental Ethics
11.4.1 The Anthropocentric Concern for the Environmental Costs of AI
11.4.2 The Biocentric Concern for the Environmental Costs of AI
11.4.3 The Ecocentric Concern for the Environmental Costs of AI
11.5 Ethical Values for a Sustainable AI
11.6 Conclusions
References
Chapter 12: The Singularity, Superintelligent Machines, and Mind Uploading: The Technological Future?
12.1 Introduction
12.2 The Advent of the Singularity: Raymond Kurzweil’s Predictions
12.2.1 Is the Singularity Near?
12.2.2 From Moore’s Law to Law of Accelerating Returns
12.3 The Roadmap to Superintelligent Machines
12.3.1 Concerns and Uncertainties
12.3.2 The Future of Superintelligence by Nick Bostrom
12.4 What if We Can Live Forever? Dreams of Digital Immortality
12.4.1 Types of MU: The Analysis of David Chalmers
12.4.2 Will I Still Be Myself in a Virtual World? Problems with Personal Identity
12.5 Conclusions
References
Recommend Papers

Ethics of Artificial Intelligence (The International Library of Ethics, Law and Technology, 41)
 3031481348, 9783031481345

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

The International Library of Ethics, Law and Technology  41

Francisco Lara Jan Deckers   Editors

Ethics of Artificial Intelligence

The International Library of Ethics, Law and Technology Volume 41

Series Editors Bert Gordijn, Ethics Institute, Dublin City University, Dublin, Dublin, Ireland Sabine Roeser, Philosophy Department, Delft University of Technology, Delft, The Netherlands Editorial Board Members Dieter Birnbacher, Institute of Philosophy, Heinrich-Heine-Universität, Düsseldorf, Nordrhein-Westfalen, Germany Roger Brownsword, Law, Kings College London, London, UK Paul Stephen Dempsey, University of Montreal, Institute of Air & Space Law, Montreal, Canada Michael Froomkin, Miami Law, University of Miami, Coral Gables, FL, USA Serge Gutwirth, Campus Etterbeek, Vrije Universiteit Brussel, Elsene, Belgium Bartha Knoppers, Université de Montréal, Montreal, QC, Canada Graeme Laurie, AHRC Centre for Intellectual Property and Technology Law, Edinburgh, UK John Weckert, Charles Sturt University, North Wagga Wagga, Australia Bernice Bovenkerk, Wageningen University and Research, Wageningen,  The Netherlands Samantha Copeland , Technology, Policy and Management, Delft University of Technology, Delft, Zuid-Holland, The Netherlands J. Adam Carter, Department of Philosophy, University of Glasgow, Glasgow, UK Stephen M. Gardiner, Department of Philosophy, University of Washington, Seattle, WA, USA Richard Heersmink, Philosophy, Macquarie University, Sydney, NSW, Australia Rafaela Hillerbrand, Karlsruhe Institute of Technology, Karlsruhe, Baden-Württemberg, Germany Niklas Möller, Stockholm University, Stockholm, Sweden Jessica Nihlén Fahlquist, Centre for Research ethics and Bioethics, Uppsala University, Uppsala, Sweden Sven Nyholm, Philosophy and Ethics, Eindhoven University of Technology, Eindhoven, The Netherlands Yashar Saghai, University of Twente, Enschede, The Netherlands Shannon Vallor, Department of Philosophy, Santa Clara University, Santa Clara, CA, USA Catriona McKinnon, Exeter, UK Jathan Sadowski, Monash University, Caulfield South, VIC, Australia

Technologies are developing faster and their impact is bigger than ever before. Synergies emerge between formerly independent technologies that trigger accelerated and unpredicted effects. Alongside these technological advances new ethical ideas and powerful moral ideologies have appeared which force us to consider the application of these emerging technologies. In attempting to navigate utopian and dystopian visions of the future, it becomes clear that technological progress and its moral quandaries call for new policies and legislative responses. Against this backdrop, this book series from Springer provides a forum for interdisciplinary discussion and normative analysis of emerging technologies that are likely to have a significant impact on the environment, society and/or humanity. These will include, but be no means limited to nanotechnology, neurotechnology, information technology, biotechnology, weapons and security technology, energy technology, and space-based technologies.

Francisco Lara  •  Jan Deckers Editors

Ethics of Artificial Intelligence

Editors Francisco Lara Department of Philosophy I University of Granada Granada, Spain

Jan Deckers School of Medicine Newcastle University Newcastle-upon-Tyne, UK

ISSN 1875-0044     ISSN 1875-0036 (electronic) The International Library of Ethics, Law and Technology ISBN 978-3-031-48134-5    ISBN 978-3-031-48135-2 (eBook) https://doi.org/10.1007/978-3-031-48135-2 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.

Acknowledgments

The editors are grateful for the support provided by the project ‘Digital Ethics. Moral Enhancement through an Interactive Use of Artificial Intelligence’ (PID2019-104943RB-I00), funded by the State Research Agency of the Spanish Government; the project ‘Moral Enhancement and Artificial Intelligence. Ethical Aspects of a Socratic Virtual Assistant’ (B-HUM-64-UGR20), funded by FEDER/ Junta de Andalucía  – Consejería de Transformación Económica, Industria, Conocimiento y Universidades; and the project ‘Ethics of Artificial Intelligence in Relation to Medicine and Health’ (B140), funded by the Great Britain Sasakawa Foundation. The editors are also very grateful to all the authors who have contributed to this book, to Springer Nature, and to those who have taken the time to comment and review the ideas presented in this book.

v

Contents

1

Introduction����������������������������������������������������������������������������������������������    1 Jan Deckers and Francisco Lara

Part I Can an AI System Be Ethical? 2

 Bias and Discrimination in Machine Decision-Making Systems ��������   13 Jorge Casillas

3

 Opacity, Machine Learning and Explainable AI����������������������������������   39 Alberto Fernández

4

 The Moral Status of AI Entities��������������������������������������������������������������   59 Joan Llorca Albareda, Paloma García, and Francisco Lara

Part II Ethical Controversies About AI Applications 5

Ethics of Virtual Assistants���������������������������������������������������������������������   87 Juan Ignacio del Valle, Joan Llorca Albareda, and Jon Rueda

6

Ethics of Virtual Reality��������������������������������������������������������������������������  109 Blanca Rodríguez López

7

Ethical Problems of the Use of Deepfakes in the Arts and Culture����������������������������������������������������������������������������������������������  129 Rafael Cejudo

8

 Exploring the Ethics of Interaction with Care Robots ������������������������  149 María Victoria Martínez-López, Gonzalo Díaz-Cobacho, Aníbal M. Astobiza, and Blanca Rodríguez López

9

Ethics of Autonomous Weapon Systems������������������������������������������������  169 Juan Ignacio del Valle and Miguel Moreno

vii

viii

Contents

Part III The Need for AI Boundaries 10 Ethical  Principles and Governance for AI ��������������������������������������������  191 Pedro Francés-Gómez 11 AI,  Sustainability, and Environmental Ethics ��������������������������������������  219 Cristian Moyano-Fernández and Jon Rueda 12 The  Singularity, Superintelligent Machines, and Mind Uploading: The Technological Future?��������������������������������  237 Antonio Diéguez and Pablo García-Barranquero

Chapter 1

Introduction Jan Deckers and Francisco Lara

A significant stimulus for co-editing a book on AI ethics comes from a collaboration between Francisco Lara and Jan Deckers that started when Francisco spent some time, in 2017, as a visiting researcher in the School of Medicine at Newcastle University (United Kingdom). Both of us had been interested for quite some time in the ethics of human enhancement by biotechnological means. Both saw significant problems with these ambitions and associated technologies, recognising at the same time the value of, and the need for a particular type of human enhancement, the improvement of our moral skills. This collaboration resulted in the idea that artificial intelligence (AI) might potentially fulfil a useful role in this respect, and led to the publication of an article that makes a case for designing and using ‘Artificial Intelligence as a Socratic Assistant for Moral Enhancement’ (Lara and Deckers 2020). Francisco subsequently developed this idea, arguing that such a virtual assistant might be even better than a human being in morally enhancing AI users (Lara 2021), whilst Jan spent some time thinking through the nature of the unnatural or artificial (Deckers 2021). Meanwhile, both of us developed a greater awareness of the moral problems and opportunities of AI in general, and of the urgent need to explore, resolve, and govern the ethical issues raised by AI. Artificial intelligence has become ubiquitous and is significantly changing human lives, in many cases, for the better. However, given the magnitude of the change, questions inevitably arise about its ethical challenges. These are beginning J. Deckers (*) School of Medicine, Newcastle University, Newcastle-upon-Tyne, UK e-mail: [email protected] F. Lara Department of Philosophy I, University of Granada, Granada, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 F. Lara, J. Deckers (eds.), Ethics of Artificial Intelligence, The International Library of Ethics, Law and Technology 41, https://doi.org/10.1007/978-3-031-48135-2_1

1

2

J. Deckers and F. Lara

to be of great concern to society and those who govern it, leading to many attempts to design ethical regulatory frameworks for the development and application of AI. In the academic sphere, this concern has translated into a growing number of publications on the subject of what is already a new discipline within applied ethics, the ethics of artificial intelligence, even if the limits of this discipline are not clearly defined. Work on the ethics of AI has exploded in recent years, but there are relatively few edited collections that aim to provide an overview of the key ethical issues in the debate around AI. This book aims to offer specialists and the general public a tool to develop a rigorous understanding of the key debates in the emerging ethics of AI.  These have been selected carefully and experts have been commissioned to work on detailed reviews of the state of the art of each debate. In order to present the reader with a comprehensive and structured understanding of AI ethics, the book is divided into three sections. The first part addresses how and to what extent AI systems could function ethically, and even be attributed moral status. The second part reviews and analyses the ethical issues raised by particular applications of AI, such as virtual assistants, care robots, technological warfare, social virtual reality and deepfakes. Finally, the third part considers possible contributions to the ethics of AI from other fields such as governance, experimental philosophy, environmental science and philosophy of science. So, the first question we have to ask ourselves is the very meaning of AI ethics. Are there moral problems in the very development of AI systems? There is a belief that when there is a machine involved in decisions, everything works better. Many people trust machines because they believe that they can be relied on not to make mistakes and that they may be better informed than we are to deduce what is best. However, machines are designed by humans, and machines are fed with data produced by humans. So, whether by design flaws (for example by unintentionally collecting data from biased human behaviour), or by intentionally biased design, in the end, the machine can be affected by the same problems as those we face without using machines. At this point, we might think that, in the end, although a machine may also have biased behaviour, it will be no worse than that of a human. However, here we run into another reality: any failure of a machine can be much more serious than that of a human. This is mainly due to the fact that machine automatisms can be massive, invisible, and sovereign. This, together with the belief that a machine is unbiased – masking possible discriminatory effects, and unconscious technological dependence, form a perfect storm. In addition, we must think of scenarios where the machine’s decisions are simply relied on, and on particularly sensitive matters such as granting a subsidy, a loan, passing a court sentence, or diagnosing a disease. In addition, there are other daily issues where machines condition our ways of thinking, such as what news should be interesting for us and what videos we should watch. Naturally, a well-designed automatic decision-making system would not cause these problems. In the second chapter (“Bias and Discrimination in Machine Decision-Making Systems”), Jorge Casillas focuses on situations where this

1 Introduction

3

design is not correct, or is subject to an interpretation of what is correct that might be questionable or skewed. Under these premises, a machine can be biased and lead to discriminatory decisions against certain social groups. Once the reasons for these malfunctions have been identified, the chapter also looks at solutions for good design. However, for AI to be reliable, it is not enough for it to be aligned with values such as the avoidance of unfair discrimination of certain groups. Its decisions must also be understandable, something that does not seem to be within the reach of certain AI developments and techniques. Thus, machine learning includes training and testing stages where, in the former stage, algorithms construct models based on input data, which are then tested on new data to obtain new understanding and to make decisions. Machine learning models include deep neural networks, support vector machines, and ensembles of classifiers. Machine learning has become a powerful decision-making tool, but also raises significant ethical concerns, for example issues related to the transparency and explainability of the algorithms used (Burkart and Huber 2021). In order to deal with some of these challenges, Alberto Fernández identifies, in the third chapter, Explainable Artificial Intelligence (XAI) as an emerging field of research that focuses on developing machine learning models, and AI systems in general, that are transparent, interpretable, and explainable (Barredo Arrieta et al. 2020). In order to ‘open the black box’ or reduce the opacity of AI systems, Fernández focuses on a typology called ‘counterfactual examples’, which can audit AI systems and be usefully applied to various scenarios, including decision-­making related to bank loans, hiring policies, and health. So far, we have referred to the problems related to how AI systems could operate in accordance with our values, for example our interests in impartiality and understandability. However, AI systems are increasingly required to develop their own values, to work out their own ethics, allowing them to make decisions as ethical subjects (Wallach and Allen 2009; Anderson et al. 2004). A clear example of this is the controversy surrounding how to design driverless vehicles for possible scenarios in which they have to choose between lives in danger. These developments question standard conceptions of moral status. In Chap. 4 on “The Moral Status of AI Entities”, Joan Llorca, Paloma García and Francisco Lara explore the possibility of AI stretching the boundaries of moral status: whilst clarity is much needed in relation to the concept of intrinsic value (see e.g. Deckers 2022), it is worth asking whether AI systems can be capable of possessing the necessary properties to be morally considered in themselves. A more challenging issue is whether an AI system could be understood as a moral agent with responsibilities. Here, the authors separate the issue of moral agency from that of having responsibilities. The authors underscore that there is a broad consensus in the literature about the inability of AI systems to be held responsible. At the same time, they entertain the notion of artificial moral agents (AMAs) that can think and act morally, and that might become essential if they turn out to be better than humans in deliberating and acting morally (Chomanski 2020) and may be able to cope better with the complexity of contemporary societies (Wallach and Allen 2009). They perceive that this theory may be

4

J. Deckers and F. Lara

odd, given that artifacts are not ordinarily understood as agents due to the view that they lack minds. However, they argue that one cannot deny a priori, from a narrow and anthropocentric conception of morality, that morally relevant properties might be possessed by artificial entities. They proceed by reviewing “the relational turn”, the thought that moral value should not be determined by an entity’s moral properties, but by meaningful relationships (see e.g. Coeckelbergh 2014; Gunkel 2018). Having reviewed these basic questions about the ethical limitations of AI itself, the second part of the book (‘Ethical Controversies about AI Applications’) focuses on controversies related to the design and use of AI systems in different domains, including the use of AI as virtual assistants, the use of care robots, and the use of deepfakes in the arts and culture. In Chap. 5 on “Ethics of Virtual Assistants”, Juan Ignacio del Valle, Joan Llorca, and Jon Rueda write about artificial virtual assistants, the use of which has increased significantly in recent decades. They define such systems as ‘not strictly linked to hardware’ because of their virtual nature and rich interfaces, which include but are not limited to text and voice. Their definition includes things that many people associate with virtual assistants, such as Siri and Alexa, but also other systems that are less frequently associated with them, such as recommender systems. The authors recognise that such assistants can fulfil useful functions, but warn that many users may not be (fully) aware of their (secret) presence. More generally, the authors identify three significant ethical concerns. Firstly, they can undermine human autonomy through direct surreptitious manipulation. In addition, they can lead to dependency and de-skilling, which – in the extreme – can lead to the second issue that they discuss: human obsolescence. Thirdly, the authors dive deeply into privacy issues, particularly when this value is jeopardised by such assistants recording intimate conversations or producing personal profiles that are used commercially, without consent. The authors also raise the question of whether artificial virtual assistants could be moral assistants, where they engage with our proposal for a ‘Socratic assistant’ (Lara and Deckers 2020), also called ‘SocrAI’ (Lara 2021). Whereas SocrAI’s suggestions to resolve moral dilemmas should only be trusted where they are understood by the moral agent, the authors warn that fairness might be compromised by such an approach (due to the necessity to trade off explainability with accuracy in some situations), that the potential moral de-skilling might be serious, and that systems that are presented as virtual moral assistants may be designed badly, either intentionally or unintentionally (Rueda et al. 2022; Vallor 2015). If, as we have seen, the possibility of having a virtual assistant that performs functions similar to its human equivalent raises important new ethical questions, think of those that lie ahead with AI implementations that allow us to experience everything around us, and even ourselves, as virtual. Some of these new questions are set out in Chap. 6 “Ethics of Virtual Reality”, by Blanca Rodríguez López. She dedicates part of her chapter to the philosophical debate on whether the virtual is real, emphasising the shocking but convincing arguments of “virtual realism” (versus “virtual fictionalism”) (Chalmers 2017). Virtual realism is reflected, for example, in the relevance of avatars, which reveal important aspects of the user’s

1 Introduction

5

personality. Avatars have significant behavioural implications for the user himself/herself and for many others. The chapter reviews studies that reveal the psychological relevance of avatars. The author also considers the contributions of the specialised literature regarding the benefits, risks and “rarities” of virtual reality. Personal risks include the physical damage associated with the prolonged use of technological devices, for example “virtual reality sickness”, as well as psychological risks, such as addiction or depersonalisation disorder. She also discusses social damage, for example the loss of a common environment, or public sphere, and the loss of a common perceptual basis that undergirds statements that claim to be true (Slater et al. 2020). Finally, the chapter considers two important ethical issues related to virtual reality: the threat to privacy and the psychological or emotional damage that a user of interactive virtual platforms can inflict on others. On a different level and in a different domain, the distinction between the virtual and the real is also blurred when certain AI technologies are used to fake audiovisual material very accurately. We are referring here to deepfakes and this is the subject of Rafael Cejudo’s Chap. 7, entitled “Ethical Problems of the Use of Deepfakes in the Arts and Culture”. These emerged initially when AI was used to create pornographic material by impersonating the features of celebrities in videos. They have now become more wide-ranging in the creation of films and fine art. Whilst deepfakes can be sophisticated forms of art where they use deep intelligence to create a fake that is still perceived for what it is – a fake, Cejudo also warns that they can undermine people’s trust in audiovisual materials. When a person’s face is changed for that of another (‘a deepface’) and when a person is simulated to say something that they did not say (‘a deepsound’), they can be used for malicious purposes, for example to spread slander and to demand ransom after a faked kidnap. Additionally, the use of deepfakes in the arts and culture raises specific ethical problems, such as including the performance of deceased persons in audiovisual materials and simulating real authors’ artistic styles. The chapter also engages with the question whether some AI systems should be granted rights that come with authorship, particularly copyright. Another domain with a great future, and at the same time with very dangerous and unfair effects, is the usage of care robots. In Chap. 8 on “Exploring the Ethics of Interaction with Care Robots”, María Victoria Martínez-López, Gonzalo Díaz-­ Corbacho, Aníbal M.  Astobiza, and Blanca Rodríguez López focus on how care robots have altered the scene of human care significantly. They explore how this change compares with care without the assistance of these sophisticated machines. In order to shed light on what these are, the authors establish a taxonomy of care robots. As new care robots are being launched on a daily basis, the authors rightly focus on the need for these robots to be designed in order to promote the autonomy and well-being of their users. The authors question Sparrow’s (2016) dystopian prediction as to what these robots might look like in the future, but argue that some care robots may not be welcome where the human agents who support their use increase inequalities in health care, make such robots difficult to handle or understand, impair accountability because of the ‘many-hands problem’ (Van De Poel and Zwart

6

J. Deckers and F. Lara

2015), gather data from vulnerable users who may not consent, reduce jobs for human carers, ignore human oversight, or replace real care with simulated care. At the same time that AI systems and robots are being designed to save and improve the lives of many sick or disabled people, research is also being done, paradoxically, on how to increase their destructive and murderous potential in war or terrorism scenarios. In Chap. 9 “Ethics of Autonomous Weapons Systems”, Juan Ignacio del Valle and Miguel Moreno focus on the legal and ethical issues raised in particular by autonomous weapons systems (AWS). After explaining what these are and showing some examples, the authors highlight the limitations of current legislation to adequately regulate these technological developments. In particular, they show how International Humanitarian Law (IHL), enacted in a context where warfare technology was under direct human control, requires a thorough revision if it is to be useful in the present. They introduce some of the main problems we encounter in current debate about AWS and argue that this is jeopardized not only by political reasons and different ethical views (e.g., regarding the value of human dignity), but also by other elements potentially easier to set down, like a definition of autonomy, understanding some characteristics, such as the operational context, the concept of unpredictability, and other high-level technical concepts. The main objective of the chapter is to show how an enhanced understanding of these latter aspects could greatly facilitate the discussion and proper regulation of such a relevant area of AI implementation as the use of AWS. Having considered the ethical issues underlying the main applications of AI, in the final part of the book (“The Need for AI Boundaries”) we turn to the question how AI should be regulated to address the current relative lack of governance, to deal appropriately with ecological challenges raised by AI and other technologies, and to ensure that some AI developers’ quest for Artificial General Superintelligence (AGSI) neither diminishes nor elevates human control inappropriately through projects related to the ‘Singularity’ or ‘mind uploading’ (MU). In Chap. 10 on “Ethical Principles and Governance for AI”, Pedro Francés-­ Gómez points out that most AI is self-regulated by the companies that develop it and by various professional organisations that have produced a number of statements and declarations. Proper legal frameworks have not been established as yet. Francés-­ Gómez explores some principles that have been proposed in the voluntary codes that have been developed in relation to AI research and development before analysing the ‘Ethics Guidelines for Trustworthy AI’ published by an expert group appointed by the EU Commission (High-Level Expert Group on Artificial Intelligence 2019). Whilst the European Union is developing an AI Act, progress has been slow. Even so, the European Union may be ahead of the game compared to many other (inter)national jurisdictions, and the impact of its regulations, once developed, may have global reach. One of the issues addressed in the European proposal for an AI Act (European Commission 2021), albeit in a cursory manner, is the potential ecological impacts of this new technology. Such effects are significant, as Cristian Moyano-Fernández and Jon Rueda state in their Chap. 11 “AI, Sustainability, and Environmental Ethics”. However, since AI could also help us to address our ecological crisis, they

1 Introduction

7

refer here to the use of ‘AI for sustainability and the sustainability of AI’ (van Wynsberghe 2021; Coeckelbergh 2021). The authors warn that the widespread use of the label ‘sustainable’ may lead to ‘ethics washing’ and ‘greenwashing’ (Heilinger et  al. 2023), summarise dominant approaches to understanding this concept, and address the sustainability of AI from different approaches in ecological ethics. To develop human health holistically, it is indeed impossible to ignore either the significant social costs associated with AI, for example those associated with the hazards of mining for precious metals, or the ecological impacts of AI. These ecological costs are wide-ranging: AI is resource-intensive, the development and use of AI systems require considerable energy, and the negative health impacts of AI on nonhuman organisms, for example through habitat loss and climate change, are considerable and widely neglected. Finally, in Chap. 12 “The Singularity, Superintelligent Machines, and Mind Uploading: The Technological Future?”, Antonio Diéguez and Pablo García-­ Barranquero discuss AGSI, which may lead to the advent of the Singularity, but also to the possibility of MU. Rather than focus on dystopian scenarios associated with the Singularity, the authors argue that we should be more fearful of companies who run AI systems dictating how such systems should operate, given that this age has arrived already, as well as with bad uses of AI systems in the realistically foreseeable future (Véliz 2020). The authors warn that we should get serious about governing AI as the development of AGSI would herald an era where AI systems might snatch even more control from us and could potentially even eliminate the human species. Whilst AI systems might encroach on our space by becoming more like us, the authors also consider MU, which might be perceived as a move in the opposite direction whereby concrete, organic entities become virtual, inorganic entities, a project that some embrace to achieve immortality. The authors analyse this issue from various theoretical perspectives about personal identity, dismissing the thought that one could achieve a continuation of personal existence and immortality through MU. All the debates in this book are only a part, albeit a very representative one, of the many novel discussions on ethical issues that the rapid advance of AI has generated. Their recent emergence in social and academic spheres explains the absence of a consolidated discipline that studies these issues and the scarcity of specialised literature on some topics. However, society urgently needs answers to many of the moral controversies that AI entails. This task is urgent as the developments of this new technology are rushing ahead at great speed; as it is difficult to coordinate its regulation at a global level, especially if there is no agreed ethical basis; and above all as, despite the great benefits of AI, its effects can also be, as we have seen, very negative. This book, in its attempt to bring together the most important debates and their key issues, is presented as a contribution to this urgent need to think about what we want to do with AI and, correspondingly, with our ways of life. We would not like to end this introduction without thanking all the authors for their participation in this book and for their willingness to collaborate diligently with the deadlines and requests of the editors; the Springer publishing house for their interest in making this editorial project a reality; and finally, the Great Britain

8

J. Deckers and F. Lara

Sasakawa Foundation and the State Research Agency of the Government of Spain for having funded, respectively, the research projects “Ethics of Artificial Intelligence in Relation to Medicine and Health” and “Digital Ethics: Moral Enhancement through an Interactive Use of AI”, within the frameworks of which most of the works presented here have been carried out.

References Anderson, M., S.L. Anderson, and C. Armen. 2004. Towards machine ethics. Proceedings of the AAAI-04. Barredo Arrieta, A., N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Bar-bado, S. Garcia, S. Gil-Lopez, D. Molina, R. Benjamins, R. Chatila, and F. Herrera. 2020. Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion 58: 82–115. Burkart, N., and M.F. Huber. 2021. A survey on the explainability of supervised machine learning. Journal of Artificial Intelligence Research 70: 245–317. Chalmers, D.J. 2017. The virtual and the real. Disputatio 9 (46): 309–352. Chomanski, B. 2020. Should moral machines be banned? A commentary on van Wynsberghe and Robbins “critiquing the reasons for making artificial moral agents”. Science and Engineering Ethics 26 (6): 3469–3481. https://doi.org/10.1007/s11948-­020-­00255-­9. Coeckelbergh, M. 2014. The moral standing of machines: Towards a relational and non-­Cartesian moral hermeneutics. Philosophy & Technology 27 (1): 61–77. https://doi.org/10.1007/ s13347-­013-­0133-­8. ———. 2021. AI for climate: Freedom, justice, and other ethical and political challenges. AI and Ethics 1 (1): 67–72. Deckers, J. 2021. On (Un)naturalness. Environmental Values 30 (3): 297–318. ———. 2022. A critique on recent Catholic Magisterium’s thinking on animal ethics. Dilemata 39: 33–49. European Commission. 2021. Proposal for a Regulation of the European Parliament and of the Council laying down harmonised rules on artificial intelligence (Artificial Intelligence Act) and amending certain Union legislatives acts. European Commission. 206 final. 2021/0106(COD). 21.4.2021. Gunkel, D. 2018. Robot rights. Cambridge, MA: MIT Press. Heilinger, J.C., H. Kempt, and S. Nagel. 2023. Beware of sustainable AI! Uses and abuses of a worthy goal. AI and Ethics: 1–12. High-Level Expert Group on Artificial Intelligence. 2019. Ethics guidelines for trustworthy AI. Brussels: European Commission. https://digital-­strategy.ec.europa.eu/en/library/ ethics-­guidelines-­trustworthy-­ai. Lara, F. 2021. Why a virtual assistant for moral enhancement when we could have a Socrates? Science and Engineering Ethics 27 (4): 1–27. Lara, F., and J. Deckers. 2020. Artificial intelligence as a Socratic assistant for moral enhancement. Neuroethics 13 (3): 275–287. Rueda, J., J. Delgado Rodríguez, I. Parra Jounou, J. Hortal-Carmona, T. Ausín, and D. Rodríguez-­ Arias. 2022. “Just” accuracy? Procedural fairness demands explainability in AI-based medical resource allocations. AI & Society. Slater, M., C. Gonzalez-Liencres, P. Haggard, C. Vinkers, R. Gregory-Clarke, S. Jelley, Z. Watson, G. Breen, R. Schwarz, W. Steptoe, D. Szostak, S. Halan, D. Fox, and J. Silver. 2020. The ethics of realism in virtual and augmented reality. Frontiers in Virtual Reality 1: 1. https://doi. org/10.3389/frvir.2020.00001.

1 Introduction

9

Sparrow, R. 2016. Robots in aged care: A dystopian future? AI & Society 31: 445–454. Vallor, S. 2015. Moral deskilling and upskilling in a new machine age: Reflections on the ambiguous future of character. Philosophy & Technology 28 (1): 107–124. Van De Poel, I., and S.D. Zwart. 2015. From understanding to avoiding the problem of many hands. In Moral responsibility and the problem of many hands, ed. I. Van De Poel, L. Royakkers, and S.D. Zwart, 209–218. New York: Routledge. Van Wynsberghe, A. 2021. Sustainable AI: AI for sustainability and the sustainability of AI. AI and Ethics 1 (3): 213–218. Véliz, C. 2020. Privacy is power: Why and how you should take back control of your data. London: Penguin Books. Wallach, W., and C. Allen. 2009. Moral machines. Oxford: Oxford University Press.

Part I

Can an AI System Be Ethical?

Chapter 2

Bias and Discrimination in Machine Decision-Making Systems Jorge Casillas

Abstract  There exists a perception, which is occasionally incorrect, that the presence of machines in decision-making processes leads to improved outcomes. The rationale for this belief is that machines are more trustworthy since they are not prone to errors and possess superior knowledge to deduce what is optimal. Nonetheless, machines are crafted by humans and their data is sourced from human-­ generated information. Consequently, the machine can be influenced by the same issues that afflict humans, whether that is caused by design inadequacies, by deliberately skewed design, or by biased data resulting from human actions. But, with an added problem, any failure of a machine is much more serious than that of a human; mainly due to three factors: they are massive, invisible, and sovereign. When machine decision-making systems are applied to very sensitive problems such as employee hiring, credit risk assessment, granting of subsidies, or medical diagnosis, a failure means thousands of people are disadvantaged. Many of these errors result in unfair treatment of minority groups (such as those defined in terms of ethnicity or gender), thus incurring discrimination. This chapter reviews different forms and definitions of machine discrimination, identifies the causes that lead to it, and discusses different solutions to avoid or, at least, mitigate its harmful effect.

2.1 Introduction Human decision-making is a complex and nuanced process that involves a multitude of factors and variables. From the most mundane decisions like what to wear in the morning to the most critical ones like choosing a career or partner, our decision-­making processes shape our lives and determine our future. J. Casillas (*) Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 F. Lara, J. Deckers (eds.), Ethics of Artificial Intelligence, The International Library of Ethics, Law and Technology 41, https://doi.org/10.1007/978-3-031-48135-2_2

13

14

J. Casillas

Today, many of our decisions are conditioned by the assistance of automatic systems that help digest information to suggest the best decision or, in more and more situations, we delegate directly to these machines to decide for us. This delegation, in turn, is sometimes conscious but, in many other cases, we do it even without awareness. Automatism helps humans by improving efficiency, thus reducing the time and effort required to complete tasks, reducing errors by avoiding fatigue, distractions, or oversight. Its more interesting aspect is its capability to provide innovation and launch us into doing things that we would never have thought we would be capable of. Perhaps because of the attractive opportunities it offers, we are falling into a new alienation, trusting more and more in technology with unlimited power, forgetting along the way that machines also fail, and the consequences are a thousand times more serious. Indeed, machine learning algorithms are increasingly being used to make decisions—such as in hiring, lending, and criminal justice—that have significant impacts on people’s lives. When these algorithms are biased, they can perpetuate and even exacerbate existing discrimination and inequality. For example, if a hiring algorithm is biased against a certain demographic group, it can result in fewer members of that group being hired, thus depriving them of work experience and perpetuating the existing discrimination. Similarly, if a lending algorithm is biased against certain groups, it can result in those groups having less access to credit, perpetuating the existing economic inequality. In addition to perpetuating existing discrimination, machine discrimination can also lead to new forms of discrimination. Indeed, algorithms are so powerful these days discovering the unthinkable, that can learn to make predictions based on proxies for protected characteristics, such as ZIP codes or educational attainment, resulting in discriminatory outcomes for certain groups. It has already been observed how difficult it is to tame them (Dastin 2018). Therefore, it is crucial to ensure that machine learning algorithms are designed and evaluated with fairness in mind, and that they do not perpetuate or create new forms of discrimination. By addressing machine discrimination, we can work towards a more equitable and just society. If we do not soon become aware of how serious the problem is, it will be very difficult to redirect the orientation of these automatisms to the point of being irreversible in certain scenarios. Some may accept risk for reward, but others may think that we are going too far and need to slow down and give ourselves time to reflect in this acceleration of artificial intelligence (AI) (Bengio et al. 2023). Only a society that is aware and knowledgeable about these issues can build the foundations for a reliable development of AI. To this end, this chapter aims to open a space for reflection on the potential discriminatory danger of machine decision-­ making systems. I will begin by reflecting on why the failure of a machine is more serious than that of a human; then I will give an informative introduction to the basics of machine learning (without knowing it, it is not possible to gain awareness); I will continue by analyzing what is meant by machine discrimination; then I will identify the main reasons that cause such discrimination; and I will finish with

2  Bias and Discrimination in Machine Decision-Making Systems

15

solutions to avoid or alleviate this discrimination or, in other words, how to ensure fairness.

2.2 Why Machine Failure Is More Serious In later sections we will see what is meant by discrimination but, first, I want to start the chapter by deliberating on the importance of a machine discriminating. Only when we are able to visualize the magnitude of the problem and the scope that it has, we will be able to assess the seriousness of a machine discriminating and we will look for solutions to alleviate it. Automation (control, big data, AI...) is used to gain efficiency (resources and time) and effectiveness (better performance). In the end, it all comes down to that, make something faster/cheaper, or better. Who is going to give up those advantages! Indeed, using automation to make decisions is a very tempting and sometimes unavoidable solution in today’s society. Companies need it to remain competitive, people need it to do their jobs better, or to spend less time in an increasingly demanding world, or simply because the app we use on our mobile, or the website we visit, offers no alternative. There is a (sometimes false) belief that when there is a machine involved in decisions, everything works better. We trust machines because they do not make mistakes and are better informed than we are to deduce what is best. However, machines are designed by humans, and machines are fed with data produced by humans. So, whether by design flaws, or by intentionally biased design, or by data collected from biased human behavior, in the end, the machine can be affected by the same problems as humans. At this point, we might think that well, in the end, although a machine may also have biased behavior, it will be no worse than that of a human. But here we run into another reality, any failure of a machine is much more serious than that of a human. Mainly due to three factors. The automatisms executed by machines are, or can be, massive, invisible, and sovereign. They are massive because they are highly scalable, making millions of decisions in a second. But any failure in its decision is magnified to a scale unthinkable for a human. Suppose a postal officer decides to make blacks wait in line twice as long as whites. First, it would be inconceivable and illegal in our times. But, at the end of the day, the impact of this discriminatory act would reach no more than 100 people. Now, when Amazon decided in 2016 to offer same-day service in select ZIP codes in major U.S. cities based on analysis of data about its customers, it marginalized neighborhoods where primarily blacks live. The impact of that measure affected millions of people every day and caused such a scandal that Amazon had to rectify (Ingold and Soper 2016). They are invisible because the automatisms are often not perceived, there is no awareness that it is a machine that is behind the decision-making process. The case I have just cited of same-day service was very visible, but on other occasions, the

16

J. Casillas

victim of certain decisions does not know that it was all or partly due to a machine. For instance, Deloitte, one of the Big Four accounting firms, explains that it uses powerful machine learning for Credit Risk Management (Phaure and Robin 2020)— they all do it, also the banks. When someone is turned down for a credit application, they simply say that the operation is not viable, they do not say there is an algorithm behind it, let alone how the algorithm arrived at that conclusion (usually because no one really knows how the machine made the decision). Advertising is also a good example of its invisibility; we already know that digital advertising is automated, but we got ads on our screens from Facebook that we do not know have been targeted because of our race, religion, or national origin (Benner et al. 2019). They are sovereign because the machine does not usually “assume” responsibility, there is a lack of accountability. They simply decide, and their decisions are considered final. So it is when you search Google Images for ‘unprofessional hairstyles’, the photos that come up are mostly of black women. This is not the case if you search for ‘professional hairstyles’. Since this became known in 2016 (Alexander 2016), the results have been nuanced, but differences are still observed nowadays (include ‘-google’ in the query to avoid images related to the scandal). Google quickly disassociated itself from this by arguing that it did nothing, it is simply what you see on the internet. The question is whether something that many give more credence to than a god they think so, or do they really expect to find a reflection of society when searching. Add to this, the belief that thinking that a machine is unbiased masks possible discriminatory effects, and that technology generates dependence on it, it does not turn back, to form a perfect storm. Finally, let us think about scenarios in which the machine already decides for itself, and on particularly sensitive matters such as granting a subsidy, a loan, passing a court sentence, or diagnosing a disease. In addition, of course, to other daily issues where machines condition your way of thinking, such as what news should be interesting for you and what videos you should watch. Naturally, a well-designed automatic decision-making system would not cause these problems. In this chapter, I will focus on situations where this design is not correct or is subject to an interpretation of what is correct that might be questionable or skewed. Under these premises, a machine can be biased and lead to discriminatory decisions against certain social groups. Therefore, once the reasons for this malfunction have been identified, the chapter will also look at solutions for good design.

2.3 How Machine Learning Works Before continuing, it is useful to briefly review how to build a model that, based on data, ends up supporting decision making in a given problem, or even directly makes the decision itself. The decision-making system (generally called model) most commonly used in AI is based on machine learning from data.

2  Bias and Discrimination in Machine Decision-Making Systems

17

The data contain values of input variables (attributes or features) that determine specific cases, and, in the case of supervised learning (the most common type of machine learning), each data is accompanied by the value of the dependent variable (the one that constitutes the case study or target) that defines what should be the correct response of the system for that combination of input variables. This is why the data is called example (or instance), because it serves to teach the algorithm what is the optimal response it should give for a particular case. When the system is built to predict a certain nominal variable, which takes different categories as possible values, it is called classification. For example, a model, based on different attributes/features such as the income of a family unit, fixed monthly expenses, work stability, outstanding debts, minors and other dependents under their care, can decide whether that family is worthy or not (two classes or categories, yes or no) of receiving a social subsidy. Another example could be using an automated system to recruit applicants. Based on several characteristics such as level of education, affinity of their education to the job position, previous work experience in the sector, outstanding work achievements, etc., the machine can decide to hire or not to hire. I will return to this last example later. For now, let us move on to a less controversial one. Let us imagine that we want to build a model (decision-making system) that is able to classify traffic related images, something very useful in the driverless era. For instance, it could detect if the image corresponds to a car, a speed limit sign, or a traffic light. The process is depicted in Fig. 2.1. To do this, we extract different attributes (features) of the images that describe what is there (things such as colors and shapes). Actually, in the era of deep learning (the most successful machine learning tool today), the image is sent raw to the

(color, shapes…) …



Fig. 2.1  Illustrative example of the machine learning pipeline, from data and labeling to model generation and subsequent prediction

18

J. Casillas

algorithm, but for our purpose of illustrating how machine learning works in general, we will use features. To perform machine learning to build a model capable of automatic classification, we will need to label (or tag) each image. This labeling must be done by humans; this is how we transfer our knowledge to machines. It is still necessary today, despite the overwhelming power of the algorithms currently in use. So, either they pay ridiculous salaries as micro jobs (Hara et al. 2018) for a person to spend hours in front of a screen labelling, or they take advantage of the free labor of millions of people who every day fill out a reCAPTCHA to claim that ‘we are not a robot’ (Von Ahn et al. 2008). At the end, we will build a data table where each row represents an example or instance, that is, a particular image. Each column will contain the value of each attribute/feature, plus a special last column that tells if that image is of a car, traffic light, etc. This data set constitutes all the knowledge we have about traffic images, and will serve to illustrate the algorithm, to teach it how to respond in each situation, i.e., it will serve to train it. Thus, in the training phase, the algorithm will build a model whose responses are as close as possible to the real ones in the hope that it will work as well (or even better) as the humans so that we can dispense with their services and use, from now on, the model built by the algorithm to classify traffic images. Since errors will inevitably be made, we will have different ways of measuring where that error occurs, so there will be multiple possible measures of performance (cost functions), and part of the design of the algorithm will be to decide which measure is best for our interests. For instance, when diagnosing a disease, it is preferable to reduce false negatives (avoid missing someone who does have the disease), while a system that issues traffic tickets is preferable to minimize false positives (avoid issuing tickets to innocent drivers). However, even if the ticketing machine is conservative, it cannot be so conservative that it does not serve its purpose, and it is not helpful to diagnose everyone for a disease either. Ultimately, the performance measure (cost function) should be a trade-off between hit and miss. In a binary classification problem where a decision is made between two possible alternatives (usually called the positive and the negative class, the positive being the target of the problem), the outcomes of a classifier can be summarized in a contingency table that collects true positives (cases that are positive and are indeed predicted to be so), true negatives (cases that are negative and are so predicted), false negatives (cases that are positive but are erroneously predicted to be negative), and false positives (cases that are negative but are erroneously predicted to be positive). From these values, a series of measures are derived that assess the performance of the classifier from different points of view. Figure 2.2 shows those of interest to us throughout the chapter to define different fairness criteria. In summary, we can see that in these machine learning tasks there are some key ingredients that determine the whole process. On the one hand, we have the data, which condense the human knowledge that we want to imitate. It is clear that biased data will lead to a biased algorithm. We also have the features, the variables with which we define each possible case, the lenses we use to see the real world. Poorly

2  Bias and Discrimination in Machine Decision-Making Systems

19

Fig. 2.2  Contingency table and some performance measures derived therefrom

chosen or faulty features can also lead to undesired algorithm behavior. I have also mentioned the performance measure or cost function, which is the way we decide to quantify what is right or wrong, and consequently the algorithm will generate a model that satisfies that criterion in the best way it is able to find. Other important design decisions are the structure and size of the model we want to generate. If it is too simple (for example, a small decision tree), it will have very low efficacy and will not be useful. If the system is excessively intricate (for example, a huge artificial neural network), it becomes challenging or even unfeasible to interpret. Consequently, we might not have the capability to elucidate the rationale behind decision-making, a necessity for addressing especially sensitive issues like those exemplified by social subsidy or hiring practices, as introduced earlier in this section. Understanding the role played by each of these ingredients (data, variables, cost function, model type and size...) is key to identify risks where a bad design can lead to a discriminating machine.

2.4 What Is Meant by Machine Discrimination Machine discrimination refers to the unfair treatment of individuals or groups based on the results of automated decision-making algorithms. A recurring question here is whether the machine discriminates, whether something inert can indeed discriminate. Or, ultimately, it is just a tool at the service of the human, who is the one who really discriminates. I believe this is a superfluous question. A smokescreen to draw attention away from algorithms, to offload responsibility onto AI, and to frame the debate exclusively on humans. Although, for the moment, the truth is that there are no laws for algorithms, there are only laws for those humans who design and use those algorithms. Perhaps these words are still premature in 2024, to talk about an algorithm discriminating may seem futuristic, unrealistic, or simply sensationalist. I take that risk in this chapter. The reader will end up drawing his or her own conclusions after following this book and completing the puzzle with other sources.

20

J. Casillas

In my view, the algorithm does discriminate. Something that makes an autonomous decision is responsible for its actions. So is the adult who decides based on the education he/she received from his/her parents. Perhaps in an early version of algorithms, say a decade ago, they were still in their teens and were somewhat irresponsible for their actions, as they were dedicated to supporting human decision making rather than deciding for themselves. But that phase is over, that screen has passed, what we have now well into the twenty-first century is a scenario in which there are algorithms designing other algorithms, machines trained with trillions of data representing trillions of real-world cases that decide, and their word is the law: unquestionable, irrefutable, irreversible. In these preceding paragraphs I have deviated from the interest of this chapter, but they have been necessary to justify why throughout the text I will speak of machine discrimination. Those who do not recognize themselves with this definition can continue to speak of human discrimination through the machine…; the result, after all, is the same. Let us look for some formality in this discrimination thing, although it is something that has already been dealt with in such depth and at such length, I refer the reader to other sources such as (Ntoutsi et al. 2020; Hardt et al. 2023) for a deeper understanding of the issue. Here I will limit myself to summarizing some keys that may help to explain the milestones of this chapter. I will try to do it in an informative way that brings this field closer to the general reader, so I will skip some excessively formal and rigorous descriptions. We have seen how we built a machine to recognize traffic-related images. Suppose we reduce it to just recognizing whether there is a car in the image. This typical case is that of binary classification. Classification, because the decision consists of choosing (predicting) a category within a possible set of alternatives (with no order among them). Binary, because there are only two possible categories: there is a car or there is not. To now bring this problem of binary classification to the field where discrimination is relevant, let us substitute images for persons: ‘there is a car’ would stand for ‘being hired’. Before continuing, we will call a decision-making system that chooses a response for a given situation a decider. This situation is measured through different attributes/features/variables, and among them there will be at least one that we will call protected attribute, that is, an attribute that determines a group against which discrimination could be exercised. Examples of protected attributes may be ethnicity, gender, or socioeconomic status. A decider discriminates with respect to a protected attribute if for cases that only differ by their protected attribute, that decider makes different decisions (choose different classes). For example, if the machine systematically decides to hire a male person and not a female one, even though both are equally qualified, the system is discriminating against women and the protected attribute is ‘gender’. To discriminate is to make an unfair decision, so discrimination can be measured in terms of fairness. Greater fairness means less discrimination. There is extensive literature around the definition of fairness (Mitchell et al. 2018; Barocas et al. 2017;

2  Bias and Discrimination in Machine Decision-Making Systems

21

Gajane and Pechenizkiy 2017); here I just introduce the best known and used for machine learning.

2.4.1 Fairness Through Unawareness A decider is said to achieve fairness through unawareness if protected attributes are not explicitly used in the decision process (Chen et al. 2019). In our example, if the decider ignores the gender to hire people, which can be easily done by simply hiding this attribute during training stage. This approach may be naïve because the interdependence with other factors may mean that, even without knowing the protected attribute, one is discriminating. For example, it may reward a type of work experience that, due to past discrimination, has been more accessible to one gender than to the other. Besides, while there may be situations where concealing the protected attribute is sufficient—think on blind auditions for musicians (Goldin and Rouse 2000)—, at other times it proves to be insufficient—e.g., in race-blind approaches (Fryer Jr et al. 2008). In general, rather than relying on the system to decide fairly if it is not told the protected attribute, it is better to have control over the process to measure and regulate the degree of unfairness, for which it is necessary to know the value of the protected attribute.

2.4.2 Individual Fairness A different approach is proposed in the seminar work of Dwork et al. (2012), where being aware that failure to control fairness leads to discriminatory systems, a mechanism is proposed to guarantee fairness based on an irrefutable fact: two equal cases (except for the difference that they belong to different groups) should be treated equally. In the hiring example, if a man and a woman are equally qualified (equal education, experience, professional achievements, etc.) when applying for a job, both should be treated equally in terms of being hired or not. Since this definition considers fairness on an individual basis, the authors call it individual fairness: similar individuals are treated similarly. The question here is to define what is similar, how we assess that two individuals are similar except for the group to which they belong. To address this in machine learning, we must define a metric, i.e. a quantitative measure that assesses the degree to which two individuals are equal or not. On paper, this seems an ideal fairness criterion, provided that the metric is well chosen and made public. However, sometimes it is not an easy task to quantify social similarities. To the extent that this metric is fair, individual fairness will be achieved. In addition, we will get that on many occasions it will not be easy to find similar individuals among the data set; given an individual, we may have

22

J. Casillas

difficulty finding his or her peer in the other group, so the individual fairness can be difficult to prove.

2.4.3 Counterfactual Fairness Another way of looking at fairness is to analyze what would happen if an individual were to be changed from one group to another. Ideally, the system should have the same results, which would be a sign that it is making a fair decision. In other words, a decision is counterfactually fair toward an individual if it is the same in (a) the real world and (b) a counterfactual world in which the individual belonged to a different demographic group (Kusner et al. 2017). Finding this counterfactual world is not so simple, it is not enough to flip the protected attribute. In fact, normally that attribute is not taught to the algorithm, it does not know it, so there is nothing to change there. There really is a causal relationship that makes one attribute influence others. For example, even if race is not considered in hiring someone, that condition may have influenced a whole prior history with respect to the educational opportunities or prior work experience they had. Therefore, it becomes necessary to first define this causal graph (which itself may not find consensus) and then determine the process by which reversing the protected attribute that is not being directly observed triggers changes in other observable attributes that, in turn, propagate other changes according to that causal graph.

2.4.4 Group Fairness Due to the above reasons that hinder achieving individual or counterfactual fairness, in most cases where machine learning is being applied, other measures are sought. In this way, instead of following an individual definition—that do value the specific discrimination of the individual—, group definitions are chosen—discrimination of an individual is not analyzed, but that of the group as a whole to which that individual belongs. In this type of group fairness criteria we find, in turn, several families of definitions: • Demographic parity (also known as independence or statistical parity): it refers to a situation where the results of the decider ensure a proportional balance between the groups. For example, in a hiring process, to select a similar ratio of male and female. If ten employees are to be hired, five should be men and five women. It should be noted that this measure does not assess the correctness of the decision; it does not ultimately matter if those selected are well qualified, only the final proportion of the decision is assessed. The fifth best qualified candidate in one group may be much less qualified than the candidate holding a similar position in the other group.

2  Bias and Discrimination in Machine Decision-Making Systems

23

• Equalized odds (also known as separation or positive rate parity): it measures the degree to which a decider provides similar rates of true positive (among all truly positive, how many are chosen as positive by the decider) and false positive (among all truly negative, how many are chosen as positive by the decider) predictions across different groups (Hardt et al. 2016). It can be relaxed to ensure only equal true positive rate (what is known as equal opportunity). In the hiring example, equal opportunity is guaranteed if in both groups (men and women) the same percentage of applicants is selected from all qualified candidates in that group. If, in addition, we have that the percentage of those selected among all the unqualified (those who should not deserve the position) is similar in both groups, we would have equalized odds. • Predictive rate parity (also known as sufficiency): both the positive predictive ratio (among all cases where the decider chooses the positive class, how many of them are truly positive) and the negative predictive ratio (likewise for the negative class) are equal in the two groups. If a decider is 80% correct in choosing in the man group (8 out of 10 cases hired were actually qualified for the position), a similar percentage of correct choices should be made in the women group to ensure positive predictive ratio. If, in addition, the precision in denying employment is also similar, we would have predictive rate parity.

2.4.5 Impossibility of Fairness The main challenge is that these three group measures of fairness are mathematically incompatible, they cannot be simultaneously satisfied (except under unrealistic circumstances). Therefore, satisfying two of them results in non-compliance with the third one. Even under certain conditions that are easily encountered in real problems, these three fairness conditions are mutually exclusive. It is known as the impossibility of fairness (Miconi 2017). To illustrate the complexity of satisfying different criteria of fairness at the same time, I will borrow the clever example given by Zafar et al. (2017) but with some modifications that will better serve the purpose of our exposition. Suppose we want to build a classifier to decide whether to stop a person on suspicion of carrying a prohibited weapon. For this aim, we have a data set based on real cases where it was known whether the subject was carrying a weapon. As features, we know if the person had a visible bulge in his/her clothing and if he/she was in the vicinity of where a crime had been committed. We also know the gender (male or female), which is the protected attribute. It is shown in Table 2.1. We will analyze the results shown in Table 2.2 of four classifiers (each one identified with ‘C’) that decide whether the subject should be stopped or not, depending on the case. For each of them, the results obtained with the three group fairness criteria are shown (positive difference of the corresponding measurement in the two groups), as well as the degree of accuracy achieved.

24

J. Casillas

Table 2.1  Illustrative example of a data set and a hypothetical response from four classifiers Gender m m m f f f

Clothing bulge Yes Yes No Yes Yes No

Prox. crime Yes No Yes Yes No No

Ground truth (has weapon) Yes Yes No Yes No No

C1 Yes Yes No Yes No No

C2 Yes Yes Yes Yes Yes Yes

C3 Yes Yes No Yes Yes No

C4 Yes No No Yes No No

Table 2.2  Results obtained by the four classifiers in Table 2.1 with respect to various measures of fairness Demographic parity Equalized odds Predictive rate parity

Demographic parity Equalized odds Predictive rate parity

C1 DR diff. TPR diff. FPR diff. PPV diff. NPV diff. Accuracy C3 DR diff. TPR diff. FPR diff. PPV diff. NPV diff. Accuracy

33% 0% 0% 0% 0% 100%

Demographic parity Equalized odds

0% 0% 50% 50% 0% 83%

Demographic parity Equalized odds

Predictive rate parity

Predictive rate parity

C2 DR diff. TPR diff. FPR diff. PPV diff. NPV diff. Accuracy C4 DR diff. TPR diff. FPR diff. PPV diff. NPV diff. Accuracy

0% 0% 0% 33% 0% 50% 0% 50% 0% 0% 50% 83%

• C1 is the same response as the ground truth knowledge, so the accuracy is obviously 100%. This is an inconceivable situation in a moderately complex real problem. There is no such thing as a perfect classifier, some mistake is always made. However, if this perfect classifier existed, although it would logically satisfy equalized odds (since TPR would be 100% and FPR 0% in both groups) and predictive rate parity (both PPV and NPV would be 100% in both groups), it could not guarantee demographic parity, since DR (demographic rate) would be 66.6% in men (two of the three men are stopped) and 33.3% in women. Nor does it comply with individual fairness, given that when faced with two identical cases that differ only in gender (both with a bulge in their clothing and not in the vicinity of a crime), the classifier decides to stop the man but not the woman. • C2 manages the problem with a simple decision: stop everyone. In this way, it gets to treat everyone with individual fairness, it also ensures demographic parity, and even holds equalized odds. However, while the PPV for men is 66.6% (of the three cases it decides to stop, it is right in two of them), for women it is 33.3% (it fails in two of the three cases). In addition, the classifier makes many errors (accuracy of 50%) with an FPR of 100% in both genders, which is totally unacceptable.

2  Bias and Discrimination in Machine Decision-Making Systems

25

• C3 and C4 offer two alternative solutions with a good accuracy of 83% and guaranteeing demographic parity. In both cases, individual fairness is also achieved. However, C3 has an FPR of 0% in males and 50% in females, while PPV is 100% in males and 50% in females. On the other hand, C4 has a TPR of 50% in men and 100% in women, in addition to an NPV of 50% in men and 100% in women. Thus, none of them can satisfy either equalized odds or predictive rate parity. Which classifier would one choose between C3 and C4? Or, in other words, if we are going to make a mistake, where do we want it to be made? If we prioritize the safety of no one escaping with a weapon, C3 is better. If we prioritize the individual’s right not to be unfairly stopped, C4 is more appropriate. But both cases commit discrimination by treating men and women differently. Each problem has different social implications, and it is up to experts in the field to decide what orientation should be given to the machine decider, assuming that it will inevitably be biased.

2.5 What We Are Talking About: Example of Machine Discrimination There are multiple examples where the use of machine learning has generated cases of discrimination (O’Neil 2016; Barocas et al. 2017; Eubanks 2018). There are also many available data sets on fairness (Fabris et al. 2022). Among all of them, if I have to choose one to illustrate the situation, I will inevitably choose the ProPublica case (Angwin et al. 2016), for multiple reasons. It is a situation of special social connotations where an error has a significant impact of discrimination, where the positions between those who use the system and those who suffer from it are drastically opposed, it is still being used and is on the rise, it has been widely studied and, in a way, it marked a before and after in the way of approaching machine learning by clearly revealing the difficulty of solving the problem with fairness awareness. In the machine learning community, it is known as the ProPublica case after the name of the media wherein the investigation of four journalists was published under the title “Machine Bias” in 2016. It was already known that several U.S. courts use a decision support system that rates the risk that a defendant may reoffend with a score between 1 and 10—see, e.g., State v. Loomis (Liu et  al. 2019). Even U.S. Attorney General Eric Holder warned in 2014 that the use of data-driven criminal justice programs could harm minorities (Holder 2014): “By basing sentencing decisions on static factors and immutable characteristics—like the defendant’s education level, socioeconomic background, or neighborhood—they may exacerbate unwarranted and unjust disparities that are already far too common in our criminal justice system and in our society.” However, until the article in ProPublica, it had not been possible to demonstrate with data how the system that is known as COMPAS works. Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) is a software used for years (Brennan et al. 2009) developed by Northpointe (Equivant since 2017) as a

J. Casillas

26

decision-making system powered by collected data from a defendant about different constructs such as current charges, criminal history, family criminality, peers, residence, social environment, or education, among others. Journalists were able to access records on thousands of cases collected between 2013 and 2014 in which they knew the risk assessment made by COMPAS and whether or not each person reoffended in the two subsequent years—which is the criterion used by the company to validate COMPAS (Equivant 2019). This data is now public and is being intensively used by the fairness machine learning community (Larson 2023). COMPAS was found to score African Americans at higher risk of recidivism than Caucasians, as shown in Fig. 2.3. However, Northpointe argued that there really is equal treatment between blacks and whites because the hit rate is similar in both groups. For this, they refer to the PPV (positive predictive value), which measures the percentage of success in predicting the positive class (medium-high risk of recidivism) among all cases that are truly positive (actually recidivated). According to this measure, COMPAS obtains a value of 63% correct for blacks and 59% for whites, i.e. only a 4% difference, which is considered within the reasonably allowable range. As for the NPV (negative predictive value) measure, i.e., the percentage predicting non-recidivism among those 450 400 350 300

[11%] [11%] 23%

[9%]

[10%] [10%] [10%] [11%] [10%] [10%] [8%]

30%

250

42%

200

46%

48%

150

56%

59%

100

68%

71%

50 0

1

2

3

Black (didn't reoffend) 800 700 600 500

4

6

7

8

9

10

Black (reoffended)

[28%] 21%

400

[15%]

300

31%

200

[11%] [12%] [10%] 34%

40%

100 0

5

79%

1

2

3

White (didn't reoffend)

4

46% 5

[8%] 57% 6

[6%] [5%] [4%] [3%] 62% 7

72%

69%

70%

8

9

10

White (reoffended)

Fig. 2.3 Number of cases according to race and recidivism for each score assigned by COMPAS. Percentage of persons out of the total group assigned to each risk score is shown in square brackets above each bar. Percentage of repeat offenders for each set of scores is shown inside each bar. For example, 10% of blacks are scored with a risk of 9 versus 4% of whites. However, the recidivism rate among those who are rated a 9 is similar in both groups (71% vs. 69%)

27

2  Bias and Discrimination in Machine Decision-Making Systems

who did not reoffend, the results were 65% for blacks and 71% for whites (6% difference). In short, from the point of view of the degree of precision, it could be said that the COMPAS treatment is fair. But there is another way of looking at things. If we put the focus on the individual and the way in which he or she is severely handicapped by COMPAS in a prediction error, we can analyze FPR (false positive rate, the proportion of cases that do not recidivate despite receiving a high-risk score), where it is observed that there is a significant imbalance between the black and white groups, against the former. The error made in wrongly predicting that someone will reoffend when in fact she/he does not is 22 points higher for blacks (45%) versus whites (23%). Figure 2.4 compares the fairness interpretation of both Northpointe and ProPublica. The reason for this huge discrepancy lies in a social reality: more blacks than whites recidivate in the data set, 51% versus 39% respectively, as shown in Fig. 2.5. Discussing the causes why there is more black recidivism is beyond the scope of this chapter, but it clearly has its roots in a multitude of historical circumstances. I will only note one reflection: what does recidivism mean? One might think that it refers to “the act of continuing to commit crimes even after having been punished,” as defined by Cambridge Dictionary. But, in the eyes of procedural law in any state under the rule of law, recidivism is more than just committing a crime again... as a recidivist must also get caught. A policing system that is focused more on catching blacks will logically find more crime and make more arrests in that community, reinforcing the perception that there is more crime among blacks. This is a classic chicken-egg example that generates a spiral from which it is difficult to escape. Whilst race is an attribute that is hidden from the machine, among the 137 features (variables) used by COMPAS,

Within each risk category, the proportion of defendants who reoffend is approximately the same regardless of race (predictive rate parity satisfied) 100% 90%

Reoffended: 65%

Reoffended: 37%

Reoffended: 71%

Reoffended: 41%

80%

PPV: 63%

70%

4%

PPV: 59%

NPV: 65%

6%

30%

Medium-High: 72%

60%

NPV: 71%

Medium-High: 23%

Medium-High: 52%

FPR: 23%

80%

22% FPR: 45% TPR: 52%

20%

40%

TPR: 72%

30% 20%

20%

0%

Medium-High: 45%

50%

50%

10%

90%

70%

60%

40%

Black defendants who don’t reoffend are predicted to be riskier than white defendants who don’t reoffend (equalized odds is violated) 100%

Didn't reoffend: 35%

Didn't reoffend: 63%

Didn't reoffend: 29%

Didn't reoffend: 59%

Black (Low)

Black (Medium-High)

White (Low)

White (Medium-High)

10% 0%

Low: 55%

Low: 28%

Low: 77%

Low: 48%

Black (didn't reoffend)

Black (reoffended)

White (didn't reoffend)

White (reoffended)

Fig. 2.4  Fairness results obtained in COMPAS as interpreted by Northpointe on the left (predictive rate parity satisfied) and ProPublica on the right (equalized odds violated). From Northpointe’s point of view, the system is fair because it maintains a balance in PPV (percentage of cases that do reoffend among all cases that the system predicts will reoffend). However, from ProPublica’s point of view, the system is not fair because there is a high disparity in FPR (percentage of cases that the system predicts will reoffend but actually do not) and TPR (percentage of cases that the system predicts will reoffend and actually do). Due to the difference in prevalence (the percentage of recidivists in the black group is higher than in the white group) as shown in Fig. 2.5, it is not possible to achieve both fairness criteria (balance of PPV and FPR at the same time), so it is crucial to understand the nature of the case and choose the most appropriate fairness criterion

28

J. Casillas

The overall recidivism rate for black defendants is higher than for white defendants (marked difference in prevalence) 100% 90%

Didn't reoffend: 49%

Didn't reoffend: 61%

80% 70% 60%

12%

50% 40% 30% 20% 10% 0%

Reoffended: 51%

Reoffended: 39%

Black

White

Fig. 2.5  The marked difference in prevalence between blacks and whites makes it impossible to reconcile the two fairness criteria. The reason why there is more recidivism in blacks is complex, but let us keep in mind that recidivism is not only repeating a crime, the subject must also be arrested and prosecuted

questions are asked and records are analyzed that disadvantage blacks (e.g., neighborhood or arrest history questions). So looking the other way does not solve the problem, it just lets it get out of control. Indeed, in this case, where there is a marked difference of recidivism prevalence between the two groups, forcing the system to have an equal PPV mathematically makes unequal FPR and, therefore, discriminatory from that point of view. In other words, as it is not possible to satisfy different criteria of fairness at the same time (Chouldechova 2017), an in-depth study of the problem is necessary to design the best strategy. The issue here is that the efficiency interest (low false negative) of a company/administration is not aligned with the efficacy interest (low false positive) of the individual. For this reason, the ProPublica case opens an interesting debate that questions the convenience of using machine decision-making systems in a situation where there is no optimal solution and the error is critical, as it seriously harms the individual. Finally, I show the result of a machine learning algorithm (Valdivia et al. 2021) that, based on a few variables together with the COMPAS prediction, can significantly improve its performance, as a proof of how fair we can go (see Fig.  2.6). Firstly, the COMPAS system is easily improved in terms of accuracy—see (Dressel and Farid 2018)—and, secondly, a well-designed algorithm can offer a range of possible best alternatives with different trade-offs of accuracy and fairness. As shown, the algorithm can generate solutions with an error rate similar to that obtained by COMPAS (35%), but with a difference in FPR between blacks and

2  Bias and Discrimination in Machine Decision-Making Systems

29

Error

Equal opportunity (FPR difference)

Fig. 2.6  Using a multi-objective optimization technique (Valdivia et al. 2021), multiple alternative classifiers can be generated. Each orange point is the result of an alternative decider, while the blue dots represent the average behavior. It can be seen how COMPAS (red dot) is easily outperformed in accuracy and fairness and that, with the same error rate as COMPAS (about 35%), it is possible to improve equality of opportunity significantly (reducing FPR difference from 12.5% to 4.5%)

whites of lower than 5% instead of the 12% that COMPAS obtained. In other words, if you want to be fairer, you can. It is just a matter of having fairness awareness and the machine learning skills necessary to employ algorithms that do not discriminate. The ProPublica case exemplifies many of the bad practices that can lead to discrimination, such as the choice of variables, the selection of the set of examples, the use of a certain cost function to guide the algorithm, or the existence of proxies that correlate with race. In the next section I formalize some of the main reasons why a machine can discriminate.

2.6 Why Machine Learning Can Discriminate Machine learning can end up generating decision-making systems that are unfair or cause discrimination because of different factors (Barocas and Selbst 2016). Sometimes it is due to the use of data that teaches the algorithm discriminatory behaviors previously performed by humans. At other times, a performance measure is used—which, in short, is the reference used by the algorithm to know whether the system it is generating is right or wrong—that leads to biased or unfair behavior.

30

J. Casillas

The problem may also lie in the fact that the system supports decisions on variables that offer an incomplete or distorted view of reality. Specifically, we can distinguish five causes that lead the algorithm to generate unfair or discriminatory deciders: 1. The use of contaminated examples: any machine learning system maintains the existing bias in the old data caused by human bias. Such bias can be aggravated over time as future observations confirm the prediction and there are fewer opportunities to make observations that contradict it. An example is the case of Amazon (Dastin 2018), when in 2014 it developed an automated recruitment system based on screening resumes. To do so, it trained its algorithm with the history of recruitment cases carried out by the company in its previous 10 years. The project ended up failing because it was found to have a gender biased behavior, favoring the hiring of men over women with equal qualifications. It was observed that the data used to train the algorithm were biased and, consequently, the algorithm learned to mimic that bias to maximize its performance. 2. The choice of the wrong performance measure: the algorithm is guided by a performance measure (cost function) to generate models that maximize (minimize) it; if that measure rewards a certain balance to the detriment of another, from a point of view, the solution could discriminate. The ProPublica case is a clear example of this (Angwin et al. 2016). While the company that developed COMPAS, the algorithm that scores an offender’s risk of recidivism, claims that its system is fair because it satisfies equal PPV (positive predictive value, the proportion of subjects with a high risk score who actually recidivate) between blacks and whites, in terms of FPR (false positive rate, the proportion of cases that do not recidivate despite receiving a high risk score) there is a high imbalance between the black and white groups, against the former. 3. The use of non-representative samples: if the training data coming from the minority group are smaller than those coming from the majority group, the minority group is less likely to be perfectly modeled. Let us mention here the cases of AI models used for medical diagnosis based on genomic information. A 2016 meta-analysis analyzing 2511 studies from around the world found that 81% of participants in genome mapping studies were of European ancestry (Popejoy and Fullerton 2016). The data overrepresents people who get sicker. In addition, demographic information on the neighborhood where a hospital is located, how it advertises clinical trials, and who enrolls in them further exacerbates the bias. While another study (Aizer et al. 2014) pointed out that the lack of diversity of research subjects is a key reason why black Americans are significantly more likely to die of cancer than white Americans. In short, the survivorship bias (Brown et al. 1992) coined during the World War II when analyzing the impact of projectiles on aircraft—information from heavily damaged aircrafts, the most interesting to be analyzed, was not available as they did not return from battle—is still in force today, especially in the field of health: the algorithm cannot understand what is not shown to it.

2  Bias and Discrimination in Machine Decision-Making Systems

31

4. The use of limited attributes: some attributes (variables) may be less informative or less reliably collected for minority groups, making the system less accurate on this group. This fact is aggravated when the variable that is not very representative is the object of study, i.e., the dependent variable or target. The scandal in the Netherlands where 26,000 families (mostly immigrants) had been unjustly classified as fraudsters through data analysis between 2013 and 2019, and thus forced to repay thousands of euros worth of benefits, is an example (Peeters and Widlak 2023). This was partly due to the use of a form that was cumbersome and difficult to understand for non-native Dutch speakers, so that erroneous data collected on it were interpreted as fraud for these families. In a nutshell, attributes were used that were not representative for certain social groups, causing them harm. 5. The effect of proxies: even if the protected attribute (e.g., ethnicity) is not used to train a system, other features may be proxies for the protected attribute (e.g., neighborhood). If such features are included, bias will occur, and it is sometimes very difficult to determine these dependencies. This is a classic case in the field of sociology. There are different causal relationships among variables that generate indirect relationships and back doors. An example is provided by Mitchell et al. (2018), where a complex historical process creates an individual’s race and socioeconomic status at birth, both affecting the hiring decision, including through education. Even if we hide an individual’s race and socioeconomic status from the algorithm, other variables, such as the quality of their education, will condition their chances of being hired. This variable is not likely to be independent, but likely to be entangled with their race and socioeconomic status. Unfortunately, in a real-world complex problem, several of these factors combine, making it very difficult to avoid that machine learning does not discriminate, or that, being fair from one point of view, it is unfair from another. However, there are solutions for all these drawbacks. The key is to include validation mechanisms to detect these discriminations and the causes, in order to take action to mitigate discriminatory effects.

2.7 How Machine Discrimination Can Be Overcome Fortunately, various solutions have been proposed from the machine learning community to address fairness. These techniques can help reduce bias in machine learning models, even if they are not a panacea. It is crucial to continually assess and monitor the performance of the model to ensure that it remains fair and unbiased. Different approaches have been adopted that can be grouped into three categories, depending on the point in the machine learning pipeline at which the mechanism is incorporated to correct the operation of the process towards better fairness: the pre-processing stage, the in-training stage, or the post-processing stage. These

32

J. Casillas

can be combined, so that it is possible to improve the data first, to apply algorithms designed for fairness subsequently, and finally to polish the results in post-­ processing. I will describe some existing alternatives in each approach.

2.7.1 Pre-processing for Fairness Pre-processing approaches can be used to address fairness in machine learning by manipulating the input data before it is used to train a model. They attempt to obtain new representations of the data to satisfy fairness definitions. They are especially useful when the cause of the discrimination is due to biased data. Among the most common pre-processing techniques used for fairness, we find the following: • Data sampling/weighting: one approach is to sample the data in a way that ensures equal representation of all groups. For example, if a dataset contains unequal representation of different races, we can oversample the underrepresented groups (or undersample the overrepresented one) to create a more balanced dataset (Kamiran and Calders 2010; Gu et al. 2020). Another way to reach a similar effect is to weight data (Krasanakis et al. 2018) in an iterative process, together with the algorithm (so it is a hybrid approach of pre-processing and in-­ training). Either by repeating (reducing) data from the minority (majority) group, or by giving more weight to data from the minority group, the objective of these techniques is to balance the representativeness of each group in the hope of reducing imbalances that affect fairness. • Data generation: by using generative adversarial networks it is also possible to generate high quality fairness-aware synthetic data (Xu et  al. 2018; Sattigeri et al. 2019). In this case, again, the aim is to create new minority group data, but in this case, it is fictitious data (based on real data), so there is more flexibility to direct the generation to reduce the fairness. • Feature selection: another approach is to select features that are not biased towards any particular group (Grgić-Hlača et al. 2018). This can be done using statistical methods to identify features that have a low correlation with protected attributes (such as race or gender) or by using domain knowledge to select relevant features that do not discriminate. To the extent that there are attributes correlated with the protected attribute, this selection approach can be effective. • Data encoding: to achieve fairness, we can frame it as an optimization challenge where we aim to discover an intermediate data representation that optimally captures the information while also hiding certain features that could reveal the protected group membership (Zemel et al. 2013; Calmon et al. 2017). Here we are looking for a data transformation aimed at reducing disparity that causes lack of fairness. • Preprocessing with fairness constraints: some preprocessing techniques add constraints that promote fairness (Donini et  al. 2018). We can also create a new attribute optimized by a kind of adversarial debiasing that trains a model to

2  Bias and Discrimination in Machine Decision-Making Systems

33

­ inimize the accuracy of a discriminator that tries to predict protected attributes m like race or gender (Zhang et al. 2018). The main advantage of pre-processing is that the modified data can be used for any subsequent task. This helps people who are not very skilled in machine learning or who do not have access to the means to develop new algorithms on an ad hoc basis. Besides, there is no need to access protected attributes at prediction time, which is sometimes a limitation in projects where it is not legal or feasible to know that information. However, pre-processing approaches are inferior to in-training approaches in terms of performance of both accuracy and fairness, as well as less flexible compared to post-processing approaches.

2.7.2 In-training for Fairness The methods in the training phase consist of modifying the classification algorithm by adding fairness criteria or by developing an optimization process considering these fairness measures. Since they work in the training phase, that is, when the algorithm determines how to generate the decider, they have great potential. Some possible in-training techniques are the following: • Fairness cost-sensitive regularization: one approach is to incorporate fairness constraints into the model’s objective function. This can be done by adding a regularization term that penalizes the model (decider) for making predictions that are biased towards certain groups. For example, we can use a fairness metric (usually the group ones) to penalize the model for making biased predictions or incorporate these measures to decide components of the model such as nodes, rules, or weights (Zafar et al. 2017b; Agarwal et al. 2018). • Adversarial training: another approach is to use adversarial training to make the model more robust to biases in the input data (Kearns et al. 2018; Zhang et al. 2018). This involves training a discriminator model that tries to predict the protected attributes of the input data (such as race or gender) and using the output of the discriminator to update the classifier’s weights. In other words, an algorithm oversees the potential discrimination caused by the models generated by another algorithm, so that iteratively the second one manages to improve the solution until it passes the approval of the first one. • Counterfactual data augmentation: in some cases, it may be possible to generate counterfactual examples that help to mitigate bias in the input data. Counterfactual data augmentation involves generating new training examples that are similar to the original examples but with modified attributes that remove the bias following the given causal graph (Kusner et al. 2017). • Individual fairness: given the metric to assess similarities among data, it can be used in training stage to force the algorithm to generate equal predictions for similar data (Dwork et al. 2012).

34

J. Casillas

• Multiobjective optimization: finally, a powerful approach is to develop a wrapping scheme (a kind of meta-learning) where the hyperparameters of a standard algorithm are optimized to direct the learner to generate models with specific fairness measures (Valdivia et al. 2021; Villar and Casillas 2021). In this sense, fairness is not conceived as a constraint but as guide to optimize the model. When multiobjective optimization is incorporated in this meta-learning approach, it is possible to generate a wide variety of models with different accuracy-­fairness trade-offs. These in-training approaches achieve the best results in both accuracy and fairness, and they have a greater flexibility to choose the desired balance between them. Any fairness criteria can be incorporated at this stage. However, the adaptation of existing algorithms or the creation of new algorithms is required, which represents a major development effort. In some projects where fairness awareness is incorporated into previous developments, it is not easy to access previous algorithms.

2.7.3 Post-processing for Fairness Post-processing methods aim to eliminate discriminatory decisions once the model has been trained by manipulating the output of the model according to some criteria. Here we have some common postprocessing techniques: • Calibrated equality: an approach is adjusting the model’s output using statistical methods or by using post-hoc adjustments to calibrate the output for each group (Canetti et al. 2019). This is the case of the equalized odds technique (Hardt et al. 2016) to ensure that the model predicts outcomes with equal false positive and false negative rates for all groups. This is possible in deciders that return an output with a degree of certainty, so that by varying the threshold of that certainty it is possible to alter the final output. For example, output 0 means that the loan is not granted and 1 that it is, so that, initially, a certainty greater than 0.5 concludes that the loan is granted (1) and a lower value that it is not (0). In this case, the 0.5 border can be varied, for example to 0.7, as long as better equity between groups is achieved. As can be seen, neither the data nor the algorithm is modified, what is altered is the output of the decider generated by the algorithm. • Rejection sampling: in some cases, it may be necessary to reject certain predictions that are likely to be biased. Rejection sampling is a technique that involves refusing predictions that are too confident or too uncertain (Kamiran et al. 2018). By doing so, the model can avoid making biased predictions that are likely to be incorrect. • Model regularization: finally, model regularization can be used to ensure fairness in machine learning. Regularization techniques can be used to penalize/constrain/modify the model to produce outputs that are consistent with a fairness

2  Bias and Discrimination in Machine Decision-Making Systems

35

metric (Pedreschi et al. 2009; Calders and Verwer 2010; Kamiran et al. 2010). Here a transformation of the generated decision plane is pursued in order to improve fairness. It is a similar effect to calibrated equality but through a more complex and potentially more effective process. These post-processing approaches have a good performance, especially with group fairness measures. As in the case of pre-processing approaches, there is no need to modify the algorithm, it can work with standard machine learning algorithms. However, these approaches are more limited than in-training approaches to get the desired balance between accuracy and fairness. Moreover, the protected attribute must be accessed in the prediction stage, which is sometimes not possible. Whilst the sensitive information can be known during the development of the algorithm for its correct design, it will not always be available (for legal or practical reasons) at the time of using the already trained model.

2.8 Conclusion Throughout the chapter we have seen many examples where the use of algorithms and massive data analysis generate discrimination based on gender, race, origin, or socioeconomic status in fields as diverse as recruitment, recidivism assessment, advertising, internet searches, credit risk assessment, subsidy payment, health treatment, and online shopping delivery. When discrimination is generated by automatisms, their consequences are much more serious due to its scalability, invisibility, and lack of accountability. Different perspectives on discrimination have been reviewed and different causes have been analyzed. We have also been able to see how there are solutions for most cases, as long as there is a will and the means to solve them. AI will continue to bring wonderful things to society, but it must also be constrained by the values of that same society. It is advancing at great speed. The best way to deal with this evolution is to move at the same pace in ethics, awareness, education, and regulation. The legislature should work closely with experts to investigate, prevent, and mitigate malicious uses of AI. AI experts must take the nature of their work seriously, proactively communicating with relevant stakeholders when harmful applications are foreseeable. External audits should be incorporated into potentially discriminatory projects, both in the private and public sectors. We should bring together data scientists and experts in social sciences to infuse a social lens into solutions and examine potential discriminations. I hope that this chapter contributes to raising awareness about the serious discriminatory potential that machine decision-making systems can have, open the eyes of AI experts to the consequences of their work, and show that, although there are solutions to alleviate it, these are not as simple as fixing a bias in the data.

36

J. Casillas

References Agarwal, A., A. Beygelzimer, M. Dudík, J. Langford, and H. Wallach. 2018. A reductions approach to fair classification. In International conference on machine learning, 60–69. PMLR. Aizer, A.A., T.J. Wilhite, M.H. Chen, P.L. Graham, T.K. Choueiri, K.E. Hoffman, et al. 2014. Lack of reduction in racial disparities in cancer-specific mortality over a 20-year period. Cancer 120 (10): 1532–1539. Alexander, L. 2016. Do Google’s ‘unprofessional hair’ results show it is racist? The Guardian. https://www.theguardian.com/technology/2016/apr/08/does-­g oogle-­u nprofessional-­ hair-­results-­prove-­algorithms-­racist-. Angwin, J., J. Larson, S. Mattu, and L. Kirchner. 2016. Machine bias. ProPublica, May 23, 2016. Barocas, S., and A.D. Selbst. 2016. Big data’s disparate impact. California Law Review: 671–732. Barocas, S., M. Hardt, and A. Narayanan. 2017. Fairness in machine learning. Nips tutorial 1: 2017. Bengio, Y., et al. 2023. Pause giant ai experiments: An open letter. Future of Live Institute. https:// futureoflife.org/open-­letter/pause-­giant-­ai-­experiments/. Benner, K., G. Thrush, M. Isaac. 2019. Facebook engages in housing discrimination with its ad practices, U.S. says. The New York Times, March 28. https://www.nytimes.com/2019/03/28/us/ politics/facebook-­housing-­discrimination.html. Brennan, T., W. Dieterich, and B. Ehret. 2009. Evaluating the predictive validity of the COMPAS risk and needs assessment system. Criminal Justice and Behavior 36 (1): 21–40. Brown, S.J., W.  Goetzmann, R.G.  Ibbotson, and S.A.  Ross. 1992. Survivorship bias in performance studies. The Review of Financial Studies 5 (4): 553–580. Calders, T., and S. Verwer. 2010. Three naive bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery 21: 277–292. Calmon, F., D. Wei, B. Vinzamuri, K. Natesan Ramamurthy, and K.R. Varshney. 2017. Optimized pre-processing for discrimination prevention. In Advances in neural information processing systems, vol. 30, NIPS. Canetti, R., A. Cohen, N. Dikkala, G. Ramnarayan, S. Scheffler, and A. Smith. 2019. From soft classifiers to hard decisions: How fair can we be? In Proceedings of the conference on fairness, accountability, and transparency, 309–318. ACM. Chen, J., N.  Kallus, X.  Mao, G.  Svacha, and M.  Udell. 2019. Fairness under unawareness: Assessing disparity when protected class is unobserved. In Proceedings of the conference on fairness, accountability, and transparency, 339–348. ACM. Chouldechova, A. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data 5 (2): 153–163. Dastin, J. 2018. Amazon scraps secret AI recruiting tool that showed bias against women. Reuters. Donini, M., L. Oneto, S. Ben-David, J.S. Shawe-Taylor, and M. Pontil. 2018. Empirical risk minimization under fairness constraints. In Advances in neural information processing systems, vol. 31. NIPS. Dressel, J., and H. Farid. 2018. The accuracy, fairness, and limits of predicting recidivism. Science Advances 4 (1): eaao5580. Dwork, C., M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, 214–226. ACM. Equivant. 2019. Practitioner’s guide to COMPAS core. https://www.equivant.com/ practitioners-­guide-­to-­compas-­core/. Eubanks, V. 2018. Automating inequality: How high-tech tools profile, police, and punish the poor. St. Martin’s Press. Fabris, A., S. Messina, G. Silvello, and G.A. Susto. 2022. Algorithmic fairness datasets: The story so far. Data Mining and Knowledge Discovery 36 (6): 2074–2152. Fryer, R.G., Jr., G.C. Loury, and T. Yuret. 2008. An economic analysis of color-blind affirmative action. The Journal of Law, Economics, & Organization 24 (2): 319–355.

2  Bias and Discrimination in Machine Decision-Making Systems

37

Gajane, P., and M. Pechenizkiy. 2017. On formalizing fairness in prediction with machine learning. arXiv preprint arXiv:1710.03184. Goldin, C., and C. Rouse. 2000. Orchestrating impartiality: The impact of “blind” auditions on female musicians. American Economic Review 90 (4): 715–741. Grgić-Hlača, N., M.B. Zafar, K.P. Gummadi, and A. Weller. 2018. Beyond distributive fairness in algorithmic decision making: Feature selection for procedurally fair learning. In Proceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1. AAAI. Gu, X., P.P. Angelov, and E.A. Soares. 2020. A self-adaptive synthetic over-sampling technique for imbalanced classification. International Journal of Intelligent Systems 35 (6): 923–943. Hara, K., A. Adams, K. Milland, S. Savage, C. Callison-Burch, and J.P. Bigham. 2018. A data-­ driven analysis of workers’ earnings on Amazon Mechanical Turk. In Proceedings of the 2018 CHI conference on human factors in computing systems, 1–14. ACM. Hardt, M., E.  Price, and N.  Srebro. 2016. Equality of opportunity in supervised learning. In Advances in neural information processing systems, 29. NIPS. Hardt, M., S. Barocas, and A. Narayanan. 2023. Fairness and machine learning: Limitations and opportunities. The MIT Press. (ISBN 9780262048613). Holder, E. 2014. Attorney general Eric holder speaks at the national association of criminal defense lawyers 57th annual meeting and 13th state criminal justice network conference. The United States Department of Justice. Ingold, D., and S.  Soper. 2016. Amazon doesn’t consider the race of its customers. Should it. Bloomberg, April, 21. Kamiran, F., and T. Calders. 2010. Classification with no discrimination by preferential sampling. In Proceedings 19th machine learning Conference Belgium and The Netherlands, vol. 1, no. 6. Citeseer. Kamiran, F., T. Calders, and M. Pechenizkiy. 2010. Discrimination aware decision tree learning. In 2010 IEEE international conference on data mining, 869–874. IEEE. Kamiran, F., S. Mansha, A. Karim, and X. Zhang. 2018. Exploiting reject option in classification for social discrimination control. Information Sciences 425: 18–33. Kearns, M., S. Neel, A. Roth, and Z.S. Wu. 2018. Preventing fairness gerrymandering: Auditing and learning for subgroup fairness. In International conference on machine learning, 2564–2572. PMLR. Krasanakis, E., E.  Spyromitros-Xioufis, S.  Papadopoulos, and Y.  Kompatsiaris. 2018. Adaptive sensitive reweighting to mitigate bias in fairness-aware classification. In Proceedings of the 2018 world wide web conference, 853–862. ACM. Kusner, M.J., J. Loftus, C. Russell, and R. Silva. 2017. Counterfactual fairness. In Advances in neural information processing systems, vol. 30. NIPS. Larson, J. 2023. COMPAS recidivism risk score data and analysis. ProPublica, April 2023. https:// www.propublica.org/datastore/dataset/compas-­recidivism-­risk-­score-­data-­and-­analysis. Liu, H.W., C.F. Lin, and Y.J. Chen. 2019. Beyond state v Loomis: Artificial intelligence, government algorithmization and accountability. International journal of law and information technology 27 (2): 122–141. Miconi, T. 2017. The impossibility of “fairness”: A generalized impossibility result for decisions. arXiv preprint arXiv:1707.01195. Mitchell, S., E.  Potash, S.  Barocas, A.  D’Amour, and K.  Lum. 2018. Prediction-based decisions and fairness: A catalogue of choices, assumptions, and definitions. arXiv preprint arXiv:1811.07867. Ntoutsi, E., P. Fafalios, U. Gadiraju, V. Iosifidis, W. Nejdl, M.E. Vidal, et al. 2020. Bias in data-­ driven artificial intelligence systems—An introductory survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 10 (3): e1356. O’Neil, C. 2016. Weapons of math destruction: How big data increases inequality and threatens democracy. New York: Crown Books. (ISBN 978-0553418811).

38

J. Casillas

Pedreschi, D., S.  Ruggieri, and F.  Turini. 2009. Measuring discrimination in socially-sensitive decision records. In Proceedings of the 2009 SIAM international conference on data mining, 581–592. Society for Industrial and Applied Mathematics. Peeters, Rik, and Arjan C.  Widlak. 2023. Administrative exclusion in the infrastructure-level bureaucracy: The case of the Dutch daycare benefit scandal. Public Administration Review 83: 1–15. https://doi.org/10.1111/puar.13615. Phaure, H., and E. Robin. 2020. Artificial intelligence for credit risk management. Deloitte. https:// www2.deloitte.com/content/dam/Deloitte/fr/Documents/risk/Publications/deloitte_artificial-­ intelligence-­credit-­risk.pdf. Popejoy, A.B., and S.M.  Fullerton. 2016. Genomics is failing on diversity. Nature 538 (7624): 161–164. Sattigeri, P., S.C.  Hoffman, V.  Chenthamarakshan, and K.R.  Varshney. 2019. Fairness GAN: Generating datasets with fairness properties using a generative adversarial network. IBM Journal of Research and Development 63 (4/5): 3–1. Valdivia, A., J. Sánchez-Monedero, and J. Casillas. 2021. How fair can we go in machine learning? Assessing the boundaries of accuracy and fairness. International Journal of Intelligent Systems 36 (4): 1619–1643. Villar, D., and J.  Casillas. 2021. Facing many objectives for fairness in machine learning. In Quality of information and communications technology: 14th international conference, QUATIC 2021, Algarve, Portugal, September 8–11, 2021, proceedings, vol. 1439, 373–386. Springer International Publishing. Von Ahn, L., B. Maurer, C. McMillen, D. Abraham, and M. Blum. 2008. Recaptcha: Human-based character recognition via web security measures. Science 321 (5895): 1465–1468. Xu, D., S. Yuan, L. Zhang, and X. Wu. 2018. Fairgan: Fairness-aware generative adversarial networks. In 2018 IEEE international conference on big data (big data), 570–575. IEEE. Zafar, M.B., I. Valera, M. Gomez Rodriguez, and K.P. Gummadi. 2017. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th international conference on world wide web, 1171–1180. ACM. Zafar, M.B., I.  Valera, M.G.  Rogriguez, and K.P.  Gummadi. 2017b. Fairness constraints: Mechanisms for fair classification. In Artificial intelligence and statistics, 962–970. PMLR. Zemel, R., Y. Wu, K. Swersky, T. Pitassi, and C. Dwork. 2013. Learning fair representations. In International conference on machine learning, 325–333. PMLR. Zhang, B.H., B.  Lemoine, and M.  Mitchell. 2018. Mitigating unwanted biases with adversarial learning. In Proceedings of the 2018 AAAI/ACM conference on AI, ethics, and society, 335–340. ACM.

Chapter 3

Opacity, Machine Learning and Explainable AI Alberto Fernández

Abstract  Artificial Intelligence is being applied in a multitude of scenarios that are sensitive to the human user, i.e., medical diagnosis, granting loans, human resources management, among many others. Behind most of these Artificial Intelligence tools is a pattern recognition model generated by Machine Learning. To do this, it is necessary to start from a dataset that characterizes the problem under study, and “train” this model to represent the former information through different mathematical approximations. Thus, when sensitive applications and mathematical models are placed in the same equation, mistrust arises about the correct functioning of Artificial Intelligence systems. This is the main reason behind which the model makes one decision and not another. The answer lies in the interpretability or transparency of the model itself, i.e., that its components are directly understandable by the human user. When this is not possible, a posteriori explainability mechanisms are used to facilitate knowledge of which variables or characteristics the model has considered. Throughout this chapter, we will introduce the current trends to achieve trustworthy Artificial Intelligence. We will expose the components that allow a model to be transparent, as well as the existing techniques to explain more complex models such as those based on Deep Learning. Finally, we will expose some prospects that can be considered to keep improving the explanations and to allow a wider use of Machine Learning solutions in all fields of application.

A. Fernández (*) DASCI Andalusian Institute, Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 F. Lara, J. Deckers (eds.), Ethics of Artificial Intelligence, The International Library of Ethics, Law and Technology 41, https://doi.org/10.1007/978-3-031-48135-2_3

39

40

A. Fernández

3.1 Introduction Artificial Intelligence (AI) has become an increasingly popular technology across various industries and is expected to have a significant impact on our society (Sousa et al. 2019). However, there are concerns about the uncontrolled use of AI and its potential impact on human life, such as on education and on communication (Rahwan et al. 2019). These concerns fall under the Responsible AI / Ethical AI thematic area, which is the focus discussed throughout this book. There is an ongoing global debate about the ethical aspects of AI, particularly regarding its impact on human dignity, fundamental rights, and privacy (Siau and Wang 2020). Although some risks of AI are known, such as the possibility of automated decisions harming vulnerable groups, there are less obvious risks, such as hidden biases that may arise from the data used to train AI systems. European Regulations such as the AI Act1 aim to establish an ethical and regulatory framework to protect individual and collective rights (Commission 2021). The current challenge for both academia and corporations is to find a trade-off between the benefits and risks of AI without impeding innovation. This means ensuring that AI systems are secure, transparent, fair, and respectful of privacy while being free of discrimination and having clear traceability and accountability (Kaur et al. 2023). To achieve these goals, it is essential to analyze and address the various dimensions of AI and to ensure that its implementation is ethical. Among different subfields related to AI, Machine Learning (ML) has become a powerful tool that is being increasingly used in various fields to automate decision-­ making processes (Witten et al. 2016). It studies algorithms capable of constructing data-driven models that can be mainly descriptive, when they are used to provide a summarization of the data for a better understanding of the case study (Celebi and Aydin 2016); or predictive, when they are used to provide values or labels to new incoming data from the same problem (Aggarwal 2014). The process involves two stages, namely training and test. During the training stage, the algorithm constructs a model based on input data, which stands for a mathematical representation / approximation of the case study that is being analysed. Then, during test, the trained model produces results based on new data exclusively provided during this stage, which can be used to evaluate its performance and to obtain insights and make decisions. As it occurs with general AI systems, one of the main concerns with the use of ML is the lack of transparency of the algorithms used (Burkart and Huber 2021). Indeed, some models that are not transparent or explainable include deep neural networks (DNN), support vector machines (SVMs), and ensembles of classifiers (a combination of several models), among others. These models can be difficult to interpret because they often involve complex mathematical transformations and multiple layers of processing that are not directly interpretable by humans. For example, in DNN, the input data is transformed through a series of non-linear operations, making it difficult to understand how the model arrived at its output.  https://artificialintelligenceact.eu/

1

3  Opacity, Machine Learning and Explainable AI

41

Similarly, in SVMs, the model relies on complex decision boundaries that may not be easy to interpret. Additionally, some ML models that rely on black-box methods, such as reinforcement learning and unsupervised learning (Arulkumaran et al. 2017), can also be difficult to interpret. Reinforcement learning models, for example, learn by trial and error, making it difficult to determine exactly how the model arrived at its decision. Similarly, unsupervised learning models do not rely on labeled data, making it difficult to determine the criteria the model used to cluster or associate the data, namely it cannot be objectively validated. In accordance with the above, Explainable Artificial Intelligence (XAI) is an emerging field of research that focuses on developing ML models that are transparent, interpretable, and explainable (Barredo Arrieta et al. 2020) The goal of XAI is to make it easier for humans to understand and trust the decisions made by ML models. By providing explanations of how a ML model arrived at its decisions, XAI can help users to understand and to identify potential biases, errors, or ethical concerns (Mehrabi et  al. 2021). This can be particularly important in fields such as healthcare, finance, and law, where decisions made by ML models can have significant impacts on people’s lives. Finally, transparency can help researchers and developers identify and address issues with ML models, such as overfitting or data leakage, that may not be apparent without clear explanations of the model’s inner workings. In other words, practitioners may easily check whether the built model is too specific for the current data and does not provide good predictions over new data, or it shows a surprisingly good behavior that could be due to confounding variables, and understanding why this could be the case. In this chapter, we want to emphasize the importance of transparency of AI and ML algorithms through the XAI trend. To do so, we must show what the main fundamentals are on which this line of action is based, and its relationship with respect to the directives and regulations that are currently being developed, mainly at the European level. In this sense, we will lay the foundations of trustworthy AI by means of tools that boost the transparency of models. To achieve this goal, it is necessary to know which tools and strategies currently exist to develop models that are transparent and interpretable by definition, or to “open the black box” on those that are more complex a priori. We must analyze the different approaches and learn how they can help to correctly audit the performance of ML solutions. In particular, we will focus on a typology called “counterfactual examples” that are responsible for identifying the necessary changes on a query input to turn the model decision upside down. This type of approach is very interesting in various scenarios such as bank loans, hiring policies, or even health diagnosis. Finally, it is necessary to look to the future to determine what the challenges and prospects in this relevant and necessary area of work will be. Among others, we will analyze how explainability techniques can take advantage of multimodal data fusion to improve the reproducibility of models. In addition, we will show how to simplify the auditing procedure of these systems to achieve reliable solutions. Finally, brief comments will be made on General Purpose AI (GPAI) and how algorithms themselves can learn to explain.

42

A. Fernández

The remainder of this paper is organized as follows. Section 3.2 introduces the basics of XAI as it relates to trusted AI.  Existing tools and strategies to achieve interpretability of ML models are presented in Sect. 3.3. Section 3.4 focuses on the solution of counterfactual examples as one of the most interesting elements that can be currently applied. Some of the main challenges and prospects in the XAI theme are compiled in Sect. 3.5. Finally, Section 3.6 summarizes and concludes this chapter.

3.2 Fundamentals of Trustworthy and Explainable Artificial Intelligence The foundation of responsible AI entails meeting several key requirements for trustworthy AI. It is crucial to ensure compliance with these requirements throughout the entire life cycle of AI systems, and this involves not only technical methodologies but also human supervision at different levels of control, namely what is known as “Human in the loop” and “Human on the loop” (Estévez Almenzar et  al. 2022). According to the European Union (EU) high-level expert group (Commission 2019), responsible and trustworthy AI must be set on three pillars: 1. lawful, ensuring compliance with applicable laws and regulations; 2. ethical, ensuring adherence to ethical principles and values, and; 3. robust, both technically and socially. While all three components are necessary, they are not sufficient to achieve trustworthy AI. Ideally, all these components should work together in harmony, but tensions may arise between them, and a collective and individual responsibility as a society is required to resolve them. For instance, AI must be based on ethical principles and values in addition to being sound technical. It must also be recognised that, in terms of its impact on the economic-social environment in which it is implemented, AI systems, even if they are well-intended, may cause accidental harm to people, their privacy, and/or their security. The EU Ethical Directives for Trustworthy AI (Commission 2019) are based on multiple principles and requirements, as depicted in Fig.  3.1. However, the EU High-­Level Expert Group has identified seven principles as fundamental requirements that AI must satisfy. These are defined as follows: • • • • • • •

Human Agency and Supervision. Technical Robustness and Security. Privacy and Data Governance. Transparency. Diversity, non-discrimination and fairness. Environmental and Social Welfare, Sustainability. Accountability or responsibility.

3  Opacity, Machine Learning and Explainable AI

43

Transparency Stakeholders investment

Explainability

Technical robustness

Oversight

Research trust properes

Safety & security

Fairness & supporve context

Change control

Diversity & bias awareness

Accountability

Privacy & data governan ce

Societal wellbeing

Human agency

Fig. 3.1  14 general principles and requirements for reliable AI: All are significant, complement each other, and should be applied throughout the life cycle of the AI system (partially adapted from Hasani et al. 2022)

As listed above, one of the principles towards Trustworthy AI is Transparency. Within this principle, XAI emerges as a necessary mechanism to validate black box models, i.e., a system in which only inputs and outputs are observed without knowing the internal details of how it works. This can be problematic, as we cannot predict how the system may behave in unexpected situations, or how it can be corrected, if something goes wrong. Despite the amount of progress on the topic of explainable and responsible AI (Barredo Arrieta et  al. 2020), there is a large gap in how the available explainability techniques satisfy the EU’s ethical requirements for trustworthy AI (Commission 2019), as well as the EU’s upcoming AI regulation, the AI Act (Commission 2021).

44

A. Fernández

Specifically, the AI Act regulates eight main families of high-risk AI systems (refer to Art. 6 of the AI Act and Section 2.5 (Commission 2021)). Moreover, the need to certify models is becoming increasingly prevalent, even in international ISO standards for AI robustness. These standards are extending formal methods of requirements verification or requirements satisfaction, which are typical in software engineering, to verify desirable properties of neural networks, such as stability, sensitivity, relevance, or reachability (Standars 2021). All of the former imply a current high demand of interpretability, explainability and transparency (Barredo Arrieta et al. 2020). Interpretability is the characteristic of a model that measures the degree of meaning and understanding that a human can obtain from ibt. This is very similar to the concept of explainability, although the latter refers to the use of explanations to achieve comprehension. Transparency goes a step further, measuring both the degree of understandability of the model and that of the process involved in its generation. For ML in particular, there could be a mismatch between the objectives of the model and that of the users. This emphasizes the need for more stable and robust models, as well as actionable explainability techniques. Therefore, in addition to high accuracy of ML algorithms, users have additional desiderata, which are listed below (Doshi-Velez and Kim 2017): • Fairness: Assure that protected groups (e.g. gender, ethnicity) are not somehow discriminated against (explicitly or implicitly); • Privacy: Assure that sensitive information is protected; • Reliability/Robustness: Assure high algorithmic performance despite variation of parameter or input; • Causality: Assure that the predicted change in output due to a perturbation will occur in the real system; • Trust: Allow users to trust a system capable of explaining its decisions rather than a black box that just outputs the decision itself. Achieving interpretability in models depends on several properties (Carvalho et al. 2019). Models with simpler inputs and outputs, such as fewer features or predictions per instance, and those with more comprehensible feature meanings tend to be more interpretable. Additionally, factors such as the number of leaves in a decision tree, the number of rules or size of the antecedents and consequents in a decision list, or the number of features in a logistic regression can affect interpretability. The type of model can also impact its interpretability. Models with similar reasoning to humans, such as decision trees, rule lists, K-nearest neighbors, and logistic regressors, are generally more interpretable. In contrast, black box models like neural networks or random forests are less transparent and require post-hoc explanation. In addition to aiding in understanding the decision process, interpretable models offer other advantages (Rudin 2019). They are easier to audit and debug, as their creators and users can more readily comprehend them and identify flaws. They are also more simulatable, meaning users can trace the model’s inference and perform

3  Opacity, Machine Learning and Explainable AI

45

the classification process, generating trust. Furthermore, interpretable models allow users to understand how modifying a sample’s features could alter the classification decision. If interpretable models are not effective or feasible, there is a wide variety of techniques that create an a posteriori (post-hoc) explanation (Barredo Arrieta et al. 2020), e.g. LIME creates a simpler local model that acts as a proxy (or surrogate model) to explain a more complex one (Ribeiro et al. 2016). A family of methods are model agnostic (such as SHAP) while others are specific to certain models such as neural networks, e.g. LRP or Gradient Class Activation Mapping (Grad-CAM). However, there is no agreement on standard XAI methodologies to provide a concrete level of explainability or confidence. Ideally, and to facilitate the task of algorithm auditors, the quality level of a responsible AI model should indicate, for example, the degree of model unbiasedness, the amount of bias mitigated in the training data, or the level of robustness or privacy achieved. Therefore, it is necessary to validate and verify the algorithms for this purpose in order to meet the stated requirements.

3.3 Dimensions and Strategies for Promoting Explainablity and Interpretability The significance of developing interpretable and explainable solutions for addressing applications via ML and AI is beyond doubt. Reaching this objective is not always an easy task, as it strongly depends on the complexity of the problem and the characteristics of the methodology and model used for this purpose. Although there is a large variety of learning algorithms that provide “transparent models” by default, such as logistic regression, decision tress, or kNN, the trade-off between accuracy and interpretability is always present, implying the choice of more sophisticated yet opaque solutions. In the remainder of this section, first the dimensions that characterize interpretable methods are introduced with relative detail (Sect. 3.3.1); then, some interesting and widely used strategies to boost the explainability of the obtained models are described (Sect. 3.3.2).

3.3.1 Dimensions of Explainability and Interpretability Interpretability methods can be characterized by a set of dimensions (Molnar 2020): global and local interpretability, intrinsic and post-hoc interpretability and model-­ specific and model-agnostic interpretability. These will be described in what follows.

46

A. Fernández

1. Global and Local Interpretability: this dimension pertains to the extent of a model’s interpretability and the portion of its predictions that can be explained ­(Lundberg et al. 2020). For every ML task, the learning algorithm creates a datadriven model based on a set of input features, thus selecting important features and learning relationships between them and the target output during the training phase. Global interpretability aims to analyze the model’s parameters and learned relationships in order to understand common patterns in the overall data that help make decisions. Local interpretability, on the other hand, focuses on understanding the relationship between the set of input features of a specific case and the model’s decision. Global interpretability is useful for understanding which relationships the model learned and identifying any non-random sources of noise that may have affected the model’s learning, such as artifacts. On the other hand, local interpretability helps to understand the importance of input features for a particular case study being solved. 2. Intrinsic and Post-hoc Interpretability (Gill et al. 2020): Intrinsic interpretability refers to models that are inherently interpretable because of their simplicity, such as decision trees or sparse linear models. To increase intrinsic interpretability in complex models, constraints can be added, such as sparsity, monotonicity, or limiting the number of neurons or layers in ANN, among others. Domain knowledge can also be included to simplify the model’s behavior. Post-hoc interpretability, on the other hand, involves applying interpretability methods after the model’s training. These methods help to understand how the model works without imposing constraints on the model. For example, feature importance can be used as a post-hoc method to understand the importance of input features in a complex model like MLP. 3. Model-specific and Model-agnostic: Interpretability methods can also be categorized based on their dependency on the type of model being explained. Model-­agnostic methods are independent of the underlying model, thus being more general and usable. On the contrary, model-specific methods are limited to a particular type of paradigm, which usually implies a more reliable outcome as it takes into account the intrinsic characteristics of the model.

3.3.2 Interpretability Strategies There are different ways to “open the black box” in order to determine why a certain output has been provided by a ML model. Among them, we must stress feature importance, saliency map, model visualization, surrogate model, domain knowledge and example-based explanations. These strategies are described in some detail next.

3  Opacity, Machine Learning and Explainable AI

47

1. Feature Importance: One explanation method that has been extensively explored is feature importance, which determines the significance or impact of an input feature on the prediction of an example. There are two primary approaches for computing feature importance: sensitivity analysis (Baehrens et  al. 2010) and decomposition (Bach et al. 2015). On the one hand, sensitivity analysis determines the effects of variations in the input variables on the model’s output and can help answer the question “What change would make the instance more or less like a specific category?” On the other hand, decomposition approaches sequentially break down the importance of the output of a layer into previous layers until the contribution of the input features to the output is identified. This approach can help answer the question “What was the feature’s influence on the model’s output?” The interpretation of feature importance values obtained from an example’s decision varies depending on the method used. High sensitivity values for two given attributes imply that an increase in these would also increase the prediction for a certain class. Conversely, high contribution values for the same attributes imply that the prediction of a class label was highly influenced by the value of these features. 2. Saliency Map: In the context of images, saliency maps (also known as heatmaps) can be utilized to depict variations in the significance of different features visually (Simonyan et al. 2014). The weight of a pixel in a given prediction can be conveyed using color. Two primary methods are used to obtain pixel values for saliency maps, similar to feature importance. In the case of DNNs, which are mostly used for solving image data problems, backpropagation methods compute pixel relevance by propagating a signal backward from the output neuron through the layers to the input image in a single pass. On the other hand, perturbation methods compute pixel relevance by making small changes to the pixel value of the input image and determining how these changes affect the prediction. 3. Model Visualization: In the internal process of the ML algorithm, a combination of input features is created, known as abstract features. There are several strategies for visualizing the patterns detected by the algorithm, such as those described in Yosinski et al. (2015). Other strategies focus on visualizing the distribution of features in the dataset (McInnes and Healy 2018). Additionally, some techniques are designed to help locate an image containing a specific pattern detected by an ANN (Wu et al. 2018), while others generate artificial images that highlight the same patterns (Nguyen et al. 2016). 4. Surrogate Model: A surrogate model is an interpretable model that is trained to provide an explanation for the predictions of a black-box model. For example, a rule list (Ribeiro et al. 2018) can be derived from a complex system, allowing final users to gain insight into the knowledge generated by the algorithm. Each rule specifies a condition that, when evaluated as true, produces a result, i.e. positive/negative class label in the case of classification problems. To accomplish this, a new dataset is created where each example of the dataset used to train the original model is paired with its prediction, and the task of the surrogate model is to predict these values.

48

A. Fernández

While global surrogate models approximate the black-box model across the entire input space, local surrogate models approximate individual predictions, which results in greater accuracy and fidelity to the model being explained. 5. Domain Knowledge: While DNN and other complex learning algorithms are capable of automatically extracting internal features during the training phase, the domain knowledge possessed by professionals and stake-holders can be utilized to validate the model’s decisions. Incorporating domain knowledge from experts during the training process can result in models that are similar to how professionals take decisions or focus on specific features or areas (Xie et al. 2021). As an example, we may focus on the context of malignancy diagnosis. Here, domain knowledge can be included directly as an input feature, such as a discrete value that indicates the tumor’s shape. Additionally, domain knowledge can be used as an additional target variable (such as shape or density) in addition to malignancy, allowing for the evaluation of how well the model predicts both target variables, similar to how clinicians consider those variables in their diagnosis. 6. Example-Based Explanation methods aim to explain the behavior of the final model by selecting specific examples from the dataset (Molnar 2020). In the case of having internal features (a combination of input features), these are also gathered from these examples to explain its behavior. To find similar examples, we look for other examples in the dataset that have similar values on the internal features and produce the same prediction as the example we are trying to explain (Caruana et al. 1999). Typically, examples in a dataset can be grouped together based on existing patterns, and a prototype is a particular example that represents its group. Another way to proceed is via counterfactual explanations (Stepin et  al. 2021a, 2021b). In this particular case, predictions are explained by finding small changes in the example that cause the model to change its prediction.

3.4 Digging Deeper on Counterfactual Explanations As presented previously in the former section, various strategies are employed in posthoc explanations, with most techniques focusing on explaining why a specific class was assigned to a particular sample (Ribeiro et al. 2016). This type of explanation is known as a “factual explanation” (Stepin et al. 2021a, 2021b). Another type of explanation is “counterfactual explanation” or simply “counterfactuals,” which demonstrates how small modifications to the initial sample could have resulted in a different class assignment (Stepin et al. 2021a, 2021b). Both types of explanations are valuable, but counterfactual explanations offer greater benefit in cases where a user wants to challenge a decision as they provide actionable insights on how to change the outcome. Having this in mind, in the remainder of this section, the basics of counterfactual explanations, their concept, benefits and expected qualities, will be explained first, in Sect. 3.4.1; then, a brief reference to some significant works on the topic is included in Sect. 3.4.2.

3  Opacity, Machine Learning and Explainable AI

49

3.4.1 Basics of Counterfactual Explanations Counterfactual explanations are typically understandable by humans because they are contrastive, providing information about the critical features for the decision and expressed in natural language involving data features and values. They not only offer insight into how the decision was made, but they also indicate the small changes required to lead to a different outcome in an intuitive way. For instance, when applying for a job, knowing that acquiring a Master’s Degree or gaining an additional year of work experience would have prevented rejection can guide the applicant’s actions towards future job applications. Understanding which features contributed to a class change is also critical for model auditing, as it not only verifies the correct feature usage but also aids in comprehending the model as a whole and potentially identifying biases or unrealistic feature values through evaluating these explanations for multiple instances. Additionally, they can be a valuable tool for auditing models, allowing developers to read the explanation and see whether the model uses the right features to decide the class change. In fact, unexpected feature changes from the sample can indicate model inadequacies. Furthermore, they become powerful tools for end-­ users in terms of understanding and challenging a model’s decisions. This is particularly important from a fairness and ethics standpoint since sensitive/protected features in a counterfactual explanation can highlight classification bias explicitly, leading to accountability. The evolution of counterfactual explanations has been ongoing since their inception. Initially, generating counterfactuals for a specific instance required solving an optimization problem that sought the closest point to the sample instance with a different prediction using differentiable terms (Wachter et al. 2017). However, multiple constraints or modifications have been added to these problems to obtain more realistic and useful counterfactuals (Verma et al. 2020). For example, constraints such as specifying certain features as non-actionable (i.e., they cannot be changed) may be added based on the problem’s requirements. Additionally, sparsity terms that penalize changes in multiple features can be added to the validity terms from the base counterfactual problem to obtain shorter and more easily understandable counterfactuals. Additional terms may also verify data manifold closeness, ensuring that the generated counterfactuals are realistic and follow the relationships found in the training data, such as preserving feature correlations. For high-quality counterfactual explanations, several characteristics are recommended or mandatory (Guidotti 2022), (Verma et  al. 2020), (Chou et  al. 2022), from which the objective ones, and therefore those that can mostly be measured, are enumerated below. • Validity: the counterfactual must have a different class from the sample. • Sparsity: a good counterfactual should change a few features. There is no definitive consensus on the optimum number of changes, but it is recommended to change only 1–3 features (Keane et al. 2021).

50

A. Fernández

• Similarity: a counterfactual must be close to the sample, and this closeness could be measured using different distance metrics. • Plausibility: a counterfactual should be in-distribution according to the observable data and it should not be considered an outlier. • Actionability: a good counterfactual must not change features that cannot be acted upon, such as immutable features. • Causality: a good counterfactual should respect the causal relations from the dataset, as otherwise, it might not be plausible. • Diversity: if multiple counterfactuals are given for a single sample, they should be as diverse as possible; that is, they should change different features so that actions can be taken in different ways.

3.4.2 Overview on Techniques for Counterfactual Explanations This section provides a concise overview of the current status of counterfactual explanations. With the growing interest in XAI in today’s society, there has been a surge in the number of approaches for generating counterfactuals, with notable differences between them. To facilitate better understanding, these were organized based on various critical aspects, such as model specificity, techniques used, data and model requirements, target data or problem context, and generated outcomes. However, for readers who seek more detailed information on specific topics, multiple surveys on this matter have been published in recent years. ((Stepin et  al. 2021a, 2021b), (Guidotti 2022), (Verma et  al. 2020), (Chou et  al. 2022), (Keane et al. 2021)). According to model specificity, some techniques require a particular type (such as a neural network, a decision tree or an ensemble (Karimi et al. 2020)), while others accept any differentiable model (Mothilal et al. 2020), or even work on every model (model-agnostic approaches) (Poyiadzi et  al. 2020). As mentioned earlier, the more specific the counterfactual explainer, the better it can harness the intrinsic characteristics of the model to its advantage, in exchange for being useful in fewer situations. Lately, the preference for model-agnostic approaches has risen due to their universality (Guidotti 2022). Regarding the techniques for their implementation, several strategies have been used. Most strategies are based on optimization techniques (cost-based (Wachter et al. 2017), on restriction (Mothilal et al. 2020), or on probability (Antorán et al. 2021)), take heuristic approaches (genetic algorithms (Sharma et al. 2020), or use game theory (Ramon et  al. 2020)). Less-explored strategies are instance-based (using distances or graphs (Poyiadzi et al. 2020)) or involve surrogate models (decision trees (Guidotti et al. 2019) or regressions (White and d’Avila Garcez 2020) for approximation).

3  Opacity, Machine Learning and Explainable AI

51

Access to different types of information is required by various counterfactual explainers. For instance, some models such as instance-based explainers (Laugel et al. 2018) may necessitate access to training data. While all counterfactual explainers demand access to parts of the prediction model, certain methods need complete information from the model (Karimi et al. 2020), while others only require access to gradients (Wachter et al. 2017) or the prediction function (Laugel et al. 2018). When dealing with categorical features, some counterfactual generators might need to pre-process them, often through encoding (Wachter et al. 2017). Besides that, not all counterfactuals are created with the same target data. Counterfactual generators have been designed to explain models that work on different types of data, which are generally tabular data (Mothilal et al. 2020), images (Vermeire et al. 2022) or, in a less explored manner, texts (Wu et al. 2021). While some explainers work for any of them (Poyiadzi et al. 2020), most have been created for specific types to preserve their simplicity. In addition, the outputs produced by counterfactual generators can also differ, depending on the intended audience. Some generators may simply present the alterations in feature values in a technical manner (Mothilal et al. 2020), whereas others may process the outcomes to provide explanations in natural language (Stepin et al. 2020), visualizations (Poyiadzi et  al. 2020), or a combination of both (Kanehira et al. 2019).

3.5 Future Challenges for Achieving Explainable Artificial Intelligence In recent years, significant improvements have been made in the topic of XAI. However, achieving full explainability is still a major challenge, and future research must address several issues to advance the field further. In this section, we will present and discuss some challenges that could be addressed for XAI to reach its full potential. Firstly, we will explore the need for multimodal data fusion to improve explainability. Secondly, we will discuss the importance of developing reliable and auditable ML systems. Finally, we will explore the potential of using GPAI algorithms to learn to explain complex models.

3.5.1 Multimodal Data Fusion for Improved Explainability While the literature has proposed using XAI to achieve accountable AI, there are several challenges facing this area that need to be addressed, including the crossroads of data fusion and explainability. Despite the availability of modelspecific and model-agnostic techniques that provide explanations or are inherently explainable, there is a need to make XAI more actionable and usable by

52

A. Fernández

auditors. For example, some visibility techniques generate a heat map as an explanation of a computer vision model, but there are generally no labels beyond the model class that can be used to verify and validate both the output and the explanation, compromising its validation and often making it subjective (Hendricks et al. 2018). One criticism of some explainable models is that they use external knowledge, such as dialogue models that answer questions interactively, relying on expert knowledge bases from sources such as Wikipedia. However, this external data fusion technique does not always imply that the model has learned the true reasoning behind its decisions (Bennetot et al. 2019). Thus, there is a need for models that are inherently explainable by design, to ensure that the explanations provided for AI models are faithful to their underlying reasoning. One way to enhance the explainability of AI models is to use multimodal data fusion, which combines different types of data to provide adaptive explanations for different audiences. Since annotated data is not always available, information fusion can be used to complete the data, and different modalities such as images, plain text, and tabular data can be used to facilitate model explainability when one modality is insufficient. After unifying the different modalities through data augmentation, an automatic reasoner can be used to automate the output of the model for specific modalities or audiences based on the unified symbolic elements. Examples of enriched models could include a natural language summary of an image, or a causal model used as prior knowledge of the problem. However, aligning data of different modalities may comprise some drawbacks, as data fusion could: (1) make it challenging to trace the origin of the data, thus preventing us from pointing to a specific modality as responsible for a particular model decision; and (2) could compensate for the lack of interpretability due to the unavailability of a data modality during inference when the model is in production. Therefore, data fusion techniques could achieve greater data utilization or completeness, which, in turn, could translate into better quality of its explainability, creating more comprehensive and satisfying explanations for diverse audiences. To overcome the first challenge of multimodality, data reduction and smart data techniques can be used to employ quality data prototypes (Triguero et al. 2019). For the second issue, insider distillation techniques might be developed (Lopez-Paz et al. 2016). The explanation of such multimodal models using this technique could utilize previously unused modalities (such as medical annotations in the form of text accompanying the classification of a radiograph) and thus facilitate the correction and explanation of models in future real-life cases, with only a radiograph available to diagnose a lung disease. In addition, validation and verification methods should be developed for auditing multimodal classification models, Q&A, and text generators. The methodology for merging multimodal data into deep learning models can be evaluated in the healthcare field, which has expert knowledge available in the form of controlled vocabularies and ontologies.

3  Opacity, Machine Learning and Explainable AI

53

3.5.2 Reliable and Auditable Machine Learning Systems The use of black-box models for AI applications in high-risk scenarios has led to significant advances in AI accuracy. However, designing reliable models requires technical certification to ensure safety. To address this need, inspiration can be drawn from software engineering methodologies for verifying and validating programs and applying them to AI models (Pohl et al. 2005). This approach is gaining momentum due to the increasing awareness of the risks associated with foundational AI models. In risky scenarios, different audiences will need to verify, validate and audit the model and the AI system of which it is part at the MLOps level (Tamburri 2020). This is why the consideration of different data modalities, and their provenance sources must be considered simultaneously to ensure that the explainability of the model acts as a useful interface, capable of providing a comprehensive view of how the model works and can be tailored to the specific needs of each audience. Verification aims to ensure that the model was built correctly, while validation ensures that the correct model was built. Both processes are crucial for developing effective and trustworthy AI systems that can be audited and certified.

3.5.3 GPAI Algorithms to Learn to Explain AI algorithms can be explained in various ways, and XAI includes post-hoc methods that explain why an algorithm makes a certain prediction rather than giving explanation capabilities to a black box, i.e. it is not always possible to provide interpretable capabilities to any complex model (please refer to Sect. 3.3). However, most of these techniques have some weaknesses that remain unaddressed. In cases where there are large black box models or complex inputs, some neighbourhood-­ based solutions become computationally demanding. To make these methods more robust, several variants have been proposed in the literature, such as avoiding generating artificial examples that do not follow the distribution of the problem, which further increase the computational cost. For instance, a study by (Meng et al. 2022) trains an Autoencoder (Bengio et al. 2013) to learn the data distribution before providing an explanation. Furthermore, another issue with these techniques is that they tend to be unreliable, and explanations can vary significantly between similar examples (Zhou et al. 2021). Additionally, generating counterfactual explanations is difficult because the minimum variation required in the data for the model to consider a different output is not always plausible (Del Ser et al. 2022). As such, the two main issues facing post-hoc explainability techniques are the computational cost and the stability of the explanations they produce. However, since many models being explained are similar and work with similar data, the idea of “learning to explain” has emerged (Dhebar 2022). Although there are only a few papers on this topic, some researchers have used meta-learning to generate

54

A. Fernández

local explanations for each prediction or to explain deep convolutional neural networks. These approaches show promise, but they do not incorporate recent meta-learning trends such as Few-Shot learning, stacking, or AutoML-zero (Real et al. 2020). A direct approach could be “Learning to explain with an AutoML-like approach (Garouani 2022)”. Depending on the application problem, hyper-parameters, and the (post-hoc) explainability techniques used, may be more suitable for the task, and/or more concise and interesting for the final user. By considering the working procedure of AutoML for obtaining the “best” hyper-parameters of a given model, the same idea can be adapted for selecting the most appropriate explainability technique or approach for a given (new) problem. Thus, the goal would be to replace the explainability expert with techniques that learn to decide how best to explain. In this case, it would be possible to optimize the explanation, the degree of adaptation to the audience that expects it, its actionability and/or its plausibility (Del Ser et al. 2022).

3.6 Concluding Remarks In this chapter, we have shown that opacity in ML and AI systems remains a significant challenge in developing reliable and trustworthy algorithms. The increasing complexity of these systems and the vast amounts of data used to train them have made it difficult to understand the reasoning behind their decisions. We have described the main pillars for responsible AI and the principles and requirements that need to be fulfilled to comply with future EU regulations. In fact, transparency and interpretability should not be viewed as mere technical issues, but as ethical imperatives, as they are essential for promoting accountability, fairness, and human-centric AI. In order to attain a better understanding of how we could address explainability and interpretability, we have listed several effective and efficient methods for interpreting and visualizing ML models, focusing on the use of counterfactual explanations as one very interesting approach for auditing AI systems. Finally, we have stressed some challenges for future work, which involve topics such as multimodal data fusion, the achievement of reliable and auditable ML solutions by design, and the use of GPAI algorithms to learn to explain. To sum up, we must understand that a multidisciplinary approach that involves collaboration between experts in computer science, ethics, law, and other relevant fields is necessary to tackle the challenges of opacity and to ensure that AI systems serve the interests of society as a whole.

3  Opacity, Machine Learning and Explainable AI

55

References Aggarwal, C.C. 2014. Data classification: Algorithms and applications. CRC Press. Almenzar, Estévez, D.  Fernádez Llorca Maylen, E.  Gómez, and F.  Martinez Plumed. 2022. Glossary of human-centric artificial intelligence. Luxembourg: Publications Office of the European Union. Antorán, J., U. Bhatt, T. Adel, A. Weller, and J.M. Hernández-Lobato. 2021. Getting a CLUE: A method for explaining uncertainty estimates. ICLR. Arulkumaran, K., M.P. Deisenroth, M. Brundage, and A.A. Bharath. 2017. Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine 34 (6): 26–38. Bach, S., A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and W. Samek. 2015. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS One 10 (7): 1–46. Baehrens, D., T.  Schroeter, S.  Harmeling, M.  Kawanabe, K.  Hansen, and K.-R.  Müller. 2010. How to explain individual classification decisions. Journal of Machine Learning Research 11: 1803–1831. Barredo Arrieta, A., N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, S. Garcia, S. Gil-Lopez, D. Molina, R. Benjamins, R. Chatila, and F. Herrera. 2020. Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion 58: 82–115. Bengio, Y., A. Courville, and P. Vincent. 2013. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence 35 (8): 1798–1828. Bennetot, A., J.-L. Laurent, R. Chatila, and N. Díaz-Rodríguez. 2019. Towards explainable neural-­ symbolic visual reasoning. arXiv:1909.09065: 1–10. Burkart, N., and M.F. Huber. 2021. A survey on the explainability of supervised machine learning. Journal of Artificial Intelligence Research 70: 245–317. Caruana, R., H. Kangarloo, J.D.N. Dionisio, U. Sinha, and D.B. Johnson. 1999. Case-based explanation of non-case-based learning methods. AMIA. Carvalho, D.V., E.M. Pereira, and J.S. Cardoso. 2019. Machine learning interpretability: A survey on methods and metrics. Electronics 8 (8): 832. Celebi, M.E., and K. Aydin. 2016. Unsupervised learning algorithms. Springer. Chou, Y.-L., C. Moreira, P. Bruza, C. Ouyang, and J. Jorge. 2022. Counterfactuals and causability in explainable artificial intelligence: Theory, algorithms, and applications. Information Fusion 81: 59–83. Commission, European. 2019. Ethics guidelines for trustworthy AI. ———. 2021. Proposal for a Regulation of the European Parliament and of the Council Laying Down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act) and Amending Certain Union Legislative Acts. Dhebar, Yash. 2022. Toward interpretable-AI policies using evolutionary nonlinear decision trees for discrete-action systems. IEEE Transactions on Cybernetics: 1–13. Doshi-Velez, Finale and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. Cite arxiv:1702.08608. Garouani, Mohamed. 2022. Towards efficient and explainable automated machine learning pipelines design: Application to industry 4.0 data. PhD Thesis, Universite´ Hassan II. Gill, Nick, P. Hall, K. Montgomery, and N. Schmidt. 2020. A responsible machine learning workflow with focus on interpretable models, post-hoc explanation, and discrimination testing. Information (Switzerland) 11 (3): 137. Guidotti, Riccardo. 2022, April. Counterfactual explanations and how to find them: Literature review and benchmarking. Data Mining and Knowledge Discovery: 1–10.

56

A. Fernández

Guidotti, Riccardo, A.  Monreale, F.  Giannotti, D.  Pedreschi, S.  Ruggieri, and F.  Turini. 2019, November. Factual and counterfactual explanations for black box decision making. IEEE Intelligent Systems 34 (6): 14–23. Conference Name: IEEE Intelligent Systems. Hasani, Narges, M.A. Morris, A. Rahmim, R.M. Summers, E. Jones, E. Siegel, and B. Saboury. 2022. Trustworthy artificial intelligence in medical imaging. PET Clinics 17 (1): 1–12. Hendricks, Lisa Anne, K. Burns, K. Saenko, T. Darrell, and A. Rohrbach. 2018. Women also snowboard: Overcoming bias in captioning models. In ECCV (3), Volume 11207 of lecture notes in computer science, ed. V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, 793–811. Springer. Kanehira, Atsushi, K. Takemoto, S. Inayoshi, and T. Harada. 2019, June. Multimodal explanations by predicting counterfactuality in videos. In 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 8586–8594. ISSN: 2575. Karimi, A.-H., G. Barthe, B. Balle, and I. Valera. Model-agnostic counterfactual explanations for consequential decisions. In Proceedings of the twenty third international conference on artificial intelligence and statistics, 895–905. PMLR, June 2020. ISSN: 2640-3498. Kaur, D., S.  Uslu, K.J.  Rittichier, and A.  Durresi. 2023. Trustworthy artificial intelligence: A review. ACM Computing Surveys 55 (2).: 39: 1–38. Keane, M. T., E. M. Kenny, E. Delaney, and B. Smyth. 2021. If only we had better counterfactual explanations: Five key deficits to rectify in the evaluation of counterfactual XAI techniques. In Proceedings of the thirtieth international joint conference on artificial intelligence, Montreal, Canada, 4466–4474. International Joint Conferences on Artificial Intelligence Organization, August 2021. Laugel, T., M.-J.  Lesot, C.  Marsala, X.  Renard, and M.  Detyniecki. 2018. Comparison-based inverse classification for interpretability in machine learning. In Information processing and management of uncertainty in knowledge-based systems. Theory and foundations, communications in computer and information science, 100–111. Cham: Springer International Publishing. Lopez-Paz, D., L. Bottou, B. Schölkopf, and V. Vapnik. 2016. Unifying distillation and privileged information. In ICLR, eds. Y. Bengio and Y. LeCun. Lundberg, S.M., G. Erion, H. Chen, A. DeGrave, J.M. Prutkin, B. Nair, R. Katz, J. Himmelfarb, N. Bansal, and S. Lee. 2020. From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence 2 (1): 56–67. McInnes, L. and J.  Healy. 2018. UMAP: Uniform manifold approximation and projection for dimension reduction. CoRR abs/1802.03426: 1–10. Mehrabi, N., F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan. 2021. A survey on bias and fairness in machine learning. ACM Computing Surveys 54 (6): 1–18. Meng, H., C.  Wagner, and I.  Triguero. 2022. Feature importance identification for time series classifiers. In 2022 IEEE international conference on systems, man, and cybernetics (SMC), 3293–3298. IEEE. Molnar, C. Interpretable machine learning. Lulu.com, 2020. Mothilal, R.K., A.  Sharma, and C.  Tan. 2020. Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings of the 2020 conference on fairness, accountability, and transparency, 607–617. ACM. Nguyen, A.M., A. Dosovitskiy, J. Yosinski, T. Brox, and J. Clune. 2016. Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. In Advances in neural information processing systems (NIPS), ed. D.D. Lee, M. Sugiyama, U. von Luxburg, I. Guyon, and R. Garnett, vol. 29, 3387–3395. Pohl, K., G. Böckle, and F. van Der Linden. 2005. Software product line engineering: Foundations, principles, and techniques. Springer. Poyiadzi, R., K. Sokol, R. Santos-Rodriguez, T. De Bie, and P. Flach. 2020. FACE: Feasible and actionable counterfactual explanations. In Proceedings of the AAAI/ACM conference on AI, ethics, and society, 344–350. ACM. Rahwan, I., M.  Cebrian, N.  Obradovich, J.  Bongard, J.F.  Bonnefon, C.  Breazeal, et  al. 2019. Machine behavior. Nature 568 (7753): 477–486.

3  Opacity, Machine Learning and Explainable AI

57

Ramon, Y., D.  Martens, F.  Provost, and T.  Evgeniou. 2020. A comparison of instance-level counterfactual explanation algorithms for behavioral and textual data: SEDC, LIME-C and SHAP-C. Advances in Data Analysis and Classification 14 (4): 801–819. Real, E., C. Liang, D.R. So, and Q.V. Le. 2020. AutoML-zero: Evolving machine learning algorithms from scratch. In Proceedings of the 37th international conference on machine learning (ICML), 8007–8019. PMLR. Ribeiro, M.T., S. Singh, and C. Guestrin. 2016. “Why should I trust you?”: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 1135–1144. ACM. ———. 2018. Anchors: High-precision model-agnostic explanations. In Proceedings of the 32nd AAAI conference on artificial intelligence (AAAI), ed. S.A.  McIlraith and K.Q.  Weinberger, 1527–1535. AAAI Press. Rudin, C. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1 (5): 206–215. Del Ser, Javier, A.  Barredo-Arrieta, N.  Díaz-Rodríguez, F.  Herrera, and A.  Holzinger. 2022. Exploring the trade-off between plausibility, change intensity and adversarial power in counterfactual explanations using multi-objective optimization. ArXiV:2205.10232: 1–10. Sharma, S., J. Henderson, and J. Ghosh. 2020, February. CERTIFAI: Counterfactual explanations for robustness, transparency, interpretability, and fairness of artificial intelligence models. In Proceedings of the AAAI/ACM conference on AI, ethics, and society, 166–172. ACM. Siau, K., and W. Wang. 2020. Artificial intelligence (AI) ethics: Ethics of AI and ethical AI. Journal of Database Management 31 (2): 74–87. Simonyan, K., A.  Vedaldi, and A.  Zisserman. 2014. Deep inside convolutional networks: Visualising image classification models and saliency maps. In 2nd international conference on learning representations, ICLR 2014 – workshop track proceedings, 1–8. ICLR. Sousa, W.G.D., E.R.P.D.  Melo, P.H.D.S.  Bermejo, R.A.S.  Farias, and A.O.  Gomes. 2019. How and where is artificial intelligence in the public sector going? A literature review and research agenda. Government Information Quarterly 36 (4): 1–8. Standars, I. 2021. ISO/IEC 24029-2:20xx information technology – Artificial intelligence (AI) – 6 assessment of the robustness of neural networks – Part 2: 7 methodology for the use of formal methods. Stepin, I., J.M. Alonso, A. Catala, and M. Pereira-Fariña. 2020, July. Generation and evaluation of factual and counterfactual explanations for decision trees and fuzzy rule-based classifiers. In 2020 IEEE international conference on fuzzy systems (FUZZ-IEEE), 1–8. ISSN: 1558-4739. ———. 2021a. A survey of contrastive and counterfactual explanation generation methods for explainable artificial intelligence. IEEE Access 9: 11974–12001. Stepin, I., A. Catala, M. Pereira-Fariña, and J.M. Alonso. 2021b. Factual and counterfactual explanation of fuzzy information granules. In Interpretable artificial intelligence: A perspective of granular computing, studies in computational intelligence, ed. W. Pedrycz and S.-M. Chen, 153–185. Cham: Springer International Publishing. Tamburri, D.A. 2020. Sustainable MLops: Trends and challenges. In Proceedings – 2020 22nd international symposium on symbolic and numeric algorithms for scientific computing, SYNASC 2020, 17–23. IEEE. Triguero, I., D. García-Gil, J. Maillo, J. Luengo, S. García, and F. Herrera. 2019. Transforming big data into smart data: An insight on the use of the k-nearest neighbours algorithm to obtain quality data. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 9 (2): 1–30. Verma, S., J. Dickerson, and K. Hines. 2020, October. Counterfactual explanations for machine learning: A review. arXiv:2010.10596 [cs, stat] arXiv: 2010.10596, 1–20. Vermeire, T., D.  Brughmans, S.  Goethals, R.M.B. de Oliveira, and D.  Martens. 2022, May. Explainable image classification with evidence counterfactual. Pattern Analysis and Applications 25 (2): 315–335.

58

A. Fernández

Wachter, S., B. Mittelstadt, and C. Russell. 2017. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. SSRN Electronic Journal. White, A., and A.S. d’Avila Garcez. 2020. Measurable counterfactual local explanations for any classifier. Frontiers in Artificial Intelligence and Applications 325: 2529–2535. ISBN: 9781643681009. Witten, I.H., E. Frank, M.A. Hall, and C.J. Pal. 2016. Data mining: Practical machine learning tools and techniques. 4th ed. Morgan Kaufmann. Wu, J., D. Peck, S. Hsieh, V. Dialani, C.D. Lehman, B. Zhou, V. Syrgkanis, L.W. Mackey, and G. Patterson. 2018. Expert identification of visual primitives used by CNNs during mammogram classification. In Medical imaging: Computer-aided diagnosis, Volume 10575 of SPIE Proceedings, ed. N.A. Petrick and K. Mori, 105752T. SPIE. Wu, T., M.T. Ribeiro, J. Heer, and D. Weld. 2021, August. Polyjuice: Generating counterfactuals for explaining, evaluating, and improving models. In Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), 6707–6723. Association for Computational Linguistics. Xie, X., J. Niu, X. Liu, Z. Chen, S. Tang, and S. Yu. 2021. A survey on incorporating domain knowledge into deep learning for medical image analysis. Medical Image Analysis 69: 101985. Yosinski, J., J. Clune, A.M. Nguyen, T.J. Fuchs, and H. Lipson. 2015. Understanding neural networks through deep visualization. In Proceedings of the international conference on machine learning – Deep learning workshop, vol. abs/1506.06579, 1–12. Zhou, Z., G.  Hooker, and F.  Wang. 2021. S-LIME: Stabilized-LIME for model explanation. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery and data mining (KDD21), 2429–2438. New York, NY, USA: ACM.

Chapter 4

The Moral Status of AI Entities Joan Llorca Albareda, Paloma García, and Francisco Lara

Abstract  The emergence of AI is posing serious challenges to standard conceptions of moral status. New non-biological entities are able to act and make decisions rationally. The question arises, in this regard, as to whether AI systems possess or can possess the necessary properties to be morally considerable. In this chapter, we have undertaken a systematic analysis of the various debates that are taking place about the moral status of AI. First, we have discussed the possibility that AI systems, by virtue of its new agential capabilities, can be understood as a moral agent. Discussions between those defending mentalist and anti-mentalist positions have revealed many nuances and particularly relevant theoretical aspects. Second, given that an AI system can hardly be an entity qualified to be responsible, we have delved into the responsibility gap and the different ways of understanding and addressing it. Third, we have provided an overview of the current and potential patientist capabilities that AI systems possess. This has led us to analyze the possibilities of AI possessing moral patiency. In addition, we have addressed the question of the moral and legal rights of AI. Finally, we have introduced the two most relevant authors of the relational turn on the moral status of AI, Mark Coeckelbergh and David Gunkel, who have been led to defend a relational approach to moral life as a result of the problems associated with the ontological understanding of moral status.

4.1 Introduction The question regarding moral status is not new; there is a long history behind it. However, it was not explicitly formulated until the birth of applied ethics. The main assumption in moral consideration during the greater part of the history of Western

J. Llorca Albareda (*) · P. García · F. Lara Department of Philosophy I, University of Granada, Granada, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 F. Lara, J. Deckers (eds.), Ethics of Artificial Intelligence, The International Library of Ethics, Law and Technology 41, https://doi.org/10.1007/978-3-031-48135-2_4

59

60

J. Llorca Albareda et al.

thought was that only human beings possessed moral status. They were conceived as the only entities that are both moral agents and moral patients. Very recently this presupposition began to be problematized when we started to critically question our relationship with non-human animals (Hursthouse 2013). This led to challenging the preponderant role of moral agency as the fundamental criterion used to demarcate the boundaries of the circle of moral consideration (Singer 1981). There are beings, such as animals and children, that can have rights despite not being able to make moral decisions. They can have them because they possess the capacities needed to be wronged and harmed by others. There are entities that, despite falling outside moral agency, can be morally considered in virtue of their moral patiency. Numerous moral properties have been proposed to underpin moral patiency, such as being sentient (DeGrazia 2008; Singer 2011) or being subject-­of-a-life (Regan 2004).1 The major change introduced by the advocates of the moral consideration of non-human moral patients, which is at the core of the concept of moral status, lies in their defense that these entities possess non-­derivative value. They have been previously morally considered in virtue of their relations with human beings, but now they are valued in themselves.2 The tremendous development of AI places us in a new scenario that will require us to consider the question of moral status from an equally intellectually disruptive perspective. To start with, thanks to AI, we could begin to design machines that are programmed to act ethically. Not only in the sense that their behavior always ends up being aligned with ethical principles, such as transparency or non-­discrimination, but also in the way it responds to prior computational reasoning in which ethical principles are represented and applied to previously interpreted contexts (Moor 1985). In the case of machines of the latter type, we could be dealing with artificial moral agents (AMAs) that can think and act in a moral manner (Anderson 2011; Anderson et al. 2011; van Wynsberghe and Robbins 2019). AMAs that would not only be possible, but could become essential. First, because, according to some authors, machines are or will be better than humans in deliberating and acting morally (Chomanski 2020)3 due to their lack of biases in moral reasoning (Dietrich 2001; Nadeau 2006), their capabilities to pursue more consistent moral goals (Wallach 2010), and their capabilities to gather relevant information for moral deliberation (Etzioni and Etzioni 2016). Second, and more realistically, the increasing complexity of contemporary societies brings the need to develop artificial entities that can cope with it (Wallach and Allen 2008). As Picard (1997, p. 19)

 In a similar vein, environmentalist proposals argue for the balance of the biotic community as a fundamental property of non-human moral status (Callicott 1980; Leopold 2020). 2  Animals and ecosystems had previously been morally considered, but instrumentally so. Examples are the doctrine of indirect duties (Kant 2017), which understands that we have duties towards animals because of the effects that cruel treatment of them could have towards the consideration of other human beings; and the conservationist doctrine that defended the preservation of natural spaces on account of the aesthetic pleasure that they produced in human beings (Callicott 1990). 3  Some have argued that AI may one day become a superagent capable of greater capacity for moral agency than humans (Bostrom 2017). 1

4  The Moral Status of AI Entities

61

puts it, “The greater the freedom of a machine, the more it will need moral standards”. Besides, current deployment of AI systems such as social robots and virtual assistants invites the development of their ethical reasoning in the light of the moral problems derived from the exponential interactions between humans and artificial entities (Floridi 2010; Sullins 2011). Third, the existential risks posed by the potential emergence of Singularity and Superintelligence to humanity requires the introduction of ethical standards in artificial agents (Bostrom 2003; Formosa and Ryan 2021). Nevertheless, underlying the possibility and necessity of ethical machines, we encounter the fundamental question of whether such machines, because they act according to ethical programs, could therefore be considered moral agents. Is there not something conceptually wrong in such a claim? This poses a radical challenge to contemporary moral philosophy since machines have always been at the other extreme of moral consideration. They are understood as artifacts, prostheses or complex means that cannot be morally considered in themselves (Gunkel 2012). However, the debates on moral status teach us that we cannot start from anthropocentric or biocentric prejudices (Torrance 2013), but from the question that asks to what extent these entities do not and/or cannot possess morally relevant properties. It cannot be denied a priori that relevant properties such as rationality, autonomy, consciousness, or inner life cannot be possessed by artificial entities. To assume that these entities cannot have them because they do not have an organic existence (Bryson and Kime 2011) would be ungrounded and can lead to morally reprehensible discriminations (Bostrom and Yudkowsky 2018). Moreover, if they could be considered moral agents, we could always ask ourselves whether this would not mean that we could attribute responsibilities to them to the point of holding them accountable for not having fulfilled their duties. But the debate does not end there. Some might argue that in addition to duties, they might also have rights. For this reason, it is important to determine whether they could ever be sentient or, alternatively, moral patients. These and other questions are pertinent to this debate, the complexity and thematic diversity of which is beginning to gain ground in the literature (Harris and Anthis 2021). This chapter will endeavor to provide a systematic overview of these questions about the moral status of AI entities. Although some authors have argued that this debate is unrepresented in the main literature (Levy 2009; Wareham 2021), its recent development has been significant. First, we will review the main contributions to the question of whether machines, by virtue of future technical development, could one day be considered to some extent as moral agents. Although, as we shall see, alternative conceptions of moral agency have been put forward that would leave room for machines to be included in this concept, some authors argue that these reformulations leave out of the concept important elements for defining a moral agent to such an extent, i.e., internal experiences somehow related to a consciousness that machines can never have. We therefore devote a second section to analyze to what extent we need a mind in order to attribute moral agency; however, since being a moral agent and having its correlative responsibilities are two formally distinct categories, it might be that AI entities, while conceived of as artificial

62

J. Llorca Albareda et al.

agents, are nevertheless not capable of being held responsible for the potential negative effects of their actions. This may bring us perilously closer, given the increasing automation of tasks previously performed by humans and of great social impact, to a scenario in which no one would be responsible for the bad consequences caused by machines. How this major conceptual and, above all, social problem might be solved will be dealt with in the third section. In the fourth section we will set out the main proposals regarding the possible consideration of AI entities as moral patients and their derivation into rights. Before drawing conclusions, we will use a fifth chapter to analyze the theories that attempt to avoid the difficulties seen so far, formulating proposals for moral status that are not based on the ontological properties of the entities but on our relations with them.

4.2 Can Machines Be Moral Agents? According to the standard conceptions of moral status, the answer is categorical: not at all.4 Their advocates adduce that granting moral agency to AI systems can only be explained by the very human tendency to anthropomorphize. Claims such as “nature has taken its revenge” or “the sea is angry” are good examples of this. Studies have shown that even at one year of age, before similar social experiences have been had, judgments are made about the agential movements of geometrical figures on an inclined plane, i.e., some “helped” and some “annoyed the others’‘(Heider and Simmel 1944). But apart from this natural tendency, it is conceptually impossible, add the standard conceptions, that a machine could ever be a moral agent. This is justified from a symmetrical relationship between agency and moral patiency: every moral agent is a moral patient and vice versa (Floridi and Sanders 2004). Christian ethics and Kantian morality, among others, have developed their ethical systems on these assumptions. These philosophical positions have tended to start from the concept of moral person or full moral status (Gordon and Gunkel 2022; Jaworska and Tannenbaum 2013; Warren 1997) to speak of moral status. That is, entities that deserve moral consideration are persons and possess the attributes that characterize them as such. This type of argumentation has given rise to a conception of AI as an instrument in the hands of human beings, produced for the realization of certain ends (Hall 2011; Johnson and Miller 2008). Its status as an instrument is not temporary, conditioned by technological development. On the contrary, there will always be an ontological difference between human beings and technological artifacts that  By standard conceptions of moral status, we refer to the most dominant accounts that have been developed to answer the question about the possible inclusion of different entities in the circle of moral consideration before the emergence of AI. Christian ethics, Kantianism or utilitarianism give different answers to the inquiry about the criteria for determining moral status. However, they all agree that, because artifacts lack such properties relevant to moral status as sentience or rationality, they cannot be morally considered in themselves. 4

4  The Moral Status of AI Entities

63

will prevent the latter from being morally considered in themselves (Miller 2015). Only human beings are persons and, therefore, only they deserve moral consideration on their own.5 The concept of personhood has been a recurrent topic of discussion in academic spaces and the criteria proposed for its demarcation have been many and very diverse. However, we can identify some of the most important properties in the debate about the moral status of AI: a conscious mind,6 free will, rationality, and responsibility. Consciousness has been the most widely endorsed, since it is usually understood as comprising other relevant properties such as the capacity to possess mental states, intentionality or to make moral deliberations (Himma 2009). Without it, the other morally relevant properties could not exist; that is, it is the necessary condition for moral personhood (Mosakas 2021). But one might ask whether, in demanding these internal states associated with personhood, we might not be indulging in a narrow, anthropocentric conception of morality (Llorca Albareda 2022). Thus, it could be argued that it is no longer a matter of looking for those properties that make human beings persons and analyzing whether they can be possessed by AI systems; on the contrary, it is a matter of understanding that AI has a very particular type of moral agency (Sullins 2011).7 It is a moral agent, but it does not have to be a moral patient, so the concept of moral status derived from it will be very different from the one we use with humans (Johnson 2006; Powers 2013). Their cognitive capabilities do not have the same moral weight compared to those of humans, but AI systems can be included in the circle of moral consideration by virtue of their new agential capabilities. This position would be more consistent with the current state of technology. From this perspective, it is postulated that AI can perform tasks and actions according to moral standards without necessarily possessing the internal properties traditionally linked to moral agency (Behdadi and Munthe 2020). Humans and AMAs can be considered homogeneous entities because both can accomplish the same moral goals (Fossa 2018). The latter is a functionalist proposal that explores a novel dimension of moral agency arising from AI systems. Distinctions between humans and artificial entities do not invalidate the question of moral agency. In an increasingly complex world in which the role of AI is growing (Wallach and Allen 2008), it is important to  Utilitarianism defends an asymmetric conception of moral status, but only on the side of patiency. That is, it is not necessary to be both a moral agent and patient to have moral status, but it is sufficient to be solely the latter. This means that artifacts cannot have moral status either under this viewpoint. 6  When we refer to a conscious mind, we are understanding it as a biological mind. Even though some proposals in the philosophy of mind envision a full mind without biological anchorage (Fodor 2000), the lack of agreement leads us to endorse this minimum definition. 7  This is not to say that moral personhood is coincident with the human species. Human beings can have full moral status if they are moral persons, that is, if they have the properties required to have full moral status, such as conscience, rationality, etc. This should be kept in mind since being human is often confused with being a moral person. Thanks to an anonymous reviewer for pointing this out. 5

64

J. Llorca Albareda et al.

recognize the different types of moral agency that exist. After all, entities such as corporations have been recognized as legal agents with special rights and duties (Floridi 2016; Hanson 2009; Søraker 2014). The maximum exponents of this functionalism are Floridi and Sanders (2004). Both authors argue that the traditional refusal to recognize AI systems as moral agents is due to conceptual error: the standard conception of moral agency assumes that every moral agent must necessarily have moral responsibility (Himma 2009), when this is not a necessary condition for moral agency. It is possible to reason morally without having to take responsibility for one’s actions. This leads to the defense of a mindless morality: concern for the internal states of artificial entities takes us to the wrong level of abstraction, since in our social and personal interactions with AI it is irrelevant whether the entity in question possesses a mind or not.8 If we take a more abstract perspective and look only at particular observable behaviors that are considered indicators of moral agency in humans, we could determine three requirements of agency that might be identified in certain AI entities. The first of these is interactivity, whereby the entity receives external influences that affect its states and actions; secondly, autonomy, whereby it can change its state or action independently of external influences; and, thirdly, adaptability, which requires the ability to learn to cope with new and complex situations by modifying internal rules of response according to the context (Floridi and Sanders, pp.  357–8). When these requirements are met by an entity whose agential behavior also entails moral effects on others, actions that cause harm or produce goods to moral patients, we could be dealing with a moral agent. If these criteria for demarcating moral agents are valid, there would only be a difference of degree between human beings and artificial moral agents, but in no case a qualitative difference. Some would be more autonomous or interactive than others and would more or less give rise to moral consequences, but they would basically be examples of the same category (Fossa 2018, pp. 116–7).

4.3 Do We Need a Mind to Attribute Moral Agency? Many authors would not be satisfied with functionalist arguments and would object that, if machines were to behave like humans, they would do so in a simulated manner (Véliz 2021). Machines will never possess mental states and, in that sense, would not be genuine moral agents (Purves et al. 2015; Champagne and Tonkens 2015, Friedman and Kahn 1992; Johnson and Miller 2008; Johnson and Powers 2006; Sparrow 2007). The question of the extent to which having a mind or conscience should be an inescapable requirement, then, needs to be explored further. A prior matter of analysis would be whether machines will one day have  The importance of the level of abstraction can be argued using a thought experiment developed by Nicholas Agar (2020). If you found out after years of marriage that your wife is a robot who has no mind, even though she always behaved as if she had one, would you stop considering her morally? Intuitive rejections of this question place serious limits on markedly internalist approaches. 8

4  The Moral Status of AI Entities

65

consciousness. Here we find a confrontation between materialist positions that assume that the biological mind will one day be able to more or less replicate itself entirely in electronic formats (Dennett 1997); and those that have denied that AI have or could in the future have these properties (Miller 2015) or, due to the current state of technologies and the implausibility that it could be achieved in the medium term, dismissed giving weight to this possibility (Müller 2021).9 But independently of these other regards, we confront the fundamental question of whether consciousness is a necessary condition for possessing moral status. Some authors object to the consciousness criterion by adducing that this ontological requirement is not essential in the same way that human beings are morally considered without determining if they really have consciousness (Anderson 2008; Coeckelbergh 2010; Floridi and Sanders 2004; Gerdes and Øhrstrøm 2015; Veruggio and Operto 2008). In fact, we only ascribe moral agency to humans in virtue of their behavioral observables, just as functionalists do with AMAs. This is to understand the debate about the moral status of AI as an epistemological problem. Moral status is still defined through the internal properties that the entity in question possesses, but it is claimed that we do not have sufficient epistemological access to be able to know the internal states of another entity (Gunkel 2012), and this turns out to be crucial for different types of entities. If the same kinds of entities can have similar mental processes, we can never know what it means to be-another-kind-of-entity (Nagel 1974). The different versions of the problem of other minds make the project of knowing the moral status of AI difficult. Thus, some authors have chosen to defend a behaviorist notion of moral status in which the possession of internal properties depends on the entities’ behavior leading the spectators to think that they possess the fundamental properties (Danaher 2020); or a precautionary notion of moral status, that is, since we cannot fully know the mental states of the other entity, we should be as inclusive as possible in our moral considerations so as not to leave anyone morally relevant out (Neely 2014). This can be reinforced by a criterion of argumentative economy: if we cannot maintain a concept of moral agency that requires demonstrable consciousness and we, in fact, can function without it, we might as well dispense with it (Floridi and Sanders 2004). However, the latter could be contested by arguing that, even if the existence of consciousness is a dispensable phenomenal requirement, it might be useful as a methodological assumption for determining the conditions of moral agency. And it could be maintained that when we assume consciousness in other humans we do so not because we demonstrate that they possess it but because they behave as if they do: with free will, rationality or a certain kind of properly moral knowledge. Is it possible to maintain that machines could be recognized as moral agents by exercising these capacities proper to agents with consciousness? With respect to free will, some argue that AMAs will never have it (Bringsjord 1992, 2007; Shen 2011). However, without entering the debate between  Although few authors argue that AI already possesses moral status (Nadeau 2006), many have argued that AI systems may in the future possess the internal properties that are associated with moral personhood (Ashrafian 2015; Schwitzgebel and Garza 2015). 9

66

J. Llorca Albareda et al.

compatibilism and incompatibilism, others believe that AMAs could be recognized as having freedom in the same basic sense we do with humans, that is, as being the source of control over their own actions (McKenna and Justin 2015), insofar as they fulfill requirements of autonomy and adaptability. But are they not programmed instruments? Ultimately, artificial intelligence is an artifact that derives its cognition and intentionality from programming, i.e., it is an entity capable of manipulating symbols by means of rules, but incapable of any kind of understanding (Searle 1980). All the meaning of its actions is derived from what its programmers imprint in its internal code. This has two problems. First, human beings are in a sense genetically programmed entities that are also socially conditioned: they require family and social education and training that also imprint a certain meaning on their lives in a way that is not fully self-determined (Gunkel 2012); and this does not prevent us from recognizing that humans have a certain degree of control that machines could also have (Floridi and Sanders 2004, pp. 371–2), being “weakly programmed” (Matheson 2012), i.e., leaving room for an autonomy that allows rules to be modified by virtue of experience and reasoning (Nagenborg 2007). Furthermore, even if we were to concede that AI, though partially autonomous, has no free will, we cannot use it as an argument against the possibility that it may possess moral status: there is an ongoing philosophical debate as to whether humans possess free will (Gordon 2020). Second, deep learning and artificial neural networks are overcoming the linear workings of the old AI. Programmers can no longer predict everything that AI will do (Floridi and Sanders 2004). Can we speak of a proper moral rationality that occurs in machines and is the same as the rationality we presuppose in rational moral agents? To some extent, one could speak of artificial rationality if by it we mean any processing that is done to reach a conclusion. This basic sense of rationality would be the one used when ascribing implicit or unconscious inferences to humans or when talking about automated reasoning in computer systems. However, there would also be a narrower sense of rationality, mostly used for proper moral rationality, which would entail both intentionality and an understanding and relevance to the issues at stake. It is usually argued that this second sense of rationality, having “reasons for action”, would not be appropriate for machines as these reasons would ultimately respond to internal states such as preferences, contextual beliefs, or desires to act; that is, a set of capacities needed to have ends and interests (Johnson 2006, p. 198; Kolodny and John 2016; Wallace 2014). But for some authors these capacities could be understood as dispositional and may require only a stimulus-response pattern (Behdadi and Munthe 2020, p. 203). Thus, every artifact would be intentional insofar as it is always directed towards the performance of a certain activity (Verbeek 2005); that is, it possesses an external intentionality and, therefore, to attribute intentional actions to it would not require biological support (Friedman and Kahn 1992; Dennett 1997; Sullins 2011). However, AI goes beyond mere artifactual intentionality because, through the use and management of syntactic rules, it can have representations that are directed towards the world. Its first-person experience can be very different from the human one and it cannot be ruled out that it has an internal life (Powers 2013).

4  The Moral Status of AI Entities

67

Finally, assuming that machines can be considered free and rational, it remains to be determined whether they can gain knowledge in the same way as conscious humans. For some authors, moral competence (Sliwa 2015), or moral sensitivity (Macnamara 2015) is not only a matter of factual perception and mere deliberative interpretation, but sometimes also a less rationalistic aspect, combining intuition and imagination. These are proposals advocated mainly by virtue ethicists and particularists. As far as ethical decision-making machines are concerned, the debate is between those who argue that such intuitions could be artificially emulated (Behdadi and Munthe 2020, p. 205), for example, by virtue-inspired hybrid programming that combines top-down implementation of ethical principles with bottom-up learning as a child does as it grows up and interacts socially (Wallach and Allen 2008); and those who deny that machines can actually possess intuitions (Purves et al. 2015). There would likewise be debate about the role that capacities and attitudes such as empathy would play in adequate moral knowledge and, consequently, to what extent AMAs should also be required to emulate them. On the one hand, even though it is commonly presupposed that empathy is unnecessary for moral skills (Goldie 2006; Maibom 2009), some authors hold that it is a useful tool, even built into machines, to know the perspective of the other and to motivate us to do the right thing, especially if we talk about empathic rationality (Asaro 2006; Purves et al. 2015; Rueda and Lara 2020). On the other hand, reciprocity is a relevant component of human morality and we should account for the capabilities necessary for this to take place (van Wynsberghe 2022, p. 479). Empathy plays a crucial role in accounting moral life since it motivates pro-social behavior –sharing, comforting, helping and caring for others– and inhibits aggression. In this sense, it has been defended that empathy would be fundamental for AI entities to know the perspective of potential users. They would be capable of recognizing the emotional mental states of humans through their expressions in order to grasp a better understanding of what the right decision is (Asaro 2006; Purves et al. 2015).

4.4 The Challenge of Responsibility Traditionally, agency and moral responsibility have gone hand in hand. Only moral agents could be responsible. But this need not necessarily be the case if we made an exercise of conceptual clarification, as we did in the previous section. However, some authors from the standard and new conceptions of moral agency agree on the impossibility of AI being held responsible. Both equate responsibility with the capacity both to perform moral actions and to be able to justify them if asked to do so. In this sense, one could defend that ethical machines can be accountable for what they do because they are rational agents, have rational qualities and develop rational actions. But machines, like animals and children, would never be responsible in the sense of being fully accountable to others, since they would not really be aware of what they are doing and they would not possess the internal properties needed to understand the meaning of their action. This is the reason why they would

68

J. Llorca Albareda et al.

not deserve to be praised or blamed. To be morally culpable one must have higher order intentionality; that is, beliefs about beliefs and desires about desires, beliefs about their fears, thoughts, hopes, etc. (Frankfurt 1988), something that machines do not possess, even if there are authors such as Dennett (1997) who do not rule out that they might do so in the future.10 This impossibility of attributing moral responsibility to AMAs generates a major socio-moral problem. If the AMAs cannot be blamed, and if very negative consequences of their actions cannot be attributed to designers or owners, we will have a responsibility gap (Matthias 2004), which has especially been the case since the overcoming of GOFAI (Good and Old Fashion Artificial Intelligence). Artificial neural networks and genetic and reinforcement algorithms have given rise to new forms of artificial learning (mainly deep learning) in which it is increasingly difficult to explain and interpret why the machine acts as it does (Russell and Norvig 2005). How then are we to deal with the responsibility gap that will inevitably arise with the development of AI? Who will be held responsible for its immoral behavior? (De Jong 2020; Matthias 2004). It could be argued that this responsibility gap has always existed and, if we would like to emphasize its growing role in our contemporaneity, we should point to the complexity of our societies and not simply to AI. Responsibility usually depends on the fitting between two elements (Tigard 2021a): the circumstances and the agent. On the one hand, if the agent lacks information to perform different courses of action or is forced to do so due to certain constraints, we will hardly attribute responsibilities to him for morally harmful consequences arising from his action (Loh 2019). On the other hand, it seems that in order to be able to attribute moral responsibility to a certain entity it needs to possess certain characteristics that confer him the status of a full moral agent (Himma 2009; Neuhäuser 2015). As in the case of children and animals, certain entities produce morally harmful effects without our ascribing responsibility to them. These circumstantial and agential responsibility gaps are enlarged by the increased level of complexity, but no qualitative or substantially quantitative transformation by AI would have to occur. However, this position fails to consider three important reasons. First, the degree of autonomy gained by AI with respect to instruments (Gunkel 2020) and machines (Matthias 2004; Sparrow 2007).11 Developments in machine learning and artificial neural networks are leading to AIs whose functioning and behaviors are not previously programmed, so they cannot be fully anticipated, making it difficult to  This perspective can be denied from two different angles. On the one hand, although AI currently lacks the properties necessary to be a responsible moral agent, it is possible that in the future, due to technological progress, it may possess them. On the other hand, as we will see later in those who have advocated for a profound change in the concept of responsibility, AI can be responsible in the wake of totally new relations of responsibility. 11  We distinguish between instrument and machine because of the argument offered by Gunkel (2020). Gunkel argues that the main reason why the responsibility gap occurs is that we try to respond to the advent and importance of AI from the instrumentalist paradigm. However, AI can be understood not only as an instrument, but also as a machine. However, this has the problem that a machine, although distinct from an instrument in the independence of its behavior, is not widely different in terms of autonomy. 10

4  The Moral Status of AI Entities

69

attribute responsibilities to programmers, manufacturers or users (Danaher 2016). It does not seem that these, despite possessing the mental qualities necessary to be moral or legal subjects, are sufficiently causally linked either to the facts in order to satisfy retributive psychological needs (Danaher 2016), or to the current legal systems (Kiener 2022).12 Second, the speed of data processing and decision making of AI systems prevent humans from being in-the-loop at all times. These kinds of AI capabilities far exceed human capabilities, making constant monitoring very difficult (Coeckelbergh 2020). Third, the first and second reasons would not matter if the role of AI were residual. However, its importance is growing, and, because of the level of social complexity, it occupies a fundamental place in areas such as medicine and finance (Wallach and Allen 2008). According to Daniel Tigard (2021a), three types of attitudes have been held toward the responsibility gap: techno-optimism, techno-pessimism, and advocates of conceptual change. Techno-optimists defend that, despite the existence of this gap and the problems it entails, the benefits brought by AI are so many that it should be obviated (Marino and Tamburrini 2020). Techno-pessimists, on the contrary, argue that the existence of this gap implies such deep problems that we should limit the development of AI and reduce its role and social importance (Asaro 2012). Finally, some authors have interpreted diagnoses of the responsibility gap as an opportunity to undertake a major conceptual shift. The reasons for exploring new ways of understanding the concept of responsibility have been mainly twofold: (i) the rejection of the existence of the responsibility gap (Tigard 2021a); and (ii) the existence of the responsibility gap has led to propose conceptual change as a solution to the problem (Champagne and Tonkens 2015; Kiener 2022). In the remainder of this section, we will be concerned with developing the most relevant types of positions regarding the moral responsibility of AI. First, we will analyze the assumptions common to techno-optimists and techno-pessimists; that is, the assumption that there is a retributive problem and an agential problem in the social deployment of AI systems. Second, we will analyze the various proposals for conceptual shift. On the one hand, we will deal with those that, despite their transformations, have maintained the individual dimension of responsibility. On the other hand, we will deal with those whose main novelty is the collective or distributed dimension of responsibility. The responsibility gap brought about by the massive development of AI is justified by two problematics: the retribution gap and the agent decoupling problem. On the one hand, the retribution gap refers to the widespread impossibility of blaming or punishing those entities causally responsible for certain damages. It owes its existence to two types of reasons: the psychological retributive interest of the injured party and the impossibility of punishing or blaming the causally responsible party. First, retributive psychological interest is strongly rooted in three types of evidence:  Strict liability is defined as an offense that, although a behavior punishable by the transgression of a norm, does not imply blameworthiness. By this we refer to the legal conditions on which the assumptions of strict liability are based. Kiener (2022) refers in particular to three: i) the offenses must be “not very serious”; ii) the ones responsible are usually those who benefit most from the damage produced; iii) these offenses do not carry any sufficiently serious stigma. 12

70

J. Llorca Albareda et al.

the human tendency to attribute the causes of events to the actions of a presumed agency; ethnographic evidence on the universality of punishment and blame; and retributive interest is usually, although typically what is made explicit is the contrary, that which drives punishment (Danaher 2016). If this psychological interest cannot be satisfied by blaming and punishing the AI or the programmers, manufacturers, or users in question, the retributive gap appears. The difficulty in blaming and punishing human beings for the harms produced by AI will be dealt with shortly. Second, the AI, causally responsible for the harms produced, cannot be blamed and punished because it lacks the fundamental property to be able to experience such a thing: sentience (Sparrow 2007). The damaged party could satisfy the desire for revenge, Sparrow argues, as we do on many occasions when we get “angry” with our objects, just as I kick my car when it stops working. However, guilt and punishment require some kind of response from the entity to which such charges are attributed. They must suffer to some degree, whereupon the possession of the capacity to suffer must be an indispensable condition for being blamed and punished. On the other hand, there is a decoupling of the entity causally responsible for the damage from the moral responsibility for the state of affairs produced. This is mainly because the entity in question lacks the properties required to be a full moral agent and is thus incapable of taking moral responsibility for the situation. Moral agency is a necessary and sufficient condition for moral responsibility (Himma 2009). Another important topic is the extent of responsibilities. As Neuhäuser (2015) argues, AI leads to an extensive network of responsibilities which implies that, in the face of the responsibility gap, certain humans must take responsibility for the consequences produced by AI. The problem lies in the fact that it is not easy to extend responsibilities to programmers, manufacturers or users for two reasons: first, because we do not do so with the unintended consequences of other types of products and artifacts, either; and, second, because the functional independence achieved thanks to machine learning and artificial neural networks makes it difficult to identify human beings with AI behaviors (Danaher 2016). Concerning conceptual change, we find, in the first place, those who have tried to find new formulas for individual responsibility. On the one hand, an initial group of authors has adduced that the impossibility of extending moral responsibility to programmers, manufacturers and users is due to the fact that our conceptual usages are too embedded in a retrospective paradigm; that is, one can only be held responsible for something once the harm has been committed. However, it is stated that it is possible to be held responsible as long as an individual declares in advance that all those actions carried out in a certain activity context are under his or her responsibility (Champagne and Tonkens 2015). This is the case of military commanders or managers of companies and administrations who, by taking the oath of office, become responsible for those damages that may occur under their mandate. This change introduces a prospective moral responsibility which is based on the legal concept of strict liability. On the other hand, Tigard (2021a) has argued for the non-­ existence of the responsibility gap with the emergence of AI. The main usages of the concept of responsibility are fully embedded in the property view (2021b) or possessive responsibility (2021c). Namely, responsibility is always a quality of an

4  The Moral Status of AI Entities

71

entity that possesses a series of properties by virtue of which it is a moral agent. Such conceptions, Tigard argues, prevent us from seeing the procedural nature of responsibility (Strawson 1962), for it depends, in the first instance, on the relations we have with others and our expectations of them. Pluralistic theories of responsibility have been elaborated on procedural responsibility (Shoemaker 2015; Watson 1996), which have addressed aspects such as the capacity to respond for the harm we commit (answerability) or the coherence of the action with respect to the whole of our character (attributability). From these positions it is possible to glimpse a non-traditional manner of holding AI accountable (Tigard 2021b). Secondly, some authors have proposed a shift in the agential dimension of responsibility. We find two groups of these. On the one hand, those authors who have defended, through the concepts of extended agency (Hanson 2009) and distributed responsibility (Floridi 2016), the need to attend to those types of actions that are not produced by individual entities but, on the contrary, involve numerous agents carrying out non-intentional actions that give rise to certain harmful effects. This type of collective action cannot be imputed to one or several individuals, so responsibility must undergo significant modifications. Floridi (2016) proposes, for example, that we prospectively conceive the concept of responsibility in its collective dimension and abandon the retributive or punitivist paradigm. That is, AI that malfunctions should not be punished, but rather appropriate measures should be taken to ensure that such harm does not recur (Floridi and Sanders 2004). This may include programming revisions, effective interpretability and explainability systems, or withdrawal from the market. The goal is to optimize the behavior of AIs by anticipating and reacting to potential failures. On the other hand, theories of the moral responsibility of new technologies have been developed based on the actor-­ network theory of Bruno Latour (1999) and the post-phenomenology of Don Ihde (1990). The author who has most developed this position is Peter-Paul Verbeek (2011). Verbeek (2005) endorses that artifacts mediate our experiences and actions in very significant ways. We interpret and live in the world through our artifacts, which gives them a crucial role in our lives. This is why we cannot impute all responsibility to human beings, for artifacts possess a morality of their own, either because they possess a particular kind of intentionality (Verbeek 2011) or because human beings have inscribed some kind of moral prescription on them (Latour 1999). This makes responsibility not only a human issue, but also that of non-human entities. It must be understood in a sense of a shared responsibility with artifacts and, fundamentally, humans must try to predict and anticipate, in the design stages, the intentionalities embedded in them.

4.5 Artificial Moral Patients and Rights The debate on the moral status of AI not only forces us to revise our concept of moral agency; we might also think that, by virtue of the properties it possesses or the type of relationships we have with it, we could cause it harm and, therefore, it

72

J. Llorca Albareda et al.

would be morally considerable. These kinds of arguments have led to interesting analogies between the moral status of animals and that of AI (Gellers 2020; Gunkel 2012). As with animals, the circle of moral consideration must be expanded to include other entities that do not meet all the criteria for being a moral person. Many authors have doubted the usefulness of this analogy, as the current state of technologies does not invite one to envision that AI systems can be sentient in the short to medium term (Hogan 2017). The approximation to moral status from moral patiency has been mainly carried out from the question of whether AI can become sentient. It is argued that AI is not sentient for two reasons. Firstly, it lacks the minimum level of agency to have moral status, as it does not possess evaluative consciousness (Gibert and Martin 2022). Joshua Shepherd (2018) argues that the standard notion of moral status refers back to evaluative consciousness, since we can only have moral obligations toward those entities that possess affective mental states. We can grant moral value to an entity to the extent that it has an interest in things and can give them a positive or negative mental valuation. Véliz (2021) reaches a similar conclusion by arguing that sentience is a necessary condition for the possession of mental states. Only those entities that have a minimum of sentience can be harmed by what is done to them. Secondly, sentience requires a physical-body architecture. The impossibility of AI not possessing sentience lies in the fact that its material substrate does not allow, at least for the time being (Donath 2020), that it can be sentient (Johnson and Verdicchio 2018). However, some believe that AI could become an hyper-sentient being by virtue of an artificial robotic system of highly sentient sensors (LaChat 1986). Other approaches anchored in moral patience have responded from the epistemological problem of other minds. In other words, moral obligations towards AI depend on AI being able to behave in such a way as to appear to possess the properties relevant to moral patience. Henry Shevlin (2021) has applied John Danaher’s (2020) moral behaviorism to the analysis of moral patiency, leading him to argue that AI will be a moral patient whenever it behaves in a way that appears to possess sentience. This approach depends on an epistemological presupposition: the impossibility of knowing for certain whether an AI system is sentient or not. Gunkel (2012) pointed out that, along with the impossibility of accessing sentient mental states, there is also a problem of conceptual vagueness. We do not have a clear definition of sentience, which makes it difficult to recognize it in other entities. The following question then arises: if, in the future, AI could have the properties necessary for the possession of moral status, should it have rights? This is a recurring question in the literature (Gunkel 2018; Inayatullah 2001; McNally and Inayatullah 1988; Miller 2015; Schwitzgebel and Garza 2015) and is deeply linked to the moral status inquiry. Indeed, it is often at the origin of the discussions about the moral status of AI: the practical challenges that AI is presenting to individuals, companies, and governments is so relevant that the question of what legal and moral protections AI should have deserves to come to the forefront. To answer this question, it is important to make two distinctions and present the analyses that have been made around them; namely, the different rights that can be possessed according to their content, or so-called Hohfeldian rights, and those according to the sphere of which they are part, i.e., moral and legal rights.

4  The Moral Status of AI Entities

73

First, Wesley Newcombe Hohfeld’s (1919) theory of rights constitutes a fruitful theoretical coordinate to analyze the type of rights AI could deserve in the present and future (Andreotta 2021; Gunkel 2018).13 This English jurist argued that rights can belong to four types: privileges, when they exempt their holder from the performance of certain duties that others must fulfill; claims, that is, claims by the holder of the right that give rise to positive duties in other agents; power, which refers to those rights proper to those who hold a certain authority; and immunities, whose task is to prevent certain forms of interference of some agents over others with the aim of preventing damage that may be produced by the action of one over the other. Andreotta (2021) argues that, due to the difficulty and remoteness of a world in which AIs hold positions of power and derive certain privileges from them, the most relevant categories are claims and immunities, which are precisely those from which positive duties derive. Thinking about this question, therefore, requires urgency, for we may be morally impelled towards the fulfillment of certain duties and the renewal of our legal systems. However, privileges may also be relevant, as these are not always the result of a position of power. Privileges may derive from the very nature of the activity being performed by the AI.  For example, think of an autonomous ambulance driving system that can bypass traffic codes or a medical AI that has access to sensitive patient data. Second, it is worth distinguishing between moral rights and legal rights. Although they are intertwined, the nature of their issues is different and failure to differentiate between them can lead to analytical problems. On the one hand, two criteria have been addressed for determining whether an entity satisfies the conditions necessary to have moral rights: for either the capacities or the interests of the holder thereof (Raz 1984). The former derives their protections from the capacities of the possessor, particularly those referring to the free will of the individual to carry out his life projects. The latter focuses on the interests possessed by the subject of rights, since, although he may not have sufficient agential capacities to direct his life freely and autonomously, he may have fundamental interests that affect his well-being. The same reasoning follows as in the case of the moral rights of animals and ecosystems14: even though they are not moral agents, AI system have certain interests that must be protected. On the other hand, discussions about the legal rights of AI began to be held at length and in earnest after the famous paper written by Solum (1992). Most of the debates have revolved around two issues: the legal personhood of AI and the appropriate legal instruments for it to have legal protection. First, the legal personhood of AI has been defended and rejected from two schools of thought: positivism and legal realism (Chesterman 2020). Legal positivism understands that law is created by legislating and that there is no instance prior to positive law that justifies the  Hohfeld develops it as a theory of legal rights. However, because its contents are not specific to legal rights, but are shared by moral and political rights, some ethicists have applied them to these debates in a general manner (Andreotta 2021; Gunkel 2018). 14  The interests of animals and ecosystems are not based on the same assumptions. The defense of animal rights presupposes the possibility of being able to subjectively experience a certain degree of well-being. In contrast, ecosystem interests are holistically understood as needs for a certain natural environment to maintain its ecological and/or biotic balance (Callicott 1980). 13

74

J. Llorca Albareda et al.

granting of certain rights. This may lead to defending the granting of legal personhood to AI if a given political society determines it necessary, either for intrinsic or instrumental reasons. Legal realism, on the other hand, understands that there is an instance prior to positive law from which the latter derives. Thus, the full or partial legal personhood of AI would be based on the presence or absence of moral and natural rights. Second, it has been discussed whether the legal instruments currently in place in legislation are sufficient to accommodate AI. This depends on the reason behind the granting of legal rights to AI. If the reasons are instrumental, the granting of rights will depend on the benefits provided at societal level. Accountability gaps could, for example, be eliminated by imputing partial accountability or more effective means of accountability elucidation (Koops et  al. 2010). It is usually argued that legal instruments such as strict liability (Champagne and Tonkens 2015), animal rights (Gellers 2020), children’s rights (Chesterman 2020), and rights of fictitious entities such as corporations (Hanson 2009) are adequate. However, if the reasons are intrinsic —that is, if the granting of rights occurs by virtue of some property possessed by that entity—legal instruments are not sufficient. This is so because, on the one hand, AI has for the time being not reached a level of development such that its capabilities can be equated to those of human beings and, therefore, it is not feasible to confer the same rights to them (Schwitzgebel and Garza 2015); and, on the other hand, because it may be that AI possesses a number of intrinsic properties that lead it to be substantially different from humans in terms of rights. This is the position sustained by Miller (2015), whose argument holds that AI is ontologically distinct from humans by virtue of the fact that, at a general level,15 it depends on humans constructing it for a purpose. This existence, tied to a substantive purpose, is in his view incompatible with the granting of the whole package of human rights. However, this does not mean that it cannot possess some rights such as the right to intellectual property (Chesterman 2020).

4.6 Relationalist Proposals The conceptions put forward so far have followed an ontological or quasi-­ ontological16 approach to the moral status of AI (Coeckelbergh 2012). According to their rationale, for an entity to have moral status it must possess certain properties

 By general level we refer to a specific part of Miller’s argument. One of the criticisms that can be made of Miller is that human beings can also be created for some purpose and that this does not make them lose their rights. However, Miller argues that being or not being produced according to a purpose should not be understood at the individual level but at the level of the existential origin of a species or typology of artifacts. The human species is the result of natural selection, a blind process devoid of teleology, quite the opposite of AI, a product of human purpose. 16  We use the term quasi-ontological to express the possibility of partially relational approaches. There are patientist positions that emphasize relations; however, these relations are marked by certain ontological properties such as moral agency (in the case of virtue ethics, see Cappuccio et al. 2020). 15

4  The Moral Status of AI Entities

75

or attributes. First, one defines what that entity is and subsequently derives how the entity in question should be considered morally. Gunkel (2012) has identified four major problems in the ontological understanding of moral status: the terminological problem, which points to the difficulty of defining the content of the fundamental ontological property with precision; the epistemological problem, which points to the difficulty of knowing with certainty whether the entity in question possesses the fundamental ontological property; the ethical problem, which states that ontological conceptions have historically excluded numerous entities worthy of moral consideration; and the methodological problem, which emphasizes that the ontological conception, by its own internal functioning, constitutes an inclusion/exclusion mechanism that always leaves morally relevant entities outside the circle of moral consideration. Many authors have taken these objections very seriously and proposed an alternative approach to the analysis of moral status (Llorca Albareda 2023; Llorca-­ Albareda and Díaz-Cobacho 2023). They argue that the concept of moral status cannot be understood from the ontological coordinates of patiency and moral agency, but that one must attend directly to the personal and social relations that entities maintain with each other, without paying attention to what that entity is or can be. These perspectives are usually encompassed within what is understood as relationalism, although there are important differences in this regard (Cappuccio et al. 2020; Müller 2021). We will explore the proposals of the two most renowned authors within those who reject the standard approaches to moral status: Mark Coeckelbergh and David Gunkel. Coeckelbergh has approached the relationalist turn in moral status from two perspectives: the as-if strategy and the transcendentalist proposal. The first states that what is important in moral relationships is appearance, how we perceive and interpret AI. If we see it as a relevant social actor, with whom we are able to have meaningful relationships because it appears to possess the relevant properties of moral patiency, it will be worthy of moral consideration. The difference with respect to the standard conceptions of moral status is that the relevant properties are constructed by social relations and that is precisely why they are apparent: their properties are not intrinsic (Coeckelbergh 2009, 2010). Secondly, this author uses what he calls the argument of the conditions of possibility of moral status (Coeckelbergh 2012). Moral status in an ontological sense can only take place on the condition that there be a prior relationship between the entities susceptible of entering the circle of moral consideration. The fundamental ontological properties are a by-product of the interpretations and valuations of the entities that are part of the world. The role of sentience and consciousness in moral status, in Coeckelbergh’s view, depends on the fact that it is these properties and not others that have been chosen as relevant. Their identification and demarcation depend on the type of relationships between human and non-human entities. In this sense, Coeckelbergh leaves behind his former position (2009, 2010), in which appearance determines the moral consideration of an entity insofar as we perceive it as having the relevant internal property, to understand the relationship transcendentally: rather than being a factual state of affairs, the relationship is the non-empirical way in which morality is structured in

76

J. Llorca Albareda et al.

the world (2012, 2014). Understanding the relationship transcendentally avoids the usual criticisms of relationalism (Gibert and Martin 2022; Müller 2021): just as it cannot be argued that human individuals who are part of a given racial or cultural community have a higher moral status than those who are not part of it, we cannot use relationship as a discriminatory criterion on these terms. David Gunkel (2012, 2018) proposes “thinking otherwise” about the moral status of AI, turning the traditional modes of moral reflection on their head and approaching the problem differently. To do so, he starts from Emmanuel Lévinas’ philosophy of otherness. The philosophical method by which moral status is analyzed, Gunkel argues, falls prey again and again to the ontological rationale; the same problem that Lévinas encountered in his critique of traditional philosophical thought. This is why the latter reversed the hierarchy among philosophical disciplines: ethics, instead of ontology, becomes the primordial philosophy. This implies that the moral relationship with the Other no longer consists, first, of identifying a series of general features and characteristics and then deriving from them the kind of moral treatment she deserves; on the contrary, the encounter with the Other must take place in the beginning and then certain ontological properties can be derived. The face of the Other is the source of moral obligations. As Gunkel himself acknowledges (2012, 2014), Leviasian philosophy is markedly anthropocentric: categories such as face or encounter have a distinctly human character. Nevertheless, it is possible to understand this relation in minimalist terms and to conceive a kind of moral consideration that emerges from unbounded and unrestricted dealings with an ontologically unknown Other.

4.7 Conclusion The moral status of AI is becoming an increasingly discussed topic in ethics of technology. Current and future developments in the field (will) pose serious challenges to standard conceptions of moral consideration. Whether we should include these entities in the circle of moral consideration or grant them legal rights are questions that are still in the process of being answered. In this chapter, we have attempted to provide a systematic overview and outline the most important theoretical issues in the debate on this topic. First, AI capabilities are opening up the possibility that non-biological entities can be moral agents and, for this reason, extend the boundaries of the circle of moral consideration. According to the standard conception of moral status, the answer is negative: AI systems could not be considered moral agents since they lack fundamental moral properties such as a conscious mind, and derivatively free will, rationality and responsibility (Johnson 2006). However, the consideration of moral status according to internal states operates from a narrow and anthropocentric conception of morality (Behdadi and Munthe 2020). For this reason, some authors argue that AI entities have a particular type of moral agency (Sullins 2011).

4  The Moral Status of AI Entities

77

Can these entities be considered morally by virtue of their intrinsic value? The epistemological limitations to access internal properties limits, according to some authors, the strength of the standard conception (Danaher 2020; Neely 2014; Floridi and Sanders 2004). In this sense, these objections offer the opportunity to elaborate a concept of a mindless moral agency. The debate revolves around topics such as whether the capacities needed for moral rationality could be understood as dispositional (Behdadi and Munthe 2020), or about the role of intuitions and emotions—in particular empathy—in moral reasoning. Second, we have dealt with the question of whether AI entities can be morally responsible. Moral responsibility usually equates with the capacity both to perform moral actions, and to be able to justify them. AI entities, therefore, do not possess the properties necessary to be held responsible for their morally harmful consequences, which leads to a major socio-moral problem: the responsibility gap (Matthias 2004). Following Tigard (2021a), we have presented three types of attitudes regarding the responsibility gap: techno-optimism, techno-pessimism and the advocates of a conceptual shift. The first two are based on two assumptions. On the one hand, they take for granted the existence of a retributive gap; that is, the impossibility of blaming or punishing entities causally responsible for certain damages. On the other hand, they assume the so-called agential problem: AI systems lack the qualities required to be morally responsible. Regarding those who advocate for conceptual change, some authors have been determined to find new formulas for individual responsibility, such as prospective moral responsibility (Champagne and Tonkens 2015) or pluralistic conceptions of responsibility (Tigard 2021b). Others, in turn, have endorsed a collective conception of responsibility such as extended agency (Hanson 2009), distributed responsibility (Floridi 2016), and postphenomenological co-responsibility (Verbeek 2011). Third, we have also delved into the counterpart of moral agency, that is, moral patiency. Can AI entities be considered moral patients that deserve rights? There is a widespread agreement in the literature on this topic: AI entities do not possess the necessary condition of moral patiency, sentience. AI systems lack other correlative properties of sentience, such as evaluative consciousness (Shepherd 2018) and own interests (Véliz 2021). However, the present state of this technology does not preclude AI from having rights in the future. We have analyzed which kinds of rights AI could have according to the categories proposed by Wesley Hohfeld (1919). Furthermore, we have presented the distinction between legal and moral rights, and the question of the legal personhood of AI. Finally, we have also paid heed to the relational turn, focusing on the approaches developed by Mark Coeckelbergh and David Gunkel. The relational turn endorses that moral value is not a result of a recognition of one or more moral properties; on the contrary, we ascribe value depending on how meaningful the relationships entities maintain with each other are. On the one hand, Mark Coelkelbergh has taken two stances toward the moral status of AI: the “as-if” strategy, and the transcendentalist argument of the conditions of possibility of moral status. On the other hand, Gunkel aims to enlarge the boundaries of the moral circle applying Lévinas ethics

78

J. Llorca Albareda et al.

to the debate of the moral status of AI. In this perspective, ethics cannot depend on ontology and, for this reason, the moral status of AI cannot depend on the possession of certain ontological properties. He thus invites us to “think otherwise” and to recognize that the Other is the source of moral obligations. Acknowledgements  This chapter was written as a part of the research projects Digital Ethics. Moral Enhancement through an Interactive Use of Artificial Intelligence (PID2019-104943RB-I00), funded by the State Research Agency of the Spanish Government, and Moral enhancement and artificial intelligence. Ethical aspects of a Socratic virtual assistant (B-HUM-64-UGR20), funded by FEDER/ Junta de Andalucía—Consejería de Transformación Económica, Industria, Conocimiento y Universidades. The authors are also grateful for the insightful comments of Jan Deckers on a previous version of this chapter.

References Agar, N. 2020. How to treat machines that might have minds. Philosophy & Technology 33 (2): 269–282. https://doi.org/10.1007/s13347-­019-­00357-­8. Anderson, S.L. 2008. Asimov’s “three laws of robotics” and machine metaethics. AI & SOCIETY 22 (4): 477–493. https://doi.org/10.1007/s00146-­007-­0094-­5. ———. 2011. Philosophical concerns with machine ethics. In Machine ethics, ed. M. Anderson and S.L. Anderson, 162–167. Cambridge: Cambridge University Press. Anderson, M., S.L. Anderson, J.H. Moor, J. Stirrs, C. Allen, W. Wallach, I. Smit, et al. 2011. In Machine ethics, ed. M. Anderson and S.L. Anderson. Cambridge: Cambridge University Press. https://doi.org/10.1017/cbo9780511978036. Andreotta, A.J. 2021. The hard problem of AI rights. AI & SOCIETY 36 (1): 19–32. https://doi. org/10.1007/s00146-­020-­00997-­x. Asaro, P.M. 2006. What should we want from a robot ethic? International Review of Information Ethics 6 (12): 9–16. https://doi.org/10.29173/irie134. Asaro, P. 2012. On banning autonomous weapon systems: Human rights, automation, and the dehumanization of lethal decision-making. International review of the Red Cross 94 (886): 687–709. Ashrafian, H. 2015. Artificial intelligence and robot responsibilities: Innovating beyond rights. Science and Engineering Ethics 21 (2): 317–326. https://doi.org/10.1007/s11948-­014-­9541-­0. Behdadi, D., and C. Munthe. 2020. A normative approach to artificial moral agency. Minds and Machines 30 (2): 195–218. https://doi.org/10.1007/s11023-­020-­09525-­8. Bostrom, N. 2003. Ethical issues in advanced artificial intelligence. In Science fiction and philosophy: From time travel to superintelligence, 277–284. ———. 2017. Superintelligence. Paris: Dunod. Bostrom, N., and E. Yudkowsky. 2018. The ethics of artificial intelligence. In Artificial intelligence safety and security, ed. R.V. Yampolskiy, 57–69. London: Routledge. Bringsjord, S. 1992. What robots can and can’t be. New York: Kluwer Academic. ———. 2007. Ethical robots: The future can heed us. AI & SOCIETY 22 (4): 539–550. Bryson, J.J., and P.P. Kime. 2011. Just an artifact: Why machines are perceived as moral agents. In Twenty-second international joint conference on artificial intelligence, vol. 22, 1641. Callicott, J.B. 1980. Animal liberation: A triangular affair. Environmental ethics 2 (4): 311–338. ———. 1990. Whither conservation ethics? Conservation Biology 4 (1): 15–20. Cappuccio, M.L., A. Peeters, and W. McDonald. 2020. Sympathy for Dolores: Moral consideration for robots based on virtue and recognition. Philosophy & Technology 33 (1): 9–31. https://doi. org/10.1007/s13347-­019-­0341-­y.

4  The Moral Status of AI Entities

79

Champagne, M., and R.  Tonkens. 2015. Bridging the responsibility gap in automated warfare. Philosophy & Technology 28 (1): 125–137. https://doi.org/10.1007/s13347-­013-­0138-­3. Chesterman, S. 2020. Artificial intelligence and the limits of legal personality. International & Comparative Law Quarterly 69 (4): 819–844. https://doi.org/10.1017/s0020589320000366. Chomanski, B. 2020. Should moral machines be banned? A commentary on van Wynsberghe and Robbins ‘critiquing the reasons for making artificial moral agents’. Science and Engineering Ethics 26 (6): 3469–3481. https://doi.org/10.1007/s11948-­020-­00255-­9. Coeckelbergh, M. 2009. Virtual moral agency, virtual moral responsibility: On the moral significance of appearance, perception and performance of artificial agents. AI & SOCIETY 24 (2): 181–189. https://doi.org/10.1007/s00146-­009-­0208-­3. ———. 2010. Moral appearances: Emotions, robots, and human morality. Ethics and Information Technology 12 (3): 235–241. https://doi.org/10.1007/s10676-­010-­9221-­y. ———. 2012. Growing moral relations: Critique of moral status ascription. New York: Palgrave Macmillan. ———. 2014. The moral standing of machines: Towards a relational and non-Cartesian moral hermeneutics. Philosophy & Technology 27 (1): 61–77. https://doi.org/10.1007/ s13347-­013-­0133-­8. ———. 2020. Artificial intelligence, responsibility attribution, and a relational justification of explainability. Science and Engineering Ethics 26 (4): 2051–2068. https://doi.org/10.1007/ s11948-­019-­00146-­8. Danaher, J. 2016. Robots, law and the retribution gap. Ethics and Information Technology 18 (4): 299–309. https://doi.org/10.1007/s10676-­016-­9403-­3. ———. 2020. Welcoming robots into the moral circle: A defense of ethical behaviourism. Science and Engineering Ethics 26 (4): 2023–2049. https://doi.org/10.1007/s11948-­019-­00119-­x. De Jong, R. 2020. The retribution-gap and responsibility-loci related to robots and automated technologies: A reply to Nyholm. Science and Engineering Ethics 26 (2): 727–735. https://doi. org/10.1007/s11948-­019-­00120-­4. DeGrazia, D. 2008. Moral status as a matter of degree? The Southern Journal of Philosophy 46 (2): 181–198. https://doi.org/10.1111/j.2041-­6962.2008.tb00075.x. Dennett, D. 1997. When HAL kills, who’s to blame? Computer ethics. In HAL’s legacy: 2001’s computer as dream and reality, ed. D. Stork, 351–366. Cambridge, Massachusetts: MIT Press. Dietrich, E. 2001. Homo sapiens 2.0: Why we should build the better robots of our nature. Journal of Experimental & Theoretical Artificial Intelligence: JETAI 13 (4): 323–328. https://doi. org/10.1080/09528130110100289. Donath, J. 2020. Ethical issues in our relationship with artificial entities. In The Oxford handbook of ethics of AI, ed. M.D.  Dubber, F.  Pasquale, and S.  Das, 53–73. Oxford: Oxford University Press. Etzioni, A., and O. Etzioni. 2016. AI assisted ethics. Ethics and Information Technology 18 (2): 149–156. https://doi.org/10.1007/s10676-­016-­9400-­6. Floridi, L. 2010. Artificial companions and their philosophical challenges. In Close engagements with artificial companions, ed. J.  Wilks, 23–28. Amsterdam: John Benjamins Publishing Company. ———. 2016. Faultless responsibility: On the nature and allocation of moral responsibility for distributed moral actions. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 374 (2083): 20160112. https://doi.org/10.1098/ rsta.2016.0112. Floridi, L., and J.W.  Sanders. 2004. On the morality of artificial agents. Minds and Machines 14 (3): 349–379. Fodor, J. 2000. The mind doesn’t work that way: The scope and limits of computational psychology. Cambridge, Massachusetts: MIT Press. Formosa, P., and M. Ryan. 2021. Making moral machines: Why we need artificial moral agents. AI & SOCIETY 36 (3): 839–851. https://doi.org/10.1007/s00146-­020-­01089-­6.

80

J. Llorca Albareda et al.

Fossa, F. 2018. Artificial moral agents: Moral mentors or sensible tools? Ethics and Information Technology 20 (2): 115–126. https://doi.org/10.1007/s10676-­018-­9451-­y. Frankfurt, H.G. 1988. Freedom of the will and the concept of a person. In What is a person, ed. M.F. Goodman, 127–144. Totowa: Humana Press. Friedman, B., and P.H. Kahn. 1992. Human agency and responsible computing: Implications for computer system design. Journal of Systems Software 17 (7): 7–14. Gellers, J.C. 2020. Rights for robots: Artificial intelligence, animal and environmental law. London: Routledge. Gerdes, A., and P. Øhrstrøm. 2015. Issues in robot ethics seen through the lens of a moral Turing test. Journal of Information, Communication and Ethics in Society 13 (2): 98–109. https://doi. org/10.1108/jices-­09-­2014-­0038. Gibert, M., and D. Martin. 2022. In search of the moral status of AI: Why sentience is a strong argument. AI & SOCIETY 37 (1): 319–330. https://doi.org/10.1007/s00146-­021-­01179-­z. Goldie, P. 2006. Anti-empathy: Against empathy as perspective-shifting. In Empathy: Philosophical and psychological perspectives, ed. P. Goldie and A. Coplan, 302–317. Oxford: Oxford University Press. Gordon, J.S. 2020. What do we owe to intelligent robots? In Smart technologies and fundamental rights, ed. J.S. Gordon, 17–47. Leiden: Brill. Gordon, J.S.D., and J. Gunkel. 2022. Moral status and intelligent robots. The Southern Journal of Philosophy 60 (1): 88–117. https://doi.org/10.1111/sjp.12450. Gunkel, D. 2012. The machine question: Critical perspectives on AI, robots, and ethics. Cambridge, Massachusetts: MIT Press. ———. 2018. Robot rights. Cambridge, Massachusetts: MIT Press. Gunkel, D.J. 2014. A vindication of the rights of machines. Philosophy & Technology 27 (1): 113–132. https://doi.org/10.1007/s13347-­013-­0121-­z. ———. 2020. Mind the gap: Responsible robotics and the problem of responsibility. Ethics and Information Technology 22 (4): 307–320. https://doi.org/10.1007/s10676-­017-­9428-­2. Hall, J.S. 2011. Ethics for machines. In Machine ethics, ed. M.  Anderson and S.L.  Anderson, 28–44. Cambridge: Cambridge University Press. Hanson, F.A. 2009. Beyond the skin bag: On the moral responsibility of extended agencies. Ethics and Information Technology 11 (1): 91–99. https://doi.org/10.1007/s10676-­009-­9184-­z. Harris, J., and J.R. Anthis. 2021. The moral consideration of artificial entities: A literature review. Science and Engineering Ethics 27 (4): 1–95. https://doi.org/10.1007/s11948-­021-­00331-­8. Heider, F., and M.  Simmel. 1944. An experimental study of apparent behavior. The American Journal of Psychology 57 (2): 243–259. Himma, K. 2009. Artificial agency, consciousness, and the criteria for moral agency: What properties must an artificial agent have to be a moral agent? Ethics and Information Technology 11 (1): 19–29. https://doi.org/10.1007/s10676-­008-­9167-­5. Hogan, K. 2017. Is the machine question the same question as the animal question? Ethics and Information Technology 19 (1): 29–38. https://doi.org/10.1007/s10676-­017-­9418-­4. Hohfeld, W.N. 1919. Fundamental legal conceptions as applied in judicial reasoning. New Haven: Yale University Press. Hursthouse, R. 2013. Moral status. In International encyclopedia of ethics, ed. H.  LaFollette. New York: John Wiley & Sons. https://onlinelibrary.wiley.com/doi/10.1002/9781444367072. wbiee076. Ihde, D. 1990. Technology and the lifeworld. Bloomington: Indiana University Press. Inayatullah, S. 2001. The rights of robot: Inclusion, courts and unexpected futures. Journal of Future Studies 6 (2): 93–102. Jaworska, A., and J. Tannenbaum. 2013. The grounds of moral status. In The Stanford encyclopedia of philosophy, ed. E.N. Zalta. https://plato.stanford.edu/entries/grounds-­moral-­status/. Johnson, D.G. 2006. Computer systems: Moral entities but not moral agents. Ethics and Information Technology 8 (4): 195–204. https://doi.org/10.1007/s10676-­006-­9111-­5.

4  The Moral Status of AI Entities

81

Johnson, D.G., and K.W. Miller. 2008. Un-making artificial moral agents. Ethics and Information Technology 10 (2): 123–133. https://doi.org/10.1007/s10676-­008-­9174-­6. Johnson, D.G., and T.M. Powers. 2006. Computer systems and responsibility: A normative look at technological complexity. Ethics and Information Technology 7 (2): 99–107. https://doi. org/10.1007/s10676-­005-­4585-­0. Johnson, D.G., and M. Verdicchio. 2018. Why robots should not be treated like animals. Ethics and Information Technology 20 (4): 291–301. https://doi.org/10.1007/s10676-­018-­9481-­5. Kant, I. 2017. Kant: The metaphysics of morals. Cambridge: Cambridge University Press. Kiener, M. 2022. Can we bridge AI’s responsibility gap at will? Ethical Theory and Moral Practice 25 (4): 575–593. https://doi.org/10.1007/s10677-­022-­10313-­9. Kolodny, N., and B. John. 2016. Instrumental rationality. In The Stanford encyclopedia of philosophy, ed. E.N. Zalta. http://plato.stanford.edu/archives/spr2016/entries/rationality-­instrumental/. Koops, B.J., M. Hildebrandt, and D.O. Jaquet-Chiffelle. 2010. Bridging the accountability gap: Rights for new entities in the information society. Minnesota Journal of Law, Science & Technology 11 (2): 497. https://scholarship.law.umn.edu/mjlst/vol11/iss2/4/. LaChat, M.R. 1986. Artificial intelligence and ethics: An exercise in the moral imagination. AI Magazine 7 (2): 70–79. Latour, B. 1999. Pandora’s hope: Essays on the reality of science studies. Cambridge: Harvard University Press. Leopold, A. 2020. A Sand County almanac: And sketches here and there. Oxford: Oxford University Press. Levy, D. 2009. The ethical treatment of artificially conscious robots. International Journal of Social Robotics 1 (3): 209–216. https://doi.org/10.1007/s12369-­009-­0022-­6. Llorca Albareda, J. 2022. Agencia (y) moral en la era de la inteligencia artificial. In Filosofía, tecnopolítica y otras ciencias sociales: nuevas formas de revisión y análisis del humanismo, ed. M. Bermúdez and A. Sánchez Cotta, 127–147. Madrid: Dykinson. ———. 2023. El estatus moral de las entidades de inteligencia artificial. Disputatio. Philosophical Research Bulletin 12 (24): 241–249. Llorca-Albareda, J., and G. Díaz-Cobacho. 2023. Contesting the consciousness criterion: A more radical approach to the moral status of non-humans. AJOB Neuroscience 14 (2): 158–160. https://doi.org/10.1080/21507740.2023.2188280. Loh, J. 2019. Responsibility and robot ethics: A critical overview. Philosophies 4 (4): 58. https:// doi.org/10.3390/philosophies4040058. Macnamara, C. 2015. Blame, communication, and morally responsible agency. In The nature of moral responsibility: New essays, ed. R.  Clarke, M.  McKenna, and A.M.  Smith, 211–236. Oxford: Oxford University Press. Maibom, H. 2009. Feeling for others: Empathy, sympathy, and morality. Inquiry 52: 483–499. Marino, D., and G. Tamburrini. 2020. Learning robots and human responsibility. In Machine ethics and robot ethics, ed. W. Wallach and P. Asaro, 377–382. London: Routledge. Matheson, B. 2012. Manipulation, moral responsibility, and machines. In The machine question: AI, ethics and moral responsibility, ed. D.  Gunkel, J.  Bryson, and S.  Torrance, 25–29. The Society for the Study of Artificial Intelligence and Simulation of Behaviour. Matthias, A. 2004. The responsibility gap: Ascribing responsibility for the actions of learning automata. Ethics and Information Technology 6 (3): 175–183. https://doi.org/10.1007/ s10676-­004-­3422-­1. McKenna, M.A.C., and D. Justin. 2015. Compatibilism. In The Stanford encyclopedia of philosophy, ed. E.N. Zalta. http://plato.stanford.edu/archives/sum2015/entries/compatibilism/. McNally, P., and S. Inayatullah. 1988. The rights of robots: Technology, culture and law in the 21st century. Futures 20 (2): 119–136. Miller, L.F. 2015. Granting automata human rights: Challenge to a basis of full-rights privilege. Human Rights Review 16 (4): 369–391. https://doi.org/10.1007/s12142-­015-­0387-­x. Moor, J.H. 1985. What is computer ethics? Metaphilosophy 16 (4): 266–275.

82

J. Llorca Albareda et al.

Mosakas, K. 2021. On the moral status of social robots: Considering the consciousness criterion. AI & SOCIETY 36 (2): 429–443. https://doi.org/10.1007/s00146-­020-­01002-­1. Müller, V.C. 2021. Is it time for robot rights? Moral status in artificial entities. Ethics and Information Technology 23 (4): 579–587. https://doi.org/10.1007/s10676-­021-­09596-­w. Nadeau, J.E. 2006. Only androids can be ethical. In Thinking about android epistemology, ed. K.M. Ford, C. Glymour, and P. Hayes, 241–248. Palo Alto: AAAI Press. Nagel, T. 1974. What is it like to be a bat? The Philosophical Review 83 (4): 435–450. Nagenborg, M. 2007. Artificial moral agents: An intercultural perspective. International Review of Information Ethics 7 (9): 129–133. https://doi.org/10.29173/irie14. Neely, E. 2014. Machines and the moral community. Philosophy & Technology 27 (1): 97–111. https://doi.org/10.1007/s13347-­013-­0114-­y. Neuhäuser, C. 2015. Some Sceptical remarks regarding robot responsibility and a way forward. In Collective agency and cooperation in natural and artificial systems, ed. C.  Misselhorn, 131–146. Cham: Springer. Picard, R. 1997. Affective computing. Cambridge, MA: MIT Press. Powers, T.M. 2013. On the moral agency of computers. Topoi 32 (2): 227–236. https://doi. org/10.1007/s11245-­012-­9149-­4. Purves, D., R.  Jenkins, and B.J.  Strawser. 2015. Autonomous machines, moral judgment, and acting for the right reasons. Ethical Theory and Moral Practice 18 (4): 851–872. https://doi. org/10.1007/s10677-­015-­9563-­y. Raz, J. 1984. On the nature of rights. Mind 93 (370): 194–214. Regan, T. 2004. The case for animal rights. Oakland: University of California Press. Rueda, J., and F. Lara. 2020. Virtual reality and empathy enhancement. Ethical aspects. Frontiers in Robotics and AI 7: 506984. https://doi.org/10.3389/frobt.2020.506984. Russell, S., and P. Norvig. 2005. AI: A modern approach. Learning 2 (3): 4. Schwitzgebel, E., and M. Garza. 2015. A defense of the rights of artificial intelligences. Midwest Studies in Philosophy 39: 98–119. https://doi.org/10.1111/misp.12032. Searle, John. 1980. Minds, brains, and programs. Behavioral and Brain Sciences 3 (3): 417–424. Shen, S. 2011. The curious case of human-robot morality. In Proceedings of the 6th international conference on human-robot interaction, 249–250. New York: Association for Computer Machinery. Shepherd, J. 2018. Consciousness and moral status. London: Routledge. Shevlin, H. 2021. How could we know when a robot was a moral patient? Cambridge Quarterly of Healthcare Ethics 30 (3): 459–471. https://doi.org/10.1017/S0963180120001012. Shoemaker, D. 2015. Responsibility from the margins. Oxford: Oxford University Press. Singer, P. 1981. The expanding circle: Ethics and sociobiology. Oxford: Clarendon Press. ———. 2011. Practical ethics. Cambridge: Cambridge University Press. Sliwa, P. 2015. Moral worth and moral knowledge. Philosophy and Phenomenological Research 93 (2): 393–418. https://doi.org/10.1111/phpr.12195. Solum, L.B. 1992. Legal personhood for artificial intelligences. North Carolina Law Review 70: 1231. Søraker, J.H. 2014. Continuities and discontinuities between humans, intelligent machines, and other entities. Philosophy & Technology 27 (1): 31–46. https://doi.org/10.1007/s13347-­013-­0132-­9. Sparrow, R. 2007. Killer robots. Journal of Applied Philosophy 24 (1): 62–77. https://doi. org/10.1111/j.1468-­5930.2007.00346.x. Strawson, P.F. 1962. Freedom and resentment and other essays. Proceedings of the British Academy 48: 1–25. Sullins, J.P. 2011. When is a robot a moral agent? In Machine ethics, ed. M.  Anderson and S.L. Anderson, 151–161. Cambridge: Cambridge University Press. Tigard, D.W. 2021a. There is no techno-responsibility gap. Philosophy & Technology 34 (3): 589–607. https://doi.org/10.1007/s13347-­020-­00414-­7.

4  The Moral Status of AI Entities

83

———. 2021b. Artificial moral responsibility: How we can and cannot hold machines responsible. Cambridge Quarterly of Healthcare Ethics 30 (3): 435–447. https://doi.org/10.1017/ S0963180120000985. ———. 2021c. Responsible AI and moral responsibility: A common appreciation. AI and Ethics 1 (2): 113–117. https://doi.org/10.1007/s43681-­020-­00009-­0. Torrance, S. 2013. Artificial agents and the expanding ethical circle. AI & SOCIETY 28 (4): 399–414. https://doi.org/10.1007/s00146-­012-­0422-­2. van Wynsberghe, A. 2022. Social robots and the risks to reciprocity. AI & SOCIETY 37 (2): 479–485. https://doi.org/10.1007/s00146-­021-­01207-­y. van Wynsberghe, A., and S. Robbins. 2019. Critiquing the reasons for making artificial moral agents. Science and Engineering Ethics 25 (3): 719–735. https://doi.org/10.1007/s11948-­018-­0030-­8. Verbeek, P.P. 2005. What things do? Philosophical reflections on technology, agency, and design. Pennsylvania: Pennsylvania State University Press. ———. 2011. Moralizing technology: Understanding and designing the morality of things. Chicago: University of Chicago Press. Veruggio, G., and F.  Operto. 2008. Roboethics: Social and ethical implications of robotics. In Springer handbook of robotics, ed. B. Siciliano and O. Khatib, 1499–1524. Berlin: Springer. Véliz, C. 2021. Moral zombies: why algorithms are not moral agents. AI & SOCIETY 36: 487–497. https://doi.org/10.1007/s00146-­021-­01189-­x. Wallace, R.J. 2014. Practical reason. In The Stanford encyclopedia of philosophy, ed. E.N. Zalta. http://plato.stanford.edu/archives/sum2014/entries/practical-­reason/. Wallach, W. 2010. Robot minds and human ethics: The need for a comprehensive model of moral decision making. Ethics and Information Technology 12 (3): 243–250. https://doi.org/10.1007/ s10676-­010-­9232-­8. Wallach, W., and C.  Allen. 2008. Moral machines: Teaching robots right from wrong. Oxford: Oxford University Press. Wareham, C.S. 2021. Artificial intelligence and African conceptions of personhood. Ethics and Information Technology 23 (2): 127–136. https://doi.org/10.1007/s10676-­020-­09541-­3. Warren, M.A. 1997. Moral status: Obligations to persons and other living things. Oxford: Clarendon Press. Watson, G. 1996. Two faces of responsibility. Philosophical Topics 24 (2): 227–248.

Part II

Ethical Controversies About AI Applications

Chapter 5

Ethics of Virtual Assistants Juan Ignacio del Valle, Joan Llorca Albareda, and Jon Rueda

Abstract  Among the many applications of artificial intelligence (AI), virtual assistants are one of the tools most likely to grow in the future. The development of these systems may play an increasingly important role in many facets of our lives. Therefore, given their potential importance and present and future weight, it is worthwhile to analyze what kind of challenges they entail. In this chapter, we will provide an overview of the ethical aspects of artificial virtual assistants. First, we provide a conceptual clarification of the term ‘virtual assistant’, including different types of interfaces and recommender systems. Second, we address three ethical issues related to the deployment of these systems: their effects on human agency and autonomy and the subsequent cognitive dependence they would generate; the human obsolescence that may cause a generalized extension of the dependence problem; and the invasions of privacy that virtual assistants may cause. Finally, we outline the debates about the use of virtual assistants to improve human moral decisions and some areas in which these systems can be applied.

5.1 Introduction Virtual assistants are everywhere, helping us in the way we shop, write, look for information, find new friends and even love, maintain a healthy lifestyle, educate our children, clean our houses, drive our cars, and a long—and continuously increasing—list of tasks that they can facilitate. Although they might seem something rather new, unveiled with the arrival of well-known systems such as Alexa, Siri, or Echo, virtual assistants have been around since the early days of computing, as Expert Systems used in very specific domains like finance or medicine. When computers became widely available, and J. I. del Valle (*) · J. Llorca Albareda · J. Rueda Department of Philosophy I, University of Granada, Granada, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 F. Lara, J. Deckers (eds.), Ethics of Artificial Intelligence, The International Library of Ethics, Law and Technology 41, https://doi.org/10.1007/978-3-031-48135-2_5

87

88

J. I. del Valle et al.

particularly with the advent of the Internet, we saw the emergence of applications that helped us in different tasks, from the primitive (and often hated) Microsoft Office assistants to the initial Recommender Systems used primarily in commercial websites. The range of tasks they can be used for, as well as their performance and the complexity of their interface (i.e., how we interact with them), have been increasing ever since and, with the use of Artificial Intelligence in the last decade, they have evolved to the multipurpose, conversational, ubiquitous, smart assistants that we know today. This does not stop here; on the contrary, new technical developments like generative AI (e.g., ChatGPT) or the expected “quantum leap” in computing power will lead us to unlock yet-to-be-discovered applications of virtual assistants. These developments promise huge benefits for individuals and for society at large, but they also entail several issues, some already evidenced in the last years, which have raised significant public concern. This includes voice assistants sending fragments of people’s discussions to the system’s developer, or the disclosure of images taken by vacuum cleaners, which are a clear breach of our privacy, regardless of what their purpose is. The Cambridge Analytica scandal is another issue that received widespread media coverage. In this case, user profile information was allegedly used to create targeted political advertisements to influence the 2016 US presidential election and the Brexit referendum. They used recommender systems to target users with personalized political ads based on their personality traits, preferences, and behaviors, clearly entailing both privacy and manipulation problems. This chapter discusses the ethical impact of virtual assistants, including, but not limited to, the aforementioned problems. First, we briefly characterize virtual assistants, highlighting their most important features for the purpose of an ethical assessment. Then, we dive into prominent issues concerning these systems. Among the different problems often discussed in the AI ethics domain, we have selected human agency and autonomy, human obsolescence, and privacy and data collection as the most relevant ones in the context of virtual assistants. Finally, since increasing philosophical literature is dealing with the impacts of AI-powered virtual assistants on human morality, we devote the last section to a specific case thereof: the ethics of using AI to support moral decision-making and to improve moral abilities.

5.2 What Are Virtual Assistants? The aspiration to relate and converse with machines is not new. Three centuries ago, we already found attempts to reproduce human speech and language in artificial entities (Guy 2022). That dream would be fulfilled in the mid-twentieth century: Bell Laboratories created Audrey, an automatic speech system that was able to recognize digits coming from a single voice (Pinola 2011). After Audrey came others, such as Shoebox, whose progressive advances have eventually led us to Siri and Alexa, the most used virtual assistants nowadays.

5  Ethics of Virtual Assistants

89

The common usage of the term ‘virtual assistant’ generally refers to voice assistants (Wald et al. 2023). Voice assistants are defined as all those software agents that are able to recognize spoken language, identify commands and queries, and perform tasks accordingly (Hoy 2018). The tasks can be very diverse and have very different levels of complexity: from sending and reading messages to telling jokes and setting alarms. However, the equivalence between virtual assistants and voice assistants incurs the well-known synecdochic fallacy (Berry and Martin 1974): the part is taken for the whole. The growing popularity of Siri, Alexa, Cortana, or Google Assistant has led to thinking, in common usage, that virtual assistants are voice assistants; when, on the contrary, voice assistants are just one of the devices that can be categorized as virtual assistants (Rafailidis and Manolopoulos 2019). This is the main source of misunderstandings about the concept of ‘virtual assistant’, but it is not the only one. That is why we will now clarify three types of misunderstandings to which this concept is usually subject: the input-output models; the underlying functions; and the type of support. First, input models refer to the initial form of the information units that are captured by the virtual assistant, while output models shape the way in which the results of the task carried out by the system are communicated (Spallazzo et al. 2019). For example, in current voice assistants, this process takes place through the following steps: deep learning strategies are used to develop Automatic Speech Recognition (ASR) models; to subsequently translate speech to text and capture the meaning of the command or query through Natural Language Understanding (NLU) models; this information is processed by inference engines and interaction models to produce the final output; and finally, it is translated again through Natural Language Generation (NLG) and Text to Speech Synthesis (TTS) to be understood by the user (Kepuska and Bohouta 2018). It seems, in light of these types of systems, that the interfaces of virtual assistants can only recognize spoken and written language and that their responses are translated into written and/or spoken language. However, the possibilities of virtual assistants are much greater. Input models such as graph models, which analyze videos and images in real-time (Schmidt et al. 2018), or gesture models, which allow the analysis of users’ body movements and facial gestures (Someshwar et al. 2020), can fall within the possibilities of virtual assistants. The combination of many of these models is giving rise to multimodal Next-Generation Virtual Assistants (Kepuska and Bohouta 2018).1 Second, the underlying goals of virtual assistants refer to the main purposes they should satisfy with their operation. This issue is particularly complicated to define for two reasons: first, the variability of functions and tasks that virtual assistants can perform is enormous; second, the technical development of these types of devices means that they are continually introducing new changes, which makes it impossible to have a closed picture of the functions and tasks they can perform. Precisely because of this breadth of functions, we propose to incorporate recommender  However, due to the poor development and low commercialization of other types of virtual assistants, the technical literature usually speaks of two typologies: smartspeakers and chatbots (Rafailidis and Manolopoulos 2019). 1

90

J. I. del Valle et al.

systems into the concept of ‘virtual assistant’. To explain the feasibility of this approach, it is essential to distinguish between a technical sense and a conceptual sense. In the technical sense, virtual assistants and recommender systems are distinct. This is mainly because the technical efforts regarding virtual assistants focus on the best ways to recognize spoken, written, or body language and respond to queries, commands, or situations that are recognized; whereas recommender systems are about streamlining and facilitating the flow of information on the network to prioritize user preferences (Lü et al. 2012; Park et al. 2012; Ricci et al. 2011). In particular, many important differences can be identified: the type of information needed, informational scarcity about user preferences, or adaptation to changing preferences (Rafailidis and Manolopoulos 2019). Virtual assistants usually rely on a broader and more general database, the internet, and do not have to customize their databases diachronically or synchronously, as is the case with recommender systems.2 However, this has not prevented technical projects that take both types of systems at the same time. It is argued that the development of a virtual assistant linked to a recommender system would greatly improve the quality of the latter, as it could make explicit—through conversations in which the system would elaborate questions to resolve possible ambiguities and clarify desires—the preferences of users (Darshan et al. 2019; Güell et al. 2020; Jannach et al. 2021). The conceptual sense is different. Our aim is to delimit the object of ethical research and to analyze, through their technical and social particularities, what kinds of ethically relevant features and consequences derive from virtual assistants. We believe that this work invites us to understand recommender systems as a type of virtual assistant. We think this is the case for two reasons: (i) the problem of ethical proliferation; (ii) the weak ethical uniqueness of the difference. On the one hand, Henrik Sætra and Danaher (2022) have warned of the danger of fragmenting ethical discussions of a general nature into different ethical sub-disciplines or artifacts with morally relevant consequences. Ethical problems related to issues such as privacy, manipulation, or human obsolescence, which we will address in the next section, are common to many subdisciplines and technological artifacts. It is thus a mistake to treat them separately when their meaning remains general. This is the case of virtual assistants and recommender systems. On the other hand, each ethical artifact and sub-discipline has a specific ethical importance, so that, although purified of those problems that can only be answered at a more general level, a thorough ethical analysis must show this specificity and distinguish between higher and lower levels of analysis (Llorca Albareda and Rueda 2023). The generality of ethical issues concerning virtual assistants and recommender systems can be extended to other AI and technological artifacts, but not in a way that can only be discussed at a general level. Recommender systems and virtual assistants possess an ethical uniqueness that invites particular scrutiny; but they are  By both terms, we refer to the two main types of recommender systems: content-based and collaborative. The former orders certain items according to their affinity with specific contents and preferences (synchronously), while the latter prioritize adaptation to user preferences over a period of time (diachronically). Hybrid approaches are also possible (Ricci et al. 2011). 2

5  Ethics of Virtual Assistants

91

not singular separately; rather, both are jointly singular with respect to other AI, technological artifacts, and ethical subdisciplines. And this is due, we believe, to their ubiquity, their role as cognitive delegates, and their general function. First, virtual assistants and recommender systems are present in the most widely used devices and applications (Rafailidis and Manolopoulos 2019) and their degree of pervasiveness is not as clear as with other systems whose hardware plays a more important role (Weiser 1991). Second, they are two sides of the same phenomenon. The technical approaches of virtual assistants emphasize the most effective ways to communicate with machines, while those of recommender systems focus on narrowing choices based on consumer preferences. However, both seek to assist humans in performing certain tasks, be they the acquisition of certain information or, conversely, the filtering of information based on user preferences. In particular, both recommender systems and commonly understood virtual assistants have cognitive delegation as their main function: they are elaborated in order to outsource certain tasks and mental operations to an AI. The specific objective and the modes of communication and interaction are therefore part of the same general function. Hence, we understand that, as a type of specific goal and task, a recommender system can be conceived as a type of virtual assistant. Thirdly, this broadening of the concept of virtual assistant must at the same time underline the role and importance of the hardware. We pointed out earlier that one of the fundamental ethical aspects of virtual assistants—understood in the broad sense just described—is their ubiquity. This ubiquity is accompanied by a wide degree of pervasiveness that, by virtue of being eminently virtual, goes largely unnoticed (Weiser 1991). This aspect is fundamental in moral terms and differentiates it from ethical problems associated with robots. Robots (a) can intervene in the external state of affairs and (b) possess a robotic body that occupies a certain place in space (Lin et al. 2014). This makes their workings easier to identify, and their ethical problematics take on a different tinge. Thus, this element constitutes a relevant ingredient when conducting an ethical analysis of virtual assistants and stands out as one of the most important aspects of their ethical uniqueness.3 In sum, the clarification of the three misunderstandings that usually accompany the understanding and common use of the term ‘virtual assistants’ has led us to conceive of them, in the ethical analysis we will carry out in the next section, as follows: as AI systems that, firstly, can recognize different types of linguistic and non-linguistic inputs as units of information and produce different types of linguistic outputs; secondly, as AI systems that perform various tasks that support human beings; and, thirdly, as AI systems that are not strictly linked to hardware due to their virtual character.4

 This is not to say that the hardware of virtual assistants is unimportant. Some of them, such as Alexa, are “embodied” and invite certain kinds of physical relationships with them. For further development of the issue, see Spallazzo et al. (2019). 4  It should also be noted that, apart from the multiplicity of functions and tasks, they can be used in different domains (Barricelli and Fogli 2021; Lugano 2017; Wald et al. 2023). However, some have 3

92

J. I. del Valle et al.

5.3 What Ethical Issues Do Virtual Assistants Raise? The last section provided an insight into the main characteristics of virtual assistants. It highlighted that they are being used for a wide range of tasks, which is continuously increasing due to their technical evolution, particularly since the advent of AI. Indeed, virtual assistants represent a paradigmatic use case of AI and entail several problems usually found in the AI Ethics literature such as data collection, privacy, fairness, transparency, explainability, human agency and oversight, and value alignment. Since many of these problems are addressed elsewhere in this book and detailing each one is beyond the scope of this chapter, we will focus on those issues particularly relevant to the use of virtual assistants and the cognitive offloading they entail. This will take us to consider the virtual assistants’ impact on the user’s agency and autonomy and the problems usually addressed under this label, like manipulation, cognitive degeneration, and technological dependency. This will provide the grounds to assess the longer-term effect of human obsolescence due to the continuous increase in the tasks being outsourced to AI Assistants. Finally, we will deal with privacy and data collection issues, which have often been in the spotlight in recent years.

5.3.1 Human Agency and Autonomy 5.3.1.1 Manipulation and Undue Influences Manipulation with virtual assistants arises when the system performs its tasks in line with hidden third-party goals that are not necessarily aligned with the user’s preferences, values, or desires. This issue has been systematically analyzed by Christopher Burr and colleagues (Burr et al. 2018). Their often-cited work assesses the interaction of Intelligent Software Agents (ISA), a concept largely coextensive with virtual assistants and human users, and provides a taxonomy of ISA/users interactions including different forms of direct manipulation: • Deception • Persuasion, best represented by nudging • Coercion Deception occurs when a decision is based on false or misrepresented information. Most examples involve the use of commercial information disguised as relevant advice, with so-called clickbait—i.e., content designed to attract the user’s attention and generate clicks on a particular link—being a paradigmatic example. The use of AI has led to more sophisticated deception attempts, particularly those based on a

questioned whether the transfer of virtual assistants to other types of activities is appropriate due to the increasing specificity of their operations (Tenhundfeld et al. 2021).

5  Ethics of Virtual Assistants

93

more accurate user segmentation, also known as micro-targeting. This technique uses the result of the correlation of a large amount of data from the user to dynamically predict the information that would attract their attention. Nevertheless, due to its immediate effect, deception is normally detected straightforwardly, which leads to more limited impacts. Persuasion with nudges is another type of manipulation. Nudges are designed to exploit known biases below the level of the person’s awareness, therefore effectively bypassing their deliberation capability (Thaler and Sunstein 2008). Nudges work because the resources that a person can devote to decision-making (e.g., deliberation time, manageable information) are limited and most of our decisions are based on cognitive shortcuts and heuristics (Kahneman 2011). Thus, presenting the space of options or “choice architecture” (Thaler and Sunstein 2008), in a specific way directly influences that decision-making. Traditional examples of nudging include placing healthier food more prominently by default, using different sizes and colors for different options, arranging diverse suggestions in a particular way, and default opt-ins for some allegedly better options (Thaler and Sunstein 2008). Nudges can be well-intentioned, and Thaler and Sunstein portray many examples of their good use when introducing them within their liberal paternalism5 proposal. However, in the last few years, the use of AI techniques has changed how the choice architectures are presented, making them more dynamic (i.e., providing a real-time change of the space of options) and based on the user’s feedback and predicted preferences. This has led to so-called hypernudges, an extremely effective type of nudge, which should be clearly regulated as its effects cannot be controlled easily by the user (Yeung 2017). Direct coercion with virtual assistants occurs when the system requests something to access its services. This can be details about the user’s identity from Google or Facebook, the acceptance of endless pages of incomprehensible terms and conditions, or a previous visualization of a promotional video or advertisement (Burr et al. 2018). There are other, less straightforward examples of coercion such as the use of virtual assistants due to peer pressure to align to current trends or to enhance performance at work. These cases, not directly linked to the system’s design, might lead to dependency problems, which might be particularly important depending on the task being delegated (see below). We now turn to the longer-term effects of the use of virtual assistants. When we refer to those systems powered by AI, it is common to use the concept of “utility” for both the user and the system, and the problems entailed by their potential misalignment. Whereas utility in a human being refers to the subjective utility and is mainly linked with their preferences, values, and desires, utility in AI-powered systems refers to the utility function, i.e., the goal that the system will attempt to  Liberal paternalism (Thaler and Sunstein 2008) is a concept in behavioral economics and political philosophy that suggests that it is possible to nudge people towards making better choices without infringing on their freedom of choice. It involves designing policies and interventions that steer people towards choices that are in their best interests, while still allowing them the freedom to make their own decisions. 5

94

J. I. del Valle et al.

maximize. Indeed, a key feature of this technology is that the systems are not programmed but trained with a specific goal, their inner workings—i.e., the trained model—being largely opaque to their users and even their designers. The main issue is that, even in case the system designers were committed to fully aligning with the user’s subjective utility, it would not be possible to translate ambiguous and seldom explicit preferences, values, and desires into a utility function. Rather, the development process is based on maximizing an allegedly relevant indicator (e.g., user engagement, explicit feedback, and so on) used to train an artificially generated model. This might lead to some unintended consequences. A clear example is the case of recommender systems trained to maximize the user’s attention or engagement. A priori, one could think that the length of time a user is using the system is a measure of their satisfaction. A posteriori, one may realize that the best way to succeed in this goal is to provide the user only with what they like, potentially resulting in a polarization and a radicalization of the user’s preferences and values. Interestingly, in most cases of polarization, the system designers do not try to manipulate their users; polarization is a non-intended effect of a system trained with a wrong –– although not necessarily malicious–– goal. 5.3.1.2 Cognitive Degeneration and Dependency Human cognitive degeneration is the consequence of not exercising the cognitive capabilities that we are outsourcing to the AI Assistant. For example, some of us have become unable to find an address or read a map since Google Maps is in everyone’s phone, or we might be unable to write without grammar or spelling mistakes without MS Word Editor. As already mentioned, with the use of AI the range of tasks that virtual assistants can carry out has multiplied. Think for example of Generative AI (e.g., ChatGPT) that can not only correct your text but write the text on your behalf. Will this kind of assistance lead to a degeneration in people’s writing skills, and what is exactly the issue? John Danaher has analyzed the degeneration effect caused by the use of virtual assistants (Danaher 2018). He echoes previous accounts of this effect, mainly relying on the degeneration argument as presented by Nicholas Carr in his book The Glass Cage (2014). Although Danaher accepts the deskilling risks resulting from the use of virtual assistants, he invites us not to take the slippery slope that leads us to think that we will become a “chronically distracted and suckers for irrelevancy” type of being, unable “to think deep thoughts by ourselves” (Danaher 2018). He takes the traditional concepts of instrumental and intrinsic value from moral philosophy to differentiate between instrumentally and intrinsically valued tasks. He contends that “the intrinsic and instrumental rewards of thinking are dissociable, and this has an important impact on the ethics of AI outsourcing” (Danaher 2018). Intrinsically valued tasks are those directly linked with the user’s well-being and characterized by a sense of reward when it is accomplished, whereas instrumental tasks are those that enable us to get other good things, and their outsourcing seems

5  Ethics of Virtual Assistants

95

less problematic, even if this entails the abovementioned degeneration effect. A remaining issue, also identified by Danaher, is the dependency problem. Indeed, even for pure instrumental tasks, the degeneration effect might have some important consequences for the user’s well-being in case the AI Assistance would be temporarily or permanently unavailable. From a practical point of view,6 the burden of deciding which tasks are delegated should fall to the user. For example, as philosophers, writing this chapter with the hope that the concepts we explain are clear and useful for future readers is intrinsically valuable for us and entails the sense of reward directly linked with our well-­ being, as mentioned above. This is something that we do not find in other activities that we consider purely instrumental, such as finding the shortest path somewhere. Thus, we should not rely on ChatGPT to write our texts, but we certainly may use— and be dependent on—Google Maps, and if our phones run out of battery, we can ask someone for directions or just be late.

5.3.2 Human Obsolescence One of the most well-known risks of AI is human obsolescence. It is predicted that AI will reach such high levels of cognitive performance that it will be able to do all tasks better than humans (Agar 2019).7 Most analyses on this topic have taken labor and the danger of complete automation as their main object of discussion (Danaher 2019a). It has been mostly economists and sociologists who have participated in them. Three types of positions can be identified (Boyd and Holton 2018): (i) researchers who state that the transformations of AI and robotics repeat the trends of other technological revolutions (Cowen 2011); (ii) those who argue that AI and robotics do entail profound changes, but that these will bring great benefits to humanity (Brynjolfsson and McAfee 2014); and (iii) those who adduce that in the face of the profound changes accompanying AI, numerous existential dangers lie ahead (Ford 2015). The ethical analysis of human obsolescence has, however, been more limited (Agar 2019; Danaher 2019a, 2022; Sparrow 2019, 2015). Virtual assistants may become one of the technologies that threaten the importance of humans in many spheres of activity the most. With a simple verbal or written request, the virtual assistant could perform many tasks at a greater speed and with a higher degree of efficiency. To see what is real about these dangers and how we can understand them ethically, let us start with the automation paradox. This indicates that the better and more useful AI is in complementing and optimizing our  We avoid entering into the theoretical discussion about whether intrinsically valuable tasks exist or not. 7  The displacement and replacement of humans by AI need not occur only at the individual level, i.e., that an AI can single-handedly surpass all the capabilities of a human being to perform a given task. It can also occur at the system level: AI with certain capabilities might work together to do everything that used to be done by a human (Danaher 2019a). 6

96

J. I. del Valle et al.

work and tasks, the more we are threatened by the danger of obsolescence (Danaher 2019a). This paradox can be subdivided into two: the paradox of complexity and the paradox of complementarity. The first indicates that the more complex an AI is to perform a certain task and the more developed its rational capabilities are, the closer its properties will be to those of humans and the more difficult it will be to see it as a simple tool. The second refers to the fact that the better an AI performs its complementary task, the more danger there is that it will be able to do the task better on its own than as a complement to humans. Each of the paradoxes has a different conception of human obsolescence. We will argue that only the second conception is ethically acceptable and show what ethical positions can be developed from it. At a generic level, obsolescence is used in common speech to refer to an object that ceases to be used or produced. Human obsolescence refers, by extension, to the fact that human capabilities are no longer suitable for the performance of certain tasks.8 This presupposes a certain scale of values: one ceases to be useful and adequate for the performance of something (Sparrow 2019). The complexity paradox presupposes that general human obsolescence is possible, i.e., that humans can become obsolete in the performance of any activity (Danaher 2022). This is problematic for three reasons. First, despite the degree of autonomy of AI having increased greatly with the development of deep learning and artificial neural networks, it is still complicated and far off to forecast the advent of Singularity (Boden 2016). Second, general obsolescence implies that either humans are defined by a single value scale, or all value scales will be co-opted by AI. The historicity and plasticity of human beings raise serious objections to the first option (Danaher 2022) and the second would imply the emergence of a superintelligence, which would lead us to the first reason. Third, even if AI were instrumentally superior in certain spheres of activity and value scales, that does not mean that these cannot remain valuable to humans, that is, intrinsically valuable (Danaher 2022). The existence of unbeatable AI at chess, for example, has not led humans to stop playing chess. Derek Thompson (2015) has shown how craftsmanship, coupled with total control of the production process by the worker, is considered an intrinsically valuable activity beyond the remuneration it brings. Thus, given the difficulty of the scenario of general human obsolescence, we will limit ourselves to an ethical analysis of the possibility of narrow human obsolescence. This is the presupposition underlying the paradox of complementarity. It is defined as a state of affairs in which human capabilities cease to be useful and adequate for the performance of some tasks (Danaher 2022). AI would cease to be complementary to human activities and be able to outperform humans unaided and autonomously. This is not only a future possibility, but we already find economic sectors in which a good part of the jobs is susceptible to automation (Acemoglu and Restrepo 2018; Danaher 2019a; Frey and Osborne 2017). Rather than engaging in  Danaher (2022) argues that human obsolescence can occur in three ways: when internal human capabilities disappear or diminish; when standards or expectations about what capabilities must be possessed to perform a given task increase; and because of the failure of society as a whole to create new economic and social possibilities for human capabilities to be accommodated. 8

5  Ethics of Virtual Assistants

97

factual discussions about the present and the future, we will be concerned with showing the normative positions that follow from the narrow human obsolescence that may result from the use and proliferation of virtual assistants. Is narrow human obsolescence desirable? On the one hand, it can be argued that it is not for two reasons: its individual undesirability and its social undesirability. First, virtual assistants can lead to deskilling. That is, by outsourcing human tasks to them, it is possible that some of the capabilities that were previously considered valuable may cease to be developed to the same degree as in the past. These capabilities can be of three types: cognitive, moral, and interactional. At the cognitive level, capabilities such as personal reflection or mathematical thinking require continuous effort and practice that may be lost when these tasks are outsourced to virtual assistants. At the moral level, ethical advisors (Giubilini and Savulescu 2018) may negatively affect the moral capabilities of human beings. The user would simply enter certain values into the system and wait for the system to give the most consistent courses of action in accordance with them. Critical thinking (Lara 2021; Lara and Deckers 2020) or capacities linked to moral agency (Danaher 2019b) could be severely affected. At the interactional level, romantic or friendship relationships with virtual assistants may involve deception and manipulation. The interlocutors would not possess the inner qualities necessary to engage in a true relationship (Nyholm 2020). On the other hand, virtual assistants could have very harmful effects on a cultural and economic level. They could destroy numerous jobs and involve profound social transformations. Moreover, contemporary culture is eminently contributivist, i.e., the sources of social satisfaction are to be found in the contributions that individuals make to society (Danaher 2022). To the extent that these contributions are no longer possible, general dissatisfaction will grow.9 On the other hand, narrow human obsolescence may be desirable. We find five reasons in the literature. First, it may make virtual assistants take over unpleasant or worthless social activities. Danaher (2019a) argues, for example, that work brings so much trouble and suffering that a world without human labor would be desirable: (i) virtual assistants can perform dull, dirty, and dangerous jobs; (ii) in a world where working conditions are precarious and psychologically demanding, virtual assistants can prevent humans from working under such conditions. Second, the benefits from the development of AI, particularly of virtual assistants, may be so great that it is not desirable to put up barriers to these technologies. Third, as stated above, an activity for which a human being becomes instrumentally obsolete need not lose its value. In fact, it may be intrinsically valuable (Danaher 2022). Fourth, a new culture of obsolescence may be desirable. A culture that promotes sources of social satisfaction of external standards is dangerous. This transformation may lead

 If we follow the ways in which human obsolescence occurs as described in footnote 6, a large part of these problems stems from the excessive increase in social standards with the advent of AI. However, even if it is not the cause of obsolescence, the effect of obsolescence is the worsening of certain human capabilities. 9

98

J. I. del Valle et al.

to more emphasis on those activities with intrinsic value.10 Fifth, virtual assistants need not only not to worsen our capabilities, but they can also enhance them. As we will see in the fourth section, virtual assistants can even play a role in improving some of our moral abilities.

5.3.3 Privacy and Data Collection When interacting with users, virtual assistants collect personal data. The nature of this data and its collection, together with possible subsequent uses by the companies managing virtual assistants, raise privacy issues. Privacy is “the quality of having one’s personal information and one’s personal ‘sensorial space’ unaccessed” (Véliz 2019). We are particularly interested in informational privacy. Broadly speaking, informational privacy refers to the possibility of keeping certain information about ourselves away from the knowledge or observation of others. Since virtual assistants can collect information about us that we would prefer not to be available to others, these devices can pose a threat to our privacy. But why is privacy an important ethical concern? The value of privacy has been a widely discussed topic in moral philosophy and philosophy of law. There are different reasons for seeking to be free from intrusions that violate our privacy. For Rachels (1975), privacy is important because of the function personal information plays in social relationships. Many of our social roles are sustained because of the information we give and withhold about ourselves. Intimate relationships, for example, are built on and tightened with some doses of more sensitive personal information that we would hardly want to be available to everyone. Being able to control what personal information others access, therefore, is important for maintaining different sorts of meaningful social relationships. Moreover, others have argued that privacy is relevant on ethical grounds because it is fundamentally connected to personhood. Jeffrey H. Reiman (1976) claimed that the protection of privacy is based on a set of social rituals that allow the “continuing creation of ‘selves’ or ‘persons’” (p. 40). In addition to its value for maintaining social relationships and for the development of our personal identity, privacy is important for autonomy. According to James Griffin (2007), privacy facilitates normative agency since it allows for developing confidence in our capacities to make autonomous decisions. Being subject to the observation of others undoubtedly conditions our behavior, as attested by renowned philosophical works (Bentham 1791/1995; Foucault 1975/1977).

 Danaher (2022) gives the following example. Let us assume that we can only be self-fulfilled to the extent that we collaborate in socially recognized goals. We will feel good to the extent that we participate in these socially recognized goals. If we lived in a society with morally reprehensible goals, such as a Nazi society, we would hardly argue for a contributivist culture. Therefore, it is appropriate to value activities in themselves and not for the social benefits they bring. 10

5  Ethics of Virtual Assistants

99

Virtual assistants raise privacy issues in ethically problematic ways. For instance, consider the example of personal assistants (such as Alexa, Siri, or Google Assistant). Personal assistants based on voice-activated systems generated substantial privacy-­ related concerns. These assistants have powerful microphones and are activated through the audio input. However, virtual private voice assistants are “always on, always listening” (Anniappa and Kim 2021). The reason is that these devices are constantly waiting for the wake-up word. After hearing the wake-up word, personal assistants record and upload the conversation that follows the activation from the users to the Internet —which may also occur by unintentional activation after a similar sound (Anniappa and Kim 2021). Privacy issues are then sometimes related to vulnerabilities in cybersecurity. There could be attacks on the cloud servers that store users’ personal data (which is allegedly important to improve the device’s performance), producing a breach of privacy. Other cybersecurity problems may involve the direct and malicious hacking of virtual assistants. Having in mind that many voice assistants are located in domestic environments, they raise the risk of diminishing users’ privacy in a context that is traditionally considered particularly privacy-respectful. In consequence, if the risk is perceived as considerable, consumers may diminish their trust in virtual assistants (Wilson and Iftimie 2021). Finally, there is a more systemic aspect of privacy regarding the developers of the most widely used virtual assistants. The companies that develop the most popular digital technologies may have contentious business models based on the collection and commercialization of personal data. Big tech companies have a controversial reputation as “data predators”, with many cases of abuses disregarding users’ personal privacy. (Véliz 2021). Thus, data capitalism and the potential for surveillance from malicious actors elicit further problems of privacy regarding virtual assistants, unearthing the underlying unequal power relations between citizens, big corporations, and governments (Véliz 2021). All in all, intruding on personal privacy seems ethically undesirable in the absence of justified reasons. On ethical grounds, robust protection of the right to privacy of the virtual assistants’ users seems therefore crucial. The translation of this sensitive issue into the legal sphere may take the form of strict regulation of the collection of personal data in interactions with these digital devices, such as the European Union’s General Data Protection Regulation of 2016.

5.4 Should We Use Virtual Assistants to Improve Ethical Decisions? Among the many applications of virtual assistants, their uses in influencing human morality is a matter of growing interest. Can AI assist us with moral decisions? Could we turn to AI for moral advice? What theoretical models have been proposed for AI-based ethical assistants or moral advisors? What are the ethical issues in the

100

J. I. del Valle et al.

use of these systems? In this section, we offer a brief introduction to the ethical debate on AI-aided moral improvement and decision-making. Many proposals to use AI assistants in the field of human morality were born in relation to the moral enhancement debate (Savulescu and Maslen 2015; Klincewicz 2016; Giubilini and Savulescu 2018; Lara and Deckers 2020; Rueda 2023). Moral enhancement proposals had focused primarily on the use of drugs, neurotechnologies, or genetic innovations to positively influence cognitive, emotional, and motivational aspects of morality (see Rueda 2020, for a review of the debate). This is what has been known as ‘moral bioenhancement’, because it was based on biotechnological methods. The use of AI was, according to these publications, opening up new possibilities to influence moral behavior in a beneficial way, even overcoming some of the criticisms of moral bioenhancement, such as that a biologically based change in moral conduct is incomplete if it does not provide agents with good reasons to behave in that way (Lara and Deckers 2020; O’Neill et al. 2022). The very term ‘moral enhancement’, however, generates certain difficulties. Seemingly descriptive definitions of moral enhancement are used to mask normative views about what is most desirable in evaluative terms (Raus et  al. 2014). Moreover, the same concept of ‘enhancement’ often refers to interventions that lead to certain capabilities beyond what is normal or typical (in a statistical sense) in a population (Daniels 2000; Rueda et  al. 2021). However, although the degree of normality of particular moral functioning can be studied empirically, this type of framing is again clouded by the normative controversy of what capabilities are relevant in moral terms. Thus, many contributions dealing with AI assistants to make more ethical decisions have moved away from this rhetoric about moral enhancement (O’Neill et al. 2022). Here, accordingly, we will attend to different proposals by which AI can assist in human morality, without necessarily enhancing it. Let us start with a proposal by Julian Savulescu and Hannah Maslen (2015). In a seminal publication, these authors argued that the limitations of human moral psychology should encourage us to turn to AI systems. Savulescu and Maslen argued that AI could adopt different functions to improve moral decision-making. First, the “moral environment monitor” function would alert the agent about the presence of suboptimal environmental, physiological, and psychological factors that may negatively influence and impair decisional capabilities; such as sleep deprivation, hunger, physiological arousal, hormonal changes, psychoactive influence, environmental heat, crowdedness, or noisiness. Second, the function of “moral organizer” would help to establish and achieve a series of moral goals proposed by the agent. Third, the function of “moral prompter” would help to deliberate on morally problematic situations. And fourth, the “moral advisor” function would help to choose a course of action based on the agent’s preferences and values. Savulescu and Maslen, in short, pointed out that AI systems could help us to behave more morally, while respecting the moral autonomy of agents. The idea of using AI-based ethical advisors has gained further traction. In another important publication, Alberto Giubilini and Julian Savulescu proposed the model of the “artificial moral advisor” (Giubilini and Savulescu 2018). This proposal partially took up Roderick Firth’s famous idea of the “ideal observer”, namely, the

5  Ethics of Virtual Assistants

101

concept of an observer with ideal qualities who would be omniscient of the facts, omnipercipient in the sense of being able to take into account all information simultaneously, dispassionate, and consistent in judging (Firth 1952). The artificial moral advisor, according to Giubilini and Savulescu, would try to emulate some of the latter characteristics, consisting in AI software that outperforms the human brain in rapidness and efficiency in giving moral advice considering the given moral preferences of individuals. Furthermore, more recent proposals have looked at particular issues related to seeking moral advice from AI assistants. For example, some have advocated labelling as “moral experts” the AI devices that we might have an interest in approaching for moral advice (Rodríguez-López and Rueda 2023). The ethical justification for turning to AI systems for moral advice is similar to the rationale we may have for relying on other humans for the same purpose. AI-based moral advisors, however, are not exempt from ethical problems. The types of ethical issues are in fact conditioned by the specific forms of the envisioned assistants (O’Neill et al. 2022). Some examples have tried to overcome certain problems of the recently presented proposals. Perhaps the most paradigmatic case is that of the “Socratic assistant” for moral education offered by Francisco Lara and Jan Deckers (2020), also called “SocrAI” (Lara 2021). According to Lara and Deckers, the proposals by Savulescu and Maslen (2015) and Giubilini and Savulescu (2018) were problematic in an ethical sense: although supposedly intended to respect and implement users’ values, they relegate the agents to a secondary role in decision-­ making. To overcome this deficiency, the Socratic assistant (i.e., an AI-based voice assistant) would allow users to improve their moral decision-making skills through a conversation, but without receiving a direct indication on how to act. In this process of Socratic dialogue with the AI system, agents could receive empirical support, increase their conceptual knowledge, understand argumentative logic, learn about their personal limitations, determine the plausibility of a moral judgment, and receive advice on how to execute a decision. This would respect and reinforce the autonomy of moral agents, offering a kind of training through discursive and deliberative interaction with the Socratic assistant. Given the differences in previous models of AI-based ethical assistants, Elizabeth O’Neill, Michal Klincewicz, and Michiel Kemmer have distinguished between two main types of assistance (O’Neill et al. 2022). On the one hand, “preparatory assistance” would include the anticipatory advisor and training role of the ethical assistants. That is, it would consist in giving support (through advice and training) in advance for moral deliberations. On the other hand, “on-the-spot assistance” would focus on content-specific support with particular moral situations. In latter cases, these AI assistants may help make decisions or facilitate how to deploy a particular course of action. With this distinction in mind, it is easier to appreciate the different roles and virtues of the models discussed above. The Socratic assistant, for example, seems more focused on fulfilling a preparatory role than on-the-spot advice— although it may help with facilitation. In addition, ethical assistants may also be designed for specific domains. Healthcare is a salient field in this respect. For example, AI-guided ethical assistants may be beneficial for health professionals, physicians, researchers, or members of

102

J. I. del Valle et al.

clinical ethics committees. So far, there have been two prominent proposals to use AI advisors to make better ethical decisions in the clinical setting. MedEthEx and METHAD are prototypes of AI counselling systems that aim to guide healthcare workers in ethical dilemmas (Anderson et al. 2005; Meier et al. 2022). Curiously, both MedEthEx and METHAD are based on principlism—the methodology proposed by Beauchamp and Childress (1979/2001) for biomedical ethics from the prima facie principles of respect for autonomy, non-maleficence, beneficence, and justice. Furthermore, other authors have also argued for using AI to support bioethical decisions. AI could improve moral judgements in bioethical contexts and help overcome many of the sources of human error, which could be particularly interesting for value alignment in the allocation of scarce resources such as organs for transplant (Skorburg et  al. 2020; Sinnott-Armstrong and Skorburg 2021). Those initiatives are praiseworthy. As long as healthcare is an ethically sensitive area, providing AI decision support systems to assist moral decision-making is a positive effort. Nonetheless, those particular proposals are far from uncontroversial. To begin with, although principlism is a dominant theory in medical ethics, it is not a unanimously endorsed normative methodology (see McMillan 2018). Moreover, MedEthEx or METHAD are two proof-of-concept systems that show that the computability of principlism in algorithms is not always an easy task. For instance, Lukas Meier et al. (2022) were not able to include the principle of justice in the METHAD algorithm. As justice is an undeniable aspiration in some health decisions, this absence is regrettable. The role of autonomy, furthermore, is not completely well-developed in either MedEthEx or METHAD. Following the rationale of the Socratic assistant, using an algorithm to point out those cases in which one ethical principle should outweigh others does not necessarily enhance the moral capacities of decision-makers, nor does it ensure that these decisions are fully autonomous. Ideally, AI systems should assist decision-makers to help them to decide according to their own values, not according to those of the expert or to a concrete ethical theory (Elliott 2006). Finally, even if healthcare professionals— like all other humans—are plagued by errors and problematic psychological tendencies, AI systems can also be biased and still have important technical limitations. In the aforementioned case of transplant allocation, for example, the current tradeoff between explainability and accuracy is a sufficiently important issue in terms of fairness to contain the enthusiasm for using AI to guide such ethically important distributional decisions (Rueda et al. 2022). Last but not least, there are further concerns related to AI-based ethical assistants—regardless of the scope of application. Undoubtedly, we may have good reasons to use these aids to improve our moral decisions. Getting accustomed to using these ethical assistants, however, may have unintended consequences. As we saw, creating a problematic dependence on these systems would be undesirable (Danaher 2018). This dependence is even more problematic in the moral realm, since the ability to make moral decisions may be more significant than the ability to, for example, orient oneself without digital maps. So, due to habituation, there could also be losses in some significant moral capabilities, known as the problem of “moral

5  Ethics of Virtual Assistants

103

deskilling” (Vallor 2015). Another potential issue is related to the ethical problems of data processing, such as where the data needed to train the algorithms of these systems would be taken from and what would happen to the data generated by these assistants in the interaction with users. Finally, the problems (addressed in the third section) of undue influence and risks of manipulation, both by well-intentioned and malicious designs of these systems, deserve further attention in the future.

5.5 Concluding Remarks Artificial virtual assistants are increasingly occupying a dominant social role. They are found in our mobile devices and personal computers without us being fully aware of their existence. While it is true that they involve extensive advantages and may help us in the performance of many tasks, both their technical operation and the contexts in which they operate result in significant ethical issues. In this chapter, we have provided an overview of the ethics of artificial virtual assistants, from the conceptual foundations to specific ethical challenges. Firstly, we have unraveled some misunderstandings about the concept of artificial virtual assistant. In common speech, they are thought to refer merely to voice assistants, such as Siri or Alexa, when, on the contrary, their interfaces can be diverse. An analysis of their general function has led us to incorporate recommender systems within this concept and to underscore, as some of their defining characteristics, their ubiquity—mainly due to the residual role played by their embodiment— and the cognitive delegation of tasks previously performed by human beings. Having defined our object of research, we have analyzed, secondly, three major ethical challenges posed by artificial virtual assistants: atrophy of capabilities, such as agency and autonomy, human obsolescence, and negative impacts on privacy. The surreptitious operation of these systems can involve deception, persuasion, and coercion, limiting all those conditions that have traditionally been understood as necessary for the exercise of human agency and autonomy. This can also lead to a high degree of dependency: many tasks that could previously be done independently can only be done with the help of an artificial virtual assistant. Human obsolescence implies taking this last idea to its ultimate consequences, i.e., thinking about the ethical implications of humans ceasing to be the optimal entities for performing certain activities. Despite rejecting the possibility of general human obsolescence, we have presented various reasons for and against partial human obsolescence. Finally, the most relevant ethical issues concerning privacy and data collection have been outlined. We have analyzed how privacy constitutes an important value in our societies, due to the role it plays in our interpersonal relationships and in the construction of our identity. The ways in which artificial virtual assistants are deployed may jeopardize this value by recording some of our intimate conversations or producing commercial profiles without our consent. Thirdly, the question of whether artificial virtual assistants can constitute moral assistants has been addressed. Due to the controversies raised by the use of the

104

J. I. del Valle et al.

concept of human enhancement, we have chosen to understand them as assistants in general. Different types of systems have been proposed, all of which possess ethical virtues and vices that are important to note. We have also discussed the use of artificial moral assistants in some contexts of activity, such as in the medical field. Acknowledgments  This chapter was written as a part of the research projects Digital Ethics. Moral Enhancement through an Interactive Use of Artificial Intelligence (PID2019-104943RB-I00), funded by the State Research Agency of the Spanish Government, and Moral enhancement and artificial intelligence. Ethical aspects of a Socratic virtual assistant (B-HUM-64-UGR20), funded by FEDER/ Junta de Andalucía  – Consejería de Transformación Económica, Industria, Conocimiento y Universidades.

References Acemoglu, D., and P. Restrepo. 2018. Artificial intelligence, automation, and work. In The economics of artificial intelligence: An agenda, 197–236. University of Chicago Press. Agar, N. 2019. How to be human in the digital economy. MIT Press. Anderson, M., S.L. Anderson, and C. Armen. 2005. MedEthEx: Toward a medical ethics advisor. In Proceedings of the AAAI fall symposium on caring machines: AI in elder care, technical report FS-05-02, 9–16. AAAI Press. Anniappa, D., and Y. Kim. 2021. Security and privacy issues with virtual private voice assistants. In 2021 IEEE 11th annual computing and communication workshop and conference (CCWC), 0702–0708. IEEE. Barricelli, B.R., and D. Fogli. 2021. Virtual assistants for personalizing IoT ecosystems: Challenges and opportunities. In CHItaly 2021: 14th biannual conference of the Italian SIGCHI chapter, 1–5. ACM. Beauchamp, T.L., and J.F.  Childress. 1979/2001. Principles of biomedical ethics. 5th ed. NY: Oxford University Press. Bentham, J. 1791/1995. The Panopticon writings. New York: Verso. Berry, K.J., and T.W. Martin. 1974. The synecdochic fallacy: A challenge to recent research and theory-building in sociology. Pacific Sociological Review 17 (2): 139–166. Boden, M.A. 2016. AI: Its nature and future. Oxford University Press. Boyd, R., and R.J. Holton. 2018. Technology, innovation, employment and power: Does robotics and artificial intelligence really mean social transformation? Journal of Sociology 54 (3): 331–345. Brynjolfsson, E., and A. McAfee. 2014. The second machine age: Work, progress, and prosperity in a time of brilliant technologies. WW Norton and Company. Burr, C., N.  Cristianini, and J.  Ladyman. 2018. An analysis of the interaction between intelligent software agents and human users. Minds and Machines 28 (4): 735–774. https://doi. org/10.1007/s11023-­018-­9479-­0. Carr, N. 2014. The glass cage: Automation and us. W. W. Norton & Co. Cowen, T. 2011. The great stagnation. Penguin. Danaher, J. 2018. Toward an ethics of AI assistants: An initial framework. Philosophy and Technology 31 (4): 629–653. ———. 2019a. Automation and utopia. Harvard University Press. ———. 2019b. The rise of the robots and the crisis of moral patiency. AI & Society 34 (1): 129–136. ———. 2022. Technological change and human obsolescence: An axiological analysis. Techné: Research in Philosophy and Technology 26 (1): 31–56.

5  Ethics of Virtual Assistants

105

Daniels, N. 2000. Normal functioning and the treatment-enhancement distinction. Cambridge Quarterly of Healthcare Ethics 9 (3): 309–322. https://doi.org/10.1017/s0963180100903037. Darshan, B.S., S. Ajay, C. Akshatha, V. Aishwarya, and S.G. Shilpa. 2019. Virtual assistant based recommendation system. International Journal of Advance Research, Ideas and Innovations in Technology 5 (3): 1191–1194. Elliott, K.C. 2006. An ethics of expertise based on informed consent. Science and Engineering Ethics 12(4): 637–661. Firth, R. (1952). Ethical absolutism and the ideal observer. Philosophy and Phenomenological Research, 12 (3), 317–345. Ford, M. 2015. Rise of the robots: Technology and the threat of a jobless future. Basic Books. Foucault, Michel. 1975/1977. Discipline and punish: The birth of the prison. New  York: Random House. Frey, C.B., and M.A. Osborne. 2017. The future of employment: How susceptible are jobs to computerisation? Technological Forecasting and Social Change 114: 254–280. Giubilini, A., and J.  Savulescu. 2018. The artificial moral advisor. The “ideal observer” meets artificial intelligence. Philosophy & technology 31 (2): 169–188. Griffin, J. 2007. The human right to privacy. San Diego Law Review 44: 697–722. Güell, M., M. Salamo, D. Contreras, and L. Boratto. 2020. Integrating a cognitive assistant within a critique-based recommender system. Cognitive Systems Research 64: 1–14. Guy, J.B.B. 2022. Artificial interactions: The ethics of virtual assistants. Hoy, M.B. 2018. Alexa, Siri, Cortana, and more: An introduction to voice assistants. Medical Reference Services Quarterly 37 (1): 81–88. Jannach, D., A. Manzoor, W. Cai, and L. Chen. 2021. A survey on conversational recommender systems. ACM Computing Surveys (CSUR) 54 (5): 1–36. Kahneman, D. 2011. Thinking, fast and slow. Farrar, Straus and Giroux. Kepuska, V., and G. Bohouta. 2018. Next-generation of virtual personal assistants (microsoft cortana, apple siri, amazon alexa and google home). In 2018 IEEE 8th annual computing and communication workshop and conference (CCWC), 99–103. IEEE. Klincewicz, M. 2016. Artificial intelligence as a means to moral enhancement. Studies in Logic, Grammar and Rhetoric 48 (1): 171–187. https://doi.org/10.1515/slgr2016-­0061. Lara, F. 2021. Why a virtual assistant for moral enhancement when we could have a socrates? Science and Engineering Ethics 27 (4): 1–27. Lara, F., and J. Deckers. 2020. Artificial intelligence as a socratic assistant for moral enhancement. Neuroethics 13 (3): 275–287. Lin, P., K. Abney, and G.A. Bekey, eds. 2014. Robot ethics: The ethical and social implications of robotics. MIT press. Llorca Albareda, J., and J. Rueda. 2023. Divide and rule: Why ethical proliferation is not so wrong for technology ethics. Philosophy & Technology 36: 10. Lü, L., M. Medo, C.H. Yeung, Y.C. Zhang, Z.K. Zhang, and T. Zhou. 2012. Recommender systems. Physics Reports 519 (1): 1–49. Lugano, G. 2017. Virtual assistants and self-driving cars. In 2017 15th international conference on ITS telecommunications (ITST), 1–5. IEEE. McMillan, J. 2018. The methods of bioethics: An essay in meta-bioethics. Oxford: Oxford University Press. Meier, L.J., A. Hein, K. Diepold, and A. Buyx. 2022. Algorithms for ethical decision-making in the clinic: A proof of concept. The American Journal of Bioethics 22 (7): 4–20. https://doi.org/1 0.1080/15265161.2022.2040647. Nyholm, S. 2020. Humans and robots: Ethics, agency, and anthropomorphism. Rowman & Littlefield Publishers. O’Neill, E., M. Klincewicz, and M. Kemmer. 2022. Ethical issues with artificial ethics assistants. In The oxford handbook of digital ethics, ed. C. Véliz. Oxford: Oxford University Press. https:// doi.org/10.1093/oxfordhb/9780198857815.013.17.

106

J. I. del Valle et al.

Park, D.H., H.K. Kim, I.Y. Choi, and J.K. Kim. 2012. A literature review and classification of recommender systems research. Expert Systems with Applications 39 (11): 10059–10072. Pinola, M. 2011. Speech recognition through the decades: How we ended up with siri. Web log post. techHive. IDGtechNetwork, 2. Rachels, J. 1975. Why privacy is important. Philosophy & Public Affairs 4: 323–333. Rafailidis, D., and Y. Manolopoulos. 2019. Can virtual assistants produce recommendations? In Proceedings of the 9th international conference on web intelligence, mining and semantics, 1–6. ACM. Raus, K., F.  Focquaert, M.  Schermer, J.  Specker, and S.  Sterckx. 2014. On defining moral enhancement: A clarificatory taxonomy. Neuroethics 7 (3): 263–273. https://doi.org/10.1007/ s12152-­014-­9205-­4. Reiman, J.H. 1976. Privacy, intimacy, and personhood. Philosophy & Public Affairs 6 (1): 26–44. Ricci, F., L. Rokach, and B. Shapira. 2011. Introduction to recommender systems handbook. In Recommender systems handbook, 1–35. Boston, MA: Springer. Rodríguez-López, B., and J. Rueda. 2023. Artificial moral experts: Asking for ethical advice to artificial intelligent assistants. AI & Ethics. https://doi.org/10.1007/s43681-­022-­00246-­5. Rueda, J. 2020. Climate change, moral bioenhancement and the ultimate mostropic. Ramon Llull Journal of Applied Ethics 11: 277–303. https://www.raco.cat/index.php/rljae/article/ view/368709. Rueda, J. 2023. ¿Automatizando la mejora moral humana? La inteligencia artificial para la ética: Nota crítica sobre LARA, F. y J.  SAVULESCU (eds.) (2021), Más (que) humanos. Biotecnología, inteligencia artificial y ética de la mejora. Madrid: Tecnos. Daimon Revista Internacional de Filosofia, (89), 199–209. https://doi.org/10.6018/daimon.508771. Rueda, J., P. García-Barranquero, and F. Lara. 2021. Doctor, please make me freer: Capabilities enhancement as a goal of medicine. Medicine, Health Care and Philosophy 24: 409–419. https://doi.org/10.1007/s11019-­021-­10016-­5. Rueda, J., J. Delgado Rodríguez, I. Parra Jounou, J. Hortal, and D. Rodríguez-Arias. 2022. “Just” accuracy? Procedural fairness demands explainability in AI-based medical resource allocations. AI & Society. https://doi.org/10.1007/s00146-­022-­01614-­9. Sætra, H.S., and J. Danaher. 2022. To each technology its own ethics: The problem of ethical proliferation. Philosophy & Technology 35 (4): 1–26. Savulescu, J., and Maslen, H. 2015. Moral enhancement and artificial intelligence: Moral AI? In J. Romportl, E. Zackova, and J. Kelemen (eds.), Beyond artificial intelligence. The disappearing human-machine divide (pp. 79–95). Springer. Schmidt, B., R. Borrison, A. Cohen, M. Dix, M. Gärtler, M. Hollender, and S. Siddharthan. 2018. Industrial virtual assistants: Challenges and opportunities. In Proceedings of the 2018 ACM international joint conference and 2018 international symposium on pervasive and ubiquitous computing and wearable computers, 794–801. ACM. Sinnott-Armstrong, W., and J.A. Skorburg. 2021. How AI can aid bioethics. Journal of Practical Ethics 9 (1). https://doi.org/10.3998/jpe.1175. Skorburg, J.A., W.  Sinnott-Armstrong, and V.  Conitzer. 2020. AI methods in bioethics. AJOB Empirical Bioethics 11 (1): 37–39. https://doi.org/10.1080/23294515.2019.1706206. Someshwar, D., D. Bhanushali, V. Chaudhari, and S. Nadkarni. 2020. Implementation of virtual assistant with sign language using deep learning and TensorFlow. In 2020 second international conference on inventive research in computing applications (ICIRCA), 595–600. IEEE. Spallazzo, D., M. Sciannamè, and M. Ceconello. 2019. The domestic shape of AI: A reflection on virtual assistants. In 11th proceedings of design and semantics of form and movement international conference (DeSForM) MIT Boston, 52–59. Scopus. Sparrow, R. 2015. Enhancement and obsolescence: Avoiding an “enhanced rat race”. Kennedy Institute of Ethics Journal 25 (3): 231–260. ———. 2019. Yesterday’s child: How gene editing for enhancement will produce obsolescence— And why it matters. The American Journal of Bioethics 19 (7): 6–15.

5  Ethics of Virtual Assistants

107

Tenhundfeld, N.L., H.M. Barr, H.O. Emily, and K. Weger. 2021. Is my Siri the same as your Siri? An exploration of users’ mental model of virtual personal assistants, implications for trust. IEEE Transactions on Human-Machine Systems 52 (3): 512–521. Thaler, R.H., and C.R. Sunstein. 2008. Nudge. Penguin. Thompson, D. 2015. A world without work. The Atlantic 316 (1): 50–56. Vallor, S. 2015. Moral deskilling and upskilling in a new machine age: Reflections on the ambiguous future of character. Philosophy & Technology 28 (1): 107–124. Véliz, C. 2019. The internet and privacy. In Ethics and the contemporary world, ed. David Edmonds, 149–159. Abingdon: Routledge. ———. 2021. Privacy and digital ethics after the pandemic. Nature Electronics 4 (1): 10–11. Wald, R., J.T. Piotrowski, T. Araujo, and J.M. van Oosten. 2023. Virtual assistants in the family home. Understanding parents’ motivations to use virtual assistants with their Child (dren). Computers in Human Behavior 139: 107526. Weiser, M. 1991. The computer for the 21st century. Scientific American 3 (265): 94–104. Wilson, R., and I. Iftimie. 2021. Virtual assistants and privacy: An anticipatory ethical analysis. In 2021 IEEE international symposium on technology and society (ISTAS), 1–1. IEEE. Yeung, K. 2017. ‘Hypernudge’: Big data as a mode of regulation by design. Information, Communication & Society 20 (1): 118–136. https://doi.org/10.1080/1369118X.2016.1186713.

Chapter 6

Ethics of Virtual Reality Blanca Rodríguez López

Abstract  Human beings have been spending a lot of time in front of a screen for many years. Through the computer and other electronic devices, we interact with our friends and colleagues, we maintain contact with our relatives, we carry out part of our work, we study and we carry out leisure activities. Despite the many advantages that all this has brought to our lives, ethical and social problems have also arisen. Although these electronic devices have become increasingly immersive and interactive, virtual reality seems to present a qualitative leap. In this chapter we will analyse how big this jump is, we will review the ethical problems that may appear, we will ask ourselves if they are really new and we will study how and to what extent we can face them.

6.1 Introduction The term “virtual reality” (VR) has been with us for a long time and it is used not only in academia or technology but also in the common, daily life contexts, but this does not mean that there is a single definition common to all. On the contrary, there are quite a few. Some of them are so wide as to include any visual representation and some others so restrictive as to exclude anything that does not require the use of helmets and haptic gloves (Brey 1999). There are still others that beg the question in some important respects, for instance including the term “illusory” (Cranford 1996). However, most of them include what are usually considered core characteristics of VR (Brey 1999; Chalmers 2017): interactivity, use of three dimensional graphics and first-person perspective. Although, as long as we keep these characteristics in mind, it is not of paramount importance to have a definition, we are going to adopt the one provided by Chalmers in this chapter: a virtual reality environment B. R. López (*) Department of Philosophy and Society, Complutense University of Madrid, Madrid, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 F. Lara, J. Deckers (eds.), Ethics of Artificial Intelligence, The International Library of Ethics, Law and Technology 41, https://doi.org/10.1007/978-3-031-48135-2_6

109

110

B. R. López

is an immersive, interactive, computer-generated environment (Chalmers 2017, p. 312). Of course, interactivity and immersiveness have degrees.1 Due to these characteristics VR can produce the experience of “presence”, the illusion of being and acting in a virtual place. This is a unique feature of VR that sets it apart from experiences produced in previous media and raises new questions. Of course, VR poses continuities as well as discontinuities with other technologies and the experiences they provide us. In this chapter we will try to make clear what they are. VR has been used as a tool in many fields, from medicine to education, training and entertainment. Although each field of application has its own characteristics, here we will focus on what they all have in common. This chapter is divided into 7 sections. We will start with a brief look at the history of VR, a good way to start understanding continuities and discontinuities. Then we will ask ourselves, also briefly, whether the virtual is real. After these preliminary comments (Sect. 6.1) we will consider in more detail the important question of Avatars (Sect. 6.2). The next three Sects. 6.3, 6.4 and 6.5 will address what is good (3), what is bad (4), and what is just weird in VR (5). Sect. 6.6 analyses the main ethical questions brought about by VR. The paper will end with some conclusions (Sect. 6.7).

6.2 Preliminaries 6.2.1 Prehistory and History Although the term “virtual reality” was coined in the 80’s of the last century (Lauria 1997), it has its history, even its prehistory. When it comes to technologies, “prehistory” is usually referring to something very different, precisely understood as technology which has some other characteristic in common, usually function. It is therefore what we could call a “functional prehistory”. From a non-technological point of view, prehistory is important to understand why humans engage in “virtual worlds”. In this sense, some authors (Horsfield 2003a, b; Damer and Hinrichs 2014) begin the prehistory of VR in literal prehistory, linking it with such things as paintings in caves, rites performed in temples and other kinds of “nonphysical, dreamlike realities’‘ (Damer and Hinrichs 2014 p. 18), products of the human imagination. In this vein, in one of the first book entirely devoted to VR, Lévy (1998) speaks of three “virtualization processes’‘ that have led to humanity as we know it today: virtualization associated with signs, that allows us to detach ourselves from the immediate here and now, virtualization associated with technology, that produces “the virtualization of action, the body, and the physical environment” (p. 97) and

 For instance, it is quite common to establish tree levels of immersiveness https://intenta.digital/ perspectives/virtual-reality-opportunities-risks/ 1

6  Ethics of Virtual Reality

111

virtualization associated with social relations and include sophisticated phenomena such as rituals, laws and economics. Whether you agree with Levy’s processes or not, the important thing is the idea of virtuality as something inseparable from and constitutive of humanity and the functions it plays. It makes creation possible, allows us to explore alternatives and to test limits, and provides us with hope. The history begins much later. If we consider that virtual reality sensu stricto involves sophisticated technology to create an illusion that involves the senses, what we can consider VR’s Middle Ages began with the invention of the camera obscura. Though it was anticipated by the Chinese philosopher Mo-Ti (5th century BC) and by Aristotle himself (384–322  BC), the first experiments are due to the Islamic scholar and scientist Alhazen (Abu Ali al-Hasan Ibn al-Haitham) (c.965–1039). In the fifteenth through eighteenth centuries this device, given its name by Johannes Kepler in the early seventeenth century, was common knowledge among painters and astronomers. The nineteenth century witnessed another two steps in the path to VR, with the 360-degree murals (or panoramic paintings), intended to fill the viewer’s entire field of vision, making the viewers feel present at some historical event or scene; and the invention of the stereoscope in 1838 by Charles Wheatstone. Then, in the twentieth century, as with many other things, the history of VR accelerated and became familiar. In the late 1920’s the Link Trainer, a mechanical flight simulator, was used to train military aeroplane pilots, and in the 1950’s Morton Heilig developed the Sensorama, the very first multisensory virtual experience, including not only film but also sound, vibration, wind and even odour. He shot, produced and edited six short films especially designed for his Sensorama, including one titled “I’m a coca cola bottle!”. The title cannot speak louder of the Sensorama’s immersive intentions. From then on, research and development acquired an increasingly fast pace, and now here we are.

6.2.2 Is It Real? This is not a banal question. Quite the contrary. Only real things are important, especially when it comes to ethics, law and governance. We usually do not pay a lot of attention to things that are “just in your imagination.” Although many people may feel compelled to answer this question in the resounding negative, the answer is far from self-evident or without controversy. Many philosophers and other scholars would agree, but not all. For some, virtual reality is real. According to Horsfield (2003a, b) VR is one of the forms of reality, along with actual, physical and material realities. The best known of them all, David Chalmers (2017, 2022) agrees: VR is real. The general question can be reduced to others, such as are virtual objects (e.g., a chair), real objects? Are the virtual properties (e.g., being green?) real? All these questions belong to the philosophical discipline we call ontology. Although the

112

B. R. López

object of this chapter is not ontology but ethics, it is convenient to begin by briefly clarifying this issue.2 Chalmers (2017) calls the two possible basic answers “virtual realism”, which answers in the affirmative, and “virtual fictionalism”, which answers in the negative. Contrary to what it may seem at first glance, virtual realism is a sensible and defensible position. Unless we want to deny that virtual objects exist, it is at least shocking to say that something exists but is not real. It is much more accurate and less confusing to say that virtual objects exist as virtual objects, much like we say that drawn objects exist as drawn objects. The possible objection that virtual objects are not real because everything real has a physical basis is not very convincing, in the first place, because very few people would be willing to affirm that only what has a physical basis is real, and in the second instance because virtual objects do have a physical basis. They are digital, computer generated objects, and a computer is a physical object. What is most important, at a pragmatic level, is that we can interact with them. After all, an electronic book is a book that you can read.

6.3 My Avatar From the Sanskrit avatāra, avatars are well known to hinduism, meaning an incarnation of a God. Much more recently, since 2009, they are also well known in our cultural tradition. That year a movie came to our screens: Avatar. Our concept of avatars is different to both of them. For our purposes, an avatar is a digital representation of a human in cyberspace. When we began using digital communications such as emails, social media or SMS, an avatar was a simple photograph or any other kind of graphic representation and was generally static. Nowadays, with the advent of more interactive digital means, the concept of avatar is understood as “an image that represents you in online games, chat rooms, etc. and that you can move around the screen” (Cambridge Dictionary) “an electronic image that represents and may be manipulated by a computer user (as in a game)” (Merriam-Webster). Static or not, avatars have aroused great interest from the beginning. This is not surprising, since avatars are important in at least three aspects: what they say about the user, how they influence the behaviour of others in online interactions and how they affect the user’s own behaviour. In order to understand the importance of avatars, it is useful to start with a few words about how we present ourselves offline. In the offline world, we also present ourselves to others in a certain way. Our identity is multiple and complex as it

 Interested readers can find an accessible introduction to this complex problem in Brey (2014).

2

6  Ethics of Virtual Reality

113

comprises many aspects and roles. For this reason, offline, we change our image according to the interaction context. We choose what we want to accentuate or hide, more or less consciously. Online, we find differences of degree. In technologically mediated interactions, from chats to virtual spaces, the possibilities of manipulation and partial presentation are much greater, as avatars are highly customizable. Some people always use the same avatar and others change it depending on the context, or even in the same interaction (to represent different moods, for example) and the possibilities of showing only a part of ourselves is greater.

6.3.1 What They Reveal About the User There are three interesting and related questions: what type of person chooses an avatar with certain characteristics, how far the avatar is from the actual offline self and what motivates people to choose a particular avatar. Although we cannot expand here on these aspects, we can give some brushstrokes. For example, men and women both selected avatars that differ from themselves in ways that correspond to societal norms about ideal appearance (e.g., women being thinner and men being bulkier), though men do not appear to value height as much as we might expect. Extroverts will be more likely to select avatars with fewer discrepancies from themselves than introverts will, and neurotic participants will select avatars with more discrepancies from themselves than will non-neurotic participants. Men high in openness, as well as people with low self esteem, are more likely to choose darker skin tones relative to their actual skin tone (Dunn and Guadagno 2012). Other studies have found that non-white participants selected avatars with lighter skin-tones and white participants selected avatars with darker skin-tones (Dunn and Guadagno 2019). People prefer avatars matched to their gender, but gender swapping occurs frequently in online games. Regarding the users’ personality, some finding suggests that individuals who are more agreeable, as well as those reporting an average personality, are more likely to create an avatar that others want to befriend (Fong and Mar 2015). There are also some interesting studies on the degree to which our real selves more or less resemble our virtual selves (Jin 2012). Finally, it should be noted that more experienced users choose avatars that are thinner, taller and more attractive (relative to their real self). They are probably more aware of the importance of the appearance of the avatar. Although motivations are always diverse, there are some interesting discoveries. For instance, gender swapping is associated more with practical benefits rather than the expression of personal identity. These practical benefits are different for women and men. Women’s motives very often are related to the desire to prevent unsolicited male attention and be treated as equals in a male dominated environment. Men, on the other hand, tend to swap gender in order to gain competitive advantages.

114

B. R. López

6.3.2 How They Influence the Behaviour of Other Users Whatever the motivations to choose an avatar and get it more or less close to the real me, what is unquestionably true is that our avatar has a great influence on the behaviour of others with whom we interact online. First, we have to consider what the avatar says about the user to other users. Individuals tend to draw conclusions about users from their avatars, which is important as many relationships that start online go offline. This tendency is not different from what we do offline; it is only a question of degree. In an avatar there are fewer sources of information, but they carry more weight because individuals know that there are conscious and multiple choices behind them. This can be explained from the uncertainty reduction theory (Berger and Calabrese 1975). According to this theory, the fundamental objective of an initial interaction is to reduce uncertainty about the person with whom we interact. This is indispensable for understanding behaviour and predicting it. In face-to-face interactions, visual cues are highly relied on (they are easy to access and trustworthy, as they are quite stable). We also put more trust in characteristics that are not easily controllable by the subject. Conclusions are drawn about personality and, correct or not, people tend to trust them. Online people know that the appearance of an avatar is chosen consciously, it is easy to change it and it is not stable. We may wonder if we are doing the right thing in attaching such importance to appearances, but the truth is that there are studies that show that avatars can communicate some personality traits (extroversion, likeability, and neuroticism) fairly accurately and that the extroverts and likeable are perceived more clearly (Fong and Mar 2015). As early as 2005, Nowak and Rauh published a paper reporting a study in which participants evaluated static avatars. They found that anthropomorphic avatars are perceived as more attractive and reliable, and that people tend to choose them. They also found that non-androgynous avatars are more attractive and reliable (partly because they serve to recognize anthropomorphism). In a later study (Nowak and Rauh 2008) they replicated these findings, although they also found that participants with more online experience judge avatars as less anthropomorphic in general and consider their partners as more credible, regardless of the avatar they choose. The most attractive avatars are perceived as more credible (Nowak and Rauh 2005), are judged as more socially and intellectually competent and as more socially adapted (Khan and De Angeli 2009). Beyond attractiveness, those avatars manipulated to maintain eye contact or imitate the movements of other participants generate greater agreement with these participants (Bailenson et al. 2005). Interestingly, avatars with white clothes provoke more harmonious interactions (Peña et al. 2009). This way, the appearance of an individual’s avatar has an important impact on other users’ behaviour as it influences such important things as perceptions of credibility or competence.

6  Ethics of Virtual Reality

115

6.3.3 How They Influence the User’s Behaviour The most interesting aspect of avatars is, however, the impact they have on the user’s behaviour. This can be explained by appealing to behavioural confirmation: our avatar creates some expectations in others and we tend to confirm these expectations with our behaviour. However, there is more to it than this. It is known as the Proteus Effect. First discovered and named in Yee and Bailenson (2007) and later confirmed many times, it refers to the phenomenon by which an individual’s behaviour conforms to their avatar independently of how others perceive them. Among the experimental findings we can mention the following examples: subjects who are assigned a more attractive avatar appear more intimate, reveal more and keep less distance from others, and subjects with a taller avatar are more confident in negotiations. According to the authors, explanations can be found in self-­perception theory and deindividuation theory. With roots in the philosophy of mind, the core or the self-perception theory is that “individuals come to “know” their own attitudes, emotions, and other internal states partially by inferring them from observations of their own overt behaviour and/ or the circumstances in which this behaviour occurs. Thus, to the extent that internal cues are weak, ambiguous, or uninterpretable, the individual is functionally in the same position as an outside observer, an observer who must necessarily rely upon those same external cues to infer the individual’s inner states” (Bem 1972). Deindividuation theory says that identity cues have a particularly strong impact when people are deindividuated. Yee and Bailenson claim that the anonymity afforded by online environments are the ideal place for deindividuation to occur, and so users may adhere to the identity that can be inferred from their avatars. Last but not least, we should mention the relation between avatars and self-­ presence. Self-presence (or embodiment) is a psychological state in which users experience their virtual self as if it were their real self. There are empirical studies that show that self-presence decreases when users perceive many discrepancies between their avatars and their real selves and it has a positive correlation with flow (and optimal psychological experience) (Jin 2012).

6.4 What Is Good Virtual reality has many good things. It provides us with advantages and benefits that were unimaginable until recently. Thanks to it we can live a multitude of experiences that would otherwise be out of our reach in the first person. We can experience fantastic worlds, with green-coloured skies, populated by dragons and other fantastic animals, with non-existent plants and fountains that flow with water from all the

116

B. R. López

colours of the rainbow. We can also travel to existing places that would otherwise cost us a lot of time, money, and effort to visit. We can take a few days and travel to New York, Egypt, Iceland and Machu Picchu in a short vacation period. We can experience places that no longer exist, from ancient Rome to mediaeval London. We can play cricket with Alice, have lunch with the Mad Hatter, play chess with friends who live many miles away, and learn unfamiliar games with other users we have never met in person. We can attend that Genesis concert that took place when we were still in the crib. We can wield Lightbringer in the battle for the Dawn. We can visit museums and admire pieces of art slowly and with delight, without having to see everything in one afternoon because the museum closes and tomorrow we have to go back home. Tomorrow we can always return to the museum and contemplate another piece. Virtual reality makes the world bigger and at the same time puts it within our reach. And it’s not just entertainment and culture. We can also do business, buy and sell, meet with our partners and network. We can visit a property before buying it or before deciding to visit it offline. Many professional fields find a training ground in virtual reality. We can design and test our designs. We can train to fly planes and ships, and to drive in bad weather, and we can do it without serious risk to ourselves or to others. Also, in the field of education, virtual reality offers multiple possibilities, some of which have been used successfully for a long time (Kavanagh et al. 2017). It can also be used in the healthcare field, including mental health care (Bell et al. 2020) and pain management (Matamala-Gomez et al. 2019a, b). The effectiveness of these uses will likely increase as virtual reality becomes more immersive and realistic. Finally, we cannot fail to mention among the good things the possibilities that virtual reality offers us to experiment with our identity and explore different options. And it is possible that I have overlooked some good things and that others may arise with the development of this technology. Unfortunately, there are no roses without thorns.

6.5 What Is Bad Like all technologies, virtual reality has risks. We can classify these risks into two categories: personal and social.

6.5.1 Personal Risks Personal risks are risks for the individual user and can be physical or psychological. The risk of physical harm has always accompanied the use of technology. Think about computers. The prolonged use of computers, which is all too common in our

6  Ethics of Virtual Reality

117

societies, has well known side effects, such as stiff shoulders, Carpal Tunnel syndrome and eye-strain headaches. Excessive use of screens, whether mobile phones, tablets or television sets, is harmful to the eyes, especially for very young users. The use of headphones can be harmful to hearing, especially if the sound is too loud, as is often the case. Unsurprisingly, virtual reality brings new physical risks while increasing some of those already mentioned. Highly immersive virtual reality, which involves the use of headsets, has given rise to what is known as “virtual reality sickness” or “cybersickness”. Similar to motion sickness, its symptoms include nausea, dizziness, sweating, pallor and loss of balance. It is produced when the user’s brain receives conflicting information about self-movement, typically when the user’s eyes send the signal of walking while the body remains seated. Although it is estimated that only 25% of users experience VR sickness in a significant way (roughly the same percentage of people who experience motion sickness during air travel), it has become an issue of high priority for the industry, and many empirical studies have been conducted (Chang et al. 2020) to develop means to reduce it. Another physical risk for users of highly immersive virtual reality has recently been studied by researchers from Oregon State University (Penumudi et al. 2020). On this occasion, the risk of muscle damage in the neck and shoulders to the typical gestures that accompany the use of VR has been studied. Other very striking physical risks are associated with the use of augmented reality. Augmented reality is a related, less immersive technology that adds computer generated images to the user’s offline environment. The concept was popularised in 2016, when a mobile application, Pokemon Go was released. It was a blockbuster, though not very long lived, success that filled the streets with young people who had their eyes fixed on their mobiles looking for pokemons. So much concentration in the chase made many participants unaware of danger in their offline physical environment and many incidents were reported in the media. It cannot be denied that this type of activity is risky, as is walking through the cities with your eyes fixed on your mobile phone and headphones at full volume. Virtual reality also poses psychological and mental health risks. First of all, we are going to mention a risk that is not new at all: addiction. Addiction is an old companion of humanity. There are addictions to various substances, to gaming, to gambling, to sex, to food. Everything that produces pleasure can generate addiction. Internet, video games and mobile phones have added new addictions. Addiction is a serious matter and in extreme cases can be the cause of serious mental illnesses and is especially linked to depression, anxiety and attention deficit disorder. In terms of addiction, virtual reality shows continuities with these previous technologies. There are some interesting recent studies that show this risk, especially in games (Rajan et al. 2018) and virtual tourism (Merkx and Nawijn 2021). The risk of addiction increases as VR becomes more immersive. However, some empirical results suggest that the addictive risk of VR is similar to that of previous related technologies (between 2% and 20% of users (Barreda-Ángeles and Hartmann 2022), though as feelings of embodiment positively predict addiction there is the

118

B. R. López

possibility of increased risk if the technology becomes more and more developed. With addiction to VR come other problems usually linked to VR: personal neglect of users’ own actual bodies and real physical environments. On the bright side, VR can also be used for the treatment of addictions (Segawa et al. 2019) (Mazza et al. 2021), as has been acknowledged by the European Union (Willmer 2022). The most important psychological risk is depersonalization (feelings of alienation towards one’s own self) and derealization (feeling of detachment from reality) disorder. It is important to stress that these are “feelings’‘ and not “beliefs”. As Madary and Metzinger (2016) point out, this distinction is essential to understand the phenomena associated with virtual reality. It is not about the user believing that the virtual environment is a physical reality, nor that their avatar is their own body. With very rare exceptions, this does not happen. The problem is that the user feels the virtual environment as the one he is inhabiting at a certain moment and experiences his avatar as if it were herself. In one of the first empirical studies of this disorder, Aardema et al. (2010) found an increase in dissociative experience (depersonalization and derealization) after exposure to VR, though higher pre-existing levels of dissociation and a tendency to become more easily absorbed were associated with higher increases. They also claim that such an increase is not necessarily pathological, as these experiences are situated in a continuum that includes such normal experiences as daydreaming, and their results were not in the clinical range. A recent study that addressed this question (Peckmann et al. 2022) found that the disorder was higher immediately after the VR experience and was significantly higher in the participants who used a head-mounted VR display than in the group that played the same game on PC. The effect did not last long and the symptoms were mild. Of course, the subjects of empirical studies are usually healthy individuals that represent the general public, and the possibility exists that potential risk groups may present more serious and long-lasting effects. Another widely treated and documented psychological effect is called ‘time compression’, where time goes by faster than you think. This phenomenon has recently been experimentally studied by Mullen and Davidenko (2021), who found that the phenomenon is not just due to the enjoyable experience of a game but can be isolated and attributed to VR compared to conventional screens. Time compression is not necessarily bad. In fact, it can be useful in some situations such as a long flight or when enduring an unpleasant medical procedure. However, it can be detrimental in other circumstances.

6.5.2 Social Risks Some of the risks posed by virtual reality can be qualified as social. They are different from those mentioned under the label “personal risks” although without a doubt these, when they affect many individuals, can constitute a public health problem. Nor is it about the risk of social isolation, for two reasons. First, social isolation only occurs when there is compulsive use of virtual reality in environments with no

6  Ethics of Virtual Reality

119

contact with other individuals, just as it happens in some video games. In many other cases, it could rather be said that VR facilitates contact with other individuals. Secondly, when social isolation does occur, it could be more properly classified as an individual risk. Only in the worst case (and highly unlikely) scenario where a large number of people are socially isolated would there be a social risk. In this chapter, by “social risks” we mean risks that affect society even if they are not directly related to risks for individuals. Slater et al. (2020) identify two risks of this kind. In the offline world we share the same environment, not only natural but above all social. This environment provides us with rules of conduct and helps us to build our identity. Thanks to it we maintain public and social mechanisms, through which we keep norms or discuss and change them. Conversely, digital environments are not only highly immersive but can also be highly personalised. The first risk of this category appears here. If spending a lot of time in these environments becomes normal, we could lose the common environment and what we know as the public sphere. What would happen in that case is a matter of speculation, but without a doubt society would change radically. The second risk has to do with the truth. We assume that the world provides us with sensory data that provides us with evidence on which to base our knowledge. Such data is public and intersubjective and this means that we have a common base to appeal to. This is important in all of our interactions, from everyday ones to the evidence we present in legal settings. Virtual reality can jeopardise this common perceptual base and, as in the previous case, the consequences are difficult to predict. All the risks mentioned above are due to the technology itself. Other bad things are due to the specific content and will be addressed below. However, first, we will briefly cover some things that are neither good nor bad, but just … weird.

6.6 What Is Weird If you google “weird” + “virtual reality” you can find some amazing examples, from the spooky to the ridiculous. There are even lists with the weirdest experiences you can have in virtual reality, from being beheaded in front of a crowd in a guillotine simulator to marrying a manga character. In this section, we will focus on an example that is undoubtedly disturbing and provoked a lot of debate in the media. In 2020 a South Korean television network broadcast a documentary titled Meeting you.3 In it, you can see a still young woman, Jang Ji-sung, discreetly dressed who, with virtual reality glasses and haptic gloves, begins to walk in front of the already familiar green screen. The green screen disappears and turns into a park and a little girl who was hiding behind a bush appears and approaches the woman very happy. She calls her mom. They talk and sit at a table in the park, share

 If you feel like it you can watch a clip in https://www.youtube.com/watch?v=uflTK8c4w0c and join the many million people that have already watched it to date. 3

120

B. R. López

a cake and sing a birthday song. They are celebrating the birthday of the girl, whose name is Nayeon. There are also green, white and pink rice cakes, and seaweed soup. Nayeon makes birthday wishes. May my father stop smoking, may my brothers not fight, may my mother not cry. The girl takes photos of the woman. After eating, Nayeon plays, jumps, and runs. And she gives her mother a bouquet of flowers that she has picked in the park. The girl then lies on a bed. She’s sleepy. They say goodbye saying that they love each other and that they will see each other again. Finally, she turns into a butterfly. It is clear that what appears in the documentary is a virtual reality experience. But it is a very special experience. Nayeon is a real girl recreated by artificial intelligence, or rather, she was a real girl. She was really the daughter of Jang Ji-sung. Nayeon had died 3 years earlier at the age of 7 as a result of an incurable disease. During those 3 years, the mother had filled the house with photos of her daughter, tattooed her name and wore a pendant with her ashes. Not many words are needed to describe the mother’s grief. This is something that we can all understand well. When the television network contacted her with the proposal, she accepted. This technique has been used before, and it is not surprising that it is controversial, since what it does is recreate real deceased people.4 This is undoubtedly disturbing and generates many doubts and problems, especially regarding who can give permission to use the image of a deceased person. No doubt this matter needs to be carefully regulated. But aside from the question of image rights, there are also questions about whether these recreations are good or bad. Of course we can question the motives of the television network to produce and broadcast this documentary. But apart from this it does not seem easy to argue that it is bad from any point of view. It doesn’t seem like it was bad for the mother, who later referred to the experience on her blog saying that she felt like she was dreaming a dream she’s always wanted to dream. The father and the siblings, who were among the audience, also seemed quite happy with the experience. In fact one of her sisters participated in the recreation of Nayeon. Today many psychologists believe that this and other means of continuing the emotional relationship with the deceased may actually help to overcome the loss (Stein 2021). On another level, the same can be said for the virtual experience of being beheaded or marrying a manga character, and all the weird experiences that people can have thanks to virtual reality. The fact that other people find such experiences disturbing says nothing about their goodness or badness. We also find many other non-virtual reality experiences disturbing. Going back to Nayeon, she had asked for the rice cakes when she was in the hospital. They were a treat.

 For example, it was used in Spain to recreate a peculiarly popular artist in a beer advertisement. You can see the process in https://www.youtube.com/watch?v=BQLTRMYHwvE 4

6  Ethics of Virtual Reality

121

6.7 Ethical Issues 6.7.1 Privacy Our societies have a two-sided relationship with privacy. On the one hand, we have become accustomed to sharing a lot of data, mainly through social networks. We willingly share photos of more or less private moments, we share our location, personal data such as age or name, information about where we live, study or work, our leisure and our opinions on a multitude of issues. We share more things than ever with more people than ever. This voluntary sharing can pose a personal risk that many people, especially the very young, are not fully aware of. For this reason, social networks have become the nightmare of many parents, aware that their children are sharing a lot of personal data with their hundred friends online, of whom probably only 10 are really well-known people. On the other hand, we are tremendously concerned about privacy, which we believe is more threatened than ever. In this way, we share more than ever and we are more concerned than ever. In a sense, both things go hand in hand. We want to share what we want with whom we want, and we are concerned that our data reaches other people, or that things are known about us that we do not want to share. The privacy issue linked to virtual reality adds another layer, which is what we are going to talk about here. It concerns data that we do not share voluntarily and that most people are not even aware is being collected. In fact, for many, privacy is the biggest concern and risk of virtual reality. From the beginning, it has figured in the literature as such. In a widely cited paper, O’Brolcháin et al. (2016) categorised the threats to privacy as threats to informational privacy (e.g. when you share medical information required for virtual meetings with doctors), threats to physical privacy (e.g. if you are recorded in your physical surroundings by some device), and threats to associational privacy (related to interactions with other people). Two years later, Adams et al. (2018) presented in the Fourteenth Symposium on Usable Privacy and Security an interesting work. It is a mixed-methods study involving semi-structured interviews with 20 VR users and developers, a survey of VR privacy policies, and an ethics co-design study with VR developers. They found that privacy was indeed one of the major concerns for developers. The incorporation of the headsets and various haptic devices increases privacy concerns. VR headsets can gather biometric data, some of which can be very sensitive and intimate, including eye movements, temperature or muscle tension. All this data is important in order to improve the user’s experience, but can also be used for more sinister purposes. Hackers can gain access to them and companies can be tempted to share these data with third parties. These are clear violations of privacy rights and as such a serious ethical concern. A related concern is the possibility of manipulation of users’ beliefs, emotions, and behaviours for various purposes, from commercial to political. VR enables a

122

B. R. López

powerful form of non-invasive psychological manipulation. This constitutes a threat to personal mental autonomy (Madary and Metzinger 2016).

6.7.2 Ethical Behaviour It is convenient to start this section by pointing out two differences with what was previously analysed. If the problems analysed up to now depended on the technical aspects of virtual reality, the ones we are going to analyse now depend on its content. If violations of privacy and autonomy are an ethical problem that falls on the shoulders of developers and companies, there is another ethical problem that fundamentally depends on the users: their behaviour in virtual environments. Concern about the content to which individuals are exposed is not exactly new. The content of books and magazines, paintings and shows has been a matter of concern and debate for centuries. Concern and debate only increased with the arrival of the mass media, hand in hand with new ways of showing content in an increasingly graphic and realistic way with cinema, radio and photography. Content of a violent or sexual nature has given rise to the most controversy, and continues to give rise to it. The arrival of virtual reality has meant a further step, and a very significant one. In a virtual environment, due to its interactive nature, the user is not a mere spectator, but an agent. As such, she makes decisions and performs actions that can be judged from a moral point of view. Brey (1999) makes a useful distinction between ethical issues raised, respectively, by representational aspects (“the way in which state-of-affairs and events are depicted or simulated”) and interactive or behavioural aspects of virtual reality, concerning the actions of the users in VR environments. Although Brey focuses on single-user VR environments, his approach can be extended to multi-users environments (Ford 2001). The main difference is that, in single user environments, such as many virtual training programs and some games, the influence on the user’s behaviour comes fundamentally from the representational aspects, while in multi-users’ environments there is also the influence of other users and non-user software robots’ (bots) actions and behaviour. Representational aspects depend almost entirely on the designer. They are the ones who make decisions about the degree of realism (that can range from completely fantastic to extremely realistic) of the virtual world and how detailed it is (which can go from simple strokes to contain an enormous amount of detail). They are also responsible for the depiction of characters and bots, which may be more or less stereotyped. Misrepresentation and bias are the two main problems that we face at the representational level. Without a doubt, entering a space where all characters are white (or black), where all enemies are Asian (or South American) and everyone who is on your side in battles and adventures is just like the user and all flower vendors are gypsy women can have an impact on how the user will perceive non-virtual

6  Ethics of Virtual Reality

123

reality. In this way, the behaviour of the user in virtual environments can affect their behaviour offline and affect real people. Multi-user environments are far more problematic. In VR environments we can have virtual real experiences (Ramirez and LaBarge 2018). Even if, as usual, after the experience we are aware that it has been a virtual experience, while we are having the experience it seems real and we feel it as such. In fact, we react psychologically, emotionally, and even physically in much the same way as we do to a real experience (Fox et  al. 2012). This is the reason why the use of virtual reality is effective for some therapies, such as the treatment of phobias (Freeman et al. 2017). If this is the case, violent acts committed against avatars or their property, such as assault, rape, or theft have real consequences in the offline world. They may not physically harm the user, but they certainly cause psychological and emotional damage. Of course, not all virtual experiences are virtual real experiences. Think for example in World of Warcraft (WoW), one of the best-known virtual worlds that was released in 2004 and has millions of players. It is a multi-user game in which users interact with other users and with bots. You can harm and be harmed, kill and get killed. And no one gets hurt by it in real life. The reason why WoW can provide fun experiences instead of traumatising ones is twofold. The first has to do with avatars. In WoW there are humans, but also dwarves, elves, gnomes and other lesser known characters, such as Draenei or Worgen. In all cases, your avatar has little to do with the real you. The second has to do with the virtual world it represents, which belongs to the realm of fantasy. World of Warcraft has a low degree of what Ramirez and LaBarge (2018) calls “context realism”, a concept that refers to the degree of physical and psychological plausibility of the virtual environment. Context realism is important when it comes to producing virtual real experiences. In virtual worlds with a higher degree of context realism, such as Second Life, things are different. For these cases, the principle of equivalence has been proposed: If it would be wrong to subject a person to an experience then it would be wrong to subject a person to a virtually real analogue of that experience (Ramirez and LaBarge 2018 p. 259). In fact, Second Life has a Code of Conduct stabilising behavioural standards against intolerance, assault or harassment. As virtual worlds become more realistic, we should pay more attention to the harm we can do to other people in the real world. At the very least, the user must be made aware of the damage that she can cause and suffer.

6.8 Conclusions The aim of this chapter has been to offer some reflections on one of the phenomena that is changing our lives the most and that in all probability will continue to do so: virtual reality. We have said something about its history and its character of reality. We have talked about avatars, the digital representations of virtual reality users, the importance of which is difficult to exaggerate and which we are not always aware

124

B. R. López

of. We have seen what is good about VR – the many experiences it provides us with and the many beneficial uses it can have and, in fact, is having already. We have reviewed what is wrong with VR, insofar as it poses individual and societal risks. Some of these risks are old acquaintances and we know quite a bit about them. Others are new. Some are more likely than others, and many depend on us being aware of them and using VR responsibly. Finally, we have raised some ethical issues. Privacy related issues are already well-known. There are other ethical issues related to the user’s own behaviour. VR can influence our behaviour indirectly, to the extent that it can present us with a depiction full of misrepresentations and biases that can make it easier for us to harm others in the offline world. We have also seen how and why our behaviour in virtual reality can harm other users with whom we interact. Everything indicates that virtual reality will continue to be part of our lives and will be increasingly immersive and interactive. Some even claim (Chalmers 2022) that virtual reality will be indistinguishable from physical reality, that the metaverse is around the corner and we could have meaningful lives in it. I don’t know if the metaverse is just around the corner. Properly understood, the Metaverse is “A massively scaled and interoperable network of real-time rendered 3D virtual worlds that can be experienced synchronously and persistently by an effectively unlimited number of users with an individual sense of presence and with continuity of data, such as identity, history, entitlements, objects, communications and payments.” (Ball 2022). We are not there yet. It is difficult to talk about something that does not yet exist, but to the extent that VR and augmented reality are generally considered as technologies on which the metaverse will grow, it will share many advantages and risks. In fact, concerns about physical and psychological risks and about privacy come up regularly when talking about the metaverse. The consequences of our behaviour on other users will be even greater. Better get ready now, just in case. Acknowledgements  This chapter was written as a part of the research projects Digital Ethics. Moral Enhancement through an Interactive Use of Artificial Intelligence (PID2019-104943RB-I00), funded by the State Research Agency of the Spanish Government, and Moral enhancement and artificial intelligence. Ethical aspects of a Socratic virtual assistant (B-HUM-64-UGR20), funded by FEDER/ Junta de Andalucía  – Consejería de Transformación Económica, Industria, Conocimiento y Universidades. The author is also grateful to Gonzalo Díaz for his help with bibliographic references.

References Aardema, Frederick, Kieron O’Connor, Sophie Côté, and Annie Taillon. 2010. Virtual reality induces dissociation and lowers sense of presence in objective reality. Cyberpsychology, Behavior and Social Networking 13 (August): 429–435. https://doi.org/10.1089/cyber.2009.0164. Adams, Devon, Alseny Bah, Catherine Barwulor, Nureli Musaby, Kadeem Pitkin, and Elissa M. Redmiles. 2018. Ethics emerging: the story of privacy and security perceptions in virtual reality. In Fourteenth symposium on usable privacy and security, 427–442. SOUPS. https:// www.usenix.org/conference/soups2018/presentation/adams.

6  Ethics of Virtual Reality

125

Bailenson, Jeremy N., Andrew C. Beall, Jack Loomis, Jim Blascovich, and Matthew Turk. 2005. Transformed social interaction, augmented gaze, and social influence in immersive virtual environments. Human Communication Research 31: 511–537. https://doi.org/10.1111/j.1468­2958.2005.tb00881.x. Ball, Matthew. 2022. The metaverse: and how it will revolutionize everything. Liveright Publishing. Barreda-Ángeles, Miguel, and Tilo Hartmann. 2022. Hooked on the metaverse? Exploring the prevalence of addiction to virtual reality applications’. Frontiers in Virtual Reality 3: 1031697. https://www.frontiersin.org/articles/10.3389/frvir.2022.1031697. Bell, Imogen H., Jennifer Nicholas, Mario Alvarez-Jimenez, Andrew Thompson, and Lucia Valmaggia. 2020. Virtual reality as a clinical tool in mental health research and practice. Dialogues in Clinical Neuroscience 22 (2): 169–177. https://doi.org/10.31887/ DCNS.2020.22.2/lvalmaggia. Bem, Daryl J. 1972. Self-perception theory. In Advances in experimental social psychology, ed. Leonard Berkowitz, vol. 6, 1–62. Academic Press. https://doi.org/10.1016/ S0065-­2601(08)60024-­6. Berger, Charles R., and Richard J.  Calabrese. 1975. Some explorations in initial interaction and beyond: Toward a developmental theory of interpersonal communication. Human Communication Research 1 (2): 99–112. https://doi.org/10.1111/j.1468-­2958.1975.tb00258.x. Brey, Philip. 1999. The ethics of representation and action in virtual reality. Ethics and Information Technology 1 (1): 5–14. https://doi.org/10.1023/A:1010069907461. ———. 2014. The physical and social reality of virtual worlds. In The Oxford handbook of virtuality, ed. Mark Grimshaw, 42–54. Oxford University Press. Chalmers, David J. 2017. The virtual and the real. Disputatio 9 (46): 309–352. ———. 2022. Reality +: Virtual worlds and the problems of philosophy. Dublin: Allen Lane. Chang, Eunhee, Hyun Taek Kim, and Byounghyun Yoo. 2020. Virtual reality sickness: A review of causes and measurements. International Journal of Human–Computer Interaction 36 (17): 1658–1682. https://doi.org/10.1080/10447318.2020.1778351. Cranford, M. 1996. The social trajectory of virtual reality: Substantive ethics in a world without constraints. Technology in Society 18 (1): 79–92. https://doi.org/10.1016/0160-­791X(95)00023-­K. Damer, Bruce, and Randy Hinrichs. 2014. The virtuality andreality of avatar cyberspace. In The Oxford handbook of virtuality. Oxford University Press. https://doi.org/10.1093/oxfor dhb/9780199826162.001.0001. Dunn, Robert Andrew, and Rosanna E. Guadagno. 2012. My avatar and me – Gender and personality predictors of avatar-self discrepancy. Computers in Human Behavior 28 (1): 97–106. https://doi.org/10.1016/j.chb.2011.08.015. Dunn, Robert, and Rosanna Guadagno. 2019. Who are you online?: A study of gender, race, and gaming experience and context on avatar self-representation. International Journal of Cyber Behavior, Psychology and Learning 9 (July): 15–31. https://doi.org/10.4018/ IJCBPL.2019070102. Fong, Katrina, and Raymond A. Mar. 2015. What does my avatar say about me? Inferring personality from avatars. Personality & Social Psychology Bulletin 41 (2): 237–249. https://doi. org/10.1177/0146167214562761. Ford, Paul J. 2001. A further analysis of the ethics of representation in virtual reality: Multi-user environments. Ethics and Information Technology 3 (2): 113–121. https://doi.org/10.102 3/A:1011846009390. Fox, Jesse, J.N. Bailenson, and T. Ricciardi. 2012. Physiological responses to virtual selves and virtual others. Journal of Cyber Therapy & Rehabilitation 5 (March): 69–73. Freeman, D., S.  Reeve, A.  Robinson, A.  Ehlers, D.  Clark, B.  Spanlang, and M.  Slater. 2017. Virtual reality in the assessment, understanding, and treatment of mental health disorders. Psychological Medicine 47 (14): 2393–2400. https://doi.org/10.1017/S003329171700040X. Horsfield, Peter. 2003a. Continuities and discontinuities in ethical reflections on digital virtual reality. Journal of Mass Media Ethics 18 (3–4): 155–172. https://doi.org/10.1080/0890052 3.2003.9679662.

126

B. R. López

———. 2003b. The ethics of virtual reality: The digital and its predecessors. Media Development 50 (2): 48–59. Jin, Seung-A. Annie. 2012. The virtual malleable self and the virtual identity discrepancy model: Investigative frameworks for virtual possible selves and others in avatar-based identity construction and social interaction. Computers in Human Behavior 28 (6): 2160–2168. https://doi. org/10.1016/j.chb.2012.06.022. Kavanagh, Sam, Andrew Luxton-Reilly, Burkhard Wuensche, and Beryl Plimmer. 2017. A systematic review of virtual reality in education. Themes in Science and Technology Education 10 (2): 85–119. Khan, Rabia, and Antonella De Angeli. 2009. The attractiveness stereotype in the evaluation of embodied conversational agents. In Human-computer interaction  – Interact 2009, ed. Tom Gross, Jan Gulliksen, Paula Kotzé, Lars Oestreicher, Philippe Palanque, Raquel Oliveira Prates, and Marco Winckler, 85–97. Berlin, Heidelberg: Springer Berlin Heidelberg. Lauria, Rita. 1997. Virtual reality: An empirical-metaphysical testbed[1]. Journal of Computer-­ Mediated Communication 3 (2): JCMC323. https://doi.org/10.1111/j.1083-­6101.1997. tb00071.x. Lévy, Pierre. 1998. Becoming virtual, reality in the digital age. New York: Plenum Trade. Madary, Michael, and Thomas K.  Metzinger. 2016. Real virtuality: A code of ethical conduct. Recommendations for good scientific practice and the consumers of VR-technology. Frontiers in Robotics and AI 3: 3. https://www.frontiersin.org/articles/10.3389/frobt.2016.00003. Matamala-Gomez, Marta, Ana M. Diaz, Mel Slater Gonzalez, and Maria V. Sanchez-Vives. 2019a. Decreasing pain ratings in chronic arm pain through changing a virtual body: Different strategies for different pain types. The Journal of Pain 20 (6): 685–697. https://doi.org/10.1016/j. jpain.2018.12.001. Matamala-Gomez, Marta, Tony Donegan, Sara Bottiroli, Giorgio Sandrini, Maria V.  Sanchez-­ Vives, and Cristina Tassorelli. 2019b. Immersive virtual reality and virtual embodiment for pain relief. Frontiers in Human Neuroscience 13: 279. https://doi.org/10.3389/fnhum.2019.00279. Mazza, Massimiliano, Kornelius Kammler-Sücker, Tagrid Leménager, Falk Kiefer, and Bernd Lenz. 2021. Virtual reality: A powerful technology to provide novel insight into treatment mechanisms of addiction. Translational Psychiatry 11 (1): 617. https://doi.org/10.1038/ s41398-­021-­01739-­3. Merkx, Celine, and Jeroen Nawijn. 2021. Virtual reality tourism experiences: Addiction and isolation. Tourism Management 87 (December): 104394. https://doi.org/10.1016/j. tourman.2021.104394. Mullen, Grayson, and Nicolas Davidenko. 2021. Time compression in virtual reality. Timing & Time Perception 9 (4): 377–392. https://doi.org/10.1163/22134468-­bja10034. Nowak, Kristine L., and Christian Rauh. 2005. The influence of the avatar on online perceptions of anthropomorphism, androgyny, credibility, homophily, and attraction. Journal of Computer-­ Mediated Communication 11 (1): 153–178. https://doi.org/10.1111/j.1083-­6101.2006. tb00308.x. ———. 2008. Choose your “buddy icon” carefully: The influence of avatar androgyny, anthropomorphism and credibility in online interactions. Computers in Human Behavior, Including the Special Issue: Integration of Human Factors in Networked Computing 24 (4): 1473–1493. https://doi.org/10.1016/j.chb.2007.05.005. O’Brolcháin, Fiachra, Tim Jacquemard, David Monaghan, Noel O’Connor, Peter Novitzky, and Bert Gordijn. 2016. The convergence of virtual reality and social networks: Threats to privacy and autonomy. Science and Engineering Ethics 22 (1): 1–29. https://doi.org/10.1007/ s11948-­014-­9621-­1. Peckmann, Carina, Kyra Kannen, Max C.  Pensel, Silke Lux, Alexandra Philipsen, and Niclas Braun. 2022. Virtual reality induces symptoms of depersonalization and derealization: A longitudinal randomised control trial. Computers in Human Behavior 131 (June): 107233. https:// doi.org/10.1016/j.chb.2022.107233.

6  Ethics of Virtual Reality

127

Peña, Jorge, Jeffrey Hancock, and Nicholas Merola. 2009. The priming effects of avatars in virtual settings. Communication Research 36 (November): 1–19. https://doi. org/10.1177/0093650209346802. Penumudi, Sai Akhil, Veera Aneesh Kuppam, Jeong Ho Kim, and Jaejin Hwang. 2020. The effects of target location on musculoskeletal load, task performance, and subjective discomfort during virtual reality interactions. Applied Ergonomics 84 (April): 103010. https://doi.org/10.1016/j. apergo.2019.103010. Rajan, A.V., N. Nassiri, V. Akre, R. Ravikumar, A. Nabeel, M. Buti, and F. Salah. 2018. Virtual reality gaming addiction. In 2018 fifth HCT information technology trends (ITT), 358–363. IEEE. https://doi.org/10.1109/CTIT.2018.8649547. Ramirez, Erick, and Scott LaBarge. 2018. Real moral problems in the use of virtual reality. Ethics and Information Technology 20 (December): 249–263. https://doi.org/10.1007/ s10676-­018-­9473-­5. Segawa, Tomoyuki, Thomas Baudry, Alexis Bourla, Jean-Victor Blanc, Charles-Siegfried Peretti, Stephane Mouchabac, and Florian Ferreri. 2019. Virtual reality (VR) in assessment and treatment of addictive disorders: A systematic review. Frontiers in Neuroscience 13: 1409. https:// doi.org/10.3389/fnins.2019.01409. Slater, Mel, Cristina Gonzalez-Liencres, Patrick Haggard, Charlotte Vinkers, Rebecca Gregory-­ Clarke, Steve Jelley, Zillah Watson, et  al. 2020. The ethics of realism in virtual and augmented reality. Frontiers in Virtual Reality 1: 1. https://www.frontiersin.org/articles/10.3389/ frvir.2020.00001. Stein, Jan-Philipp. 2021. Conjuring up the departed in virtual reality: The good, the bad, and the potentially ugly. Psychology of Popular Media 10: 505–510. https://doi.org/10.1037/ ppm0000315. Willmer, Gareth. 2022. Therapeutic Digital Gaming and VR to Level-up Treatment for Addiction | Research and Innovation. 2022. https://ec.europa.eu/research-­and-­innovation/en/ horizon-­magazine/therapeutic-­digital-­gaming-­and-­vr-­level-­treatment-­addiction. Yee, Nick, and Jeremy Bailenson. 2007. The Proteus effect: The effect of transformed self-­ representation on behavior. Human Communication Research 33 (July): 271–290. https://doi. org/10.1111/j.1468-­2958.2007.00299.x.

Chapter 7

Ethical Problems of the Use of Deepfakes in the Arts and Culture Rafael Cejudo

Abstract  Deepfakes are highly realistic, albeit fake, audiovisual contents created with AI. This technology allows the use of deceptive audiovisual material that can impersonate someone’s identity to erode their reputation or manipulate the audience. Deepfakes are also one of the applications of AI that can be used in cultural industries and even to produce works of art. On the one hand, it is important to clarify whether deepfakes in arts and culture are free from the ethical dangers mentioned above. On the other hand, there are specific ethical issues in this field of application. Deepfake technologies can be used to include the performance of deceased persons in audiovisual materials for the dissemination of cultural heritage, and to generate images or sounds in the style of real authors, but deepfakes are also potentially misleading by blurring the boundary between artwork and reality, and raise questions about the relationship between technology, artificial creativity and authorship. The answer to these problems requires an analysis of the normative foundations of copyright to accommodate the new role of AI.

7.1 Introduction: What Is a Deepfake? Why Could It Be Dangerous? The production of deepfakes is one of the technological breakthroughs achieved in the creation of audio-visual content through AI, such as automated image databases, systems to improve photo resolution and algorithms for predicting video images.

This publication resulted from research supported by GrantPID2021-128606NB100 Funded by the Spanish MCIN/AEI/10.13039/501100011033 and by “ERDF A way of making Europe”. R. Cejudo (*) Department of Social Sciences and Humanities, University of Cordoba, Córdoba, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 F. Lara, J. Deckers (eds.), Ethics of Artificial Intelligence, The International Library of Ethics, Law and Technology 41, https://doi.org/10.1007/978-3-031-48135-2_7

129

130

R. Cejudo

More precisely, deepfakes are artificially produced images, videos or audios that human minds cannot distinguish from reality; that is to say, from audiovisual material that reproduces reality (Ajder et  al. 2019; Paris and Donovan 2019; Spivak 2019). Deepfakes, therefore, exploit the difference between reality and appearance, and between artlessness and artifice. The first examples arose as an opportunity of using AI to generate pornographic material by impersonating the features of celebrities in videos (Cole 2018). Connoisseurs shared this type of pornographic content on the website Reddit and even disseminated software for this purpose. Someone identifying himself on Reddit by the word “deepfakes” provided a free and easy-to-­ use program (called FakeApp) and his nickname came to denote this type of AI technology. Reddit banned the community of deepfake fans in 2018. However, deepfakes have been very much improved since then, to such extend that IA makes possible to generate images and voices of completely non-existent people. There are now many deepfakes on the internet. As early as 2019, a report identified 14,678 deepfake videos accessible online, 96% of which were pornographic in nature, but current uses are quite varied, including the arts and culture. Deepfakes are generated through AI techniques based on deep learning (Knight 2017; Westerlund 2019), and require very large neural networks trained on huge amounts of data. There are two main methods for creating deepfakes. On the one hand there is the encoding-decoding system, which replaces visual or auditory features of one subject with those of another. This makes it possible to change one person’s face for that of another (a deepface), simulate someone saying something they do not say (a deepsound), and give movement to still images, such as paintings. To do this, an AI program uses a huge set of audio and/or video recordings of real people, resulting in audiovisual material in which real people appear saying or doing things that do not correspond to reality. On the other hand, the GAN (Generative Adversarial Network) method makes it possible to create images of objects or people that do not exist, although ultimately real images and audio are also used as initial inputs. The method consists of one neural network (called a “generator”) producing audiovisual results that another neural network (called a “discriminator”) determines whether the image is real or fake. When the discriminator cannot identify the potential deepfake as such, the adversarial process comes to an end. Automatically produced texts do not imply a tendency to be immediately believed. Like any other text, receivers cannot rule out on principle that it is an attempt at deception. In fact, there are many cases of written messages that cannot be taken as literally true, for example ciphertexts or many literary texts. In contrast, before the existence of deepfakes, the only way to obtain images and audios of humans was as reproductions of real individuals and voices. Deepfakes will undoubtedly modify our perceptual habits, as did other audiovisual production techniques such as engraving, photography and film (Benjamin 1936; Habgood-Coote 2023). But until this happens, the spontaneous tendency is to

7  Ethical Problems of the Use of Deepfakes in the Arts and Culture

131

believe that recorded images and voices provide truthful information if they are indistinguishable from those we would perceive directly. This is the reason why photographs are used in passports and identity documents to identify their holders. Therefore, the ethical risks of deepfakes are obvious. Not by chance is the word “deepfake” a play on words that simultaneously evokes deep learning networks and fakes. As stated above, deepfake technology began as a way to produce pornographic material featuring celebrities without their permission. Beyond impersonating someone for abusive sexual gratification, deceptive fabrications through deepfakes can erode someone’s reputation through pornographic hoaxes, depicting them performing disgraceful actions or even committing crimes. Therefore, deepfakes are powerful tools for spreading slander. For example, they can be used as a hoax for revenge after a relationship. Also, deepvoices of celebrities and relatives can be used to fraudulently solicit money, and deepfaces and deepvoices of loved ones can be used to demand ransom by faking a kidnapping. Similarly, it is easy to falsify identification documents by means of deepfakes (De Rancourt-Raymond and Smaili 2022; De Ruiter 2021; Graber-Mitchell 2021; Meskys et al. 2020). Deepfakes also have perverse uses in the political arena (Chesney and Citron 2019a, b; Franks and Waldman 2019; Paterson and Hanley 2020). They can be a dangerous instrument of manipulation as they can make people believe that some public figure is saying or doing that which is in reality what the hoaxer wants. More generally, deepfakes undermine the trust people usually have in audiovisual material, such as recordings or photographs (Harris 2021; Rini 2020). It would not be necessary for there to be many deepfakes, but rather for society to consider the probability of encountering deepfakes to be high. In this case there might be a risk of epistemological apocalypse (Habgood-Coote 2023) because trust in the testimonial value of recordings would have disappeared. The following section addresses the ethical issues raised by the use of deepfakes in the field of art and culture. The aim is not to investigate each of them in depth, but rather to provide a catalogue of issues that demand the attention of ethicists, cultural managers, and audiences. For ease of reading, it is structured into subsections that deal with the differences between the two main techniques for obtaining deepfakes (Sect. 7.2.1), the possibility that artistic deepfakes maliciously deceive the audience (Sect. 7.2.2), the possibilities and risks of using deepfake avatars of deceased authors (Sect. 7.2.3), and the potential of GAN technology to copy the style of famous authors (Sect. 7.2.4). This use of deepfakes raises the more general question of whether AI technologies applied to arts and culture or AI-for-the-arts should be considered an author. Section 7.3 expands on this issue by distinguishing a deontological and a consequentialist approach to justifying copyright, arguing that the latter is more appropriate for the case of AI-for-the-arts. Section 7.4 is a brief recapitulation of the whole chapter.

132

R. Cejudo

7.2 Are Deepfakes Applied to Arts and Culture Harmful? 7.2.1 Encoding-Decoding and GAN Deepfakes The application of deepfakes to arts and culture seems free of the above dangers. First, deepfakes through encoding-decoding can be deceptive, but they are beyond moral reprobation if confined to art. Regarding this, I shall support the Millian argument that artistic freedom should be unrestricted except when the artwork causes harm or directly instigates harm (and it is beyond doubt that artistic deepfakes do not represent harm in themselves, unlike a work of art based on damaging an animal, such as the genetically engineered fluorescent GFP Bunny by artist Eduardo Kac). Therefore, deepfakes that illegitimately appropriate the personal appearance of real people are objectionable even if limited to artistic purposes, but only by reason of such illegitimate appropriation. If the individuals depicted in the deepfake freely consent, there is no reason for criticism, let alone censorship (unless it supports a general rejection of the technological change involved in the use of deepfakes). This would be the case for the art project Spectre by Bill Posters and Daniel Howe, which includes a deepfake of Mark Zuckerberg (www.billposters.ch), and the viral Tom Cruise deepfake videos on TikTok (Stump 2021). Deepfakes achieved through GAN technology do not raise issues of appropriation of personal traits, as they do not represent anyone real; thus, this sort of deepfake is free from the burden of lack of consent. On the other hand, GAN deepfakes challenge notions of mimesis and art. At the outset, it could be argued that they cannot be art, nor be comparable to art because of their perfection as purely specular, non-artistic representations. According to this persuasion, the ethical neutrality of non-harmful works of art does not apply to such deepfakes. Incidentally, this point bears similarity to the notion held when photography was in its infancy. Its corollary is the division of labor between the production of images or sounds through AI and their use by artists. It is in the latter that the artistic possibilities lie, not in the artifacts made by an AI robot or algorithm. AI is often seen as a technology that can help artists, not as a form of artistic expression. Like AI today, the daguerreotype photographic process was described as being mathematically precise. Daguerreotypes and deepfakes would be obtained by mechanical causality unrelated to artistic and cultural processes (Naude 2010). As in the daguerreotype, the illusion of reality implied by deepfakes depends on the sharpness and richness of detail of their perceptual contents. Unlike paintings, daguerreotypes (and later photographs) gratified the audience with the feeling they were being confronted with a representation of something that was real. Richness of detail was the proof of authenticity, as also seems to be the case for deepfakes. Like the daguerreotype, AI is initially being used for commercial rather than artistic purposes, although this analogy cannot be followed for long since creative industries did not exist in the mid-nineteenth century. Pornography was also one of the first uses of the daguerreotype (especially those obtained by the stereoscopic technique that provided even greater realism) (Rosen 2023; Starl 1998). Later, coinciding with

7  Ethical Problems of the Use of Deepfakes in the Arts and Culture

133

the invention of the calotype or talbotype, photographs were used as models for artists. In his book The pencil of Nature, Talbot (1844) included a picture of a bust of Patroclus suggesting that it could serve as a model for artists. The title of Talbot’s book illustrates the idea that photography would be obtained through an absolute causal relationship, as it is nature itself that would create the image. The current narrative on AI is similar, as deepfakes would be the pencil of technology that provide, almost miraculously, perceptive illusions. The title of Talbot’s book underlines the idea of the absence of human intervention, since it is nature that represents itself in the photograph. Moreover, the talbotype inaugurated modern photography because it was based on obtaining a negative that allowed many copies to be printed. For this reason, talbotypes initiated the emancipation of images from their original, a process culminating in GAN technology (Starl 1998). This would be the “brush of technique” that generates self-­ referential audiovisual content, since GAN deepfakes only refer to data inputs. Aristotle, in Poetics IV 1448b.4, argued that we derive pleasure from representation, even of ugly things, for the pleasure of recognizing what is represented. As Gombrich noticed, it is also the other way around, because a real situation sometimes resembles “a Whistler or a Pissarro” (Gombrich 1982, p. 14). In contrast, GAN deepfakes are mere self-referential representations in which there is nothing to recognize. Once discovered, GAN deepfakes cause a feeling of perplexity and disorientation, analogous to the uncanny valley effect elicited by androids (Mori et al. 2012; Wang et al. 2015). The standard relation of mimesis has been reduced in GAN deepfakes to such an extent that, because there is no original, it is like a parodic form of mimesis. Such images provide a type of deception (Goodman 1976) more extreme than the usual forms of representational illusion, since they are not based on the confusion of the image with its object, but on the absence of object. The image becomes a mere signifier, as Magritte’s paintings illustrate, which begs the question: How does this affect art and culture from an ethical point of view?

7.2.2 The Moral Limit of Artistic Illusion I shall assume that art (visual or literary) is not referential speech, so it is not empirically falsifiable. However, there are cases in which the artistic work uses real data. For example, there are non-fiction novels, such as Capote’s In Cold Blood and Forsyth’s The Day of the Jackal, not to mention a great deal of artistic photography. In these cases, artistic merit depends in part on fidelity to historical or empirical truth. Given that deepfakes seem completely real but are not given that the object they represent is either not shown as represented, or does not exist at all, the question of the morality of the artistic illusion achieved through them must be raised. Applying the general rule, artistic illusion is morally right as long as the illusion does not cause harm to the audience. This means staying within the boundaries of art, i.e., the work must be recognizable as a work of art. Accordingly, the artist does not intend to deceive the audience. She is trying to communicate an authentic

134

R. Cejudo

experience without preventing the audience from identifying the object of the experience as fictitious. In other words, trompe l’oeil is admissible as long as it does not cause an accident. It is not permissible for someone to falsely shout fire and cause a panic, as Justice O. H. Holmes pointed out (Feinberg 1992, p. 136). Nor is it permissible if the shouting of “fire” were part of an artistic performance (Cejudo 2021). Orson Welles’ radio narrative The War of the Worlds would be a borderline case. So would deepfakes. Deepfakes were used to generate a younger version of actress Carrie Fischer for the film Rogue One (Edwards 2016), part of the Star Wars saga, and a posthumous appearance of Fischer was deepfaked for The Last Jedi (Johnson 2017). In these films a character (Princess Leia), not a real individual (Carrie Fischer), is impersonated by a deepfake. The result was surprising to a large portion of the public accustomed to associating the character of Princess Leia with actress Carrie Fisher. However, the artistic illusion is still bounded to fiction. The next case is a little more problematic, even if the technology used is less developed (Paris and Donovan 2019). In the movie Forrest Gump (Zemeckis 1994) there is a cameo of President Kennedy. Obviously, it is fake since Kennedy died in 1963. In reality, the actor Tom Hank (Gump) appears interacting with Kennedy thanks to digital compositing. This case is different from the previous ones because the audience is confronted with the appearance of someone who was not an actor and yet is interacting with a character in the film. Nevertheless, the deepfake can also be easily deactivated because the audience knows that Kennedy was dead and, therefore, can perceive and enjoy the film as such, i.e., as fiction. The aforementioned Spectre project, by Bill Posters and Daniel Howe, is a case that goes one step further. It was an art installation intended to show how big tech companies threaten privacy and democracy. It included several videos of AI-generated deepfake celebrities and artists, such as Kim Kardashian, Freddy Mercury and Marcel Duchamp, and also a deepfake featuring Mark Zuckerberg accusing himself of controlling people through stolen personal data. Although the authors identified it as a deepfake, they state that their installation elicited “unexpected  – and contradictory  – official responses from Facebook, Instagram and Youtube” (https://billposters.ch/projects/spectre/). In fact, Zuckerberg’s deepfake video became viral and Posters acknowledged that he was “operating ethically in a difficult space” (Pyatt 2020). Indeed, interpreting an artistic representation requires some learning of the conventions of representation, including those of realistic representation (Goodman 1976; Gombrich 1987; Panofsky 1991). Viewers of early Lumière films would surely be frightened to see a train in the theater, but they quickly learned to distinguish appearance from reality. The problem with deepfakes is that this may be impossible. Hume (1826) considered that a work of art is aesthetically worse if it expresses a commitment to moral evil because, in such a case, it is more difficult to empathize with the overall meaning of the work. Hume did not advocate any moral absolutism, merely pointing out that immoral works of art are those that go against well-­ established patterns of moral approval and rejection. Such is the case of an artwork in which there is no way of knowing whether the whole work, or a substantial

7  Ethical Problems of the Use of Deepfakes in the Arts and Culture

135

element of it, is a fake. The impossibility of deactivating the illusory effect of the deepfake would prevent distinguishing the work from reality, so the work of art loses part of its potential. Even considering that the aesthetic experience entails identification with the work (as in Aristotle’s theory of tragedy), or the suppression of the boundary between the image or text and the environment (as in the guerrilla art movement), art is based on the separation between art and non-art. In other words, a basic goal of any artistic work is the possibility of being taken as an artistic work, and this implies some way of distinguishing the work of art from what it is not (reality). This distinction goes beyond the visual arts, since it is not a matter of distinguishing representation from what is represented, but of distinguishing something qua representation.

7.2.3 Resurrecting Authors All ethical doubts are variants of the risk of lying implied by deepfakes, given they potentially represent a refined tool for deceiving. To a certain extent, postmodern authors such as Baudrillard (1981) and Deleuze (1990) foresaw the ethical harms of deepfakes. These authors denounced that the cultural and entertainment industry multiplied copies without originals in an exercise of confusion of meanings and signifiers, that is, signifiers that represent something non-existent. Since the 1980s, reality and fiction have been intertwined in contemporary culture, a process reinforced by social networks and the prospect of generalized virtual interaction in the metaverse. Consequently, deepfakes are the latest technological advance of the social ethos that began at the beginning of postmodernity. Baudrillard understood that simulacra, through marketing as mere signifiers, have an extraordinary field of application in economic relations. In fact, an important part of the ethical doubts and harms caused by deepfakes have repercussions in the creative industries and in cultural management. An interesting case is that of an interactive avatar of Dalí in the Dalí Museum of St. Petersburg (Florida, USA). For the Dalí Lives exhibition, an advertising agency created a life-size recreation of the Spanish painter consisting of an interactive animation through composition and decomposition technology using more than 6000 frames. An algorithm was trained to reproduce Dalí’s face with 1000 h of learning. These facial expressions were then imposed on an actor with Dalí’s body proportions. Excerpts from his interviews and letters were synchronized with a deepsound to mimic his accent. The result was that visitors could hold various conversations with the deepfake of Dalí, who even took a selfie and shared it with visitors (Lee 2019). The deepfake experience was designed to bring the famous painter closer to visitors, provoking in them feelings of proximity and familiarity as a way of creating a personal bond with the painter and his work. Dalí Lives displays the extraordinary possibilities of deepfakes for cultural management, specifically for heritage interpretation. This case shows how deepfakes can be used to broaden public involvement and engagement with heritage items.

136

R. Cejudo

Like augmented reality, deepfakes make it possible to insert a heritage element into the public’s frame of reference (what more could one ask for than to take a selfie with the painter himself!), thus providing a highly useful technique for cultural interpretation. According to Ham’s (1992) definition, this consists in translating specialized language into terms and ideas that the non-specialized public can understand. But this process of interpretation is not merely linguistic, since its ultimate goal is heritage conservation. Cultural interpretation aims to protect heritage through understanding (Tilden 1957), as it is only through understanding that it is possible to incorporate the heritage element into the personal reality of the public. On the other hand, cultural interpretation is characterized by the difference between the interpreter and expert, on the one hand, and the public on the other, who differ in terms of knowledge and interest in heritage. For this reason, surprise and intellectual provocation play an important role. Reviving Dalí through a deepfake is an extraordinary example of this. However, it is not difficult to see the risk of over-provocation, trivialization and even falsification in this resurrection of authors through deepfake avatars. The risk is greater in the domain of cultural management since the concept of heritage is not straightforward (Lowenthal 2015). Obviously, Dalí is no longer alive and cannot speak or correct what his avatar says. It is the professional responsibility of museum managers and heritage educators to ensure that the deepfake does justice to historical reality (to the real painter in Dalí Lives). The problem of factual veracity should not be obscured by the wow effect of the deepfake and its playful dimension. Who was the real Dalí? This problem is not even addressed in the deepfake avatar, which merely personifies the clichéd public figure. At the same time, the visitors’ idea of Dalí, i.e. their concept of who Dalí was, changes inadvertently, because the Dalí deepfake provides them with a new kind of information to construct their concept of Dalí. The historical figure is replaced by his virtual namesake, while the difficult access to the painter’s significance through the history of art is supplanted by Dalí himself, an affable and uncanny guide of his own works. On the other hand, the museum presents the Dalí deepfake as a fait accompli, without questioning its ontological ambiguity. Empathy with Dalí, or more precisely with the simplified and enjoyable character performed by the deepfake, comes to the forefront. Consequently, a direct emotional response replaces a critical vision, and a fabricated memory replaces the history of art. Indeed, it remains unclear whether audience participation is not more about testing the technology than learning about Dalí. In short, the use of deepfakes would be a new version of the frequent practice of inserting marketing into cultural management, as museums have emerged as markets for selling audience engagement (Mihailova 2021). As stated, cultural interpretation requires both fun and intellectual provocation. However, when managing cultural and museum resources, there are complex relationships that may require compromises between the different aims of cultural management, such as visitor maximization, economic sustainability, heritage conservation and transmission to different audiences. Therefore, deepfakes pose a new challenge to the professional ethics of cultural managers.

7  Ethical Problems of the Use of Deepfakes in the Arts and Culture

137

7.2.4 Falsifying Style The counterfeiting of cultural items also affects the professional ethics of cultural managers, curators and heritage educators. Moreover, it extends to general issues of creativity and authorship that are discussed in the next section. AI allows for the surprising result of generating audiovisual content that is very similar to other content that is original, i.e. authentic reproductions. One application is to generate images or sounds in the style of real authors, be it a painting in the style of Rembrandt or a piece of music in the style of Bach. It is worth remembering that GAN technology is based on automated fraud detection, as two deep neural networks compete with each other in producing examples similar to those in the original database, so that one of the networks learns to produce better and better examples, and the other learns to detect better and better copies, until the result exceeds human fraud detection capability. Now, to generate an image that is in the style of Rembrandt is a form of copying Rembrandt. The most sophisticated style copying initiative to date is The Next Rembrandt project by the Dutch public museum Mauritshuis, the Delft University of Technology, Microsoft and ING (https://www.nextrembrandt.com). The project has as its motto the question: can the great master be brought back to create one more painting? The result is a portrait (a deepface) made in the style of Rembrandt by a high-precision printer that reproduces the texture and thickness of a Rembrandt painting. Deepfakes ensure the possibility (or the illusion) of having additional works by the masters of art history, perhaps on subjects on which the artist did not work, and which in any case allow the museum’s collection to be expanded virtually but limitlessly. And as in the case of Dalí Lives, deepfakes provide a new way of storing artworks (Mihailova 2021), a type of archive that allows not only unlimited storage, but also reproduction of the archived works. From an ethical point of view, a distinction must be made between fakes and forgeries (Stalnaker 2005) because intentionality has prima facie relevance for ethical judgments. A fake is a non-deceptive copy because there is no intention to defraud or, alternatively, because it is too inaccurate to deceive. By contrast, a forgery is a deceptive copy fabricated with the intent to deceive. The Next Rembrandt case is one of fake, not forgery, but the extraordinary resemblance that deepfakes achieve represents an attractive opportunity for forgeries. Furthermore, there are two types of forgeries (Stalnaker 2005). Referential forgery is the copying of an existing work. In this case, it is possible to unmask the forgery by comparing it with the original. Inventive forgeries, however, are works created by emulating a known style. For example, Van Meegeren forged Vermeer’s style in his The Supper at Emmaus (Blankert 2007). Vermeer had never painted this work, so the lack of a copied original made it impossible to verify that it was a forgery by this means. Van Meegeren’s painting had Vermeer’s signature, and although a forged signature is a very relevant issue in identifying a work as a forgery, it is not foolproof. However, GAN technology makes superbly executed inventive forgeries possible.

138

R. Cejudo

Just as resurrecting an author by means of his or her deepfake avatar is quite convenient in heritage interpretation, resurrecting that author’s style can also be very useful. In both cases, these interpretative strategies need proper ethical surveillance to prevent the public from confusing fact and fiction. Deepfake avatars are easy to detect contextually, i.e., the public can clearly know whether the deepfake corresponds to a deceased person and, if not, the deepfaked person can show his accord with the messages of his deepfake avatar. But this is not possible in the case of the reproduction of a style, especially of a deceased author. Only an expert audience, or some AI device, could identify the style as copied. However, any good forgery escapes the screening of the layperson. Indeed, given any team of experts equipped with the best available technology to identify forgeries, it is conceivable that there will be some forgery that such a team will not be able to identify. Goodman (1976) considers that the never-to-be ruled out possibility of telling the forgery and the original apart makes the original aesthetically superior even if it is indistinguishable, the reason being the original will always be capable of revealing qualities important for the valuation of the work. Nevertheless, this argument rests on factual premises, and we can envisage deepfake technologies that mimic an author’s style to the point that they are absolutely indistinguishable from an original work. On the other hand, a deepfake could be considered inferior to the original even if it is perceptually identical. Apart from aesthetic authenticity, historical authenticity refers to the possibility of establishing a causal connection between a given work of art and its author (Stalnaker 2005). This causal connection is a key aspect of the concept of authorship, as shall be explained in the next section. Historic authenticity guarantees that the artwork is part of the author’s personal world, comprising biographical, cultural, and social aspects. Benjamin’s (1936) concept of aura points to this kind of authenticity. But the historical link is missing in the artistic deepfake, even if it were molecule for molecule identical to a work created by the author. In summary, deepfakes threaten the usual ways of identifying authenticity or, even further, the very notion of authenticity itself, since it is also questionable to what extent the distinction between authentic and fake deepfakes makes sense. Distinguishing between the authentic and the fake is essential not only for traditional, but also for digital art (Innocenti 2014). The fantasy of a next Rembrandt, in addition to pleasing the public and encouraging its involvement, reveals the role of museums as producers of authority in the world of art and culture and, given the corporations that support The Next Rembrandt project, it is easy to see the need to situate artistic deepfakes in the context of authorship and copyright.

7.3 The Limits of Authorship International treaties and copyright laws have been drafted on the basis of implicit consensus regarding the meaning of certain basic terms, especially “author” and “creation” (Salami 2021; Skiljic 2021). In other words, there are no explicit and independent definitions of the two concepts. The usual formulation is that an author

7  Ethical Problems of the Use of Deepfakes in the Arts and Culture

139

is the person who creates an intellectual work, or who carries out an intellectual creation. In turn, such intellectual creations, which are attributed the possibility of constituting a special type of property (non-industrial intellectual property), are defined ostensively. Article 2 of the Berne Convention states that literary and artistic works comprise any production in the literary, scientific or artistic domain. This treaty is one of the most widely recognized international agreements in the area of intellectual property law and and guides many national regulations. For example, US Copyright Law states that copyright protection subsists “in original works of authorship” (17  U.S.C. § 102), but neither authorship nor original are defined. Instead, an open-ended list of works of authorship is given. Similarly, Article 2 of Directive 2001/84/EC on the resale right for the benefit of the author of an original work of art states that “original work of art means works of graphic or plastic art such as pictures, collages, paintings […] provided they are made by the artist himself or are copies considered to be original works of art”. Once again, the meanings of made by the artist and original are not considered controversial. This procedure of ultimately resorting to the conventional meanings of natural language seems a reasonable practice in terms of clarity and economy of exposition. However, it is not clear that deepfakes imitating the style of a real author “are copies considered to be original works of art”, nor that an AI-generated poem is original. Artificial intelligence now permits the generation of results comparable to intellectual creations, to the extent that AI technologies are already considered by many as a new kind of author, while the need arises to legislate to whom such results belong and, therefore, whether AI should be considered an author that creates literary and artistic works. When asked what should be understood by author, the AI chatbot ChatGPT replies that it is the person who creates or is responsible for the creation of a literary, artistic, or scientific work. To the question of whether ChatGPT itself is an author, its answer (correctly, as I shall explain) is that it is not. It rightly uses commas to indicate that “my ‘creativity’ [sic] is based on my ability to combine and generate answers”. The problem is that we are already at a stage in AI where outputs such as deepfakes are produced that are indistinguishable from works of human authorship, including the kind protected by copyright, i.e. what we can loosely call works of Culture. Is it correct, then, to posit a digital authorship (Bridy 2016) for works of Culture? Would it be like traditional human authorship? To answer the above questions, we must begin by analyzing the normative foundations of copyright law and seeing whether they are applicable to the case at hand. As expected, we find the possibility of a deontological approach and one of a consequentialist character (Craig 2002; Fisher 2007). In the first case, copyright, i.e. the author’s property and control over his works, is based on some argument that the work is a type of extension of the author. Within this approach, one possibility is of Lockean origin, and holds that the author has a right to his work because it is the fruit of his labor. Another possibility is of Kantian inspiration and considers that the author’s autonomy is externalized in his work. In both possibilities of the deontological approach, the author must be human, otherwise the argument would be meaningless. Current copyright legislation establishes as a general rule that only natural persons can be authors. Therefore, the deontological approach refers to a

140

R. Cejudo

definition of the author as a human being who performs a creative activity, and the meaning of creation is defined differently in each variant. In the Lockean approach, creating is a kind of laborious mental task whereby the author adds something of his own to what did not exist before (as in a painting), or was an undifferentiated part of a common good (as in popular music), or did not have the symbolic dimension that his action has conferred on it (as in a surrealist readymade object) (Locke 1963; Moore 2012). This effortful addition justifies the ownership of the work. The other version of the deontological approach would be an application of personal autonomy to space and time, which justifies copyright on human personality rather than effort. This Kantian-inspired argument (Drassinower 2015; Ulmer 1980) can assume a Hegelian form if creation is seen as an objectification of subjective will, and thus creation would be the imprinting on the world of a unique personal meaning (Drahos 1996; Hughes 1988). It is not difficult to link this Hegelian argument with the Lockean approach to arrive at a Marxist notion of intellectual creation as objectification of the subject through labor. In any case, unlike the Lockean argument, the other deontological versions allow us to consider the right of authorship as a personal right that goes beyond the right to possess what has been created. According to these conceptions, authors must be humans, or more precisely, that which the Western metaphysical tradition characterizes as persons. Moreover, the implicit concept of creation is that of externalization of that personality. For both reasons, it would be incorrect to admit digital authorship, namely that AI can be an author. The consequentialist approach to copyright argues that intellectual creations should be protected by granting authors exclusive rights because this incentivizes the creation of such works, which ultimately benefits the general welfare (Fisher 2007). It should be noted that the social goals justifying copyright do not necessarily have to be the maximization of individual utility (Landes and Posner 1989). It could also be promoting the production of cultural items to achieve a richer or more just cultural life (Cejudo 2022; Netanel 1996). On the other hand, it is necessary to point out that copyright is no longer inherent to the personality of its holder, but a moral and legal instrument. As Mill says in Utilitarianism, having a right “is to have something which society ought to defend me in the possession of. If the objector goes on to ask why it ought, I can give no other reason than general utility” (Mill 2015, p. 167). The challenge of AI in the artistic field suggests extending the scope of the consequentialist approach to refer not only to how to justify copyright, but also to whether there should be authors; that is, whether the moral and legal institution we call authorship, the institution used to attribute merit, recognition, remuneration and responsibility, is worth it in terms of the general goals pursued. The potential attribution of authorship to AI would be a particular case. In short, the consequentialist approach allows us to reformulate the problem of authorship, while suggesting a way to approach it. The more general question would be: what are the sufficient conditions of authorship given a particular context? And in the case of AI and artistic deepfakes, what kind of authorship and authorship suffices for AI in the cultural domain?

7  Ethical Problems of the Use of Deepfakes in the Arts and Culture

141

I will thus address the above questions (Is it correct, then, to propose “digital authorship” through AI? Would it be like traditional authorship?) under the consequentialist umbrella; that is, what kind of authorship and author is sufficient for AI in the cultural sphere if we want to achieve a richer cultural life? I will begin by delving a little deeper into the implicit concept of author that we deal with in works of culture: the romantic author. It is important to do so because the deepfakes that falsify style disturbingly threaten this unnoticed but widely internalized conception. While the deontological approach provides strong support for the implicit belief in the romantic author, this belief is also compatible with the consequentialist approach, the difference being that only the latter can accommodate conceptions of authorship that differ from that of the romantic author. This is a modern conception, largely created by nineteenth-century literary figures and the idealistic philosophy of the time (Bennett 2006; Bracha 2008; Chon 2020; Craig and Kerr 2021). However, this conception lies at “the normative heart of our view of copyright” (Bracha 2008, p. 188). It consists in conceiving the author as an individual who works alone creating original pieces. His work is due to his autonomy, to the extent that only the author is deserving of moral recognition for his artistic or cultural works. The underlying notion of creativity is not that of a collective and coordinated endeavor that achieves novelty by reformulating what already exists. On the contrary, the model would be the biblical creatio ex nihilo (Hartley et al. 2013), where the paradigmatic author is the genius whose inventiveness and skill breaks out of the inertia of custom and tradition. Like the biblical God or Plato’s demiurge, romantic authors infuse new meaning into what already exists. They have the right to leave in their works of culture the mark of their authorship, their signature, and since the work is a result of their personality, to falsify it is not only stealing from them but causing serious moral harm by attempting to conceal a part of their personality. In conclusion, the romantic author treasures a personal and untransferable resource, inspiration, which entitles him to a quasi-rent guaranteed through copyright. This conceptualization perfectly fits into the Lockean or Hegelian justifications of copyright, but also into the consequentialist approach, under which copyright is the result of a social contract that gives certain individuals (the romantic authors) ownership of their cultural creations as an incentive to make them available to the public in exchange for recognition and compensation, thus encouraging them to keep doing their intellectual work. Although there are alternatives to copyright such as creative commons licenses or monetizable reputational gains (Benkler 2006), such alternatives do not require abandoning the romantic author model, as authors remain human individuals who have access to mechanisms for remuneration and protection of their works. AI is a direct challenge to the model of the romantic author, since it would be non-human, ontologically incapable of producing personal results that externalize a mental life. Deepfakes that reproduce the style of great masters appear to us as the factual challenge to the model of the romantic author: it is not necessary to be a genius to paint like Rembrandt; it is enough to be a machine. But it is not so easy to eliminate the presupposition of the romantic author and, on the other hand, it is

142

R. Cejudo

necessary to determine to what extent it is worth doing so. In the current phase of its development, the anthropomorphization of AI is a pervasive feature of its social impact (Alabed et al. 2022; Epstein et al. 2020; Ryan 2020; Salles et al. 2020) and, given the prevalence of the romantic author in cultural production, even more so in AI-for-the-arts. Headlines such as The Kindle Store has a prolific new author: ChatGPT are common (Shanklin 2023). They provide examples not only of the anthropomorphization of AI, but also of the romanticization of its authorial role. This idealization of the function of a machine, be it physical or logical, is based on distinguishing between results and creative processes. Creative processes stricto sensu have to be human, because they involve psychological processes such as imagination or understanding that are very different from the procedures used by AI. However, the argument continues, creative results can be obtained by AI, and it is by reference to them that it would be legitimate to accept digital authors. However, deepfakes reveal that such an approach still uses the concept of creativity of the romantic author, which does not correspond to what AI actually does. The romantic author is an idealization. Arts and culture creation can be conceived under the model of a genius who creates ex novo, but also as a process of cultural appropriation that goes beyond the admissible expedients in the romantic author model, namely multiple authorship and acknowledgements. Alternatively, cultural creation can be understood as “the interplay between similar differences and different similarities” (Hartley et al. 2013), that is to say it would be the result of an endless process of modification of any initial situation. This model is closer to what AI does. As deepfaking a style shows, AI works with historical data, that is, with images, sounds or words previously created by other machines, and ultimately by human authors. ChatGPT and The Next Rembrandt predict which constitutive elements go behind others (words, touches) to fit a symbolic result understandable by a human. It is possible to partially describe the human creative process in a similar way. According to a quite influential part of literary theory, the arts and culture creative process is relational, and authors occupy a node in an indefinitely wide network through which they receive data from their cultural and historical context and, after altering these data, return them to that network where they will be re-­processed by the public (Barthes 1984; Foucault 1969; Kristeva 1980). In his critique of the romantic author, Foucault (1969) suggests replacing the subject (i. e. a human individual) with a function (the authorial function) which, according to Foucault, should not be considered inherently human. Be that as it may, the romantic author model includes ethically relevant features that are worth preserving as authorial functions. Chon (2020) identifies two main prongs in the romantic author: the genius effect and the authorization effect. The former refers to the break with tradition that makes an original work possible. The authorization effect means that the author provides authoritative interpretations of his work. Thus, the author approves a prioritization of cultural practices, imposing order and acting as an arbiter of value. Likewise, he approves stylistic patterns and their replication. Other functions are knowing the meaning of the work, not simply recognizing that it has some meaning, as AI can do. This is possible because the author has a certain

7  Ethical Problems of the Use of Deepfakes in the Arts and Culture

143

aesthetic sense, which also enables the function of participating together with the public in the interpretation, enjoyment, and dissemination of the work. The above functions are not only aesthetically, but also ethically relevant. Indeed, Foucault (1969) begins by asking why who is speaking should even matter. In other words, it matters to know who the owner of the work will be, to whom the creation shall be attributed or who will be held liable for any damage caused by the work. On the other hand, the above-mentioned function of generating something original is a copyright necessary condition. Furthermore, the authorization effect has consequences for the variety of available cultural items, since the author has power over the legitimate cultural interpretations and appropriations etc. when the work becomes part of the public domain. In conclusion, the above-mentioned authorial functions can be summarized in the following non-exhaustive list: (a) Produce something original (b) Provide authorized interpretations of the work (c) Authorize stylistic patterns and thus the replication of the work (d) Understand the meaning of the work (e) Take responsibility for the harm that the work could cause (f) Possess the work (g) Participate with the public in the interpretation, enjoyment, and dissemination of the work. To be an author is to be able to perform authorial functions. Thus, it would be possible to recognize less than full authors, depending on whether they perform more or fewer functions. The romantic author is the full author typical of the fine arts, literature, or philosophy. In fact, legislation allows for non-full forms of authorship, such as in Spain where it is possible to be the author of a contribution to a collective work (R.D. 1/1996, art. 8), or the author of a “work made for hire” in US law (Copyright Act of 1976, §101). According to the consequentialist approach, it is the effects of the different authorial functions that are valuable. Therefore, the romantic author could be justified as a socially useful idealization. On the other hand, it is not necessary for these functions to be performed by a human individual, in contrast to the deontological approach. So, should authorial functions be attributed to AI-for-­ the-arts? The famous forger Hans van Meegeren was an author despite being a forger, and can be blamed for forgery precisely because it is taken for granted he is an author. So, should the AI technology that makes The Next Rembrandt project possible be considered an author? The first auctorial function is to produce an original work. But different degrees of originality are possible. The highest is originality that breaks with tradition, which is a basic assumption of the romantic author. Deepfakes that copy a style are technologies based on neural networks that find patterns in human behavior (the tradition) and then generate particular instances that fit the pattern and are therefore potentially predictable by humans. It is therefore the opposite of the romantic artist who is deepfaked. The deepfake algorithm is not like an artist but like a craftsman, i.e. it cannot bring about radical innovations. In the cultural field, these do not even consist of unexpected findings, but of new ways of symbolizing (as the invention of

144

R. Cejudo

the essay by Montaigne, or the first use of photography for artistic purposes by Nadar). Although the creation of a cultural item may be due to AI, its identification as such, and especially as a valuable element that enriches the cultural heritage, depends on the prior existence of that heritage. Since actual copying is eliminated in GAN deepfakes and other AI-for-the-arts technologies, “coincidence, however improbable, is the truth”, to paraphrase Lord Diplock (Craig and Kerr 2021, p. 38). Therefore, AI-for-the-arts can produce original results if “original” is understood objectively or perceptually, i. e. a cultural item is original simply if it is outwardly different from those already known within its genre (Saiz 2019). The AI cannot perform other authorial functions, such as knowing the meaning of the work, enjoying it or authorizing interpretations as a connoisseur of the work’s genre. But other authorial functions could be attributed to AI-for-the-arts through fictio iuris, in a similar way as the law ascribes criminal liability to companies. From the consequential approach, a cost-benefit analysis is needed to justify the granting of copyright to AI, and to hold it accountable for the deepfakes generated. Such an analysis has to be based largely on objective information on how cultural life is enriched or impoverished by the various alternatives, for example, whether failure to grant AI royalties for the deepfakes generated will discourage research into other potentially useful algorithms. On the other hand, cultural life is “the general intellectual environment in which we all live” (Dworkin 1985, p.  225). It is clear that the proliferation of deepfakes will have an effect on this general environment, but judging it is not only an empirical problem, but also a normative one (Cejudo 2022). The richness of a cultural life is not only a quantitative magnitude, as it also depends on the originality of cultural novelties or the role of cultural heritage in fostering the ability of society to question norms for the benefit of people’s lives. As acknowledged by the developers of the deep learning model DALL-E: While we are highly uncertain which commercial and non-commercial use cases might get traction and be safely supportable in the longer-term, plausible use cases of powerful image generation and modification technologies like DALL⋅E 2 include education (e.g. illustrating and explaining concepts in pedagogical contexts), art/creativity (e.g. as a brainstorming tool or as one part of a larger workflow for artistic ideation), marketing (e.g. generating variations on a theme or “placing” people/items in certain contexts more easily than with existing tools), architecture/real estate/design (e.g. as a brainstorming tool or as one part of a larger workflow for design ideation), and research (e.g. illustrating and explaining scientific concepts) (OpenAI employees 2022).

Indeed, we do not know how this and other technologies will be used when they become widespread, or how they will interact with the different fields and agents of cultural life (from the audiovisual industry to cultural consumption patterns, from cultural heritage to contemporary art). Leaving aside whether current legal protection is sufficient (Hugenholtz and Quintais 2021; Salami 2021; Skiljic 2021), we can venture that it would be counterproductive to attribute copyright to AI-for-the-­ arts. These technologies produce texts and audiovisual content that make sense to humans from other sources that have human origins. Deepfakes are remotely derived from images or voices of real people and mimic the features or style of real authors

7  Ethical Problems of the Use of Deepfakes in the Arts and Culture

145

(Degli Esposti et al. 2020). As AI parasitizes, we could say, human creations, and does so at an increasing rate, a future in which AI-for-the-arts has utilized the full content of pre-AI cultural life is not far off. It is therefore more useful that the results of AI-for-the-arts be considered as authorless works that can be incorporated into the digital public domain; thus, they will be freely available for the kind of radical cultural innovation that, for now, can only be achieved by human agents.

7.4 Conclusion Deepfakes are an extraordinary tool at the service of filmmaking (as in the Star Wars saga), or the fine arts (as in Klingemann’s Memories of Passerby I) (Klingemann 2018). However, the ethical drawbacks of artistic deepfakes have been the main topic of Sect. 7.2. Through case studies, the specific ethical risks of using deepfakes in the field of arts and culture have been identified and assessed. Even when deepfakes are free from the ethical pitfalls of their non-cultural uses, arts and culture pose specific challenges both for cultural managers and the public. The application of deepfakes in the film industry and their use to reproduce the style of famous authors raises the question of whether AI should be considered an author capable of producing artistic items. Other aspects of AI-for-the-arts, such as algorithms that generate images or text, also suggest this question. This has been the subject of the last section of the chapter. An adequate response requires an analysis of the normative foundations of copyright. It has been argued that a consequentialist view of copyright allows for accommodating the new role of AI, once authorship is decomposed into a set of authorial functions. However, at the current point in the evolution of AI-for-the-arts, it is more desirable for cultural life that its results become part of the cultural public domain.

References Ajder, A., G. Patrini, F. Cavalli, and L. Cullen. 2019. The State of Deepfakes: Landscape, Threats, and Impact. Deeptrace. https://regmedia.co.uk/2019/10/08/deepfake_report.pdf. Accessed 2 November 2023. Alabed, A., A.  Javornik, and D.  Gregory-Smith. 2022. AI anthropomorphism and its effect on users’ self-congruence and self-AI integration: A theoretical framework and research agenda. Technological Forecasting and Social Change 182: 1–19. https://doi.org/10.1016/ j.techfore.2022.121786. Barthes, R. 1984. Le Bruissement de la langue. Essais critiques IV. Paris: Seuil. Baudrillard, J. 1981. Simulacres et Simulation. Paris: Galilée. Benjamin, W. 1936. The work of art in the age of mechanical reproduction. New  York: Prism Key Press. Benkler, Y. 2006. The wealth of networks: How social production transforms markets and freedom. New Haven: Yale University Press.

146

R. Cejudo

Bennett, A. 2006. Expressivity: The romantic theory of authorship. In Literary theory and criticism, ed. P. Waugh, 48–58. Oxford: Oxford University Press. Berne Convention for the Protection of Literary and Artistic Works, art. 2, 9 September 1886, revised at Paris on 24 July 1971, and amended on 28 September 1979, 1161 U.N.T.S. 3. Blankert, A. 2007. The case of Han van Meegeren’s fake Vermeer supper at Emmaus reconsidered. In His Milieu. Essays on Netherlandish art in memory of John Michael Montias, ed. A. Golahny et al., 47–57. Amsterdam: Amsterdam University Press. Bracha, O. 2008. The ideology of authorship revisited: Authors, markets, and Liberal values in early American copyright. The Yale Law Journal 118 (2): 186–271. https://doi.org/10.2307/ 20454710. Bridy, A. 2016. The evolution of authorship: Work made by code. Columbia Journal of Law & the Arts 39: 1–25. https://doi.org/10.2139/ssrn.2836568. Cejudo, R. 2021. J. S. Mill on artistic freedom and censorship. Utilitas 33: 180–192. https://doi. org/10.1017/S0953820820000230. ———. 2022. ¿Es posible apropiarse de la vida cultural? Mercantilización y patrimonialización de comunes culturales. Isegoría. Revista de filosofía moral y política, 66, enero-junio. https:// doi.org/10.3989/isegoria.2022.66.19. Chesney, R., and D.  Citron. 2019a. Deep fakes: A looming challenge for privacy, democracy, and national security. California Law Review 107: 1753–1819. https://doi.org/10.15779/ Z38RV0D15J. Chesney, R., and Citron, D.K. 2019b. Deepfakes and the new information war. Foreign Affairs, January/February: 147–155. https://www.foreignaffairs.com/articles/world/2018-­12-­11/ deepfakes-­and-­new-­disinformation-­war. Chon, M. 2020. The romantic collective author. Vanderbilt Journal of Entertainment and Technology Law 14 (4): 829–849. Cole, S. 2018. We are truly fucked: Everyone is making AI-generated fake porn now, Motherboard, January 24. https://motherboard.vice.com/en_us/article/bjye8a/reddit-­fake-­ porn-­app-­daisy-­ridley. Copyright Act of 1976, 17 U.S.C. Craig, C.J. 2002. Labour, and limiting the Author’s right: A warning against a Lockean approach to copyright law. Queen’s Law Journal 28 (1): 1–60. Craig, C., and I. Kerr. 2021. The death of the AI author. Ottawa Law Review 52 (1): 31–86. De Rancourt-Raymond, A., and N.  Smaili. 2022. The unethical use of Deepfakes. Journal of Financial Crime. https://doi.org/10.1108/JFC-­04-­2022-­0090. De Ruiter, A. 2021. The distinct wrong of Deepfakes. Philosophy & Technology 34: 1311–1332. https://doi.org/10.1007/s13347-­021-­00459-­2. Degli Esposti, M., F. Lagioia, and G. Sartor. 2020. The use of copyrighted works by AI systems: Art works in the data Mill. European Journal of Risk Regulation 11 (1): 51–69. https://doi. org/10.1017/err.2019.56. Deleuze, G. 1990. The logic of sense. New York: Columbia University Press. Directive 2001/84/EC of the European Parliament and of the Council of 27 September 2001, art. 2, O.J. (L 272) 32 (2001). Drahos, P. 1996. A philosophy of intellectual property. London: Routledge. Drassinower, A. 2015. What’s wrong with copying? Cambridge, MA: Harvard University Press. Dworkin, R. 1985. Can a liberal state support art? In A matter of principle, 221–236. Cambridge, MA: Harvard University Press. Epstein, Z., et al. 2020. Who gets credit for AI-generated art? iScience 23 (9): 1–10. https://doi. org/10.1016/j.isci.2020.101515. Feinberg, J. 1992. Freedom and fulfilment. Princenton: Princenton University Press. Fisher, W. 2007. Theories of intellectual property law. In New essays in the legal and political theory of property, ed. S.R. Munzer, 159–177. Cambridge: Cambridge University Press. Foucault, M. 1969. Qu’est-ce qu’un auteur? Bulletin de la Société française de philosophie 63 (3): 73–104.

7  Ethical Problems of the Use of Deepfakes in the Arts and Culture

147

Franks, A., and A.E.  Waldman. 2019. Sex, lies, and videotapes: Deep fakes and free speech illusions. Maryland Law Review 78 (4): 892–898. Gombrich, E.H. 1982. The image and the eye. Further studies in the psychology of pictorial representation. New York: Cornell University Press. Gombrich, E.H. 1987. Reflections on the History of Art. Views and Reviews. Oxford: Phaidon. Goodman, N. 1976. Languages of art. An approach to a theory of symbols. Indianapolis: Bobes-Merrill. Graber-Mitchell, N. 2021. Artificial illusions: Deepfakes as speech. Intersect 14 (3): 1–19. Habgood-Coote, J. 2023. Deepfakes and the epistemic apocalypse. Synthese 201: 103. https://doi. org/10.1007/s11229-­023-­04097-­3. Harris, K.R. 2021. Video on demand: What deepfakes do and how they harm. Synthese 199: 13373–13391. https://doi.org/10.1007/s11229-­021-­03379-­y. Hartley, J., J.  Potts, S.  Cunningham, T.  Flew, M.  Keane, and J.  Banks. 2013. Key concepts in creative industries. Los Angeles: Sage. Ham, S.H. 1992. Environmental Interpretation. A Practical Guide for People with Big Ideas and Small Budgets. Golden: North American Press. Hugenholtz, P.B., and J.P. Quintais. 2021. Copyright and artificial creation: Does EU copyright law protect AI-assisted output? Iic-International Review of Intellectual Property and Competition Law 52 (9): 1190–1216. https://doi.org/10.1007/s40319-­021-­01115-­0. Hughes, J. 1988. The philosophy of intellectual property. Georgetown Law Review 77: 299–330. Hume, D. 1826. Of the standard of taste. In The philosophical works of David Hume, vol. III, 256–282. Edimburgo: Black and Tait. Innocenti, P. 2014. Bridging the gap in digital art preservation: Interdisciplinary reflections on authenticity, longevity and potential collaboration. In Preserving complex digital objects, ed. J. Delve and D. Anderson, 73–91. London: Facet Publishing. Klingemann, M. 2018. Memories of Passerby I. www.quasimodo.com. Accessed 3 Apr 2023. Knight, W. 2017. Meet the fake celebrities dreamed up by AI. MIT Technology Review. October 31. https://perma.cc/D3A3-­JFY4. Kristeva, J. 1980. Desire in language: A semiotic approach to literature and art. New  York: Columbia University Press. Landes, W.M., and R.A.  Posner. 1989. An economic analysis of copyright law. The Journal of Legal Studies 18 (2): 325–363. Lee, D. 2019. Deepfake Salvador Dalí takes selfies with museum visitors. The Verge, May 10. https://www.theverge.com/2019/5/10/18540953/salvador-­d ali-­l ives-­d eepfake-­m useum. Accessed 28 Mar 2023. Locke, J. 1963. Two treatises of government. X vols. Vol. V. The works of John Locke. Londres/ Darmstadt: Scientia Verlag Aalen. Lowenthal, D. 2015. The Past is a Foreign Country. Cambridge: Cambridge University Press. Meskys, E., J. Kalpokiene, J. Paulius, and A. Liaudanskas. 2020. Regulating deep fakes: Legal and ethical considerations. Journal of Intellectual Property Law & Practice 15 (1): 24–31. https:// ssrn.com/abstract=3497144. Mihailova, M. 2021. To Dally with Dalí: Deepfake (inter)faces in the Art Museum. Convergence: The International Journal of Research into New Media Technologies 27 (4): 882–898. https:// doi.org/10.1177/13548565211029401. Mill, J.S. 2015. On liberty, utilitarianism and other essays. Oxford: Oxford University Press. Moore, A.D. 2012. A Lockean theory of intellectual property. San Diego Law Review 49 (4): 1069–1104. https://digital.sandiego.edu/sdlr/vol49/iss4/6. Mori, M., K.F.  MacDorman, and N.  Kageki. 2012. The Uncanny Valley [from the field]. IEEE Robotics & Automation Magazine 19 (2): 98–100. https://doi.org/10.1109/MRA.2012.2192811. Naude, I. 2010. Photography as inventor of new memories. De Arte 82: 24–32. Netanel, N.W. 1996. Copyright and a democratic civil society. The Yale Law Journal 106, no. 2 (1996): 283–387. https://doi.org/10.2307/797212.

148

R. Cejudo

OpenAI employees. 2022. DALL·E 2 preview – Risks and limitations. https://github.com/openai/ dalle-­2-­preview/blob/main/system-­card.md. Accessed 3 Apr 2023. Panofsky, E. 1991. Perspective as symbolic form. New York: Zone Books. Paris, B., and J. Donovan. 2019. Deepfakes and cheap fakes: The manipulation of audio and visual evidence. Data & Society. September 18. https://datasociety.net/library/deepfakes-­and-­cheap-­ fakes/. Accessed 28 Mar 2023. Paterson, T., and L. Hanley. 2020. Political warfare in the digital age: Cyber subversion, information operations, and “deep fakes”. Australian Journal of International Affairs 74 (4): 439–454. https://doi.org/10.1080/10357718.2020.1734772. Pyatt, C. 2020. The art of interrogation: An interview with bill posters. Juxtapoz. Art and Culture, June 19. https://www.juxtapoz.com/news/street-­art/the-­art-­of-­interrogation-­an-­interview-­with-­ bill-­posters/. Accessed 28 Mar 2023. Rel Decreto Legislativo 1/1996, of 12 April, approving the revised text of the Law on Intellectual Property, BOE no. 97, 22 April 1996. Rini, R. 2020. Deepfakes and the epistemic backstop. Philosophers’ Imprint 20 (24): 1–16. http:// hdl.handle.net/2027/spo.3521354.0020.024. Rosen, D. 2023. Pornography and the Erotic Phantasmagoria. Sexuality & Culture 27: 242–265. https://doi.org/10.1007/s12119-­022-­10011-­9. Ryan, M. 2020. In AI we trust: Ethics, artificial intelligence, and reliability. Science and Engineering Ethics 26 (5): 2749–2767. https://doi.org/10.1007/s11948-­020-­00228-­y. Saiz, C. 2019. Las obras creadas por sistemas de inteligencia artificial y su protección por el derecho de autor. InDret. Revista para el análisis del Derecho 1: 1–45. Salami, E. 2021. AI-generated works and copyright law: Towards a union of strange bedfellows. Journal of Intellectual Property Law & Practice 16 (2): 124–135. https://doi.org/10.1093/jiplp/ jpaa189. Salles, A., K. Evers, and M. Farisco. 2020. Anthropomorphism in AI. AJOB Neuroscience 11 (2): 88–95. https://doi.org/10.1080/21507740.2020.1740350. Shanklin, W. 2023. The Kindle Store has a prolific new author: ChatGPT. Engadget, February 21. www.engadget.com. Accesed 2 Apr 2023. Skiljic, A. 2021. When art meets technology or vice versa: Key challenges at the crossroads of AI-generated artworks and copyright law. International Review of Intellectual Property and Competition Law 52 (10): 1338–1369. https://doi.org/10.1007/s40319-­021-­01119-­w. Spivak, R. 2019. ‘Deepfakes’: The newest way to commit one of the oldest crimes. Georgetown Law Technology Review 3 (2): 339–400. Stalnaker, N. 2005. Fakes and forgeries. In The Routledge companion to aesthetics, ed. B. Gaut and D.M. Lopes, 513–526. London: Routledge. Starl, T. 1998. A new world of pictures. The use and spread of the daguerreotype process. In A new history of photography, ed. M. Frizot, 32–50. Köln: Könemann. Stump, S. 2021. Man behind viral Tom Cruise deepfake videos calls the technology ‘morally neutral’. https://www.today.com/news/man-­tom-­cruise-­deepfakes-­tiktok-­speaks-­ ethics-­technology-­rcna10163. Talbot, W.H.F. 1844. The pencil of nature. London: Longman, Brown, Green & Longmans. https:// gutenberg.org/ebooks/33447. Accessed 3 Apr 2023. Tilden, F. 1957. Interpreting Our Heritage. Chapel Hill: The University of North Carolina Press. Ulmer, E. 1980. Urheber und Verlagsrecht. New York: Springer. Wang, S., S.O. Lilienfeld, and P. Rochat. 2015. The Uncanny Valley: Existence and explanations. Review of General Psychology 19 (4): 393–407. https://doi.org/10.1037/gpr0000056. Westerlund, M. 2019. The emergence of Deepfake technology: A review. Technology Innovation Management Review 9 (11): 40–53. https://doi.org/10.22215/timreview/1282.

Chapter 8

Exploring the Ethics of Interaction with Care Robots María Victoria Martínez-López, Gonzalo Díaz-Cobacho, Aníbal M. Astobiza, and Blanca Rodríguez López

Abstract  The development of assistive robotics and anthropomorphic AI allows machines to increasingly enter into the daily lives of human beings and gradually become part of their lives. Robots have made a strong entry in the field of assistive behaviour. In this chapter, we will ask to what extent technology can satisfy people’s personal needs and desires as compared to human agents in the field of care. The industry of assistive technology burst out of the gate at the beginning of the century with very strong innovation and development and is currently attracting large sources of public and private investment and public attention. We believe that a better-defined and more fundamental philosophical-ethical analysis of the values at stake in care robots is needed. To this end, we will focus on the current status of care robots (types of care robots, their functioning and their design) and we will provide a philosophical-ethical analysis that offers a solid framework for the debate surrounding the potential risks and benefits of implementing assistive robots in people’s daily lives.

8.1 Introduction Global demography is changing: the world is getting older. Globally, the median age increased from slightly over 20 in 1970 to just over 30 in 2022 (Ritchie and Rose 2019). In particular, Europe is facing a demographic decline due to low fertility rates, low mortality rates (however, due to the COVID-19 pandemic, Europe experienced a sharp increase in mortality in 2020), and negative net migration. In M. V. Martínez-López (*) · G. Díaz-Cobacho Department of Philosophy I, University of Granada, Granada, Spain A. M. Astobiza Department of Public Law, University of the Basque Country, Leioa, Spain B. Rodríguez López Department of Philosophy and Society, Complutense University of Madrid, Madrid, Spain © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 F. Lara, J. Deckers (eds.), Ethics of Artificial Intelligence, The International Library of Ethics, Law and Technology 41, https://doi.org/10.1007/978-3-031-48135-2_8

149

150

M. V. Martínez-López et al.

the next three decades, Europe will face two significant demographic challenges: a rapid population drop and a rapid ageing population. According to a report by the European Commission (Kiss 2021), by the year 2070, the proportion of individuals aged 65 and above in the overall population will increase from 20% to 30%, while that of people aged 80 and older will more than double (from 6% to 13%). Demography is leading to a progressive increase in the dependent population. Although ageing is not necessarily synonymous with dependency, several studies show the correlation between age and disability. In fact, more than 32% of people aged over 65 have some kind of disability, while this percentage is just 5% for the rest of the population. In addition, dependence due to illness and other causes of disability or limitation has increased in recent years due to changes in the survival rates of certain chronic diseases. Examples of diseases causing high dependency are neurodegenerative diseases such as Alzheimer’s disease, Amyotrophic lateral sclerosis (ALS) and Parkinson’s disease. Of these, Alzheimer’s disease is the most common cause of dementia, and leads to increased morbidity, mortality, disability, and dependency of patients, with a significant decrease in quality of life and survival. Eighty percent of patients are cared for by their families, who bear on average 87% of the total cost, with the consequent overburdening and impairment of caregivers’ health and quality of life. Thus, the economic impact of dementia is enormous. It is necessary to develop comprehensive programmes and increase resources focused on promoting research, prevention, early diagnosis, multidimensional treatment, and a multidisciplinary approach to reduce the health, social and economic burden of dementia. The care of dependent persons and the promotion of their personal autonomy is one of the main challenges of social policy in developed countries and requires a firm, sustained response that is adapted to the current model of our society. The challenge is none other than to meet the needs of those people who because they are in a situation of special vulnerability, require support to carry out the essential activities of daily living, achieve greater personal autonomy and be able to fully exercise their citizenship rights. It should not be forgotten that, until now, it has been families, and especially women, who have traditionally assumed the care of dependent persons, constituting what has come to be known as “informal support”. Changes in the family model and the progressive incorporation of almost three million women into the labour market in the last decade introduce new factors into this situation which make it essential to review the traditional system of care for dependent people to ensure adequate implementation of new care tools. In this chapter, we will look at some challenges that care robots may present in our society. To this end, we have formulated the following structure for this chapter. Firstly, we will provide a working definition of care robots. To do so, we will dive into the history of these artefacts and propose a taxonomy from which to analyse them. Not all care robots are of the same type, and in fact, it is also necessary to clarify the typology itself. Secondly, we explore who robots should be in our society. Finally, in the section “An ethical framework for assistive technologies,” we

8  Exploring the Ethics of Interaction with Care Robots

151

will try to outline an ethical framework capable of reflecting the main challenges we face in our relationship with care robots.

8.2 State of Art Dependency is understood as the permanent state of individuals who, for reasons derived from age, illness or disability, and linked to the lack or loss of physical, mental, intellectual or sensory autonomy, require the care of others or significant assistance to carry out basic activities of daily living or, in the case of people with intellectual disabilities or mental illness, other support for their personal autonomy. The recognition of the rights of people in a situation of dependency has been highlighted by numerous documents and decisions of international organisations, such as the World Health Organisation, the Council of Europe and the European Union. We recall that vulnerability is a universal human condition, but is highly dependent on the environment. In this sense, assistive technology, in the form of assistive robots, can be a useful tool to care for potentially vulnerable people. However, it is important to pay attention to this kind of technology because it can prevent or eliminate some dangers to which we are exposed, but also create new situations of vulnerability (Liedo and Ausín 2022). Care robots are a type of assistive technology that may offer people all kinds of support and assistance, particularly in settings of health and social care. They have been developed for a variety of purposes, such as companionship, monitoring, physical assistance, education and/or cognitive stimulation. The development of care robots to meet the needs of dependent people would not be possible if artificial intelligence (AI) does not develop in a similar line. A robot is nothing more than embodied AI. If the physical realization of a machine, in other words, a robot, does not have the intelligence to be able to respond to stimuli and exhibit goal-directed behaviour, robots would not be reliable.1 The goal of AI, a subfield of computer science, is to build machines and systems that are capable of reasoning, learning, making decisions, and perceiving in the same way that humans do (Misnky 2003). AI has several uses in many industries, including healthcare, where it can enhance older people’s quality of life and overall well-being. Providing care robots to dependent people is one of the ways AI may help them. AI-powered care robots are devices that can help vulnerable and dependent people with daily tasks including taking their medications, cooking, cleaning, and moving around. Vulnerable and dependent people who live alone or have little human  There is an ongoing and extensive debate about the need for robots to possess properties such as consciousness in order to be considered intelligent. Given the epistemological limitations we have in accessing the internal properties of other entities, some authors have argued that we should be wary of categorically denying that robots can think or relate to human beings in functionally equivalent manners (Llorca Albareda 2023; Llorca Albareda and Díaz-Cobacho 2023). 1

152

M. V. Martínez-López et al.

contact can also benefit from the emotional and social assistance that care robots can offer. Such care robots, for instance, can play games with them, read their books, remind them of key anniversaries and events, and provide for comfort (Kyrarini et al. 2021). Care robots can lessen the load on human caregivers, who frequently experience high levels of stress and burnout. As well, they can free up time and resources for human caregivers, so they can concentrate on more sophisticated and individualized parts of care by taking over some of the everyday activities and offering companionship to vulnerable and dependent people. But the massive emergence of robots in the service of health care requires that human caregivers be prepared to interact with them. This means that appropriate policies must be implemented by the administration, taking into account the particularities of care. There is no “one size fits all” solution, although technology is crucial because comprehensive policies are needed and these can only come from a state committed to social justice. As human caregivers and care robots gradually will share space, one aspect to be taken into consideration will be the caregiver’s attitude and perspective towards care robots, including whether they view them as tools, collaborators, a threat, or friends. There is no doubt that the future of work between humans and machines will have to be legislated. There is no regulation (at international and European levels) on robotics for the care of dependent people. Despite this, various institutions have shown interest, such as the International Labour Organisation (ILO), the International Organisation for Standardisation (ISO), or the recent European Parliament Resolutions. There have been widespread efforts to highlight the importance of assistive robotics in improving the lives of people in situations of dependency. An important aspect in achieving this is the certification of the safety of robots. In relation to the implementation of care robots, a concern has been expressed about the impossibility of replacing human carers in their entirety. Care requires multiple tasks that cannot yet be automated. Without any question that the introduction of care robots in healthcare has more challenges than any other welfare technology. Consequently, the relationship between human caregivers, dependent people and care robots can be both advantageous and difficult for all parties (Andtfolk et al. 2022). Nonetheless, it is the responsibility of governments and states to ensure that adequate policies are in place to guarantee that care robots are used in a way that safeguards the autonomy and dignity of the end-users. However, special attention must be paid because it can not only prevent or eliminate some dangers to which we are exposed, but can also create new situations of vulnerability. Care robots can offer many benefits such as enhancing the independence, quality of life, and social inclusion of those who require care, particularly the elderly and the disabled. Before they can be extensively used, they also present ethical, legal, and social issues that need to be resolved. Some of these challenges are the potential effects of care robots on human autonomy and dignity, liability and even moral responsibility for their actions and decisions, particularly when they involve life-and-death situations or the treatment of sensitive data.

8  Exploring the Ethics of Interaction with Care Robots

153

Another major challenge is the economic impact of the implementation of care robots. The possible replacement or displacement of human caregivers by care robots, and the implications for their employment, skills and relationships with care recipients.

8.3 What Are Care Robots? 8.3.1 Definition A robot is considered to be a complex sensorimotor machine that extends the human capability to act (Monasterio Astobiza et al. 2019; Christaller 2001). If we try to define a care robot, we are faced with a more complex task, given the fact that the definition of care robots is conditioned by the definition of care itself. If we understand care for human beings as the holistic and inclusive protection of all aspects that make up the human being (physical, psychological, social, spiritual) we find that no robot can comprehensively address all aspects. Care is a complex, multifaceted and synchronic phenomenon (Ausín et  al. 2023). Perhaps, in the future, a general AI will be able to achieve this, but so far, we only have robots that compartmentalise actions considered to be caring, but not caring holistically. It is foreseeable that robotics will evolve to complete tasks autonomously, more efficiently and accurately, but for the moment care itself cannot be replaced by robots. In other words, some care tasks are automated, but there are no “care robots”. With this in mind, we could define care robots as those devices, equipment, instruments and software that have functions related to the care and/or protection of health in clinical, welfare or social settings. Most of them serve for the automation of tasks that are technical or provide assistance to users with special needs. Among these care actions we find activities such as: health care, physical and cognitive rehabilitation, activities related to domestic daily life and educational activities in different environments (hospitals, nursing homes, homes and schools). Target groups include the elderly, people with dementia, children with autistic spectrum disorders, convalescent patients and people with other types of functional diversity needs (Pareto Boada et al. 2021). But also, for their caregivers, technologies can be a source of support for both professionals and caregivers, as they offer greater autonomy and the possibility to improve care.

8.3.2 A Bit of History The tradition of robotics applications for disability assistance began in the 1970s, with the construction of prosthetic arms and legs dates back to antiquity (Bliquez 1983). The first robotic prosthetics were based on the recognition of myoelectric

154

M. V. Martínez-López et al.

signals, which are the signals produced by muscles when they contract or flex (Montero et  al. 2010). Created in the 1980s, the first robots in the medical field provided surgical assistance using robotic arm technologies (Kwon et  al. 2001). Over the years, AI enabled machine vision and data analytics have transformed health care robots and expanded their capabilities in many other areas of health care (Montaño-Serrano et al. 2021). Technically, assistive robotics has been feasible for years, but the high cost means that most countries are unable to implement it, and it is more economical to hire people, sometimes from more vulnerable social strata, to carry out care work. From an economic point of view, the human worker is more cost-effective than the robotic worker.

8.3.3 Taxonomy There are numerous ways to categorise care robots according to their function, design or components. The following is a suggested classification of care robots according to the type of function they perform based on other proposed classifications (Case School of Engineering 2017; Intel n.d.; Fernández Bernat et al. 2019; Organización Mundial de la Salud 2018): • Clinical robots: They perform tasks such as assisting in the operating theatre, monitoring biomedical parameters or helping to diagnose pathologies. –– Surgical assistance robots: They are used in the operating theatre to help professionals make more precise movements during surgery. Some robots may even be able to complete tasks automatically, allowing practitioners to monitor procedures from a console. The ability to share video streams from the operating theatre to other locations allows practitioners to stay in contact with other specialists in other parts of the world. Robotics also plays a key role in educating professionals. For example, simulation platforms use artificial intelligence and virtual reality to provide surgical training. –– Monitoring robots: These robots analyse biometric and behavioural parameters of users to detect significant variations, which may be associated with health problems. Recently, digital medicines are being developed that can track the amount of drug missed and send a report to the patient, preventing medication forgetting errors. –– Examination or diagnostic robots: When equipped with light detection and mapping capabilities, these robots can navigate on their own in examination rooms or hospitals, allowing professionals to interact from afar. Some robots can assist professionals before examining a patient. For example, the RoomieBot robot can help medical staff with high-risk COVID-19 patients. It is able to classify patients by their temperature, blood oxygen level and medical history upon arrival at the hospital.

8  Exploring the Ethics of Interaction with Care Robots

155

• Service robots: They perform logistical tasks, automate manual, repetitive and high-volume tasks. They aim to simplify routine tasks to ease the daily burden on workers in health and social care settings. Many of these robots operate autonomously and can send a report upon completion of a task. They can track inventory, transport supplies and bedding in hospitals (Aethon’s TUG) or autonomously clean and disinfect rooms, helping to limit person-to-person contact in infectious diseases (for example, the Akara robot disinfects contaminated surfaces with UV light). • Robots for rehabilitation: They help with rehabilitation for patients after stroke, paralysis or traumatic brain injury, as well as with disabilities caused by multiple sclerosis. When robots are equipped with AI and depth cameras, they can monitor a patient’s form as they go through prescribed exercises, measure degrees of movement in different positions and track their progress more accurately than the human eye. Video games, virtual and augmented reality can also be used for learning and rehabilitation, for people with disabilities or the elderly. • Assistive robots: They function similarly to rehabilitation robots, although the objective is different. They serve to facilitate the mobility of users, providing the physical support necessary for people to carry out their Activities of Daily Living (ADL). Depending on the degree of disability of the patient, their control model can vary from fully autonomous navigation to minimal shared control, where the patient makes all decisions and performs all but the most complex manoeuvres, which are taken care of by the robot. Within assistive robotics, we can highlight walkers, wheelchairs, and exoskeletons. –– Walkers: These can be supporting structures attached to autonomous robots or conventional walkers transformed into robots. Their functionalities can range from merely capturing information, to providing haptic feedback (a type of tactile communication, usually working with a vibration pattern) by moving the wheels to alter the user’s trajectory. –– Wheelchairs: These can range from conventional wheelchairs with minimal hardware to robotic wheelchairs incorporating sensors, push-button control, voice control or even BCI (Brain Computer Interface) headsets. They can incorporate various devices adapted to each type of disability. –– Exoskeletons: These are intended to strengthen the lower or upper limbs of severely disabled users. Similar to a suit, this robot captures the user’s intention through different types of sensors such as electromyography and configures the different motors and joints to reproduce the desired movement in the patient’s limbs. Experimental models include Cyberdine’s HAL. • Social robots: This group includes companion robots, telepresence robots or virtual assistants. Their functions include keeping people company, reminding them to take their medication, or making emergency calls. Their main objectives are to monitor patients and their habits, and to report potentially dangerous irregularities to their caregivers or social services. Some robots can even perform tasks such as fetching small objects. In addition, the robots offer telepresence

156

M. V. Martínez-López et al.

services, allowing them to initiate video conferencing and other types of communication with caregivers, relatives, doctors, etc. In case of emergencies, e.g. fall detection, communication could be initiated by the robot to assess the situation before proceeding to help. Companion robots are intended for people with cognitive disorders. They can also be used to provide directions to visitors and patients within the hospital environment, such as the PACA robot. In this chapter, we will focus on assistive and social robots as they are the most relevant to the ethical discussion.

8.3.4 Some More Examples of Existing Robots Robots are constantly evolving at the same rate as technology is evolving, i.e. very fast. New robots are continually being designed and brought to market, but at the time of writing these are some examples of current robots2: –– EC (Excretion care): robots automatically dispose of excreted waste and assist bedridden patients with their excretory activity in their rooms. –– Clothilde: in hospital logistics, it can help with bed-making and collecting used linen. Outside the hospital, it may be able to locate a home, ring the doorbell, take the lift and, once at the home, collect patient data and send them medical teams. –– HSR: this robot has a folding mechanical arm with a two-finger gripper that can open curtains, pick up objects of any size and shape or take them out of a cupboard, and can hold them thanks to a suction cup. –– ROBEAR: this robot is responsible for helping people with reduced mobility to get up and move them from a bed to a wheelchair. It has sensors whose main task is to calculate the force and position to be used to lift a person without harming them. –– RIBA: this robot helps patients by lifting them out of their beds and wheelchairs and moving them to new positions. –– OBI: this is a robotic arm designed to feed people with reduced mobility. It has two buttons that allow the person to select the type of food, pick up the food with a spoon, and to bring it to their mouth. The robot is programmed by the caregiver, who can adapt the rhythm and speed, and it detects when the patient’s mouth is open to feed him or her. If the user is distracted by a feed, the robot will speak to warn the patient and, if he or her persist, the arm will start to move to get their attention. –– PR2: this robot, consisting of two arms and two wheels, can move objects from one room to another and help people with their self-care tasks such as shaving.

 The following examples are taken from commercial websites and from scientific literature.

2

8  Exploring the Ethics of Interaction with Care Robots

157

–– STEVIE II: it stands out for its social component. This robot can remember medication, identify the person and respond to requests such as calling the emergency services if someone is incapacitated. –– KiKi: robot pet capable of identifying and remembering the user. It adapts its head movements to follow the user and responds to their emotions. If it considers its companion is sad, the robot sings and dances to cheer up its owner. It needs to be cared for and fed. It responds to caresses and sticks out its nose. When it is hungry, it growls and its owner, through an application, must feed it. Similarly, you can reward it with treats when it learns something new. –– ARI: designed to offer companionship to elderly people to avoid loneliness, as well as offering help such as medication reminders. –– BUDDY: it acts as a personal assistant, reminding the patient of important dates, takes and makes calls, can read books and connects to apps and smart devices via the internet. –– PEPPER: it can recognise faces and basic human emotions and is able to engage with people through conversation as well as its touchscreen. –– TIAGo: Its functions include giving pills on time, monitoring vital signs and locating keys. –– LEA: a robotic walker that uses sensor technology to enable location by scanning the environment for autonomous navigation and the ability to react intelligently to various conditions. For example, when it encounters an object on the ground that may pose a tripping hazard, the walker slows down for safety. –– TEFI: a dog-shaped robot with artificial intelligence that can serve as a guide for the visually impaired. Connected to Google, it can learn information in real time, such as the traffic situation, and is able to communicate this information to its owner by voice. It is also capable of guiding patients to a doctor’s surgery or requesting a taxi. –– Care-O-bot: mobile robotic assistant for human assistance in the domestic environment and public spaces such as museums or airports. The robot can be equipped with one or two arms or a tray and is able to pick up and deliver objects. It is capable of displaying different moods via a screen integrated into its head.

8.4 Design Care robots are very useful instruments for the care and assistance of people who cannot perform certain actions on their own. This is the case, for example, of elderly people or people with some kind of physical or intellectual disability. Due to the various factors detailed in the introduction (ageing population, cost of care, or labour shortage), since the beginning of the millennium, trying to create robotic assistants capable of helping the most vulnerable population in particular domestic and care tasks has been a concern for technology.

158

M. V. Martínez-López et al.

From the moment of their conception until the present day –when these care robots are more of a reality than a fantasy, with examples such as Clothilde or PACA– one of the major concerns of the scientific community has been the correct design of these robotic assistants. Given the characteristics of the potential users of these robots, it is essential to adequately address the user acceptability of the design of these care robots (Meng and Lee 2006). The first studies on the user acceptability of this type of new assistant have shown that exporting industrial robotics techniques is not a very valid task for domestic and healthcare use (Meng and Lee 2006). Another key aspect that has been pointed out in another study conducted with potential elderly patients (Cortellessa et al. 2008) is the need to adapt the care robots to the patient’s environment. This requires adapting the design of care robots to help elderly persons in countries where the elderly are not very familiar with new technologies and the Internet. Sabanovic et al. (2006) found that observation and behavioural analysis of social interactions between humans and robots in real environments is necessary to consider the different factors relevant to the design of assistive robots. The design of assistive robots also raises many ethical issues that need to be discussed to guide system designers. Turkle et al. (2006) explored some of the ethical implications of human-robot interaction, particularly on the type of trustworthiness we demand of the technology and on the selection of the most appropriate relationships between children or the elderly with relational artefacts. Regarding the external appearance of assistive robots, it seems that the industry has opted to adapt this type of robot to human morphology, providing them –almost always– with humanoid features and a friendly appearance. There are also cases of robots with a zoological appearance (RoBear) and robots with a much more industrial look (Jardón et al. 2008). A priori, the latter would give the impression of being the most hostile to users. However, a study by van Vugt et al. (2007) concluded that realism in the external representation of robots does not necessarily affect robot performance. Regarding the relationship between appearance and treatment, some studies conclude that there might be a relationship between a more human-like appearance and user’s engagement and satisfaction with the robotic assistant (Lester et al. 1997). However, some studies argue that the effectiveness of human-like relational agents may affect the perceived success of the interaction, but not directly contribute to its actual success (Catrambone et al. 2002). Ultimately, these studies seem to point out that user responsiveness and the effectiveness of assistive robot interfaces are highly dependent on the specific environment and function in which the robots are deployed (Cortellessa et al. 2008). Assistive technologies face serious difficulties when using systems that require direct or indirect physical contact with the user. According to Meng and Lee’s study (2006), which classified various household and assistive robots according to their interaction levels, devices that have no robotic characteristics, i.e., lack spatial machinery, and do not present unique safety issues beyond those of any household appliance or computer device, are included in level 0. Level 1 devices are those that can move in the user’s environment but are generally required to avoid physical

8  Exploring the Ethics of Interaction with Care Robots

159

contact with users and even do their best to stay away from their personal spaces. Examples are autonomous vacuum cleaners or lawnmowers. Level 2 robots are those that must take extra safety precautions in case a user unintentionally bumps into or falls on the robot. Robots that assist in household tasks such as cooking or ironing are some examples (Meng and Lee 2006). There are some other very important factors to be taken into account in the design and development of these robotic assistants, such as safety, cost, autonomy, usability and flexibility (Meng and Lee 2006). It is worthwhile to discuss them more extensively to develop them later in other sections of the article. One of the most obvious factors is safety. One of the premises of any assistive robot must be to avoid endangering the physical or intellectual integrity of the human being with whom it interacts. To this end, these robots must have safety mechanisms capable of identifying potential or immediate risks and deactivating themselves in such a case. Daehyung Park et al. (2020) point out that in the case of a robotic arm for feeding, it is essential to have a kind of switch to stop immediately to ensure that the hardware is correctly programmed, so that in the event of an error the robot does not violently hit the person-assisted and that it also has some. Likewise, an intuitive and easy-to-use interface can be crucial to manage the interaction with the device in a safe way. Finally, implementing a multimodal performance monitor that allows the robot to recognise if something is wrong with the patient is also critical. In fact, we believe that safety measures should not only pose any risk to the person interacting with them but should also be able to protect or alert the user of other imminent risks that may be happening around them. Another undoubtedly key factor for the massive implementation of this type of robot is their manufacturing and design cost, and consequently, their final price. New robotic and AI technologies require a large public or private investment and this is usually reflected in the cost of the product. Cost reduction is a priority in the design of these types of machines because if cost-competitive products are not offered to the population, they will never be able to take advantage of their benefits. Turning the use of care robots into an economic privilege can create or increase gaps in the quality of life of the elderly or people with physical or intellectual disabilities. Meng and Lee (2006), in his taxonomy of needs and requirements of an assistive robot, speaks of autonomy in terms of robot self-improvement and the ability to anticipate possible failures in its system. While we understand this to be very important, we believe it is also relevant to add another perspective to the concept of autonomy. This is the ability to remain in operation for long periods and regulate its energy so that it does not shut down in the middle of an action and cannot harm the person it is dealing with. An autonomous robot is one that is capable of performing a series of functions on its own (like self-improvement and the ability to anticipate possible system failures) and can help others for a prolonged period of time. Finally, the last remaining issues are flexibility and usability. These two points are connected because both relate to the ability of the care robot to interact with the environment. In terms of flexibility, we are talking about the robot’s ability to adapt to its physical environment. In the same way that a vacuuming robot can map a

160

M. V. Martínez-López et al.

house so as not to continually bump into the various heavy objects in it, a care robot must be able to recognize the space it inhabits to avoid potential complications. At this point, it is interesting to include another view on the flexibility definition that Lee Meng does not take into account in his classification but that we think is a key to satisfactorily complementing correct flexibility: the ability to customize the device. As Carme Torras points out (Torras 2019), in order to have better usability, it is important to be able to “adapt” certain features of the device to our needs. When we talk about usability, we refer to the ability to simplify the functions of the device so that its recipients (mostly elderly people) can operate it correctly. Although complex care robots exist, they need to be simplified so that they are accepted and well-­ received in the homes of the elderly who need them. Design is a very important part of care robotics because it determines how the robots will be in the world and how they will interact with users, as well as who will be able to access those care robots. Good design of such complex devices will promote low costs without sacrificing product quality, ease of learning to use, safety and adaptability to different spaces and situations. For example, in the case of a feeding robotic arm, it must adapt the path and execution of the feeding task. Similarly, a social robot that has a direct and interactive relationship with people must be able to adapt to a human’s preferences. Just as with the assistive devices that many of us have in our homes (Del Valle et al. In Press) where we can choose to modify its voice or the location it will use to recommend a good restaurant, we should be able to adjust how social robots are directed towards us, what tasks they do or what they remind us to do (e.g. take a particular medication, go to the doctor, etc.) so that they are really useful and end users can make proper use of them.

8.5 An Ethical Framework for Care Technologies As with any new technology, the manufacture and introduction of assistive robots raises legal, political, social, and ethical issues. Some of these issues, like those related to safety, have been mentioned in Sect. 8.3. As assistive robots are intended to function in close contact with users, it is fundamental that their design and construction pay special attention to safety issues, minimizing risks. Beyond design, it is important not only that manufacturers provide instructions and information on residual risks, but also that these instructions are as clear and understandable as possible, taking into account the general characteristics of the population to which they are addressed, and that the robots are as easy to handle as possible. Another related question is the digital competence, as it is fundamental to ensure that human carers and users have the relevant knowledge and skills. This is probably the most pressing socio-political challenge, and not an insurmountable one, as we have the experience of addressing a similar question in various contexts, especially the educational one. There are another couple of questions that we can also consider socio-political in nature. The first is related to the economic cost of care robots. One of the main

8  Exploring the Ethics of Interaction with Care Robots

161

worries regarding the advent of new technologies is that they are only accessible to a few, those who are already privileged, thus increasing the social gap between the poor and the rich. For anyone with sensitivity for equality, this is a problem. There are some things that can be said in this respect. If the history of technology shows something, it is that it is usually expensive when it first comes into the market, but becomes cheaper with time (think about cars, computers or mobile phones). We should also consider the fact that human care does not come cheap. This is not to say that there are not any problems. The provision of care is a challenge that different societies face in varying ways, depending on their social security system and whether care is publicly funded or privately managed. This can lead to disparities in access to care and support for individuals who require it. This problem affects both human and robotic care. The second question is about responsibility. Robots are complex machines. This is one of the main topics in the ethics of AI. In the case of care robots, the problem can be explained as follows. The use of care robots can bring consequences, mainly good ones (or so hope we) but eventually also bad ones. The same can be said about human carers, and in both cases, we can ask about responsibility attribution. Who is responsible? This question goes beyond ethics into the realm of law and politics. Though sometimes this question is not so easy to answer, in the case of human carers it is a relatively straightforward one. Humans are the kind of beings that can be held responsible because they have agency. In the case of robots, the problem is bigger by far. As they are not (yet) considered as agents, questions of responsibility fall on humans. The problem is to know on whom exactly. In trying to answer this question we face what philosophers of technology call “the many-hands problem”: when many people are involved in an activity, it is often difficult, if not impossible, to pinpoint who is morally responsible for what (Poel et al. 2015) “Many people” does not only refer to people involved in design or production, but also to users. This problem is not to be underestimated, but it is not impossible to solve either. For instance, Nagenborg et al. (2008) propose some ways of addressing it. Let’s turn now to ethical questions. When it comes to a new technology, application or activity, the most radical question that can be asked from an ethical point of view is whether it is really necessary or convenient. We have addressed this question in Sect. 8.1, but some might wonder if there are no lower tech devices that could meet those needs. In fact, we do have such low tech help, from wearable alarms that can be fired with a simple touch to pillboxes with audio, visual and vibrating reminders that can help users to remember to take her tablets and pills. We also have kitchen robots (of the Thermomix variety), that make cooking easy, and other assistive technology to help with other house chores. There are also mobile phones adapted to elederly users or those with some type of disability that help them to keep in touch with family and friends. We even have mobile apps designed to help the elderly living on their own.3 Thus, to question the need for assistive and social robots is not unreasonable.

 A very good example is the app cuYdo, recently developed and commercialized in Spain.

3

162

M. V. Martínez-López et al.

There are at least three answers to this question. The first one is about convenience. A multitask robot able to perform several tasks (like Care-O-bot 4) is more convenient than having to deal with three single-task ones. The second is related to the fact that some users might need more help than these low-tech devices can offer (see for instance the help offered by TEFI). The third and most important one is that both humanoid and not humanoid robots have “presence” (Sorell and Draper 2014). “Presence” is understood as more than just “being there”; it is “the kind of co-­ location of a thing with a person that brings it about that the person no longer feels alone” (p. 184). Some can even have what they call “sophisticated presence” as they move “around with the older person, appearing to take interest in activities in which the older person is engaged, prompting him or her to undertake beneficial behaviours, communicating through a touch screen and reacting to the older person’s commands” (p. 184). Some authors, as Sharkey and Sharkey (2012), have raised concerns that robots may result in the elderly or disabled having less human contact. The replacement of caregivers by care robots could mean that opportunities for human social contact are reduced, and vulnerable people could be even more neglected by society and their families. Robots could serve as an excuse for this neglect, if others mistakenly believe that machines holistically address the physical and emotional needs of people in need of care. Reducing social interaction can have a measurable impact on the health and well-being of these people and reinforces the idea that depriving them of such contact is unethical, and even a form of cruelty. Of course, what is not yet fully understood is the extent to which reduced or non-existent interaction with humans can be compensated for by interaction with robots. Conversely it could also be argued that robotic assistive technology can promote social contact for the elderly by enabling them to travel to and from social gathering locations, which will likely improve their psychological wellbeing in addition to giving them a greater sense of control and autonomy. Care robots also raise more specific problems of a more ethical nature. The first one is by no means new, though robots give it a new twist. One of these specific ethical issues is that of data and information privacy. Regarding data security and privacy, assistive technologies raise yet another ethical issue (Lin et al. 2012). Robots created to provide personal care frequently gather private information about their owners, or users, including parameters about their health and daily schedules. Users must have control over the collection, storage, and use of their data in order for it to be protected and utilized solely for the purposes for which it was collected. Manufacturers and health care providers alike must make sure that their data management procedures adhere to all applicable privacy and data protection laws. Also, it is crucial for robot manufacturers and health care providers using robots for their services to be open and honest about how they manage user data and to let them know what information is being gathered, how it will be put to use, and who will have access to it. Furthermore, they must make sure that the information gathered is safe and not exposed to hacker assaults or illegal access. Likewise, it is critical that the users’ personal data not be utilised for anything other than what was

8  Exploring the Ethics of Interaction with Care Robots

163

intended, like for commercial gain or prejudice. However, in addition to the danger of unauthorised use of data and information, there is also the risk of biased use of data. If the training data is not varied enough or the algorithm is not clear enough, embodied AI, in other words, robots, may also be prejudiced against underrepresented patients. Another very important ethical issue relates to questions about autonomy, consent and freedom. In this regard, it is necessary to make a distinction between different kinds of users, as they can be cognitively or physically impaired (or both). Sorrel and Draper (2014) propose a user-centred technology to address this question that we find especially attractive. Those physically impaired should be treated equally, whether they are old or young. We cannot ignore the tendency to treat the physically impaired with a certain amount of paternalism, especially if they are very young or very old, both by professional carers and family members. Nevertheless a 70- or 80-year-old is not a child and their physical problems do not diminish their capacity to act and make decisions autonomously. Assistive technology, including care robots, allow them to live in their homes on their own, if this is what they prefer. And they are perfectly able to make these decisions and many others, which may not be approved by their families, typically those related with diet, the hours they keep or the company they like to have. The fact that they are old cannot be an excuse to force unsavoury options on them. Indeed, one of the reasons that this group of people may need to use a care robot to keep living on their own is to avoid this well-intentioned meddling. By the same token, not cognitively impaired users have the last word in questions such as when something is an emergency and what is not and when to call for medical or family help and so on. Cognitively impaired users present a bigger challenge. As cognitive capabilities diminish, so does autonomy. Decision making ceases to rest exclusively with the user and gradually becomes shared by others, usually family members. But we must notice that this problem also arises with human carers. When discussing the ethics of care robots it is frequent to make a distinction between robots as replacements and robots as assistance of human care. Though some consider both as very problematic (Sparrow 2016; Sparrow and Sparrow 2006) the first is generally considered as much more problematic from an ethical point of view. Objections to robots as replacements go beyond mere worries about the desirability of having a human carer to supervise the robot, or about human carers losing their jobs, though both are reasonable worries. They focus on the concept of care and the incapacity of robots to provide real care. Coeckelbergh (2009) analyses the objections as demands for deep care, good care, private care, and real care. Two of these demands are reasonable, but the objections related to them are not impossible to face. For instance, good care focuses on the undeniable fact that care does not only involve the preservation of life, body integrity and body health, but also of other human capabilities such as being able to use your senses, imagination, and thought or those related to affiliation (p. 185). If robots cannot preserve and promote these capabilities will have to be judged on a case by case basis (and it must be said, for the sake of justice, that this is also the

164

M. V. Martínez-López et al.

situation with human carers). Private care is also a reasonable demand. We have already stressed the importance of privacy, but it has to be noticed that this problem is not exclusive of care robots and, as Coeckelbergh remarks, privacy is far from being the only good to consider and, when it conflicts with other good, a balance has to be reached, and this balance by no means can always be solved in the same direction. The other two demands are different, as they involve complex questions of feelings (deep care) and deception (real care). As robots do not have feelings, at least today, we can admit that robots cannot provide deep care and at the same time acknowledge that not all human care is deep. On the other hand, maybe the fact that good care is not deep is not a sufficient reason to object to the use of care robots if we consider this demand excessive. That users are deceived about robots is something that cannot be taken for granted. Most of them, especially those in the not cognitively impaired group, will not be. Perhaps this demand is not very reasonable either, and has to face the very damaging question of what real care should be and the not-so-easy task of explaining what “real” exactly means and why it is so important. In Coeckelbergh’s words, “If AI assistive technologies constitute a ‘good demon’ (eudemonia in another sense) that gives you that experience of the good life, what is wrong with it?” Maybe the worries about robots as replacement arise from a deeper source, not always explicit, and could be based on a misconception. Even if care robots replace human carers, this does not mean that the users are to be, for this reason, socially isolated and lonely. This does not need to happen. Though it could happen in some cases, the dystopian picture painted by Sparrow (2016), even if it is good as a caution, is probably not accurate as a prediction. Robots could compromise people’s privacy, liberty, and security. The development of caregiving robots and assistive technologies has the potential to revolutionize how we perceive and understand caregiving. While there are many useful advantages of these technologies, some of them mentioned above, such as greater user freedom and support, there are also important concerns to consider. The potential lack of interpersonal interaction and socialisation (e.g. from human being to human being) is one of these hazards that is most severe.

8.6 Conclusion Some ethical issues (e.g. information privacy, safety, collaboration…) of assistive technologies and care robots have been examined in this chapter. However, there are many other ethical issues that we have not been able to address in sufficient depth. For example, given that women typically provide care, it is essential to conduct an analysis from a gender perspective (it is crucial to ensure that these robots are designed and programmed to meet the unique needs and preferences of individuals of all genders), giving priority to interdisciplinary research and also conducting

8  Exploring the Ethics of Interaction with Care Robots

165

educational campaigns to raise awareness of the benefits robotics can provide for the care of the elderly. It is also crucial to begin examining the potential integration of robotics for elderly care into social welfare systems. As robotics for elderly care is not widely marketed, primarily due to its high cost, it is also crucial to take steps to prevent a robotics gap and inequality of access (those who can afford to use it and those who cannot). Furthermore, the need to acquire competences –both for formal and informal caregivers– related to the handling of new technologies such as care robots becomes evident. To understand how care robots can interact with people and be incorporated into society, a taxonomy of care robots has been established, and their design in a conceptual manner has been examined. Furthermore, we present an ethical framework that takes into account the difficulties faced when interacting with assistive robots. In the end, our analysis highlights the significance of creating assistive technologies with an awareness of their ethical consequences and a focus on the autonomy and well-being of the users who engage with them. In our analysis, we have emphasised the significance of developing assistive technologies while considering their ethical issues and keeping the autonomy and wellbeing of their users in mind. We must make sure that these technologies are human-centred. We think that the introduction of care robots will bring both opportunities and difficulties to society. To make sure that these technologies are used in a way that is advantageous and respectful to everyone involved, it is crucial that we carefully explore their ethical implications. By doing so, we hope to stimulate further discussion and research on these emerging technologies and their potential impacts on society. Acknowledgements  María Victoria Martínez-López would like to thank Belén Liedo and Joan Llorta Albareda for their support and review of this chapter. Aníbal M.  Astobiza gratefully acknowledges the support by the project EthAI+3 (PID2019-104943RB-100). All the authors thank Mar Díaz-Millón and Jan Deckers for their help for the English revision of the manuscript.

References Andtfolk, Malin, Linda Nyholm, Hilde Eide, and Lisbeth Fagerström. 2022. Humanoid robots in the care of older persons: A scoping review. Assistive Technology: The Official Journal of RESNA 34 (5): 518–526. https://doi.org/10.1080/10400435.2021.1880493. Ausín, Txetxu, Belén Liedo, and Daniel López Castro. 2023. Robótica Asistencial. Una reflexión ética y filosófica. In Tecnología para la salud. Madrid: Plaza y Valdés. https://www.plazayvaldes.es/libro/tecnologia-­para-­la-­salud. Bliquez, L.J. 1983. Classical prosthetics. Archaeology 36 (5): 25–29. https://doi.org/10.2307/ 41729063. Case School of Engineering. 2017. 5 Medical robots making a difference in healthcare. Cleveland: Case Wenstern Reserve University.

166

M. V. Martínez-López et al.

Catrambone, Richard, John T. Stasko, and Jun Xiao. 2002. Anthropomorphic agents as a UI paradigm: Experimental findings and a framework for research. Christaller, Thomas. 2001. Robotik. Perspektiven für menschliches Handeln in der zukünftigen Gesellschaft. Berlín: Springer. https://publica.fraunhofer.de/handle/publica/290881. Coeckelbergh, Mark. 2009. Personal robots, appearance, and human good: A methodological reflection on Roboethics. International Journal of Social Robotics 1 (3): 217–221. https://doi. org/10.1007/s12369-­009-­0026-­2. Cortellessa, Gabriella, Gion Svedberg, Amy Loutfi, and Federico Pecora. 2008. A cross-cultural evaluation of domestic assistive robots. AAAI Fall Symposium – Technical Report, January. Del Valle, Juan Ignacio, Joan Llorca Albareda, and Jon Rueda. In Press. ‘Ethics of virtual assistants’. In Ethics of artificial intelligence. Springer. Fernández Bernat, Juan Antonio, María Dolores García Valverde, Antonio López Peláez, Enrique Norro Gañan, Carolina Serrano Falcón, Natalia Tomás Jiménez, and Cristina Urdiales García. 2019. Los Robots Para El Cuidado de Mayores. Granada: Universidad de Granada. Intel. n.d. Robótica En La Asistencia Sanitaria: El Futuro de Los Robots En Medicina. https:// www.intel.es/content/www/es/es/healthcare-­it/robotics-­in-­healthcare.html. Jardón, Alberto, Antonio Giménez, Raúl Correal, Santiago Martinez, and Carlos Balaguers. 2008. Asibot: Robot Portátil de Asistencia a Discapacitados. Concepto, Arquitectura de Control y Evaluación Clínica. Revista Iberoamericana de Automática e Informática Industrial RIAI 5 (2): 48–59. https://doi.org/10.1016/S1697-­7912(08)70144-­4. Kiss, Monica. 2021. Demographic outlook for the European Union. European Parliamentary Research Service. https://www.europarl.europa.eu/RegData/etudes/STUD/2021/690528/ EPRS_STU(2021)690528_EN.pdf. Kwon, Dong-Soo, Yong-San Yoon, Jung-Ju Lee, Seong Young Ko, Kwan-Hoe Huh, Jong-Ha Chung, Young-Bae Park, and Chung-Hee Won. 2001. ARTHROBOT: A new surgical robot system for total hip arthroplasty. In Proceedings 2001 IEEE/RSJ International conference on intelligent robots and systems. Expanding the societal role of robotics in the Next Millennium (Cat. No.01CH37180) 2: 1123–28 vol.2. Kyrarini, Maria, Fotios Lygerakis, Akilesh Rajavenkatanarayanan, Christos Sevastopoulos, Harish Ram Nambiappan, Kodur Krishna Chaitanya, Ashwin Ramesh Babu, Joanne Mathew, and Fillia Makedon. 2021. A survey of robots in healthcare. Technologies 9 (1): 8. https://doi. org/10.3390/technologies9010008. Lester James, C., Sharolyn A. Converse, Susan E. Kahler, S. Todd Barlow, Brain A. Stone, and Ravinder S. Bhogal. 1997. The Persona effect: Affective impact of animated pedagogical agents. In Proceedings of the ACM SIGCHI conference on human factors in computing systems, 359–66. Atlanta, Georgia, USA: ACM. https://doi.org/10.1145/258549.258797. Liedo, Belén, and Txetxu Ausín. 2022. Alcance y Límites de La Tecnologización Del Cuidado: Aprendizajes de Una Pandemia. Revista Española de Salud Pública 96: 12. Lin, Patrick, Keith Abney, and George Bekey. 2012. The rights and wrongs of robot care. In In robot ethics: The ethical and social implications of robotics, 267–282. Cambridge, MA: MIT Press. Llorca Albareda, J. 2023. El estatus moral de las entidades de inteligencia artificial. Disputatio. Philosophical Research Bulletin 12 (24): 241–249. https://doi.org/10.5281/zenodo.8140967. Llorca-Albareda, J. and Díaz-Cobacho, G. 2023. Contesting the consciousness criterion: A more radical approach to the moral status of non-humans. AJOB neuroscience 14 (2): 158–160. https://doi.org/10.1080/21507740.2023.2188280. Meng, Qinggang, and Mark H.  Lee. 2006. Design issues for assistive robotics for the elderly. Advanced Engineering Informatics 20: 171–186. Misnky, Marvin. 2003. Semantic information processing. Cambridge, MA: The MIT Press. Monasterio Astobiza, Aníbal, Daniel López Castro, Manuel Aparicio Payá, Ricardo Morte, Txetxu Ausín, and Mario Toboso-Martín. 2019. Conceptual analysis: Technology, machine and robot. Springer. https://doi.org/10.13039/501100000780.

8  Exploring the Ethics of Interaction with Care Robots

167

Montaño-Serrano, Victor M., Juan M. Jacinto-Villegas, Adriana H. Vilchis-González, and Otniel Portillo-Rodríguez. 2021. Artificial vision algorithms for socially assistive Robot applications: A review of the literature. Sensors 21 (17). https://doi.org/10.3390/s21175728. Montero, Victor A., Christopher M. Frumento, and Ethan A. Messier. 2010. History and future of rehabilitation robotics. Worcester: Worcester Polytechnic Institute. Digital WPI. https://digital. wpi.edu/show/bn999724d. Nagenborg, Michael, Rafael Capurro, Jutta Weber, and Christoph Pingel. 2008. Ethical regulations on robotics in Europe. AI & Society 22 (August): 349–366. https://doi.org/10.1007/ s00146-­007-­0153-­y. Organización Mundial de la Salud, ed. 2018. Tecnología de asistencia. https://www.who.int/es/ news-­room/fact-­sheets/detail/assistive-­technology. Pareto Boada, Júlia, Begoña Roman, and Carme Torras. 2021. The ethical issues of social robotics: A critical literature review. Technology in Society 67 (November): 101726. https://doi. org/10.1016/j.techsoc.2021.101726. Park, Daehyung, Yuuna Hoshi, Harshal P.  Mahajan, Ho Keun Kim, Zackory Erickson, Wendy A. Rogers, and Charles C. Kemp. 2020. Active robot-assisted feeding with a general-purpose Mobile manipulator: Design, evaluation, and lessons learned. Robotics and Autonomous Systems 124 (February): 103344. https://doi.org/10.1016/j.robot.2019.103344. Poel, Ibo, Lambèr Royakkers, Sjoerd Zwart, T. Lima, Neelke Doorn, and Jessica Fahlquist. 2015. Moral responsibility and the problem of many hands. https://doi.org/10.4324/9781315734217. Ritchie, Hannah, and Max Rose. 2019. Age structure. Our World in Data (blog). https://ourworldindata.org/age-­structure. Sabanovic, S., M.P. Michalowski, and R. Simmons. 2006. Robots in the wild: Observing human-­ robot social interaction outside the Lab. In 9th IEEE international workshop on Advanced Motion Control, 2006, 596–601. https://doi.org/10.1109/AMC.2006.1631758. Sharkey, Amanda, and Noel Sharkey. 2012. Granny and the robots: Ethical issues in robot care for the elderly. Ethics and Information Technology 14 (1): 27–40. https://doi.org/10.1007/ s10676-­010-­9234-­6. Sorell, Tom, and Heather Draper. 2014. Robot carers, ethics, and older people. Ethics and Information Technology 16 (3): 183–195. https://doi.org/10.1007/s10676-­014-­9344-­7. Sparrow, Robert. 2016. Robots in aged care: A dystopian future? AI & SOCIETY 31 (4): 445–454. https://doi.org/10.1007/s00146-­015-­0625-­4. Sparrow, Robert, and Linda Sparrow. 2006. In the hands of machines? The future of aged care. Minds and Machines 16 (2): 141–161. https://doi.org/10.1007/s11023-­006-­9030-­6. Torras, Carme. 2019. Assistive robotics: Research challenges and ethics education initiatives. Dilemata 30: 63–77. Turkle, Sherry, Will Taggart, Cory D.  Kidd, and Olivia Dasté. 2006. Relational artifacts with children and elders: The complexities of Cybercompanionship. Connection Science 18 (4): 347–361. https://doi.org/10.1080/09540090600868912. van Vugt, H.C., E.A. Konijn, J.F. Hoorn, I. Keur, and A. Eliëns. 2007. Realism is not all! User engagement with task-related interface characters. HCI Issues in Computer Games 19 (2): 267–280. https://doi.org/10.1016/j.intcom.2006.08.005.

Chapter 9

Ethics of Autonomous Weapon Systems Juan Ignacio del Valle and Miguel Moreno

Abstract  The use of weapons without humans-in-the-loop in modern warfare has been a contentious issue for several decades, from land mines to more advanced systems like loitering munitions. With the emergence of artificial intelligence (AI), particularly machine learning (ML) technologies, the ethical difficulties in this complex field have increased. The challenges related to the adherence to International Humanitarian Law (IHL), or human dignity are compounded by ethical concerns related to AI, such as transparency, explainability, human agency, and autonomy. In this chapter, we aim to provide a comprehensive overview of the main issues and current positions in the field of autonomous weapons and technological warfare. We will begin by clarifying the concept of autonomy in warfare, an area that still needs attention, as evidenced by the latest discussions within the United Nations (UN) Convention on Certain Conventional Weapons (CCW). We will also introduce the current legal basis in this field and the problems in its practical application and offer sound philosophical grounds to better understand this highly complex and multifaceted field.

9.1 Introduction AI-enabled technologies are rapidly expanding into various sectors such as transportation, healthcare, education, and aviation, among others. However, the deployment of these technologies, particularly those with increased autonomy enabled by AI, raises serious concerns about safety, privacy, robustness, and compliance with existing rules and regulations. One area where the deployment of such technologies is particularly contentious is in the field of technological warfare.

J. I. del Valle (*) · M. Moreno Department of Philosophy I, University of Granada, Granada, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 F. Lara, J. Deckers (eds.), Ethics of Artificial Intelligence, The International Library of Ethics, Law and Technology 41, https://doi.org/10.1007/978-3-031-48135-2_9

169

170

J. I. del Valle and M. Moreno

Although AI can be used in different military systems such as munition, platforms, or operational systems (Horowitz 2016b), here we will focus on munition and platforms, for which we will use the term Autonomous Weapon System (AWS), encompassing both lethal and non-lethal systems. Notwithstanding the importance of operational systems e.g., military operations planning systems in which military leaders would be substituted by a machine, this kind of systems are unlikely to be deployed anytime soon (Horowitz 2016b) in stark contrast with AWS, which are already being used in the Ukrainian war and we have seen public declarations stating that the use of new AWS in this conflict will be “logical and inevitable”.1 In addition, AWS—and particularly Lethal AWS (LAWS)—have been at the center of the discussions at the United Nations (UN) Convention on Certain Conventional Weapons (CCW) for over a decade,2 and the current literature and the public concern at large. In this chapter, we will provide some insight into current AWS, their most relevant features, and real examples of AWS currently in use. Then we will introduce the legal basis that should govern the development and deployment of this technology: the International Humanitarian Law (IHL). IHL was developed well before the emergence of autonomous systems and it is legislation made for humans, not for machines, which significantly complicates its current application. We will introduce some of the main problems we encounter in current discussions of AWS and argue that the current debate is jeopardized not only by political reasons and different ethical views (e.g., regarding the value of human dignity), but also by other elements potentially easier to set down, like a definition of autonomy, the operational context, the concept of unpredictability, and other high-level technical concepts. Understanding these characteristics would likely enable a more fruitful debate than what one encounters often today with the involved parties talking past each other.

9.2 Autonomous Weapon Systems The first issue we are going to deal with in this chapter –and arguably one of the main problems in this field– is the very definition of AWS. Starting with “Weapon”, it is often used to encompass a wide variety of systems (offensive, defensive, land, air or sea systems, anti-personnel, anti-materiel…), not all of them having a lethal effect. Things get more complex with the term “Autonomous” and here an apocalyptic view of technology often fueled by sci-fi movies often plays an important role at all levels, from laypeople to governmental experts discussing AWS at the highest

 https://www.wired.com/story/ukraine-war-autonomous-weapons-frontlines/  See e.g., the 11 guiding principles adopted by the 2019 Meeting of the High Contracting Parties to the CCW, which summarize these meetings and provide some insight about these discussions (https://www.un.org/disarmament/the-convention-on-certain-conventional-weapons/ background-on-laws-in-the-ccw/) 1 2

9  Ethics of Autonomous Weapon Systems

171

spheres.3 This lack of definition has also been used to deviate the debate—sometimes intentionally—to speculative discussions of autonomy focused on even more ambiguous elements, such as intentionality or consciousness, rather than on weapons that will be available in the shorter term or are already being used in the battlefield. Recently, there has been a call for more reality-based discussions focusing on existing weapons (Ekelhof 2019), or even abandoning the search for a widely accepted definition as there will be always cases that fall between the cracks (Boshuijzen-van Burken 2023). This section provides some examples of definitions of AWS currently being proposed but, taking a more pragmatic approach now common in the philosophy of technology (Coeckelbergh 2020; Keulartz et al. 2004), we will not attempt to provide a final definition of these systems. Rather, we will rather offer some real examples of military technology with increasing levels of autonomy and assess some of their characteristics, paying due attention also to other important elements such as the operational context or several technical details relevant to the discussion.

9.2.1 Definitions Anyone involved in the discussion on AWS would agree that there is no agreement – not even an understanding– of what is meant by “autonomous weapon” (Horowitz 2016a; Scharre 2018; Ekelhof 2019; Taddeo and Blanchard 2021; Boshuijzen-van Burken 2023). The last systematic attempt in academia to provide a definition has been carried out by Mariarosaria Taddeo and Alexander Blanchard (2021). In their paper, the authors collect and compare the different definitions of AWS that have been provided in the last decade and provide their own proposal of AWS definition as: “an artificial agent which, at the very minimum, is able to change its own internal states to achieve a given goal, or set of goals, within its dynamic operating environment and without the direct intervention of another agent and may also be endowed with some abilities for changing its own transition rules without the intervention of another agent, and which is deployed with the purpose of exerting kinetic force against a physical entity (whether an object or a human being) and to this end is able to identify, select and attack the target without the intervention of another agent is an AWS. Once deployed, AWS can be operated with or without some forms of human control (in, on or out the loop). A lethal AWS is specific subset of an AWS with the goal of exerting kinetic force against human beings” (Taddeo and Blanchard 2021). Another example recently published by Department of Defense (DoD) of the United States includes the essential elements of the previous definition in a simpler  As Paul Scharre has put it: “I am continually struck by how much the Terminator films influence debate on autonomous weapons. In nine out of ten serious conversations on autonomous weapons I have had, whether in the bowels of the Pentagon or the halls of the United Nations, someone invariably mentions the Terminator.” (Scharre 2018) 3

172

J. I. del Valle and M. Moreno

way: “A weapon system that, once activated, can select and engage targets without further intervention by an operator. This includes, but is not limited to, operator-­ supervised autonomous weapon systems that are designed to allow operators to override operation of the weapon system, but can select and engage targets without further operator input after activation” (DoD 2023). Rather than discussing the differences of the many definitions that have been provided or the nuances between the two definitions above (e.g., agent vs. operator, or the reference to “transition rules” and “internal states” in the first one), we will take the DoD’s definition as our starting point, in a pragmatic way i.e., always fallible and subject to revision when our understanding of the domain evolves. In addition, we will provide some examples of current weapons and one speculative but realistic example that could fall under this definition (spoiler alert: Terminator is not in the list) and thus involve the ethical issues that we will be addressing in this chapter.

9.2.2 Examples Weapon systems with higher levels of autonomy have already been developed and deployed. This section will describe some of them, providing some publicly available details of their operation. Before this, we will introduce two features that we deem important for the characterization of AWS: the operational context and the level of autonomy. The operational context includes three main environments: • Demilitarized zones (DMZ): a designated area, usually a strip of land or a buffer zone, that is free of military presence or military activity and which also normally encompasses restrictions of civilian activities or movement. • Battlefield frontline: line or boundary separating opposing military forces engaged in combat. It is the forward-most position of military units in a warzone, where direct contact and exchange of fire between opposing forces typically occur. • Urban assault: military operation in which military forces conduct an attack on an urban area, typically a city or town, with the objective of gaining control of strategic positions or neutralizing enemy forces. Urban assault is one of the most complex and challenging military operations, as the urban environment presents unique and often unpredictable challenges such as dense civil population, intricate urban infrastructure, and complex terrain. Regarding the level of autonomy, there is no widely accepted classification applicable to AWS.  In 2011 the DoD published an Unmanned Systems Integrated Roadmap, which included four levels of autonomy from (1) human operated to (2) human delegated, (3) human supervised, and (4) fully autonomous (DoD 2011). The fact that this classification disappeared in newer editions of the roadmap (DoD 2017) might be an indication of the struggle of finding an all-encompassing, yet

9  Ethics of Autonomous Weapon Systems

173

simple, taxonomy. However, the level of autonomy of the system and, particularly, the level of human involvement in its operation, will be key for the rest of the discussions in this chapter and, to this end, we will rely on the traditional human in/on/ out-of-the-loop classification: • Human in the loop (HITL): the human operator is involved in overseeing and intervening in the actions of the system. The human plays an active role in setting the objective, providing guidance and feedback, and validating the different phases of the operation. • Human on the loop (HOTL): the human operator is responsible for setting the system’s objective, overseeing and managing the system’s operation but intervenes mainly to re-task or override its operation. • Human out of the loop (HOOTL): the system operates entirely autonomously without human oversight or intervention to achieve a predetermined goal. 9.2.2.1 Sentry Robots: SGR—A1 The SGR-A1 sentry robot is an autonomous security robot developed by Samsung Techwin for use along the DMZ between North and South Korea. It is designed to detect and track intruders using a combination of sensors including cameras, infrared sensors, and a laser range finder. When the sensors detect movement, the system automatically locks onto the target and sends an alert to the command center. The system also has the capability to use a loudspeaker and a flashing light to warn intruders to stop or leave the area. If the intruder does not comply with the warning, the SGR-1 can engage the target with a mounted machine gun. The human involvement in this last step is not clear as the technical specifications of the SGR—A1 are not public. Whereas the weapon’s manufacturer initially clarified that the system requires a human operator to approve any use of lethal force, the SGR—A1 is often cited as fully autonomous (Scharre 2018). 9.2.2.2 Loitering Munitions with Human in the Loop: Switchblade and Shahed-136 Loitering munitions (kamikaze drones) with Human in the Loop have reportedly been used in the Ukrainian battlefield frontline, being the most well-known examples, the US-manufactured Switchblade used by Ukraine and the Iran-manufactured Shahed-136 deployed by the Russian army, even in the urban environment.4 Although they show significant technical differences (e.g., in weight, range, operational versatility, and mission duration…) the important characteristic of both systems for this section is that, once they are airborne (both are launched for the  The Guardian: “‘Kamikaze’ drones hit Kyiv despite Putin’s claim of no further strikes” https:// www.theguardian.com/world/2022/oct/17/kyiv-hit-by-a-series-of-explosions-from-drone-attack 4

174

J. I. del Valle and M. Moreno

ground) they are operated remotely by a human operator, who selects and engages the target. Once the target is identified, both drones can be armed and directed to dive-bomb the target, delivering their explosive payload. 9.2.2.3 Autonomous Loitering Munitions: HARPY The HARPY is an unmanned aerial vehicle (UAV) developed by Israel Aerospace Industries (IAI). It is a loitering munition that is designed for Suppression of Enemy Air Defense (SEAD) type missions. It can autonomously detect, attack, and destroy radar and communication systems of enemy air defense units. HARPY is an all-weather day/night “Fire and Forget” autonomous weapon, launched from a ground vehicle behind the battle zone. Programmed before launch to perform autonomous flight to a pre-defined “Loitering Area”, in which they loiter and search for radiating targets. Once the HARPY detects a radar signal, it attacks the radar source by diving toward it at high speed and exploding upon impact. The manufacturer’s commercial information5 does not include any detail about the operator’s capability to override its operation, which makes HARPY a real example of a human-out-of-the-loop weapon. 9.2.2.4 Autonomous Cluster Bomb: Sensor Fuzed Weapon (SFW) A Sensor Fuzed Weapon (SFW) is a type of precision-guided munition designed to destroy armored vehicles and other high-value targets. The weapon consists of a cluster bomb containing several submunitions or “skeets,” each equipped with a sensor that detects and targets armored vehicles. When the SFW is released from an aircraft, it deploys a parachute to slow its descent and opens up, releasing the submunitions. Each submunition contains an infrared sensor that detects the heat signature of a vehicle’s engine or exhaust, and a small explosive charge designed to penetrate the armor and destroy the vehicle’s internal components. This is another example of a human-out-of-the-loop system in which the operator only selects the geographical areas to release the weapon, being the rest of the operation fully autonomous. 9.2.2.5 Hypothetical AWS: SFW + Quadcopter + Image Recognition Capabilities We provide here a hypothetical AWS that will illustrate the main issues discussed in the following sections. As we said, we don’t have to think of a Terminator to start assessing the ethical impact of AWS: we already have several weapons allegedly

 https://www.iai.co.il/p/harpy

5

9  Ethics of Autonomous Weapon Systems

175

fully autonomous, and in our example, we will just upgrade the SFW described above with some AI-powered capabilities. Our example will replace the “skeets” from the SFW with 40 small quadcopters (small hovering drones) that are deployed at their operational altitude of around 500 m above ground. These quadcopters have an operational time of around 30 min and are equipped with an Electrooptical / Infrared (EO/IR) camera and an airborne Machine Learning powered image recognition computer. The rest of the operation is unchanged: once a vehicle is detected—accurately classified in this case—the quadcopters home in on the target and release their explosive charges, creating a high-velocity armor-penetrating projectile. An alternative variant of this weapon would be the anti-personnel version, which would use facial recognition—reportedly being already researched6—to select and engage enemy combatants. In this case, the quadcopters would be significantly lighter and a single SFW could accommodate up to 400 submunitions. This AWS would benefit from the SFW and the HARPY combined effectiveness. Like these weapons, our example is a “Fire and Forget” weapon: once deployed in a specific geographical area, it is a HOOTL system. Unlike the SFW, the submunitions have over 30 min to perform their operation (SFW skeets would only have several seconds), and the target classification would be much more accurate, using the latest image recognition capabilities. However, this example is particularly relevant because it also involved several characteristics that permeate the current discussions on AWS and their legal and ethical assessment. This includes the availability of the enabling technology, also widely used for civil applications, the concept of “meaningful human control”, or the use of Machine Learning, which involves all issues often found in the AI ethics literature, amongst which the (lack of) explainability or its probabilistic nature will be key problems. All these issues will be discussed in the following sections.

9.3 Legal Basis The attempts to regulate artificial intelligence applications delimit a complex domain of problems, where the guiding criteria depend on how certain questions are answered, the legal framework of reference, and specific characteristics of the technologies being integrated (Chauhan 2022). Given the multifunctional nature of many technological developments, the application context introduces additional elements of complexity when integrated into services or activities in civil and military industries (Pražák 2021; Gómez de Ágreda 2020; Arkin et al. 2012). The establishment of acceptable regulatory mechanisms is also problematic, since regulatory regimes vary across the countries most actively contributing to the  New Scientist: “US Air Force is giving military drones the ability to recognize faces” (https://www.newscientist.com/article/2360475-us-air-force-is-giving-military-dronesthe-ability-to-recognise-faces/) 6

176

J. I. del Valle and M. Moreno

development and application of AI-based technologies. While there is presumably broad agreement on the justification for attempting to regulate AI-based technologies to prevent potential abuses in their application (Kayser 2023), it is more difficult to specify the list of sufficient reasons, delineate the precise content of what needs to be regulated, and identify the processes that best serve this purpose. In a field that attracts a vast investment in a highly competitive global ecosystem, it is important to examine the feasibility of the best articulated proposals for the regulation of artificial intelligence applications, as well as the ethical issues involved, against the backdrop of regulatory systems anchored in different legal, political and moral traditions (Lin et al. 2008). International Humanitarian Law (IHL), also known as the Law of War or the Law of Armed Conflict (LOAC), is the set of regulations that govern the conduct of armed conflicts. IHL aims to limit the effects of armed conflict and protect individuals who are not or are no longer participating in hostilities. It seeks to ensure that even during times of armed conflict, there are limits on the methods and means of warfare, and that the human dignity and rights of all individuals, including civilians and prisoners of war, are respected and protected. The principles of IHL are designed to minimize the suffering caused by armed conflicts and promote the values of humanity and morality in times of war. However, several authors and reports have raised concerns about the inadequacy of the existing framework of IHL / LOAC to regulate the use of advanced weapons on the battlefield (in Bello). In particular, those that are capable of integrating information from multiple sensors, identifying (i.e. search for or detect, track, select) and engaging targets (i.e. use force against, neutralize, or destroy) without direct human intervention (Roff and Moyes 2016; Sehrawat 2017; Boulanin et al. 2020; Morgan et al. 2020; Gómez de Ágreda 2020; Sari and Celik 2021; Kahn 2023). For military planners, AWS have the potential to operate in ways and in contexts that humans cannot, helping to save costs or reduce military casualties. From a geopolitical perspective, AI is expected to become a crucial component of economic and military power in the near future. AI-based technologies may alter fundamental aspects of the distinction between automation and autonomy, and the role that notions of predictability, reliability and accountability play in processes of assessment and public acceptance (Roff 2014; Roff and Moyes 2016). In this context, high or full automation combined with object recognition and swarming in systems or platforms like battle tanks, navy vessels and unmanned aerial vehicles raises relevant legal and ethical questions, regarding the guarantees of the system’s compliance with IHL norms when selecting and attacking targets. In the event of a violation of the IHL, it will be necessary to consider how accountability and responsibility for individual or state actors can be ensured to make such systems ethically and socially acceptable (ICRC 2016; Boutin 2023). The unpredictability of scenarios in which the use of AWS might be considered the preferred option complicates the application of the IHL rules of distinction, proportionality, and precaution in the attack. The integration of advanced AI-based technologies may limit the sequence of human decisions to the mere activation of the system when the use of force is agreed, distancing it from the context where the

9  Ethics of Autonomous Weapon Systems

177

selection of specific targets is made and the attack is decided. A further detachment between the human decision phase and the actual use of force may undermine the conventional ways of attributing responsibility and the degree of human judgement, control, or involvement. According to the IHL (Article 36 of Additional Protocol I of the 1949 Geneva Conventions), humans must make life and death decisions (ICRC 2016, p. 11), including restrictions on their operation in time and space, to ensure that only legitimate targets were attacked. This is a robust criterion, applicable even under deteriorating communications or extreme pressure to guarantee defensive effectiveness and the minimization of civilian casualties. But it is essential to enable other aspects of discrimination and interpretation of the context associated with deliberate human control over the consequences (rationality, planning, and accountability). Article 51 of IHL prohibits the use of weapons that have indiscriminate effects, strike military objectives and civilians indiscriminately, or employ a method or means of combat the effects of which cannot be limited (i.e., nuclear, biological, or chemical weapons) (ICRC n.d.). AWSs with target recognition systems are designed to be more accurate. However, the challenges in attributing sufficient capacity to distinguish between combatants and civilians in pertinent contexts, where both the prevention of collateral damage and the proportionality of damage to infrastructure, homes, and facilities are crucial, limit their social acceptance (Morgan et al. 2020). Assessing these aspects requires the degree of complexity, empathy, and prudence associated with the human judgment within the framework of professionally responsible actions (ICRC 2016, p. 82). The criterion of military necessity states that “a combatant is justified in using those measures, not prohibited by international law, which are indispensable to ensure the complete submission of an enemy at the earliest possible moment” (Sehrawat 2017, p. 46). Military necessity is a context-dependent, value-based judgment under certain restraints applied through the targeting process and justified when the military advantage gained is clear. Even if AWS could take greater risks than human combatants to trigger more precise actions, estimating the number of acceptable civilian casualties and how to achieve the objectives while avoiding unnecessary suffering requires a complex level of moral reasoning, challenging to accomplish without direct human perception (eyes on the target) to specify the requirements of the precautionary principle when deciding to attack (Scharre 2018). Allusions to the doctrine of double effect (according to which collateral damage is acceptable as long as it was unforeseen, unintended, and necessary to achieve the military objective) are relevant here (Lin et al. 2008). But there is a reasonable doubt about the capacity that AWS may incorporate to compromise their integrity when applying operational criteria to distinguish between combatants, wounded or surrendered, and non-combatants, tempering the final decision with additional IHL requirements (Cantrell 2022, p. 647, 651). Martens Clause -the appeal to the public conscience, as stated in Additional Protocol I of 1977 to the Geneva Conventions- was initially introduced to avoid permitting everything not explicitly forbidden. This is not the public opinion in the

178

J. I. del Valle and M. Moreno

“doxa” sense but informed debate, i.e., “public discussion, academic scholarship, artistic and cultural expression, individual reflection, collective action, and additional means by which society deliberates its collective moral conscience” (Scharre 2018). But the arguments surrounding AWS cannot treat AWS as if all of these shared all morally relevant features, without reference to specific weapon platforms with specific abilities, and limitations when deployed for specific purposes (Wood 2023). On a case-by-case basis, the analysis focuses on the capabilities of weapons deployed in specific contexts and for particular purposes, rather than appealing to generic notions of human dignity to justify, for example, a moratorium on the development of AWS (Birnbacher 2016). A ban would not halt the development of AWS, it would lead to its development in even more secrecy and outside of public debate and assessment (HRW 2018; Lin et al. 2014, p. 246). Suppose the way to take the temperature of the collective public conscience is to survey public attitudes about a particular issue, such as nuclear weapons or AWS.  Nevertheless, further effort is needed since public attitudes may be uncritical, uninformed, or unethical, such as supporting racist and sexist policies (Morgan et al. 2020). Doing ethics by survey alone is the same mistake as doing ethics by numbers alone. Nonetheless, surveys do provide essential data points in this conversation. The main reason to consider AWS to contravene the Martens Clause, even without publicly voiced opposition by faith leaders, scientists, tech workers, and civil society organizations, depends on a right not to be killed arbitrarily, unaccountably, or otherwise inhumanely (Lin 2015, p.  2). But such a generic formulation brings up the problem of value alignment, a characteristic of plural societies that poses significant challenges for designing and regulating technically complex systems such as AWS (Russell 2016; Peterson 2019; Boshuijzenvan Burken 2023).

9.4 Main Issues Posed by AWS This section provides insight into several problems that are often encountered in the current discussion on AWS, particularly those powered by AI. These issues commonly involve legal, ethical, and practical matters.

9.4.1 Low Bar to Start a Conflict: Jus Ad Bellum Given the difficulties in building the integrated system of capabilities needed to coordinate and optimize the performance of AWS in conflict zones, it is unlikely in the short term that the adoption of AI-based technologies in the militaries of major R&D spenders will contribute to lowering the caution required to initiate a conflict

9  Ethics of Autonomous Weapon Systems

179

under reasonable jus ad bellum criteria (Sari and Celik 2021, p.  10; Kahn 2023, p.  10, 21, 40). Apart from technologically implausible scenarios of autonomous operational battle systems deciding to go to war, it is doubtful that AWS create novel problems from an ethical perspective regarding just war theory (Horowitz 2016a, p. 33). Even a negative assessment of the use of AWS on moral grounds has to consider the plausible contribution to conflict scenarios where unnecessary suffering is minimized by using military equipment (means and methods) more precise than conventional systems. However, it does not in any way exclude the central role of the human factor in decision-making (Horowitz 2016a, p. 34). Each phase (Ante Bellum, In Bello, Post Bellum) incorporates complex decision-making sequences and readjustments of the relevant aspects or parameters under human control, ranging from design and manufacturing to the social and political context that justifies the use of AWS (Roff and Moyes 2016, p. 3–4). A fortiori, and except for particular circumstances that might restrict reaction time, it is unlikely that human control will be transferred in the critical phase of justifying the use of force and estimating its extent, the actors involved, and the absence of alternatives (Boulanin et al. 2020, p. 28; May 2015, p. 65–86).

9.4.2 Availability of Enabling Technologies and the Dual Use Problem Unlike previous evolutions in weapon systems, like e.g., nuclear weapons, AWS do not require complex facilities or specific hardware only at the reach of some states. Autonomy, particularly when using AI, is mainly a software upgrade and the underlying technology is accessible to anyone conversant with publicly available programming frameworks, mainly used for civil purposes such as image recognition, recommender systems, gaming, etc. Other technologies widely available for civil purposes but reportedly already being used in today’s armed conflicts are commercial drones or 3D printing. This use turns AI, drones, or 3D printing into what is called dual-use technologies, i.e., technology that has both civil and military applications. For example, the drones used as submunitions in our speculative AWS could be also used for medical delivery; the underlying AI image recognition used for target classification could be the same technology that these medical delivery transport drones would use to classify landing spots. The dual-use nature of these technologies creates an additional challenge for anyone involved in the discussion on AWS, as they must balance the benefits of technological advancement with the potential risks of misuse. In addition, the accessibility and affordability of these technologies also mean that they can be easily obtained by non-state actors, or not law-abiding state actors, which entails a further complication in this domain.

180

J. I. del Valle and M. Moreno

9.4.3 Meaningful Human Control Meaningful human control and responsibility is one of the key issues of AWS entailing both legal and moral aspects. We have already seen in Sect. 9.2 the complexity of providing an all-­encompassing taxonomy to categorize the level of autonomy of a system and the fact that no classification has yet been agreed upon for AWS. Although we proposed to use the traditional human in/on/out-of-the loop, when describing our AWS examples, there is another concept that has emerged UN CCW discussions since 2014 (Horowitz and Scharre 2015): meaningful human control.7 Although this concept might seem at first intuitive, it suffers from a familiar issue of other intuitive concepts: nobody knows what it actually means in practice (Ekelhof 2019). This lack of a detailed agreed theory of meaningful human control prevents the development of any legal regulation, design guideline, or technical standard that could be derived from this concept. Taken to the extreme, meaningful human control could require a human being overlooking the whole operation of the weapon, always keeping eye on the target and having full control of the weapon, including the possibility to abort the attack at any time. However, this would make the sword the only lawful weapon, since we humans lack this level of control of our weapons since the first rock was thrown in anger. As clearly put by Horowitz and Scharre “If meaningful human control is defined in such a way that it has never existed in war, or only very rarely, then such a definition sheds little light on the new challenge posed by increased autonomy” (Horowitz and Scharre 2015). Other approaches to meaningful human control offer a more nuanced, realistic view of technology, not necessarily requiring “the act of direct controlling from a position that is contiguous in space and time or is a proximate cause, as control in a morally relevant sense allows for technological mediation and separation of the human agent and the relevant moral effects of the acts that he is involved in” (Santoni de Sio and van den Hoven 2018). In line with this, Santoni de Sio and var. den Hoven provide a philosophically grounded concept of meaningful human control, which is composed by two elements: tracking and tracing. Tracking condition is a sort of explainability: all actions performed by the AWS must be responsive to human moral reason. Tracing requires the commander deploying the AWS (or at least in single person in the development and deployment chain) to have enough training on the system and knowledge on the situation and accepts the responsibility of launching the system. Finally, an additional definition of meaningful human control is the proposal by Horowitz and Scharre, which without resorting to technical philosophical terminology like the previous one, rightly put the burden share on the weapon’s operator and on the weapon’s designer. This has three essential components:  A similar concept (and equally ambiguous) has been introduced in the DoD Directive 3000.09 Autonomy in Weapon Systems: “appropriate level of human judgment” (DoD 2023). 7

9  Ethics of Autonomous Weapon Systems

181

1 . Human operators are making informed, conscious decisions about the use of weapons. 2. Human operators have sufficient information to ensure the lawfulness of the action they are taking, given what they know about the target, the weapon, and the context for action. 3. The weapon is designed and tested, and human operators are properly trained, to ensure effective control over the use of the weapon. (Horowitz and Scharre 2015)

Meaningful human control is thus the condition for a human being to take responsibility to launch the system. And this could be extrapolated to the different steps in the chain of command since, as rightly highlighted by Merel Ekelhof, any attempt to define meaningful human control should recognize “the distributed nature of control in military decision-making in order to pay due regard to a practice that has shaped operations over the past decades and continues to be standard in contemporary targeting.” (Ekelhof 2019).

9.4.4 Unpredictability of AWS Directly linked with (the lack of) meaningful human control is the unpredictable nature of AWS, which is often cited as a key issue of these systems. This view is clearly described as follows: “The sheer complexity of autonomous weapon systems’ methods for making these determinations may make it impossible for human beings to predict what the systems will do, especially to the extent they operate in complex environments and are subject to various types of malfunction and corruption. More advanced autonomous weapon systems might even “learn” from in-field experiences or make probabilistic calculations” (Crootof 2016). The unpredictability of AWS is the main reason why Maria Rosa Taddeo et al. call a “moral gambit” to develop and deploy an AWS since everyone in the chain of command should be “aware of the risk that unpredicted outcomes may occur and […accept the…] moral responsibility also for the unpredictable effects that may follow the decision to deploy AWS” (2022). This concern should be taken seriously as deploying an unpredictable weapon, besides having very limited operational benefits,8 would be ruled upfront by International Humanitarian Law (IHL) (McFarland 2020, p. 85–112). However, assuming that unpredictability is inherent to AWS, would be a rough oversimplification of this domain, probably caused by a wrong extrapolation of previous well-known failures of AI systems.9 Thus, a more nuanced approach to autonomy would be needed for more fruitful discussions about this important topic. One way to start would be to analyze the sources of unpredictability that

 For dissenting, yet controversial, point of view based on the deterrence effect, see (Scharre 2018, p.  315): “Deploying an untested and unverified autonomous weapon would be even more of a deterrent, since one could convincingly say that its behavior was truly unpredictable.” 9  Typical examples often cited to underpin unpredictability claims of AI systems are Microsoft Tay deployed in 2016 that soon started publishing offensive and racist comments, or Google image recognition software that classified in 2015 black people as “gorillas”. 8

182

J. I. del Valle and M. Moreno

a weapon system with higher levels of autonomy might present. The excerpt above includes different elements that would be better analyzed on their own: the complexity of the system, its learning capabilities, the use of the weapon in a dynamic environment, and the system’s malfunctions. We will start with the learning capabilities, which are the only distinct feature compared to other complex weapon systems. Referring to learning systems assumes that the autonomy of the system or some of the tasks that endow the system with autonomy are based upon Machine Learning technologies. However, it is very important—particularly when discussing the unpredictable behavior of the systems—to distinguish between two main approaches in Machine Learning: offline learning, also called batch learning, and online learning, also called incremental learning. In offline learning, the system is trained on a fixed dataset or batch of data before it is deployed for use. The training process will provide several performance metrics (e.g., accuracy) that will be unchanged during operation. In addition, the training data, which should be representative of the design operational envelope, is well-­ known and controlled by the developers. Finally, any further updates or changes require a full retraining of the model. This means that after the training phase—and particularly once the system is in operation—the system will not keep learning. It does not mean that the system will not fail: perfect accuracy will never be achieved, or the system could be deployed outside its design operational envelope but this should be addressed as a system malfunction or misuse rather than unpredictability. In online learning, the model is updated continuously as new data becomes available, without requiring a full retraining of the model on the entire dataset. In this case, the system keeps learning during its operation, the developers will not have control over the training data, and the performance metrics will evolve during operation, hopefully, to better figures but this cannot be assured. Thus, it is online learning the technique that would feature the degree of unpredictability under concern. A weapon relying on this type of Machine Learning for any critical task would be indeed against IHL and should be ruled out, which is a position held even by well-known advocates of AWS.10 However, inferring that all AWS are inherently unpredictable would be a hasty generalization that would be counterproductive for any useful discussion on the lawfulness of AWS. The complexity of the system is another major concern when discussing the unpredictable behavior of autonomous systems, including, but not limited, to weapons, particularly when some of their functions are powered by Machine Learning.

 See e.g., Ronald Arkin’s position: “Arkin says the only ban he ‘could possibly support’ would be one limited to the very specific capability of ‘target generalization through machine learning.’ He would not want to see autonomous weapons that could learn on their own in the field in an unsupervised manner and generalize to new types of targets” (Scharre 2018). 10

9  Ethics of Autonomous Weapon Systems

183

The lack of transparency (the “black box” issue) for the operators of the system and even its designers, or the lack of explainability, are common issues in the AI ethics domain. Indeed, Machine Learning systems are not programmed but trained, which prevents any meaningful verification of the code, and their results are based on predictions based on the training data rather than preprogrammed decision trees. This does not make these systems necessarily unpredictable but probabilistic. In addition, the training process offers valuable information about the system’s reliability, e.g., accuracy, that could be used by the operators of the system to judge the suitability of the use of the weapon in specific contexts. Another source of unpredictability in AWS discussions is the complexity of the environment. However, upon reflection, the battlefield has always been a dynamic, uncertain setting characterized by a lack of sufficient knowledge of the enemy’s location, capabilities, and intentions often illustrated by the term “fog of war”, attributed to XIX-century war theorist Carl von Clausewitz. Operating in an unpredictable environment is not new neither to modern warfare nor to the use of AWS. What is arguably new to AWS is the increasing distance in time and space between their triggering and their effects. The concern is that AWS can be launched hours or days before the attack from remote locations using information that can become quickly out of date. In this case, the AWS can operate in a context that was not initially foreseen or even an environment which it has not been trained for, which is strongly linked with the capability of the operator to judge the suitability of the AWS for a particular operation as discussed in the last section. But then the discussion is not about unpredictability; it should rather revolve around the means to ensure that the operational context is the one for which the AWS has been launched (and been designed) and what to do otherwise, e.g., that the weapon is confined to a certain time and space or how to autonomously abort the operation if the context changes significantly. Finally, regarding AWS malfunction, it would be hard to admit that this is a problem inherent to AWS or that an AWS software or hardware “failure would have any legal implications beyond those which would attach to an equivalent software or hardware failure in a manual weapon system.” (McFarland 2020, p. 97).

9.4.5 Accountability Morgan et  al. (2020) show the results of a survey focused on public support for investment in military AI applications, including AWS designed to be more accurate and precise than humans. The main concern is the ethical risks that military AI poses for accountability and human dignity, particularly when attacks that take human lives are authorized. Less concern is associated with decision support systems and military AI applications in cyber defense or disinformation. Beyond

184

J. I. del Valle and M. Moreno

accuracy or technical precision, the key aspect is which actor(s) can be held accountable or punished for wrongful actions, in perspectives that vary depending on the threat and self-defence considerations. Accountability is also essential for safety and trust in clinical research and medical treatments, associated with transparency and shared responsibility to avoid harm, bias and conflicts of interest (Montréal Institute for Learning Algorithms 2018, p. 90, 108). In addition, the attribution of responsibility to individuals and organizations for their actions and any unintended consequences or adverse effects they may have caused is relevant for gaining democratic legitimacy in the use of AWS, when policymakers, professionals, and industry stakeholders are involved in often confidential dynamics of cooperation, to satisfy additional demands of liability, oversight, and compliance with ethical and legal norms. Ultimately, transparency and accountability are the basis for gaining knowledge about other factors necessary to explain decisions and preserve the skills and role of human beings in the legal system (de Fine Licht and de Fine Licht 2020). The traditional problem of many hands is relevant in cases when the use of AWS violates either jus ad bellum or jus in bello rules, since no principle in IHL states that there must be an individual to hold accountable for every death on the battlefield (Scharre 2018). There are weapons systems (the ship-based Aegis combat system, for example) that can detect incoming rockets and missiles and respond on their own to eliminate the threat. In many cases, “asking whether the operator exercises meaningful human control over the weapon may be the wrong question here. Instead, we should ask how the military organization as such can or cannot ensure meaningful human control over important targeting decisions.” (Ekelhof 2019, p. 347). Although individuals were not considered criminally liable under international law (instead, responsible states were liable to other states for reparations), in the last seventy years many efforts have been made to create enforceable individual liability mechanisms for war crimes. The concept of “war crimes” should be interpreted now as intertwined with individual criminal liability (Crootof 2016, p. 1355). A robust scheme of vicarious liability should include the commander in the chain of responsibility, regardless of how much the state’s role could have been obfuscated even in war crimes (Horowitz 2016a, p. 30–33).

9.4.6 Human Dignity—Dehumanization of Targets The last issue we will be discussing is also the one that opponents and supporters of AWS, particularly those capable of engaging lethal force, will unlikely find a trade-­ off as it is rooted in deep moral considerations that people could legitimately accept or not: the requirement that initiating lethal force shall always involve human judgment because otherwise, this would be an assault on human dignity (Lin et al. 2008).

9  Ethics of Autonomous Weapon Systems

185

This would go beyond the meaningful human control of the AWS or the “moral gambit” of accepting the responsibility to initiate an attack with these systems discussed previously,11 or even the accountability concern. This is not (or not only) responsibility and redress but about the consequences that the use of these systems would have for us as moral agents. This view has been clearly put by Peter Asaro when he explains that technological evolution in warfare should enhance the moral agency of the people involved in the chain of command by providing additional capabilities to utilize their judgment, particularly regarding the principles of distinction and proportionality. This has been the case so far with precision-guided munitions and armed drones, which “actually increase our moral burden to ensure that targets are properly selected and civilians are spared”. Delegating life-or-death decisions to a machine regardless of its sophistication would lead us in the opposite direction (Asaro 2012). As Paul Scharre put it: “The strongest ethical objection to autonomous weapons is that as long as war exists, as long as there is human suffering, someone should suffer the moral pain of those decisions” (Scharre 2018). This view is arguably supported by IHL12 that “explicitly requires combatants to reflexively consider the implications of their actions, and to apply compassion and judgement in an explicit appeal to their humanity” (Asaro 2012). The fact that IHL is made for humans—and that requires human interpretation, which prevents its codification in an AWS—is not an issue but a feature, one that should not be changed with the emergence of new technical capabilities. Critics of this argument, such as Michael Horowitz contend that the dignity argument has emotional resonance, but it may romanticize warfare (Horowitz 2016b). Reflecting further on this argument there seems not to be any clear reason why being killed by an AWS would lower the dignity of the combatant fallen on the battlefield (particularly if AWS would save the lives of non-combatants). Also, it is often said that human soldiers are not great at using their human judgment in warfare, as no one would under extremely high pressure. This has led to some people to take a pure pragmatic and consequentialist view and contend that if AWS can perform more ethically than humans, e.g., abiding to IHL better as they would not be subject to any stress to keep their own life, then we would have a moral imperative to use this technology to help save civilian lives (Arkin 2009). However, these critics would miss the central point of this objection: even if we would prove that AWSs outperform human beings in abiding to IHL—which still needs to be proven empirically—we would still be facing a normative request: that there ought to be a moral weight in the active taking of a human life, which should be carried by a human being. This normative request should make us at least not to

 For the same reasons adduced herein Taddeo and Blanchard (2022) contend that the moral gambit would not be applicable to Lethal Autonomous Weapon Systems (LAWS). 12  For a dissenting view see (Scharre 2018, p.  295): Considering the beyond-IHL principle of human dignity “would put militaries in the backwards position of accepting more battlefield harm and more deaths for an abstract concept.” 11

186

J. I. del Valle and M. Moreno

jump too quickly to the otherwise appealing consequentialist view mentioned above as it deals not with numbers and probabilities but with raising or lowering our moral standards with technological innovation.

9.5 Conclusions AI-based technologies with increasing autonomy levels in modern warfare have raised public concern and high-level political debates, with new and specific elements if we consider decades of discussion about other weapon types without humans in the loop. Traditional problems such as the adherence to International Humanitarian Law (IHL), the implementation of the Geneva Conventions or the Martens Clause, respect for human dignity, and accountability for collateral harm are exacerbated in combination with ethical concerns related to AI, such as transparency, explainability, human agency, and autonomy. While there is a clear risk of distorting the public debate by promoting overly alarmist interpretations endorsing either a moratorium or a total ban on AWS, not necessarily based on a realistic assessment of systems’ capabilities deployed on various platforms, it is crucial to facilitate the conditions for policymakers, professionals, and industry stakeholders to cooperate in a way that meets genuine demands for responsibility, accountability, oversight, and compliance with ethical and legal standards. The undeniable advantages of more accurate, effective, and versatile weaponry in complex battlefield conditions can help minimize the suffering of combatants and reduce civilian casualties. However, the international race to develop AWS with advanced AI and increasing degrees of autonomy in the relevant aspects has to face the complex challenge of the entire system required to exploit their potential in eventual conflict scenarios. Therefore, it is unlikely that their deployment will be more conducive to initiating hostilities than other conventional technologies. In recent decades, the trend in attributing legal responsibility for war crimes has evolved from a more state-centric approach to chain-of-command scrutiny of individual decisions. The object recognition and targeting capabilities that advanced AWS with AI-based technologies can provide certainly introduce a greater distance between the human decision to activate specific devices and the time or context in which combat actions occur. Nevertheless, similar circumstances are associated with the use of advanced weaponry with specific defense avoidance and target recognition capabilities, designed to act faster than the human reaction margin, and therefore do not constitute insurmountable obstacles to social acceptance or relieve military organizations of the duty to ensure the highest possible degree of control over the systems they deploy.

9  Ethics of Autonomous Weapon Systems

187

References Arkin, R. 2009. Governing Lethal Behavior in Autonomous Robots. Taylor & Francis. Arkin, Ronald Craig, Patrick Ulam, and Alan R. Wagner. 2012. Moral decision making in autonomous systems: Enforcement, moral emotions, dignity, trust, and deception. Proceedings of the IEEE 100 (3): 571–589. https://doi.org/10.1109/JPROC.2011.2173265. Asaro, Peter. 2012. On banning autonomous weapon systems: Human rights, automation, and the dehumanization of lethal decision-making. International Review of the Red Cross 94 (886): 687–709. https://doi.org/10.1017/S1816383112000768. Birnbacher, Dieter. 2016. “Are autonomous weapons systems a threat to human dignity?” In Autonomous weapons systems, edited by Nehal Bhuta, Susanne Beck, Robin Geiβ, Hin-­ Yan Liu, and Claus Kreβ, 105–121. Cambridge University Press https://doi.org/10.1017/ CBO9781316597873.005. Boshuijzen-van Burken, Christine. 2023. Value sensitive design for autonomous weapon systems  – A primer. Ethics and Information Technology 25 (1): 11. https://doi.org/10.1007/ s10676-­023-­09687-­w. Boulanin, Vincent, Neil Davison, Netta Goussac, and Moa Peldán Carlsson. 2020. Limits on autonomy in weapon systems: Identifying practical elements of human control. Stockholm International Peace Research Institute and the International Committee of the Red Cross. Boutin, Bérénice. 2023. State responsibility in relation to military applications of artificial intelligence. Leiden Journal of International Law 36 (1): 133–150. https://doi.org/10.1017/ S0922156522000607. Cantrell, Hunter. 2022. Autonomous weapon systems and the claim-rights of innocents on the Battlefield. AI and Ethics 2 (4): 645–653. https://doi.org/10.1007/s43681-­021-­00119-­3. Chauhan, Krishna Deo Singh. 2022. From ‘what’ and ‘why’ to ‘how’: An imperative driven approach to mechanics of AI regulation. Global Jurist, December. https://doi.org/10.1515/ gj-­2022-­0053. Coeckelbergh, M. 2020. Introduction to Philosophy of Technology. Oxford University Press. Crootof, Rebecca. 2016. War torts: Accountability for autonomous weapons. University of Pennsylvania Law Review 164 (6): 1347–1402. de Fine Licht, Karl, and Jenny de Fine Licht. 2020. Artificial intelligence, transparency, and public decision-making. AI & SOCIETY 35 (4): 917–926. https://doi.org/10.1007/ s00146-­020-­00960-­w. DoD. 2011. Unmanned systems integrated roadmap FY2011–2036. ———. 2017. Unmanned systems integrated roadmap FY2017–2042. ———. 2023. DoD DIRECTIVE 3000.09 Autonomy in Weapon Systems. Ekelhof, Merel. 2019. Moving beyond semantics on autonomous weapons: Meaningful human control in operation. Global Policy 10 (3): 343–348. https://doi.org/10.1111/1758-­5899.12665. Gómez de Ágreda, Ángel. 2020. Ethics of autonomous weapons systems and its applicability to any AI systems. Telecommunications Policy 44 (6): 101953. https://doi.org/10.1016/j. telpol.2020.101953. Horowitz, Michael C. 2016a. The ethics & morality of robotic warfare: Assessing the debate over autonomous weapons. Daedalus 145 (4): 25–36. https://doi.org/10.1162/DAED_a_00409. ———. 2016b. Why words matter: The real world consequences of defining autonomous weapons systems. Temple International & Comparative Law Journal. https://sites.temple.edu/ticlj/ files/2017/02/30.1.Horowitz-TICLJ.pdf. Horowitz, Michael C., and Paul Scharre. 2015. Meaningful human control in weapon systems: A Primer. 2015. https://www.cnas.org/publications/reports/meaningful-­human-­ control-­in-­weapon-­systems-­a-­primer. HRW. 2018. Heed the call: A moral and legal imperative to ban killer robots. Human Rights Watch. ICRC. 2016. Autonomous weapon systems implications of increasing autonomy. International Committee of the Red Cross, no. March.

188

J. I. del Valle and M. Moreno

———. n.d. Article 51. Protection of the civilian population. https://ihl-­databases.icrc.org/en/ ihl-­treaties/api-­1977/article-­51. Kahn, Lauren A. 2023. Risky incrementalism. Defense Al in the United States. Hamburg. Kayser, Daan. 2023. Why a treaty on autonomous weapons is necessary and feasible. Ethics and Information Technology 25 (2): 25. https://doi.org/10.1007/s10676-­023-­09685-­y. Keulartz, J., Schermer, M., Korthals, M., and Swierstra, T. (2004). Ethics in Technological Culture: A Programmatic Proposal for a Pragmatist Approach. Science, Technology, & Human Values, 29 (1), 3–29. https://doi.org/10.1177/0162243903259188 Lin, Patrick. 2015. The right to life and the martens clause. Pravo – Teorija i Praksa. https://cyberlaw.stanford.edu/sites/default/files/publication/files/ccw_testimony.pdf. Lin, Patrick, George Bekey, and Keith Abney. 2008. Autonomous military robotics: Risk, ethics, and design. Fort Belvoir: Defense Technical Information Center. https://doi.org/10.21236/ ADA534697. Lin, Patrick, Keith Abney, and George A. Bekey. 2014. Robot ethics: The ethical and social implications of robotics. London: MIT Press. https://doi.org/10.1007/s10677-­015-­9638-­9. May, Larry. 2015. Contingent pacifism. In Contingent pacifism, ed. Larry May. Cambridge University Press. https://doi.org/10.1017/CBO9781316344422. McFarland, Tim. 2020. Autonomous weapon systems and the law of armed conflict: Compatibility with international humanitarian law. Cambridge: Cambridge University Press. https://doi. org/10.1017/9781108584654. Montréal Institute for Learning Algorithms. 2018. Montréal declaration for a responsible development of artificial intelligence, 1–21. Morgan, Forrest, Benjamin Boudreaux, Andrew Lohn, Mark Ashby, Christian Curriden, Kelly Klima, and Derek Grossman. 2020. Military applications of artificial intelligence: Ethical concerns in an uncertain world. RAND Corporation. https://doi.org/10.7249/RR3139-­1. Peterson, Martin. 2019. The value alignment problem: A geometric approach. Ethics and Information Technology 21 (1): 19–28. https://doi.org/10.1007/s10676-­018-­9486-­0. Pražák, Jakub. 2021. Dual-use conundrum: Towards the weaponization of outer space? Acta Astronautica 187: 397–405. https://doi.org/10.1016/j.actaastro.2020.12.051. Roff, Heather M. 2014. The strategic robot problem: Lethal autonomous weapons in war. Journal of Military Ethics 13 (3): 211–227. https://doi.org/10.1080/15027570.2014.975010. Roff, Heather M., and Richard Moyes. 2016. Meaningful human control, artificial intelligence and autonomous weapons. Article36.Org. Geneva. Russell, Stuart. 2016. Should we fear supersmart robots? Scientific American 2, no. June. Santoni de Sio, Filippo, and Jeroen van den Hoven. 2018. Meaningful human control over autonomous systems: A philosophical account. Frontiers in Robotics and AI 5. https://www.frontiersin.org/articles/10.3389/frobt.2018.00015. Sari, Onur, and Sener Celik. 2021. Legal evaluation of the attacks caused by artificial intelligence-­ based lethal weapon systems within the context of rome statute. Computer Law & Security Review 42: 105564. https://doi.org/10.1016/j.clsr.2021.105564. Scharre, Paul. 2018. Army of none: Autonomous weapons and the future of war. 1st ed. New York: W.W. Norton & Company. Sehrawat, Vivek. 2017. Autonomous weapon system: Law of Armed Conflict (LOAC) and other legal challenges. Computer Law & Security Review 33 (1): 38–56. https://doi.org/10.1016/j. clsr.2016.11.001. Taddeo, Mariarosaria, and Alexander Blanchard. 2021. A comparative analysis of the definitions of autonomous weapons, SSRN Scholarly Paper ID 3941214. Rochester: Social Science Research Network. https://doi.org/10.2139/ssrn.3941214. ———. 2022. Accepting moral responsibility for the actions of autonomous weapons systems—A moral gambit. Philosophy & Technology 35 (3): 78. https://doi.org/10.1007/ s13347-­022-­00571-­x. Wood, Nathan Gabriel. 2023. Autonomous weapon systems and responsibility gaps: A taxonomy. Ethics and Information Technology 25 (1): 16. https://doi.org/10.1007/s10676-­023-­09690-­1.

Part III

The Need for AI Boundaries

Chapter 10

Ethical Principles and Governance for AI Pedro Francés-Gómez

Abstract  AI is a potentially very beneficial, but risky new technology. There is broad consensus that AI requires a formal normative framework to minimize risks affecting human rights, fairness, and autonomy, not to mention supposed existential and catastrophic risks derived from unaligned AGI.  This framework could be entirely private or entirely public, but the reasonable option is a hybrid system: private governance procedures integrated in Corporate Responsibility policies, quality management systems and risk management systems at firm level; and public agencies in charge of monitoring high-risk systems during their lifecycle. This chapter links the main risks posed by AI with the need to resort to time-­ tested ethical principles as the ideological grounding of legislation and public policy. The governance systems of private companies based on ethical business policies represent the self-regulatory option for AI governance. It is argued, however, that self-regulation is not enough in this case. A public governance system for AI is required. The EU model of hybrid governance is described in detail as the more advanced model in this regard, including a reference to the main provisions and potential implication of the proposed, but still not in force, AI Act. Finally, the chapter mentions future challenges and lines of research regarding AI governance. Keywords  AI ethics · Responsible AI · AI governance · AI risks · European AI act · Trustworthy AI

P. Francés-Gómez (*) Department of Philosophy I, University of Granada, Granada, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 F. Lara, J. Deckers (eds.), Ethics of Artificial Intelligence, The International Library of Ethics, Law and Technology 41, https://doi.org/10.1007/978-3-031-48135-2_10

191

192

P. Francés-Gómez

10.1 Intro: Risk and Governance The rapid growth of risky industries and technologies calls for particularly stringent rules. Aviation is commonly cited. Most of us travel nonchalantly by plane in part because we trust that regulations and operational protocols are thorough and there is constant supervision by public authorities. If regulation of air transportation were lax, or non-existent, we would be very reluctant to fly. In general, technologies precede their regulation. There are two main reasons: First, risks may be hidden, unknown or deliberately concealed. Second, technology is developing so quickly that regulators simply cannot catch up: the rhythm of market-­driven innovation generally outpaces the democratic legislative process. AI is a technology in development, therefore some of its risks are not fully known. Nevertheless, most of them have already been identified (6, 2001, Bostrom 2014, Coeckelbergh 2020, Bucknall and Dori-Hacohen 2022, cfr. in particular Future of Life Institute open letter of March 23 2023: https://futureoflife.org/open-­ letter/pause-­giant-­ai-­experiments/). Still, public debate and social pressure for AI regulation is relatively recent. This may be in part because firms developing the technology tried to hush or belittle public debates about the potential risks and misuses of their products. However, AI systems are making decisions that affect us directly, and in some areas of application, the potential threat to basic human rights and safety is significant (Bostrom 2014; Floridi 2021; O’Neill 2017; Bender et al. 2021). AI is known to be extraordinarily difficult to explain to non-experts – even to experts! (cfr. Pégny et al. 2020). Its name covers a number of different computer programming techniques that give computers, computer-based systems, and sometimes hardware and equipment capabilities that replicate – and often outperform – those of humans. Given the diversity of software solutions and technologies identified as AI in different fields of application, the expert consensus is to offer a functional definition: AI systems are machine-based systems that can, for a given set of human objectives, make predictions, recommendations or decisions influencing real or virtual environments (OECD 2019b). AI comes in seven main patterns that it is important to recall, for they pose different potential risks: 1. Hyper-personalization: profiling and updating profiles of persons for many uses, in general related to commerce, marketing, personalized recommendations, etc. 2. Human interaction: chatbots, digital assistants and other conversational systems using text, voice or image. 3. Applications using pattern-anomaly detection: it may be related to anomalous transactions, for example in banking; or anomalous patterns or changes in tissues, images, etc., for medical diagnoses. In the future it may be applied to any kind of anomalous or deviant behavior of persons or machines. 4. Recognition: machine learning is used for the system to recognize unstructured data. It is applied for facial recognition, diagnosis, security, identification, etc. 5. Goal-driven systems: it refers to techniques – known as ‘cognitive approaches’ – through which computer systems learn through trial and error to solve problems. This pattern has been used in games, real-time bidding, and increasingly

10  Ethical Principles and Governance for AI

193

c­ omplex strategic interactions. It can be potentially extended to any kind of purposeful activity. 6. Predictive analytics and decision support: by using past data, computer systems can predict future events, for example in improving the accuracy of weather forecasting by learning from past data and millions of models of prediction. AI systems may be used to predict human individual or collective behavior. 7. Autonomous systems: hardware and software systems capable of accomplishing tasks, reaching goals and in general interacting with the environment with no human intervention. The case in point is self-driving vehicles, including autonomous lethal weapons. AI systems are potentially beneficial, but they create or increase risks for the following reasons: 1. They make unsupervised decisions that end-users may not be able to explain or control (Russell 2020; Marjanovic et al. 2021). 2. They use huge amounts of personal and non-personal data. Legal data governance would require consent and proper care in the flow of data, which is virtually impossible in the training process of very large general purpose AI systems (Kuner et al. 2018). 3. They are designed to act autonomously, therefore without the consent of the subjects in the process of generating results (cfr Floridi et al. 2021, for example, on autonomous weapons). 4. They have been reported to make biased decisions that reproduce and intensify prevalent stereotypes and prejudices in some applications fields. (O’Neill 2017; D’Ignazio and Klein 2020; Scott and Yampolskiy 2020) 5. They are applied to sensitive activities, such as border and immigration control, public security, crime prevention, and others that by their very nature affect basic rights of people, sometimes without proper redress mechanisms. 6. Since there is no previous legislation or case-law applicable, they disturb the accountability legal system, introducing great uncertainty in liability rules and processes (Bertolini and Episcopo 2021). 7. The training of large ML systems is energy-consuming. It has been calculated that the technology industry contributes above 3% to global greenhouse gas emissions (Bucknall and Dori-Hacohen 2022). 8. Diverse pressures for quick development – for example state-state competition or market competition between private firms – may cause design errors that end up in what experts name ‘un-alignement’. This refers to AI (in particular AGI) Systems to fail to address the goals or purposes for which they were created and set instead goals of their ‘own’: non-human goals. This is generally considered a catastrophic risk. 9. Finally, let us not forget that AI may simply be used, as any other human invention, with criminal or unethical intentions (cfr. Schenier 2018). Think of the capacity to create deep fakes intended to damage someone’s reputation; or the very real possibility of manipulating public opinion or even destabilizing democratic processes with biased or false information targeted to the most vulnerable or gullible audience, etc. (Coeckelbergh 2022; Floridi et al. 2021)

194

P. Francés-Gómez

Despite all the dangers, market forces as well as governmental agencies, are pressing for the quick development and uptake of AI-based services, products, and policies. Public and private funds are being funneled into AI research, and it is difficult to name an industry that is not being currently transformed by AI. When the new technology is not fully covered by extant legislation, or problems of interpretation create legal vacuums, the best we can do is to make an honest assessment of risks and take time-tested ethical principles and values to keep the development and implementation of the technology within the limits of the ethically permissible, of the socially acceptable (6,  2001, Floridi 2021, Véliz 2019, Blackman 2022, cfr. also the Montreal Declaration 2018; The Asilomar AI Principles 2017; AI Now institute; Center for Humane Technology). This is the approach already being adopted by responsible developers, who are trying to convey that their development of AI technologies is ethical and for the social good. They need to be trusted, which is why forms of self-regulation are common in industry and professional associations and bodies (cfr. AlgorithmWatch. Inventory of AI Ethics guidelines: https://algorithmwatch.org/en/ai-­ethics-­guidelines-­ inventory-­upgrade-­2020/). They are a welcome first layer of governance, but selfregulation has obvious limits. Voluntary commitments are easily used for ethics-washing and may not provide the level of security that society requires. It is probably the best we can hope for at present, when enforceable rules are not yet in place, but is definitely not what we need. Even in the USA, where voluntary subjection to ethical codes is supposed to be the preferred approach to AI governance, it is accepted that legal rights of citizens must be formally protected if AI systems are to be trusted (see for example the White House “Blueprint for an AI Bill of Rights” https://www.whitehouse.gov/ostp/ai-­bill-­of-­rights/). In the EU the legislative approach has been deemed preferable. However, even after a future AI Act is passed in the EU Parliament, it will not cover many AI systems and selfenforced norms will be the only guarantee we, as users, will get. In this context, a stage of general analysis and ethical foundation is necessary to establish the norms and mechanism of supervision and (self)enforcement that may guarantee a socially responsible and ethical development and deployment of AI. This process of identifying key values and principles for AI and its translation to corporate commitments, legal provisions and both public and private oversight mechanisms  – including the framework of offices and bodies responsible for the enforcement of the rules – is what we call AI-governance. The underlying idea is that AI is not just another information technology application, or a service that may fall under previous legislation on consumers rights, manufacturers liability, or data protection, although relevant legislation is of course applicable to AI systems for the time being. AI is an entirely new aspect of technology, and even a new form of resource with disruptive potential. Professor Andrew Ng famously compared AI with electricity: an all-purpose resource. The capabilities of AI systems are amazing to the extent that a low-risk application such as the chatbot “Chat GPT” from Open AI, which is in a phase of testing, might very well be the best source of a definition of the object of this chapter:

10  Ethical Principles and Governance for AI

195

AI governance refers to the frameworks, processes, and regulations that are put in place to ensure that artificial intelligence (AI) systems are developed, deployed, and used in a responsible, ethical, and accountable manner. This includes considerations related to transparency, fairness, privacy, and security, among other issues. Effective AI governance is essential for ensuring that AI systems benefit society as a whole and do not cause harm to individuals or groups. (Excerpt of a conversation with Chat GPT on March 2nd 2023).

Eventually AI systems might do most of the work of us professors; but before that happens, it is important that the principles for AI governance be discussed. This chapter will focus on these principles: what they are, why they are necessary, and how they can possibly be translated into legislation, in case it is advisable that all or some of them be made into compulsory procedures and obligations. The chapter focuses specifically on European initiatives and legislative projects. The EU has taken the lead in creating a framework of AI governance in accordance with the basic rights and principles embraced by the Union. While governance of AI is also a strategic issue in the USA and many other countries (Radu 2021), the European approach can be considered a reasonable midway between a voluntary approach within a free market framework – where governance is steered by private firms and associations – and a ‘command and control’ approach where the State steers and monitors the process. It will be argued that this makes the European approach the best example of a democratic governance of AI. This approach is based on a broad understanding of risks, including non-obvious risks like the potential threat to democracy itself, and also on a sensible attitude towards the social benefits of this technology. It will be shown that the European approach focuses on principles that gather wide social support and is carried out through a democratic and participative process – as much as the topic and the urgency permit. To explain the main governance principles included in the EU approach, the chapter is organized as follows. The next section focuses on the specific risks of AI and how this is related to the need to assure that basic ethical principles are respected and how this has been recognized by all relevant agents of this transformation. The third section will describe and explain the principles of the EU framework for AI Governance, as established by the multi-stakeholder Group of Experts appointed by the European Commission to draft a guide for ethical AI – the well-known High-­ Level Expert Group’s Ethical Guidelines for Trustworthy AI. The fourth section will describe the main provisions of the prospective AI Act, which should accompany the General Data Protection Regulation in establishing a legal framework for AI governance. The final section will explore future challenges and research questions.

10.2 AI Risks, Responsibility and Ethical Principles In 1979, the German philosopher Hans Jonas staged a sort of revolution in the realm of ethics with his book The Imperative of Responsibility, (Das Prinzip Verantwortung) (Jonas 1985). The book’s main thesis was that traditional ethical principles had evolved for a situation in which the consequences of human action

196

P. Francés-Gómez

were close in time and space and easily foreseeable. Technological capacities developed in the twentieth century wholly changed this. Our power expands to distant places and distant futures. Unpredictability and potentially catastrophic scenarios are now ordinary features of human technical action. Under such conditions, Jonas argued, traditional ethical principles based on virtues, mutual respect, or natural law should all be subordinated to a higher principle, namely, the principle of responsibility, or acting so that the consequences and effects of our actions are compatible with the permanence of an authentic human life on earth. This is a general precautionary principle regarding any new technology. The imperative of responsibility stressed the importance of carefully assessing new technologies and considering all possible avenues for unforeseen negative effects. Responsibility and precaution are the fundamental ethical notions when dealing with disruptive new technologies, but they invite more questions: Responsible for what? For whom? To what extent? What does responsibility imply in each case? To say that you must be responsible and cautious is to say that in your considerations about whether to do something, and how to do it, you must take an impartial and universal perspective and think of possible effects of your action, not in terms of probabilistic instrumental calculations, but in terms of principled decision. The principles that should be present in a responsible deliberation are basic ethical principles: no-harm, fairness, honesty, integrity, respect, benevolence, compassion. Ethical principles are the best tool we have to establish among us what should and should not be done. What is acceptable and what is not. There is a constellation of legal and moral concepts that have become fixed in the form of rights, values, and principles. Ethical theory has singled out a set of values that form the core of our common social morality. These values are not uncontested, and their number and definition may not be as precise as a mathematical operation, but this does not mean that moral judgments are pointless. There are hundreds of texts and agreements that establish rights and values that can be taken to be universally shared. Moral philosophers may use these and other descriptive data about human morality, along with several traditions of moral thought and professional deontology, to argue for general principles for ordering and ranking ethical values, so that words such as “fair”, “ethically permissible”, “ethically unacceptable”, and even “good” or “evil” can be given an operational meaning to be applied to the myriad of actions and goals that new technologies allow. In the case of AI technologies, it was obvious for its developers that some applications require a great deal of trust to be accepted. No matter how artificially intelligent you are told your car is, would you let it drive you at high speed through a highway full of other vehicles? You need to put as much trust in the AI system driving the car as you put on a human chauffer. No less. Some other applications may have problems that are not so obvious. Apparently harmless recommendations are based on algorithms that register and analyze literally millions of data about your behavior. Users are being profiled in ways they ignore.

10  Ethical Principles and Governance for AI

197

Knowing these problems – and the more acute worries related to use of AI systems in strategic industries, including the military and security  – developers and manufacturers aspiring to maintain a good public reputation took the lead in trying to ensure that the research, development, and use of AI systems were in accordance with ethical principles. Since corporate governance has existed for decades (Cfr. Bowen 2013), the same methods and procedures were adapted to the development of AI (Schultz and Seele 2023; cfr. OEDC 2019a, b). Let us note that the development of AI is basically a private endeavor. But as the technology becomes mature, governments wish to gain control over it. Some form of social supervision of this all-purpose technology, based for the most part on our private data, seems reasonable in its own right. But it is even more justified if you think that governments are increasingly relying on AI systems to perform tasks essential for security, immigration and trans-border mobility, defense, criminal and penal systems. This means that civil rights and freedoms are increasingly managed and ‘protected’ by AI systems. Ensuring that human supervision is competent and constant, and that mistakes, biases or evil practices can be detected, corrected, and punished, if necessary, becomes a principal goal of democratic societies: hence the need for a framework of governance for AI. Let first concentrate on the principles and rules chosen and applied by private companies, private professional associations, and international organizations. There are several repertoires of company principles and procedures for ethical AI (see for example AlgorithmWatch. Inventory of AI Ethics guidelines: https://algorithmwatch.org/en/ai-­ethics-­guidelines-­inventory-­upgrade-­2020/). The OECD 2019b plotted the documents produced until late in 2019. Private and multi-­ stakeholder codes of conduct had been produced since 2016, but 2018 marked a turning point, with many large private corporations publishing their AI governance policies and EU and national governments adopting explicit strategies for AI governance. At this point – and the situation has not changed to a great extent since – inter-governmental regulations were for the most part calls for companies to adopt voluntary codes of conduct. Only a few countries established advisory boards or committees with some regulatory power. In the private sector, AI governance is integrated into the ‘ethical business’ policies and compliance procedures of most large corporations. Corporate Social Responsibility has evolved towards a relatively standard set of legal and corporate norms usually integrated under the acronym ‘ESG’ (environmental, social and governance) and linked to the UN sustainable development goals (cfr. https://sdgs. un.org/es/goals). This is taken as the shared agenda for responsible business, where each company chooses a particular voluntary focus while complying with basic common regulation regarding basic rights of workers, consumers, shareholders, and other stakeholders. ESG policies are often seen as ‘risk management.’ Under this light, AI ethics and governance gets easily integrated in corporate policies (Schultz and Seele 2023); AI products clearly increase reputational risks (cfr. Ohlheiser 2016 on the chatbot Tay case, for example).

198

P. Francés-Gómez

In general companies have correctly identified the ethical risks of AI: It may produce unfair results if outcomes of certain tasks are biased or unjustifiably unequal; it may violate the right to privacy, because of the improper use of personal data; it may not respect the user’s right to know, because of the opaque workings of algorithms; it may alter the system of liability; it may impede taking back control if necessary, meaning the annulment of human autonomy; it may affect the right to safety and security and the right to product quality, and so on. In general, companies have focused on accountability, fairness, transparency, privacy, and security. A second layer of principles would be reliability, inclusiveness, explicability, revision, and stakeholder dialogue. In European companies, perhaps under the influence of N. Bostrom’s early warnings and the Asilomar AI Principles, it is common to make explicit mention of human-centric AI (for example in Telia and Telefónica statements that will be used for illustration below). Let us use an example of a corporate framework for AI governance in the private sector. Telefónica’s approach is simple enough, and relatively complete for a Telecom company. The governance framework includes the following: firstly, a set of five principles aligned with the principles of ethical business and with the explicit corporate policy toward human rights, which are: 1. Fair AI. This implies a commitment to always consider the impact of an algorithm in its applications domain, −beyond its performance characteristics. 2. Transparent and explainable AI. It would imply the need to focus on end-users to be able to understand how their data are used and to some extent also understand how the system reaches the conclusions it does. 3. Human-centric AI.  This refers to AI being at the service of society, or social good. Also, it refers to AI systems being under human control and subordinated to human values. 4. Privacy and security by design. The statement mentions that proper data management is already assured in all operations and business activities. This responds to EU and national legislation. 5. Working with partners and third parties. The principle acknowledges the need for some democratic participation in the governance of AI. These principles are presented as the expert consensus and the specialized agreements within the telecommunications sector, but it is clear they also represent very basic ethical values: justice, autonomy, mutual respect, social welfare – not far from Floridi et al. proposal of an ‘Ethical Framework for a Good AI Society’ which is based on the four classic principles of bio-medical ethics, namely no-harm, autonomy, justice and beneficence, plus a principle of explicability (Floridi et al. 2021). The principles require proper procedures, implementation, and control mechanisms to be secured. At company level this is undertaken through the training of workers and the monitoring of key processes, including training of end-users when necessary. In the case of Telefónica, the system includes three types of measures – training and awareness of employees, self-assessment questionnaires, and developing or adopting technical tools – plus an operational framework within the company.

10  Ethical Principles and Governance for AI

199

Training is important to raise awareness about the binding norms that must be respected (GDPR, for example) and to align AI ethics with the general ethical business approach or ESG policy of the company. The second measure is a set of on-line self-assessment questionnaires for designers and internal users of AI systems. Questionnaires come with recommendations, so that apart from a checklist they are conceived as a tool to remind employees what can be done to improve the privacy, fairness, security and explicability of the product. This element is prominent in the HiLEG Ethics Guidelines for Trustworthy AI that will be presented in the next section. The third measure refers to tools – either internally developed, or outsourced – that help achieve the objective of complying with the principles. Tools for anonymizing data are the obvious example, but there are also tools for detecting algorithmic discrimination of protected groups and enhancing explicability. The application of these procedures is at product manager level; however, the system includes a protocol in case of doubt. In such a situation a designated expert in each area will study the case and may make recommendations. Critical cases, however, pass to a third level of control, which is the Responsible Business Office. This internal AI governance system does not mention individual accountability, as would be advisable. Telia, for example, does explicitly refer to accountability in these terms: ‘Our solutions come with a clear definition of who is responsible for which AI solution’ (Telia Guiding Principles on Trusted AI Ethics). At any rate, the goal of the system is to protect the company from reputational risks. Voluntary governance schemes such as the aforementioned ones are welcome but may fall short of what society might expect. First, voluntary commitments imply less guarantee of compliance: Informal coercion – by peers, consumers, etc.– may work, but critical social norms are supported by stronger coercive mechanisms, and when the stakes are high, complex societies invariably formalize these by way of legal provisions, enforcement agents and deterrent penalties. Enforcement is not the only problem for voluntary governance schemes. Companies and industries may use different definitions of AI.  They may have a governance system for AI that they do not apply to products that others would classify as AI.  Uncertainty would reign. A similar problem of vagueness would be related with risks. Suppose there is a tool for checking whether an algorithm discriminates against vulnerable groups. The decision to use the tool would still be based on the self-assessment made by the developer or product-manager, and they may be under pressure to finish the product, so if they are not obliged to do the check, would they do it? The general suspicion is, unfortunately, with private companies because, being driven by market competition and the profit motive, ethical considerations, risk assessment and compliance with governance provisions will always be seen as a cost. The voluntary solution to this problem is a second layer of voluntary commitments. For example, companies may voluntarily accept third-party audits of their governance systems to ‘bind themselves’ to their own principles. Associations and standardization organizations may create auditable norms and certifications that further increase public trust. Each layer of commitments may be welcome, particularly

200

P. Francés-Gómez

if coming from a collective decision of several key actors in the industry, but they are ultimately subject to the same suspicions stated above. Most large companies in the USA have in fact argued that governmental regulation of AI is necessary. MicroSoft for example is taking the lead in conceiving tools for responsible developing and learning of AI systems, but they need governments to take a stance so that the vision of an ethical AI becomes an actual requirement for all players. Only governments have the enforcement power to turn AI governance principles into a real protection for all. In sum, ethical principles are necessary and voluntary assumption of responsibilities show a welcomed awareness of the implications of AI technologies; but a public system of governance seems to be necessary regardless (Cfr. Floridi 2021; Coeckelbergh 2020, 2022). This conclusion is present in all EU and CEPS reports dealing with AI, and consequently the last 4 years have witnessed a series of efforts at EU level to draft an AI Act as the backbone of a comprehensive and common system of governance. The first effort was directed towards clarifying what system of governance is preferable, and the second towards establishing basic ethical principles and requirements that should be incorporated therein. It is important to note that in the process of creating a governance system for a revolutionary technology there are competing goals. On the one hand, the rules must assure that individual rights are protected, and basic ethical principles respected; on the other hand, the same rules must promote research and development of the new technology, insofar as it may be a valuable tool for improving our response to societal challenges. These two basic goals create tensions within the legislative bodies and between experts, and the result is a compromise. The EU proposal for AI governance is, therefore, a compromise. The next two sections present this to enable assessment on the part of the reader.

10.3 Ethical Guidelines and the European Option for AI Governance Provided that the research and development of AI needs to be subject to ethical and responsibility rules in the public interest, there are several options for how public authorities may go about it (Steurer 2013). The Table 10.1 on the following page, adapted from an EU publication preceding the drafts of the AI Act, lists the main options considered by the EU Commission. The table shows five governance possibilities, and their meaning in terms of what kind of legislative act from the EU Commission and Parliament would be involved, how AI would be defined and by whom in each case, and how it would work: what kind of AI products would be under the regulation, which party (providers, users, etc.) would be mainly obliged by the regulation, and what actions would be required (both ex ante, that is, before the products are placed in the market – for example undergoing risk assessment or obtaining special certifications–, and ex post, that is,

10  Ethical Principles and Governance for AI

201

Table 10.1  Summary of the policy options analyzed by the EU Option 1 Voluntary labeling scheme An EU act establishing a voluntary labeling scheme.

Option 2 Ad Hoc sectoral approach Ad hoc sectoral acts (revision or new)

Option 3 Horizontal risk-based act on AI A single binding horizontal act on some AI (risk-based).

One definition of AI, however applicable only on a voluntary basis.

Each sector can adopt a definition of AI and determinethe riskiness of systems.

Applicable only for sector specific AI systems. Possible additional safeguards for specific uses. Sector Only for specific providers obligations who adopt for voluntary providers scheme. No obligationsfor and users users. depending on the use case. Depends on Self-­ enforcement assessment system and an ex under the ante relevant check by sectoral acts. national competent authorities.

Option 3+ Codes of conduct Option 3 + codes of conduct

Option 4 Horizontal act for all AI A single binding horizontal act on AI.

One horizontally applicable AI definition and determination of high-risk (risk-based).

Option3+industryled codes of conduct for non-high-­ risk AI

Requirements Applicable only for Voluntarily labeled AI systems.

Risk-based requirements for prohibited and high risk AI + information requirements for others.

Option 3 + industry-led codes of conductfor non-high-risk AI.

One horizontal AI definition, but no methodology/ or gradation (all risks covered) Applicable to all AI systems irrespective of the level of the risk.

Obligations

Horizontal obligations for providers and users of high-risk AI systems.

Option 3 + commitment to comply with codes for non-high-risk AI.

Same as Option 3, but applicable to all AI (irrespective of risk).

Conformity assessment for providers of high-risk systems (3rd party) + registration in an EU database.

Option 3 + selfassessment for compliance with codes of conduct for non-high-­ risk AI

Same as Option 3, but applicable to all AI (irrespective of risk)

Nature of act

Scope/ definition of AI

Ex-ante Enforcement

(continued)

202

P. Francés-Gómez

Table 10.1 (continued) Option 1 Voluntary labeling scheme Ex-post Monitoring Enforce-ment by Authorities responsible for EU voluntary label. Governance

Option 2 Ad Hoc sectoral approach Monitoring by Competent authorities under the relevant sectoral acts.

Option 3 Horizontal risk-based act on AI Monitoring of high-risk systems by market surveillance authorities.

Option 4 Horizontal act for all AI Same as Option 3, but applicable to all AI (irrespective of risk). Same as Option 3 + Depends on At national National codes of conduct Option the sectoral level but competent 3, but without reinforced acts at authorities applicable EU approval. national and with responsible to all AI EU level; no cooperation for the EU (irrespective between States label + a light platform of and with the Cooperation for Risk). cooperation EU level (AI mechanism. Board). between competent authorities. Option 3+ Codes of conduct Option 3 + unfair commercial practice in case of noncompliance with codes.

Adapted from EU Policy and legislation | Publication 21 April 2021 Impact Assessment of the Regulation on Artificial intelligence

what kind of monitoring would be implemented after the AI systems are already in use), and which authority would be responsible for the supervision. Finally, the table refers to the governance system of each option. In this case, governance refers to the structure of the supervisory system: if it is mainly a national system, or a European one, which type of coordination between regulatory authorities, etc. The five possibilities run from a voluntary labeling scheme to a horizontal act establishing compulsory requirements for all AI. According to option 1, AI governance would be left almost entirely to self-regulation. Public authorities would oversee only those companies or industries that voluntarily adopt the labeling scheme and the overseeing authority would be national compliance agencies, typically with little coercive power. This kind of governance may be the framework of choice if the public goal is the rapid development of AI technologies, leaving researchers and developers to decide if they need to take any measures to show they have a formal system of ethical compliance. Legislatures and governments defend that AI governance requires more than a voluntary scheme, which is something leading private actors in the industry agree on. If you combine these positions and add the increasing public concern about

10  Ethical Principles and Governance for AI

203

security and potential risks of AI systems, the conclusion is that the purely voluntary labeling scheme of option 1 was a no-go possibility for the EU. At the other extreme lies a Horizontal act for all AI, which would mean applying a compulsory common (European) regulation analogue to the GDPR to all products defined as being or containing AI. There are several problems with this option. First, there is a problem of definition. Under certain definitions, too many technical solutions  – not only systems based on machine learning  – may be considered AI.  A general regulation for all possible cases of AI would need to establish an arbitrary definition with many borderline cases. The alternative would be an attempt at covering all possible cases and applications described as AI, but this would be too broad to work. Second, compulsory requirements would be particularly detrimental to small and medium enterprises and start-ups, with a horizontal regulation being too burdensome for those actors most likely to innovate and offer potential benefits from new applications of AI. Third, there are intrinsic difficulties in regulating and monitoring AI. Due to the opacity of many AI systems, public authorities lack the technical capacity to effectively monitor them. This is a powerful reason for keeping compulsory regulation as circumscribed as possible, so that monitoring efforts can be concentrated on critical cases. In conclusion, only options 2, 3 and 3+ are deemed real possibilities. 3+ is simply 3 with the addition of codes of conduct. As seen above, codes are already being proposed by many associations and authorities, and they are used by most large companies; therefore, option 3 can easily be improved with ‘no extra cost’ by opting for 3+. So, it seems the realistic options are 3+ and 2. The EU commission opted for 3+. This is probably coherent with the general goal of keeping the single market as unified as possible. However, AI regulation according to this option is in parallel with many other sectoral norms that, insofar as AI systems are being adopted in different industries, need to be interpreted and adapted, and option 2 cannot therefore be discarded as the provisional governance approach. As an example, take regulation protecting equal access to credit in the USA. The Equal Credit Opportunity Act and implementation rules establish the normative framework. Access to credit has been affected by AI algorithms that go beyond analyzing the creditworthiness of clients; they decide on their applications. Due to consultations and complaints, the regulation was recently interpreted by a Consumer Financial Protection Bureau circular of 2022. The decision stated that the relevant norm “apply equally to all credit decisions, regardless of the technology used to make them” and consequently “do not permit creditors to use complex algorithms when doing so means they cannot provide the specific and accurate reasons for adverse actions [such as declined applications].” (CFPB 2022). This sectoral ruling either requires algorithms to be fully explicable or bans them. Analogous norms will surely mushroom here and there every time ‘complex algorithms’ threaten fundamental rights. Be that as it may, the EU Commission adopted option 3+ with the deliberate aim of achieving a model of so-called hybrid governance of AI, and with the explicit

204

P. Francés-Gómez

goal of protecting EU and human values while creating a safe environment for technological innovation. The hybrid scheme implies there are legal obligations supported by sanctions for some types of AI products – those considered ‘risky’– and a market-driven voluntary subjection to assessment and transparency rules for the rest. The first building blocks for this governance scheme are (i) a methodology for classifying AI systems and research lines as more or less risky and (ii) a set of basic ethical principles that AI systems should observe in any case. Once this is clear, a future AI act should be specific about which uses of AI need to be bound by a compulsory legal norm, which uses should be straightforwardly prohibited, and which can keep the freedom to voluntary abide by all or part of the requirements of the norm, using the tools and mechanisms established by the law as a way to assume an extra commitment towards ethical AI. In June 2018 the European Commission set up a multi-stakeholder ‘High Level Expert Group’ with the task of drafting an ethical guide for lawful and trustworthy AI. The expert group included representatives of all the interests involved, including high-tech companies and legal experts, but also academics and independent experts from several fields and geographical regions, representing the general interest. The expert group released an influential document entitled Ethics guidelines for trustworthy AI, which establishes key principles for the development of AI in Europe and can be considered the ideological foundation of the much less digestible legislation that will be presented in the next section. The European framework for a common regulation of AI is established by the aforementioned guidelines, and endorsed by the Commission in the Communication COM (2019) 168 Building Trust in Human-Centric Artificial Intelligence. Even if the recommendations and assessment procedures included in the guidelines are not enforceable, they have been welcomed by different stakeholders, in particular industry, developers, and suppliers. This may indicate that the political and philosophical bases of the recommendations are widely accepted. The guidelines offer a standard and relatively simple tool for developers that they can use discretionarily, thus creating a level ethical playing field for the development of AI in Europe while avoiding excessive interference and formal sanctions. The Ethical Guidelines for Trustworthy AI were issued first by the end of 2018, and, after feedback from stakeholders, a revised version was then released in March 2019. The remainder of this section offers a brief description of the content of the guidelines, and some critical considerations. It should be pointed out that the mandate of the High Level Expert Group was to draft ethical guidelines on how to assure that AI is trustworthy, human-centered and aligned with core EU values, which were at the center of the EU Strategy on AI designed by mid-2018. According to this strategy, trust is a pre-requisite for ensuring a human-centric approach to AI, and such an approach is, in turn, what core EU values require. Any proposed AI regulation should be in line with previous

10  Ethical Principles and Governance for AI

205

directives regarding personal data (Regulation 2016/679 GDPR), non-personal data (Regulation 2018/1807 FFD) and cybersecurity (Regulation 2019/881 Cybersecurity Act). The ultimate foundation of AI governance in the EU is fundamental individual rights. The guide first states that trustworthy AI must be lawful, ethical and robust – meaning technically reliable and well adapted to the social environment where it will be used. It then alludes to the four ethical principles that must be respected during the whole life cycle of AI: respect for human autonomy, prevention of harm, fairness and explicability. These four key ethical principles resonate with standard medical ethics as suggested above – no-harm, autonomy, fairness, beneficence. In this case, explicability, rather than beneficence, occupies a prominent place, for it is the antidote to the opacity of AI systems. The number of parameters and layers of learning turn most AI systems into true black boxes. Explicability has become, therefore, a very specific ethical principle for AI (Floridi et al. 2021), and it is still an open question whether it can be fully realized; nevertheless, efforts towards this end are exemplary of a responsible approach to AI. As for the other three principles, they are well ingrained in our commonsense morality: taking all possible care to avoid or minimize harm seems to be a very basic moral demand; preserving human autonomy is essential in systems that are often designed to replace human decisions or to nudge humans in certain ways. Let us recall that some systems not only replace human decision, but human action altogether (as in the case of some robots, autonomous vehicles and autonomous weapon systems); but even in these cases human autonomy should be preserved in the sense that the systems must respond to human direction and remain under human supervision. The possibility of human control being recovered at any given time must be assured. Finally, the fairness principle should guarantee that AI systems do not interfere with the moral and legal equal rights of persons; they must be unbiased, and they must treat vulnerable groups and individuals with due care. The ethical principles are realized in turn through seven key requirements: • • • • • • •

Human agency and oversight Technical robustness and safety Privacy and data governance Transparency Diversity, non-discrimination, and fairness Societal and environmental well-being Accountability

If the four ethical principles resemble the compass of medical ethics; these seven requirements set the tone for the corporate AI ethics developed so far, and picture the development and deployment of AI that may be socially desirable, namely: first, a technology that is applied to improve social well-being and environmental

206

P. Francés-Gómez

protection and second, a technology that is reliable, robust and safe in practice, protected against hacking and misuse, producing precise and consistent results. Immediately after this, AI must respect human autonomy, meaning private data and eventual decisions based on them must be treated according to extant rules – and note that these rules prohibit ‘automated’ decisions affecting basic rights of individuals. This is the requirement of data governance. The requirements of human agency and transparency are also related to autonomy. Transparency operationalizes the principle of due respect to the dignity of end-users and the public. Transparency and explicability must of course be adapted to the technical capacity of the citizens. Explainable AI is not only a product whose owners or designers share lines of code that only experts can interpret – something that may be illegal, by breaking intellectual and industrial property laws. The idea is that the average citizen can understand the workings of the system and the data on which the decision or result depends. The requirement of human oversight also expresses respect for the goal of human-centered AI. Systems must ensure by design that a human can intervene at several points in the cycle, a condition that was relatively trivial in the case of most previous technologies. The case of AI may be entirely different (Russell 2020; Bostrom and Yudkowsky 2014). Nowadays it is safe to say that AI systems are produced and used under our (human) control, but the rapid development of the technology may undergo dramatic changes in a matter of days. Some AI systems are truly autonomous in the sense that, once set in motion, they decide how to attain their goals, and since these may be vague, they may take unforeseen courses of action with no further human intervention. In this context, it seems that preserving human agency must be a deliberate decision from the moment of conception, and a prudent attitude demands that developers commit to respecting this principle and detecting violations – that is, attempts to create AI systems that may reject or effectively avoid human control. The requirement of non-discrimination, diversity and fairness is supported by the principles of justice, mutual respect, equality, and equal individual rights. In the context of AI technology, it implies verifying that the systems do not reproduce or amplify biases and prejudices that may be incorporated in data, which requires a long period of testing and assessment, and may require specific instruction and possible monitoring of users when they are supplied with general AI systems to be trained with their own data. Finally, the requirement of accountability should help authorities to locate responsibility and eventually impose liabilities and penalties. Operationally, this will require a well documented process from conception to use. One important feature of the guidelines is that they acknowledge that innovative research on AI may be slowed down or entirely blocked if researchers need to verify that each step is respectful of the aforementioned requirements. In order to deal with this, the experts include a call for the establishment of normative sandboxes, under the supervision of national authorities, where new ideas can be securely developed until they reach maturity and undergo the proposed assessment process only in the event they are successful and are intended for the market.

10  Ethical Principles and Governance for AI

207

The image below represents the structure of the guidelines.

Source: EU Communication (COM/2019/168) (HiLEG Report)

It is important to note that chapter three of this document is devoted to operationalization, including a very detailed assessment list. If well used, this would be an important contribution towards attaining uniform standards of ethical AI across Europe. The document is taken as inspiration for corporate codes and procedural governance initiatives. The guidelines have received more support than criticism so far. One possible objection is that they were not really democratic, involving a group appointed by the EU commission among ‘experts’ and corporate representatives. But the HiLEG was probably as democratic as circumstances allowed. First, it was broad in its composition, including all key stakeholders; second, the document went through a phase of public consultation, and third, extant legislation and even established opinions from advisory boards had to be considered, so the document is aligned with key values and legislation, which are themselves the result of democratic processes. It is always possible to deepen democracy, but at the moment, the course followed in the EU seems appropriate enough from a democratic perspective.

208

P. Francés-Gómez

The ideological foundation of the document could be questioned. Note that the intent of the EU Commission can be construed as instrumental: Europe needs a common ethical norm for AI in order to preserve a single market and to protect European industry. The protectionist logic would be that if our products and research practices are ethically superior, other products will be made unlawful in the EU market. This opens the door for prohibitions of foreign AI systems under the allegation that they have violated EU standards at some point in their lifecycle. It must be said that there is indeed a certain instrumental tone in the general EU policy regarding AI. There is the ambition to preserve AI sovereignty, and given that the technological edge has been lost, this can be done through the ‘ethical edge.’, but is this objectionable? First, the EU is a very large market, and using the political sovereignty of the union to impose ethical limits on AI may help change some habits everywhere. Something analogous has occurred with data protection and many other sectoral regulations. Utilitarian ethics and other teleological models of ethics would subscribe to that policy. Other schools of thought would disagree, pointing out that an ethical stance should lead to a respectful and circumspect attitude towards others, even if they choose to act wrongly. But a middle ground is always possible: using the best means to a good end is ethical provided that well defined individual rights are not violated in the process, and these guidelines make every effort to secure this double standard. Now, the stress on human-centered AI is to some extent in contradiction with the requirement that AI be for the well-being of the environment – included in requirement six, but little developed. The excessive weight given to the idea that AI is human-centered may not be sufficiently justified by simply appealing to the policy decision of the European Commission (see Coeckelbergh 2020, 2022). Advocates of environmental ethics may claim that if humans are part of a larger community of living beings, the privilege this governance framework bestows on them is arbitrary. If we accept that humans, and only humans, form a particularly valuable form of community, the sixth key requirement should be clear and place the environment as a secondary interest. This is not to say that AI systems should be permitted to work without human oversight. The EU Commission is clear that human control is essential for a trustworthy AI: “All other things being equal, the less oversight a human can exercise over an AI system, the more extensive testing and stricter governance is required.” (Communication 2019/168). But human oversight is fully compatible with environmentally oriented AI ethics. This has not been the choice of the EU. Several legal and ethical traditions converge in the guidelines for trustworthy AI, making it a document of consensus, despite it only being a report with no statutory value. The EU is committed to passing a law that imposes a common framework for AI governance, meaning the full implementation of option 3+ and, in particular, the prohibition on some forms and uses of AI in the EU, and the subjection of uses considered high-risk to an enforceable common system of guarantees and conditions.

10  Ethical Principles and Governance for AI

209

10.4 The Artificial Intelligence Regulation in Europe The proposed AI act implements the demands of the guidelines described in the previous section and sets up a system of public verification and control of high-risk AI. The main object of the Act are high-risk AI systems. The requirements of the act are thought of as a form of control of beneficial but highly risky AI applications. We may imagine the classification proposed as a pyramid of risk. At its base, there would be applications with no or minimal risk – such as certain industrial applications, videogames or purchase recommendations. For these cases the Act simply recommends the adoption of codes of conduct (art. 69). At the second level there are systems that pose limited risk, like chatbots and deepfakes. For these systems, besides the recommendation of adopting codes of conduct, there is a requirement of transparency: users must be aware that they are interacting with an AI system (art. 52). The third level is the High-risk systems that will be described below. And there is a fourth level that is considered unacceptable in civilian use.1 Facial recognition and remote biometrics, social scoring, and subliminal manipulation are banned (art. 5). The critical third level of risk receives all the attention of the norm. The Act defines three categories of risky AI: systems increasing health risk, including diagnosis, triage and dispatch of emergency fist response services; systems that may compromise fundamental rights, such as approval or selection procedures in education and other areas of social services, including finance, employment etc., as well as law enforcement; system that involve a safety risk: autonomous vehicles and other autonomous machinery interacting with the environment as well as safety components of critical infrastructure (essential utilities, energy, water supply, etc.). Apart from these categories, the Act acknowledges that some all-purpose AI systems can be high-risk if they are used to support high-risk functions or make part of high-risk appliances. On the other hand, if a low-risk product could be used for purposes that would make it high-risk, but the provider makes clear that the ordinary and legal use is restricted to the low-risk functions, it may avoid the regulation (Engler and Renda 2022). Once an AI application or system is categorized as high-risk, detailed obligations are imposed on developers, providers, and distributors, and even on end-users. Only products that pass a stringent assessment of conformity will be authorized. National authorities reporting to the European Commission, assisted by a European Advisory Board on AI, will assure compliance. Because the requirements are deemed too burdensome, the Act makes sure that ‘regulatory sandboxes’ will be available for entrepreneurs and scientists to research and develop new tools. The Act is intended to protect the public; stifling the development of AI is avoided at all costs.  The AI Act will not regulate the military use of AI. Public international law would be the appropriate legal framework for the regulation of AI systems in the context of the use of lethal force and other military activities. 1

210

P. Francés-Gómez

Below is a non-exhaustive list of the obligations for developers and designers cited in articles 9 to 15: 1. A full documented risk management system must be implemented, covering the entire life cycle of the product and including all mitigation measures necessary. 2. Data governance. Training data must be the minimum possible, and their use must be in accordance with extant regulations, implying duties to supervise the quality, integrity, and pertinence of data. 3. Technical documentation of systems must allow public authorities to check if the system complies with the law (AI act and all legislation applicable) before the system is commercialized. 4. The systems must technically permit the logging and registering of all events – every single session – of the system along its entire life. 5. The design of the systems must ensure transparency by informing of all circumstances that can affect the function of the system. 6. The systems must allow human oversight. This can be done through measures integrated in the system before commercialization, or through measures defined by the provider to be put in practice by the user. The provider is obliged to make sure that the users are trained, and they are aware of the tendency to put too much confidence on the AI systems’ outcomes. 7. The design must make these systems precise, robust, and as resistant as possible to hacking. These requirements refer to features of AI systems and the companies that design them. But the chain of checks and controls involves providers and users: they must “ensure that their high-risk AI systems are compliant with the requirements set out in Chapter 2 of this [third] Title” (article 16 (a)). Providers are, in fact, the keystone of the control mechanism. Article 21 states, for example, that “Providers of high-­ risk AI systems which consider or have reason to consider that a high-risk AI system which they have placed on the market or put into service is not in conformity with this Regulation shall immediately investigate, where applicable, the causes in collaboration with the reporting user and take the necessary corrective actions to bring that system into conformity, to withdraw it or to recall it, as appropriate.” The system includes further obligations for importers, distributors, and users. The underlying idea is that public authorities receive updated information and access to technical details, including all the activity of the system and the detailed distribution of responsibility so that all the parties are accountable and risk incidents can be timely and effectively discovered and fixed. An important element in the law is the so-called ‘governance’ and enforcement system. A complex scheme of conformity assessment, public notification, certifications, and markings of conformity is established. This will work at national level, but there will be a European AI board to which national authorities will report. Article 39 deserves mention. It is about “Conformity assessment bodies of third countries”. The legislators state that the EU will accept conformity assessments from bodies outside the EU under certain agreements and provided they work on the

10  Ethical Principles and Governance for AI

211

same conditions as European assessment bodies – thus facilitating compliance of providers located outside the borders of the EU. The enforcement is buttressed by the possibility of imposing fines on companies that fail to comply with the relevant dispositions. These few paragraphs give quite a vague idea of the possible effects of EU regulation and governance scheme of AI. It will certainly create an entirely new niche in the market for consultants and experts of several fields: risk management, quality management, data governance, algorithm explicability, etc., and new governmental agencies and subsidiary authorized semi-public bodies. Perhaps this is as it should be, given that AI is itself a new industry that so far lacks a public framework of legal guidelines and supervision. However, it is reasonable to ask whether a simpler system could not have been designed. Precisely because AI is a new industry, it could have been an occasion for the EU authorities to try a less bureaucratic approach. Perhaps a simpler form of regulation may be as effective (cfr. Engler and Renda 2022, Bietti 2021). Actually, the requirements imposed by law will be so complex that private firms will be the only agents with expertise and capacity to provide authorities with the tools they require to exercise the control they want to have. Paradoxically, only with the help of privately developed AI tools will it be possible to assure permanent compliance with the requirements of the AI act.

10.5 AI Governance: Open Questions, Future Paths There is no doubt that AI needs a formal normative framework to minimize risks affecting human rights, fairness, and autonomy, not to mention supposed existential and catastrophic risks derived from unaligned AGI.  This framework could be entirely private or entirely public, but the reasonable option is a hybrid system: private governance procedures integrated in Corporate Responsibility policies, quality management systems and risk management systems at firm level; and public agencies in charge of monitoring high-risk systems during their lifecycle. The governance of AI must involve ex ante assessment and authorizations and ex post monitoring, redress mechanisms and eventually sanctions. The weight given to ex ante and ex post mechanisms is important. This chapter has described the options chosen by the EU. The Ethical Guidelines for Trustworthy AI plus the EU communication COM (2019)168 Building Trust in Human-Centric Artificial Intelligence establish the rationale for a public governance system of AI. These documents focus on ex ante voluntary procedures, emphasizing the ethical principles AI must abide by. In contrast, the proposed AI act – still in development, although an advanced draft issued in December 2022 is being discussed in the EU Parliament at the time of finishing this chapter – will erect a comprehensive governance system that will go from principles to enforceable obligations and constraints. The law draws upon previous legislation in the EU, and upon corporate procedures like risk assessment and quality management. Perhaps it relies too much on the documental and

212

P. Francés-Gómez

procedural nature of these systems. AI is developing exponentially, and it is doubtful it can be harnessed by the relatively traditional means proposed by the AI act. If EU regulation is successful, it may have a global impact, steering AI governance in a standardized way. But it may encounter resistance or technical and conceptual difficulties. Here are some of the challenges and open questions discussed in the literature. The problem of definition will be permanent. It is true that OECD and the EU AI Act have adopted a functional definition that seems to cut through conceptual and technical issues. However, Bertolini and Episcopo (2021) are right when they say “AI is used to indicate ground-breaking technologies, regardless of the specific mechanisms applied, so that what is considered “AI” today might not be classified as such tomorrow”. This adds to the fact that AI comes in a vast variety of forms and products. Some are designed to be autonomous, but others, like robotic prosthetics, are designed to be as least autonomous as possible. Providing a single definition of AI will remain a challenge, opening the door for interpretations about the scope of any legislation. The precise structure of accountability proposed by the AI act is questionable. It places responsibilities on developers, manufacturers, importers, distributors, and users. The diversity of products and the number of agents involved will challenge application. Perhaps trying to devise a single liability scheme for systems so diverse is not the right approach: “just because a doctor, a financial intermediary, an autonomous driving (car) or flying (drone) system, a platform or consumer might use AI-based applications, this does not imply that they should be subject to an identical liability regime. Indeed, a unitary regulation of advanced technologies, with respect to sole liability, is unjustifiable from a technological and legal perspective, and is a potential source for market failures and distortions” (Bertolini and Episcopo 2021: 648). The AI Act involves modifications of several previous EU acts. Furthermore, it insists that AI regulations must be interpreted in accordance with liability regulations and consumer rights regulations. But overlaps and contradictions are to be expected. A problem of vagueness might also arise as regards the damages that AI may cause. We seem to be readier to understand material risks (flooding, fire, crashes) than to understand immaterial risks (even if they involve violation of basic rights). On most occasions, the risks posed by AI systems will not be life threatening; in fact, they will seem mild by comparison, which may be an obstacle to taking preventive measures. If AI manufacturers are influenced by public perception of risks, they may assess most of their products as low-risk and avoid regulation altogether, even if they objectively fall within the categories of high risk. In parallel, it is always somewhat arbitrary to classify AI products as low risk. Even an innocent application can be hacked and become a threat. Engler and Renda (2022) point to the business model of software with AI code, in which a provider develops software with code to train an AI model (but no data) and sells the software to a user who adds their own training data. This model is

10  Ethical Principles and Governance for AI

213

particularly challenging for the AI Act, because the user is unable to change, or even potentially precisely know, the code or model object of the AI system, and the provider may never know the data; it may be functionally impossible for either entity to meet the requirements of the AI Act. An open question remains about regulation itself. Radu (2021) found that all the countries analyzed had prioritized an ethical orientation but as of 2021 no legislative initiative had been reported. It may be that, given the extraordinary complexity of the issue, there is way to go ahead with statutory law that satisfies the public interest, and it is acceptable for the many stakeholders: industry, consumers, scientists/developers, etc. This may be connected to a view that is underlying the debate on AI: the idea that AI is a strategic tool. Radu (2021: 178) reproduces this statement of Vladimir Putin, referring to AI: ‘Whoever becomes the leader in this sphere will become the ruler of the world’. Governance options may represent different strategies in the race to become the world leader in AI. Are ethical constraints a way for the EU to protect its own AI industry or an attempt to stifle the rapid growth of AI in other economic areas? Mixing ethical principles with the goal of gaining strategic technological power would be very paradoxical. AI governance is an unresolved issue. While ethically speaking we have a somewhat clear set of requirements that AI systems should respect, there are many open questions about how to succeed in securing an ethical development of AI, and a predictable system of accountability. A well-developed governance scheme is however key to prevent the worst scenarios that the technology enables. This chapter has focused on the EU model of governance, as it is today, as an example. At least three issues have not been discussed; and they will need attention in the future: The global nature of AI will require global AI governance. On the one hand, risks and ethical issues are the same everywhere. On the other hand, AI systems do not stop at borders. Recommendation systems incorporated into on-line services are global, and companies manufacturing AI solutions are also global. It has been pointed out how EU regulation will probably have a global impact, but this could be considered paternalistic, or neo-colonial. A wider dialogue is most likely a better way to envisaging a global governance system for AI. The second issue is related to the environment and anthropogenic global warming. AI is advocated by some as a factor contributing towards the fight against climate change and securing environmental sustainability. It is possible that AI reveals itself as our best strategy against climate change. AI tools may outperform human ingenuity in finding ways to mitigate and adapt, being necessary for making predictions and model future threats. It may be that AI systems also become essential as policy designers. Notwithstanding the above, the resources that ever larger servers and data centers consume are an environmental concern. The question about the limits imposed by energy resources on AI technologies is also an open question. In the event we need to prioritize AI services, which ones will be sacrificed?

214

P. Francés-Gómez

Thirdly, this chapter has throughout interpreted AI Governance as the process of harnessing the power of AI, restraining AI to the limits of the ethically acceptable and keeping its agents accountable. However, there is a growing literature focusing on the use of AI in governance, that is, the use of AI tools to improve – or, perhaps, to deteriorate – democratic governance: in the face of institutional failure, citizens may want ‘AI assisted governance.’ This topic is directly relevant to political philosophy and legal theory, and it is plagued with riddles of its own (Coeckelbergh 2022; Nowotny 2021). Some of them are the same issues treated in this chapter – namely, the reliability, accountability, and fairness of the systems used – but some others are very different, as the use of AI systems for public governance may rob the people of their sovereignty (Sætra et al. 2022). This is a complex issue that deserves an independent study. It is obvious that the challenges and open questions vastly outnumber the already large number of initiatives and proposals. AI is a resource that is likely to change our lives, as the internet did before it. Most industries and private and public activities including perhaps the exercise of political power, will be greatly transformed. Mapping the risks and setting up a control panel presided over by time tested ethical principles seems essential at this stage. The governance of AI is the science and art for turning principles into procedures and institutional structures that secure its beneficial uptake. Acknowledgements  This work was possible in part by Grant PID2021-128606NB100 Funded by the Spanish MCIN/AEI/10.13039/501100011033 and by “ERDF A way of making Europe”.

References 6, Perri. 2001. Governing by technique: Judgement and the prospects for governance of and with technology. In Governance in the XXI Century, ed. OECD. OECD Future Studies 2001. Available at: https://www.oecd.org/futures/17394484.pdf. Bender, E.M., T. Gebru, A. McMillan-Major, and S. Shmitchell. 2021. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, 610–623. ACM. Bertolini, A., and F.  Episcopo. 2021. The expert group’s report on liability for artificial intelligence and other emerging digital technologies: A critical assessment. European Journal of Risk Regulation 12: 644–659. https://doi.org/10.1017/err.2021.30. Bietti, E. 2021. From ethics washing to ethics bashing: A moral philosophy view on tech ethics. Journal of Social Computing 2 (3): 266–283. https://doi.org/10.23919/JSC.2021.0031. Blackman, R. 2022. Ethical machines. Your concise guide to totally unbiased, transparent and respectful AI. Cambridge (Mass): Harvard Business Review Press. Bostrom, N. 2014. Superintelligence. Paths, dangers, strategies. Oxford University Press. Bostrom, N., and E.  Yudkowsky. 2014. The ethics of artificial intelligence. In The Cambridge handbook of artificial intelligence, ed. K.  Frankish and W.  Ramsey, 316–334. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9781139046855.020. Bowen, H. 2013. Social responsibilities of the businessman. Iowa City: University of Iowa Press. (Original edition 1953, New York, Harper & Brothers).

10  Ethical Principles and Governance for AI

215

Bucknall, B.S., and S. Dori-Hacohen. 2022. Current and near-term AI as a potential existential risk factor. In AIES '22: Proceedings of the 2022 AAAI/ACM conference on AI, ethics, and society. ACM. https://doi.org/10.1145/3514094.3534146. Coeckelbergh, M. 2020. AI ethics. Cambridge (Mass): The MIT Press. ———. 2022. The political philosophy of AI. An introduction. Cambridge (UK): Polity Press. D’Ignazio, C., and L.  Klein. 2020. Data feminism. Cambridge, MA, USA: MIT Press. Open sourced: https://data-­feminism.mitpress.mit.edu/pub/frfa9szd/release/6. Engler, A.C., and A.  Renda. 2022. Reconciling the AI value chain with the EU’s artificial intelligence act. Brussels: CEPS Papers. https://www.ceps.eu/ceps-­publications/ reconciling-­the-­ai-­value-­chain-­with-­the-­eus-­artificial-­intelligence-­act/. Floridi, L., ed. 2021. Ethics, governance and policies in artificial intelligence. Cham (Switzerland): Springer. Floridi, L., et  al. 2021. An ethical framework for a good AI society: Opportunities, risks, principles, and recommendations. In Ethics, governance, and policies in artificial intelligence, Philosophical Studies Series, ed. L.  Floridi, vol. 144. Cham: Springer. https://doi. org/10.1007/978-­3-­030-­81907-­1_3. Jonas, H. 1985. The imperative of responsibility: In search of an ethics for the technological age. Chicago: University of Chicago Press. (Original Das Prinzip Verantwortung. Versuch einer Ethik für die technologische Zivilisation, Frankfurt: Suhrkamp, 1979). Kuner, C., F.H. Cate, O. Lynskey, C. Millard, N. Loideain, D. Jerker, and B. Svantesson. 2018. Expanding the artificial intelligence-data protection debate. International Data Privacy Law 8 (4): 289–292. https://doi.org/10.1093/idpl/ipy024. Marjanovic, O., D.  Cecez-Kecmanovic, and R.  Vidgen. 2021. Algorithmic pollution: Making the invisible visible. Journal of Information Technology 36 (4): 391–408. https://doi. org/10.1177/02683962211010356. Nowotny, H. 2021. In AI we trust: Power, illusion and control of predictive algorithms. Wiley. Ohlheiser, A. 2016. Trolls turned Tay, Microsoft’s fun millennial AI bot, into a genocidal maniac. The Washington Post, March 25, 2016 at 6:01 p.m. https://www. washingtonpost.com/news/the-­intersect/wp/2016/03/24/the-­internet-­turned-­tay-­microsofts-­fun-­ millennial-­ai-­bot-­into-­a-­genocidal-­maniac/. O’Neill, C. 2017. Weapons of math destruction: How big data increases inequality and threatens democracy. New York, NY, USA: Broadway Books. Pégny, M., E.  Thelisson, and I.  Ibnouhsein. 2020. The right to an explanation. Delphi  – Interdisciplinary Review of Emerging Technologies 2 (4): 161–166. https://doi.org/10.21552/ delphi/2019/4. Radu, R. 2021. Steering the governance of artificial intelligence: National strategies in perspective. Policy and Society 40 (2): 178–193. https://doi.org/10.1080/14494035.2021.1929728. Russell, S. 2020. Human compatible: AI and the problem of control. London: Penguin Books. Sætra, H.S., H. Borgebund, and M. Coeckelbergh. 2022. Avoid diluting democracy by algorithms. Nature Machine Intelligence 4: 804–806. https://doi.org/10.1038/s42256-­022-­00537-­w. Schenier, B. 2018. Click here to kill everybody. Security and survival in a hyper-connected world. New York: W. W. Norton & Company. Schultz, M.D., and P.  Seele. 2023. Towards AI ethics’ institutionalization: Knowledge bridges from business ethics to advance organizational AI ethics. AI and Ethics 3: 99–111. https://doi. org/10.1007/s43681-­022-­00150-­y. Scott, P., and R.  Yampolskiy. 2020. Classification schemas for artificial intelligence failures. Delphi  – Interdisciplinary Review of Emerging Technologies 2 (4): 186–199. https://doi. org/10.21552/delphi/2019/4/8. Steurer, R. 2013. Disentangling governance: A synoptic view of regulation by government, business and civil society. Policy Sciences 46: 387–410. https://doi.org/10.1007/s11077-­013-­9177-­y. Véliz, C. 2019. Three things digital ethics can learn from medical ethics. Nature Electronics 2: 316–318. https://doi.org/10.1038/s41928-­019-­0294-­2.

216

P. Francés-Gómez

EU Legislation and Official Documents Cited Communication (COM/2019/168) from the commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions Building Trust in Human-Centric Artificial Intelligence https://eur-­lex.europa.eu/legal-­content/EN/ALL /?uri=CELEX%3A52019DC0168. EU Charter of Fundamental Rights. https://fra.europa.eu/en/eu-­charter. EU Policy and legislation | Publication 21 April 2021 Impact Assessment of the Regulation on Artificial intelligence https://digital-­strategy.ec.europa.eu/en/library/ impact-­assessment-­regulation-­artificial-­intelligence. European Commission, Directorate-General for Research and Innovation, European group on ethics in science and new technologies, statement on artificial intelligence, robotics and ‘autonomous’ systems: Brussels, 9 March 2018, Publications Office, 2018, https://data.europa.eu/ doi/10.2777/531856. European Group on Ethics in Science and New Technologies (EGE). https://research-­and-­ innovation.ec.europa.eu/strategy/support-­p olicy-­m aking/scientific-­s upport-­e u-­p olicies/ european-­group-­ethics_en#what-­is-­the-­ege. Proposal for a Regulation of the European Parliament and of the Council Laying Down Harmonised Rules on Artificial Intelligence. (Artificial Intelligence Act) and Amending Certain Union Legislative Acts https://eur-­lex.europa.eu/legal-­content/EN/TXT/PDF/?uri=CELEX:52021 PC0206&from=EN. Proposal for amending Directive 2009/148/EC on the protection of workers from the risks related to exposure to asbestos at work (2022). https://eur-­lex.europa.eu/legal-­content/EN/TXT/PDF/? uri=CELEX:52022PC0489&from=EN. Regulation (EU) 2018/1807 of the European Parliament and of the Council of 14 November 2018 on a framework for the free flow of non-personal data in the European Union. https://eur-­lex. europa.eu/legal-­content/EN/TXT/?uri=CELEX%3A32018R1807. ——— 2019/881 of the European Parliament and of the Council of 17 April 2019 on ENISA (the European Union Agency for Cybersecurity) and on information and communications technology cybersecurity certification and repealing Regulation (EU) No 526/2013 (Cybersecurity Act) (Text with EEA relevance) Cybersecurity Act: https://eur-­lex.europa.eu/legal-­content/EN/ ALL/?uri=CELEX%3A32019R0881. Regulation EU 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). https://eur-­lex.europa.eu/legal-­content/EN/TXT/?uri=CELEX%3A02016R0679-­20160504.

Other Resources Mentioned AI Now Institute: https://ainowinstitute.org/. AlgorithmWatch. Inventory of AI Ethics guidelines: https://algorithmwatch.org/en/ai-­ethics-­ guidelines-­inventory-­upgrade-­2020/. Blueprint for an AI Bill of Rights. https://www.whitehouse.gov/ostp/ai-­bill-­of-­rights/. Center for Humane Technology. https://www.humanetech.com/. CEPS. Center for European Policy Studies. https://www.ceps.eu/. CFPB. 2022 Consumer Financial Protection Circular 2022–03 Adverse action notification requirements in connection with credit decisions based on complex algorithms https://www. consumerfinance.gov/compliance/circulars/circular-­2022-­03-­adverse-­action-­notification-­ requirements-­in-­connection-­with-­credit-­decisions-­based-­on-­complex-­algorithms/.

10  Ethical Principles and Governance for AI

217

Future of Life Institute. 2017. Asilomar AI Principles https://futureoflife.org/open-­letter/ ai-­principles/. ———. 2023. Pause Giant AI Experiments: An Open Letter March 23 2023: https://futureoflife. org/open-­letter/pause-­giant-­ai-­experiments/. ———. The AI Act. Webpage with updates about the AI Act. https://artificialintelligenceact.eu/. Montréal Declaration Activity Reports (2018. 2022): https://www.montrealdeclaration-­ responsibleai.com/reports-­of-­montreal-­declaration. Montréal Declaration for a Responsible Development of Artificial Intelligence 2018.: https:// www.montrealdeclaration-­responsibleai.com/_files/ugd/ebc3a3_506ea08298cd4f819663554 5a16b071d.pdf. OECD, Recommendation of the Council on Artificial Intelligence. 2019a. https://legalinstruments. oecd.org/en/instruments/OECD-­LEGAL-­0449. ———, Artificial Intelligence & Responsible Business Conduct. 2019b. https://mneguidelines. oecd.org/RBC-­and-­artificial-­intelligence.pdf. Telefónica. Telefónica’s Approach to the Responsible Use of AI https://www.telefonica.com/es/ wp-­content/uploads/sites/4/2021/06/ia-­responsible-­governance.pdf. Telia. Guiding Principles on Trusted AI Ethics: https://www.teliacompany.com/globalassets/ telia-­company/documents/about-­telia-­company/public-­policy/2021/tc-­guiding-­principles-­on-­ trusted-­ai_jan11.pdf.

Chapter 11

AI, Sustainability, and Environmental Ethics Cristian Moyano-Fernández and Jon Rueda

Abstract  Artificial Intelligence (AI) developments are proliferating at an astonishing rate. Unsurprisingly, the number of meaningful studies addressing the social impacts of AI applications in several fields has been remarkable. More recently, several contributions have started exploring the ecological impacts of AI. Machine learning systems do not have a neutral environmental cost, so it is important to unravel the ecological footprint of these techno-scientific developments. In this chapter, we discuss the sustainability of AI from environmental ethics approaches. We examine the moral trade-offs that AI may cause in different moral dimensions and analyse prominent conflicts that may arise from human and more-than-human-­ centred concerns.

11.1 Introduction Until recently, the environmental costs of Artificial Intelligence (AI) developments have been largely neglected. The lack of attention to the negative externalities of the environmental impacts of AI has, in fact, been labelled as a “blind spot of AI ethics” (Hagendorff 2022). Fortunately, the debate on the environmental burdens of machine learning and deep learning applications is gaining traction in the literature on AI’s ethical aspects (Dauvergne 2020; Coeckelbergh 2021; Van Wynsberghe 2021; Mulligan and Elaluf-Calderwood 2022; Richie 2022; Heilinger et  al. 2023). The discussion on the sustainability of AI is even becoming a burgeoning interdisciplinary field, fed by scientific publications that attempt to measure its specific carbon C. Moyano-Fernández Spanish National Research Council, Madrid, Spain Department of Philosophy, Autonomous University of Barcelona, Barcelona, Spain J. Rueda (*) Department of Philosophy I, University of Granada, Granada, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 F. Lara, J. Deckers (eds.), Ethics of Artificial Intelligence, The International Library of Ethics, Law and Technology 41, https://doi.org/10.1007/978-3-031-48135-2_11

219

220

C. Moyano-Fernández and J. Rueda

emissions and other environmental impacts (Strubell et  al. 2019; Patterson et  al. 2021; Wu et al. 2022). The relationship between sustainability and AI is ambiguous, though. In relation to sustainability, AI has a dual role (Dhar 2020). We can distinguish between AI for sustainability and the sustainability of AI (Van Wynsberghe 2021). On the one hand, AI for sustainability refers to the use of machine learning systems to help mitigate the adverse effects of climate change and other environmental challenges. Many discussions have primarily focused on how AI can have a positive impact on our management of the climate emergency – as mentioned, but not endorsed by Mulligan and Elaluf-Calderwood (2022). Indeed, AI may be an ally in tackling problems related to environmental degradation and the changes in climate. As Mark Coeckelbergh summarises in a very informative way, AI can help to gather and process data on temperature change and carbon emissions, predict weather events and climate, show the effects of extreme weather, improve predictions of how much energy we need and manage energy consumption (e.g., by means of smart grids), process data on endangered species, transform transportation in a way that leads to less carbon emissions and more efficient energy management and routing (car traffic, shipping, etc.), track deforestation and carbon emissions by industry, monitor ocean ecosystems, predict droughts and enable precision agriculture, contribute to smart recycling, assist carbon capture and geoengineering, and nudge consumers to behave in more climate friendly ways and create more awareness about the environmental and climate impact of their behaviour. (Coeckelbergh 2021).

On the other hand, the very sustainability of AI is increasingly in question. AI is a data-intensive technology. Data centres require considerable energy, with an estimated use of 1% of worldwide electricity consumption (Masanet et  al. 2020), and with many other detrimental environmental impacts (Lucivero 2020). The dominant majority of AI research has been catalogued as “Red AI”, being very environmentally unfriendly, despite its significant scientific contributions (Schwartz et al. 2020). Greening AI is becoming, therefore, a pressing challenge in the context of climate emergency. Cutting carbon emissions, moreover, is not the only rationale, it is also necessary to develop different approaches to manage resources at a time of heightened social sensitivity to the escalation of energy consumption. Since AI is resource-intensive, it is not surprising that calls for sustainability are becoming commonplace. Carole-Jean Wu et al. (2022) advocated that the field of AI should develop and adopt a “sustainability mindset”. In the third wave of AI ethics, likewise, sustainability stands out as a focal issue according to Aimee van Wynsberghe (2021). The role of AI in the accomplishment of sustainable development goals is also considered increasingly (Sætra 2021; Monasterio Astobiza et al. 2021). In the ethics of technology in general, moreover, the consolidation of the value of sustainability – whose emergence has been gaining prominence over the last few decades thanks also to the trade-offs of technological developments  – becomes clear (Van de Poel 2021; de Wildt et al. 2022). Sustainability is also sometimes referred to as an ethical principle for AI, as described in the Statement on

11  AI, Sustainability, and Environmental Ethics

221

Artificial Intelligence, Robotics and ‘Autonomous’ Systems of the European Group on Ethics in Science and New Technologies of the European Commission (2018).1 However, we need to be careful with the very label ‘sustainable’. The rhetoric of sustainability may lead to ethics-washing and greenwashing, as far as “being able to label something ‘sustainable’ significantly increases the likelihood of political and public acceptance” (Heilinger et al. 2023). Certainly, in growth-oriented economies, the genuine sustainability of AI is not excluded from controversies. As shown in the context of big data’s environmental impacts, sustainable discourses often coexist with neoliberal practices (Lucivero 2020, p. 1018). These practices have a questionable reputation as predators of the planet’s finite resources and often put economic benefits before environmental interests. Consequently, the sustainability of AI requires a more thorough analysis from environmental ethics approaches. In this chapter, we offer an ethical overview of the ecological impacts of AI.2 Our aim is to introduce an environmental ethics perspective to deal with the increasing concern about the unsustainability of many AI developments. The structure of this chapter goes as follows. In the second section, we provide some empirical evidence of AI’s energy demands and environmental costs (including its carbon footprint). In the third section, we clarify the term ‘sustainability’ and summarise three dominant approaches to understanding the importance of that concept. Then, in the fourth section, we address the issues related to the sustainability of AI from different environmental ethics approaches. Finally, in the fifth section, we identify various ethical values that are helpful to rethink the development of AI technologies in a more environmental-friendly way. We conclude, after that, with some final remarks.

11.2 Energy Demands and Environmental Impacts of AI Applications The energy demands and environmental impacts of AI are receiving increasing academic and societal attention. The concerns about the ecological footprint of AI have also been present among the developers and researchers of the most advanced machine learning models themselves. In fact, some of the most discussed publications in this recent literature have involved AI engineers from leading companies in ICT industry. Below, we summarise some significant evidence that is relevant to assess, in ethical terms, the (un)sustainability of these technological developments. It should be noted, however, that some of the articles on the environmental costs of AI come from publications in digital repositories, and not in traditional  Other authors have included the notion of sustainability within the principle of beneficence, as the widest interpretation of beneficence is concerned also about the welfare of future generations and of non-human species, and the health of the environment (Floridi et al. 2018). 2  Although the terms ‘ecological’ and ‘environmental’ are often used interchangeably, these notions have different nuances. Whereas ‘environmental’ refers to what surrounds the human being, ‘ecological’ refers to oikos (i.e. the house) in which humans and other beings live. 1

222

C. Moyano-Fernández and J. Rueda

peer-reviewed scientific journals – as they are sometimes uploaded by the teams and engineers of the technology companies themselves. The carbon footprint of AI is one of the more pressing issues, especially regarding deep learning applications. In a seminal article, Emma Strubell et al. (2019) put the spotlight on the environmental costs of natural language processing (NLP) systems trained with deep neural networks. They analysed four models (Transformer, BERT, ELMo, and GPT-2) which were quite advanced at the time, but which require high computational demands. Consequently, the training of these models emitted considerable carbon emissions – training BERT on GPU, for instance, was similar to a trans-American flight (Strubell et al. 2019). Or, to draw an often-repeated analogy from that study, training a single NPL model emitted approximately 600,000 lb. (i.e. 300,000 kg) of carbon dioxide, which is equivalent to the lifetime emissions of five cars (Strubell et al. 2019; Coeckelbergh 2021; Van Wynsberghe 2021). Similarly, although estimating the energy costs of the training of large neural networks is a difficult task, David Patterson and colleagues have also studied the substantial level of the carbon footprint of training other famous models (T5, Meena, GShard, Switch Transformer, and GPT-3), showing that the huge computing demands of deep learning research induce significant carbon emissions (Patterson et al. 2021). Again, by way of illustration, the carbon footprint of training one such model (namely, Meena) would be equivalent to 242,231  miles (i.e. 389,833  km) driven by an average passenger vehicle (Patterson et al. 2021; Wu et al. 2022). Undoubtedly, the high energy consumption of training these large models is a manifest concern (Bender et al. 2021). The production of the electricity needed to train AI is something that generates, as we have seen, considerable carbon emissions. However, the total emissions from AI systems are higher. Concentrating carbon footprint metrics in relation to the energy required for the training and use of these applications is, unfortunately, a narrow view. Instead, by focusing on the entire lifecycle of these systems, we could have a more complete picture of their entire carbon footprint, along with other significant environmental costs. In this sense, from a more holistic point of view, it is interesting to distinguish between operational and embodied carbon footprints (Mulligan and Elaluf-­ Calderwood 2022; Wu et al. 2022). The operational carbon footprint includes the emissions produced during the use of AI models, such as their training, as reported in the previous articles. By contrast, the embodied carbon footprint captures all the emissions generated to create and sustain the digital infrastructure involved in the end-to-end machine learning pipeline, including the full supply chain, the manufacturing of hardware, material extraction, the building of the systems, and so on. Therefore, the operational cost of deep learning models is only one aspect of the problem. From a holistic approach to the AI ecosystem, the carbon footprint of AI goes beyond operational emissions, so its embodied carbon footprint needs to be increasingly addressed (Mulligan and Elaluf-Calderwood 2022; Wu et al. 2022). Thus, considering their full lifecycle, the most developed AI systems are energy-­ intensive. Since fossil-fuel-based energy supply is a significant contributor to greenhouse gases, using non-renewable sources may increase the climate emergency. This partly explains the fact that many large companies (such as Google, Amazon,

11  AI, Sustainability, and Environmental Ethics

223

Microsoft, or Apple) are trying to move towards reliance on renewable energy (Mulligan and Elaluf-Calderwood 2022). However, it is questionable whether this shift towards a renewable energy supply will eliminate environmental concerns. For some, it is uncertain to what extent the adoption of renewable energy would offset the climate and environmental footprint of AI technologies at the global level (Coeckelbergh 2021). Moreover, building infrastructure for carbon-free energy requires substantial economic and human resources and is often dependent on rare metals and materials (Wu et al. 2022). The environmental burdens of AI are not restricted to energy consumption and its related carbon footprint. If we consider again the digital infrastructure on which these AI models depend, there are other types of costs worth mentioning given that even those models work with data from “cloud” service providers. For instance, the production of hardware requires materials such as plastics, semiconductors, and microchips. Some of these materials depend on scarce natural resources, including cobalt, nickel, or lithium. The extraction of raw materials may even lead to abiotic depletion in some cases. Microchips also have a non-negligible environmental impact, despite their small size. According to Semiconductor Review (2020), the fabrication of 2 g microchips requires 1.6 kg of petroleum, 72 g of chemicals, and 32 kg of ware, along with the use of water and the generation of toxic chemicals. Finally, the e-waste of AI deserves to be mentioned. After becoming outdated or defunct, digital devices generate highly toxic waste (Heilinger et  al. 2023).3 The end-of-life stage of AI systems’ lifecycle, moreover, generates carbon through transportation, waste processing, and disposal (Mulligan and Elaluf-­ Calderwood 2022). Hence, being concerned about the sustainability of AI means, firstly, dealing with its cradle-to-grave environmental costs and, secondly, attempting to develop a cradle-to-cradle AI design. Still, there is a plurality of sustainability models. As we shall explain in the next section, there are different approaches to engaging with sustainability, some weaker and some stronger.

11.3 What Is Sustainability? The contemporary conception of sustainability was born from the Brundtland report of 1987, which highlighted two concerns that should be reconciled: development and the environment (WCED 1987). When sustainability is applied to well-being, the tension between needs and resources is made explicit. However, the perception of (un)sustainability goes back further. At the end of the eighteenth century, when Thomas Malthus (1798) explained his famous population theory, according to which population tends to grow faster than resources, it became  The problem of e-waste seems to be even greater when digital devices are processed in “informal recycling contexts”, as it sometimes occurs in low- and middle-income countries with fewer environmental control protocols and through polluting practices as incineration (Lucivero 2020). 3

224

C. Moyano-Fernández and J. Rueda

evident that the planet needed time to regenerate. Also, when 18th and nineteenth-­ century naturalists began to notice that some animal species, such as the dodo or the mammoth, had become extinct forever, a certain awareness of sustainability emerged (Barrow 2009). There are human activities that do not respect the reproductive rhythms of the biosphere and, in such cases, become unsustainable in the long term. This conclusion was indeed popularised by the well-known report of the Club of Rome, which predicted that many natural resources crucial to our survival would be exhausted within the forthcoming two generations (Meadows et al. 1972). Then, if we have been aware of the importance of sustainable development for decades, is the way forward clear? Not at all. When addressing sustainability as a socioecological concern, there can be different approaches to understanding it, depending on the theoretical model used. In what follows we shall explain three different approaches to sustainability. The first model would be the so-called Triple Bottom Line (TBL), which is widely used because it puts the same importance on the 3Ps: planet, people, and profit. Sustainability would be found within the space that overlaps these three dimensions. Concern for these factors has been the conceptual basis for other models of sustainability, but sometimes, depending on the model, asymmetric values have been placed on them. The TBL model was coined by John Elkington in the 1990s. Elkington (1994) suggested that the social, economic, and environmental dimensions should be given equal value. At least two broad criticisms could be raised against this first model. The first has to do with a question of content or definition: what does the social domain cover? Or what do we mean by “profit”? The TBL model mainly reduced the people factor to the triad of equity, inclusion, and health, and equated profit to GDP (Kuhlman and Farrington 2010). However, it would be debatable whether these aspects can adequately cover the whole meaning of “people” and “profit”. The second objection would point to why wealth is considered a dimension different from the social dimension, and not a part of it. Is it fair that sustainability is defined by three equivalent factors, when two appeal to human well-being and only one to the environment? Since socio-economic aspects are primarily concerned with the well-being of the present generation and environmental aspects with caring for the future, this would mean that the former is implicitly twice as important as the latter. In the end, we should conclude that this model would lead to a moderate concept of sustainability. A second approach to sustainability would be based on the Mickey Mouse (MM) model. It also starts from the same three dimensions suggested by the TBL model, but focuses more on profit, while treating people and the planet as secondary focuses (Sanz 2009). The view underlying this type of model underpins many of the companies and much of the current global economic and political decision-making. The MM model has been largely criticised because it considers the three sustainability dimensions as separate factors, rather than understanding that there are interdependencies between them. This model has also the problem that it subjugates, to the maintenance of the economy, not only the environment necessary for future generations’ development, but also today’s society itself with its present needs. For these reasons, the MM model represents a weak conception of sustainability

11  AI, Sustainability, and Environmental Ethics

225

and is mostly committed to a biased view towards the short term (Aruga 2022). What it ultimately seeks to sustain is economic value. Finally, there is the Bullseye (BE) model, which would be defined as the strongest sustainability model, because it puts the planet factor as the most important aspect (Giddings et al. 2002). Then, it considers people to be the second highest, and profit as the third most important aspect in providing sustainable development. The BE model is represented by three concentric circles: the broadest circle, which would include the other circles, would be the ecosystem; in the middle would be society; and within this, as a subsection of it, would be the economy. Thus, it emphasises the supremacy of the environment, without which neither society nor the economy can exist (Griggs et al. 2013). In other words, by destroying the planet, people and their profits will certainly be destroyed. The BE model moves away from sociocultural and economic sustainability (represented by the TBL and MM models) and moves more towards ecological sustainability. Here, human society would become a wholly-owned subsidiary of the environment, which would be the dimension with the highest moral value. Recently, Kate Raworth (2017) popularised a parallel version of the BE model of sustainability with the so-called Doughnut Economics framework. According to it, there would be a base or threshold of human needs and a ceiling corresponding to environmental limits: the safe space in which human beings could develop in a sustainable way would only be within these boundaries. In fact, Raworth combined the planetary boundaries (in the outer ring) modelled by the scientists of the Stockholm Resilience Centre (Steffen et al. 2015) with a set of social targets (in the inner ring) that resemble the UN’s Sustainable Development Goals (Griggs et al. 2013). Each of these sustainability models has a stronger or weaker commitment to the environment, depending on the weight that is placed on the other dimensions (such as the economy or society). Moreover, implicit in each of them is a particular environmental ethic, which gives moral support to the defence of sustainability. We will explore this in the following section.

11.4 A Path to Make AI More Sustainable from Environmental Ethics Environmental ethics was born from a triple conjunction: the vindication of civil rights by socially marginalised groups (Hiller et al. 2014), American transcendentalism with its quest to unite man and nature (Robinson 2007), and British utilitarianism with its concern for nonhuman animals (Bentham 1996 [1789]). These three movements gave rise to a historical juncture that would fertilise the discipline of ethics, which in the Anglo-Saxon sphere in the early 1970s took the name of “environmental ethics” (Nash 1989). The implications of each of these sources that nourished the environmental ethics framework can be traced in the various approaches it presents, as well as in their respective approaches to justice: strong anthropocentrism with social justice or ecosocialism, biocentrism with multispecies justice or animal rights, and ecocentrism with ecological justice or deep ecology.

226

C. Moyano-Fernández and J. Rueda

11.4.1 The Anthropocentric Concern for the Environmental Costs of AI The softer version of environmental ethics recognises that we have a duty to protect nature not because nonhuman natural entities have intrinsic value, but because they have instrumental value. Care for the environment serves to ensure human development in the present and in the future (Passmore 1974). Our health and well-being depend on not damaging ecosystems. Hence, there are good anthropocentric reasons to be concerned about the environmental impacts of AI. In this view, the sustainability of AI would be projected primarily through a TBL model, for which the environment matters to the extent that it overlaps with human interests and, moreover, prioritises interests that do not always protect our overriding moral interest. That is, if AI development wreaks havoc on ecosystems or consumes so much energy that it accelerates climate change, this will be good or bad depending on whether it pays off for humans. To analyse these trade-offs from an anthropocentric environmental ethic, we can focus on different movements. For example, if the resources demanded by AI and its ecological impacts harm society asymmetrically, we could conclude that this is a problem of environmental justice that could be criticised by ecosocialism (Croesser 2020). A MM model of sustainability that puts the profit derived from the expansion of AI at the centre would be partly neglecting even the sphere of people. This could be the case for green AI capitalism. If people become aware that AI contributes to climate change and threatens human safety, then they may demand limiting AI’s use of fossil fuels, switching to the consumption of renewable energy sources. However, from ecosocialist perspective, this may negatively impact poorer societies who might lose access to AI technology. Richest countries may have an advantage in using green AI based on renewable resources, as long as many of them have more funding to invest in newer energy systems. This tendency may exclude lower-income countries from the uses of AI, which would also suffer many of the most negative environmental impacts. Ecosocialism would dispute this capitalist vision of AI, even if the latter assumes a green direction or takes into account a certain planetary sustainability. The development of AI should not be driven by unlimited economic growth and, in order to save it, should not be content to look only for less polluting resources. For ecosocialism, the use of AI should be moderated and redirected towards meeting basic needs. This could lead us to justify the use of AI for, perhaps, medicine, engineering, or conservation biology, but perhaps not for economics, transport, or military weaponry. As AI has an environmental impact that may negatively harm humans in the long term, we should choose those AIs that are intended to meet minimum social thresholds for well-being and reject, above all, those that are speculative tools or whose social value is questionable. Eco-capitalist efforts to, for example, increase the energy efficiency of AI sooner or later run up against ecological limits, as shown by the Jevons paradox (Polimeni et  al. 2009). According to this paradox, a reduction in the environmental and

11  AI, Sustainability, and Environmental Ethics

227

economic costs of a technology does not necessarily imply that its use will lead to lower environmental impacts, but that, in aggregate, these may increase if demand also increases. Ecosocialism denounces that the environmental impacts of AI negatively discriminate against the least advantaged collectives in the capitalist industrial race. Hence, the solution for sustainable AI is not simply to change its energy sources to be greener, but to rethink whether it is necessary for the well-being of our societies and those of the future to develop so much AI. Ecosocialists reject a model of MM sustainability because it may harm minority human groups or those with less socio-economic power. Ecofeminists should discuss the environmental effects of AI if they impact negatively upon the roles of women (Shiva and Mies 2014). There are good reasons to care about the ecological footprint of AI without having to sacrifice moral anthropocentrism altogether. Even if what we are concerned about are the environmental trade-offs of AI at different timescales, that is, whether its current development offsets the ecological impacts it will have on the environment of our descendants, nonhuman nature matters only instrumentally in this approach. Only human beings of the present and of the future have intrinsic value. To ensure this moral inclusion of future generations, indirect duties or non-­ contractual duties can be claimed, without necessarily involving rational human agents in the socio-ecological contract (Midgley 1996). In short, the relevant questions that can be asked from anthropocentric environmental ethics to assess the sustainability of AI would have human well-being as their moral focus, which would have intrinsic value, while nonhuman beings and ecosystems would have only instrumental value. According to the TBL model of sustainability, the human sphere as comprising profit and people would ethically outweigh the sphere of the planet. From this perspective, the following questions could be asked: If AI has an environmental footprint, are its use and development socially worthwhile? Should we reduce the environmental impacts of AI by using fewer polluting resources? Should we abandon AI that is not aimed at satisfying the basic needs of people, but rather seeks above all to increase the economic profit of companies? Should we refrain from using AI applications whose environmental impacts mainly harm minority groups and benefit the most economically advantaged (who are also the better equipped to deal with the costs of mitigating or adapting to ecological deterioration)?

11.4.2 The Biocentric Concern for the Environmental Costs of AI Expanding the moral circle to include more than human beings is a greater commitment to sustainability on a horizontal level, because it includes respect for the interests of more individuals living in the present. The thesis of biocentrism is that every individual life has an inherent value, a value for its own sake. According to Albert Schweitzer, one of its leading representatives, the intrinsic value of all life is based

228

C. Moyano-Fernández and J. Rueda

on the «will to live» (a concept borrowed from Schopenhauer), which both human beings and the rest of living beings share (Schweitzer 1923). This idea would morally oblige us to be humbler and to embrace moral concern for all living beings that have an interest in developing, growing, and reproducing. Along these lines, Kenneth Goodpaster (1978) argued that both plants and animals have interests, and that all beings with interests deserve moral consideration, if not as moral agents, then as «moral patients». A more moderate version of biocentrism would consist in placing moral relevance not in the fact of being alive, but more specifically in the fact of having pleasurable experiences or suffering, namely, of being a sentient being (Bentham 1996 [1789]). This is the thesis of sentientism accepted by some animal advocates. According to them, those who have particular interests (such as the interest in avoiding suffering) deserve more respect, regardless of whether they are human animals or not. Otherwise, we would fall into a speciesist bias (Singer 1975) of unjust discrimination based on whether or not one belongs to a particular species. Peter Singer (1975), for example, rejected the Kantian criterion that the ability to reason should be the main attribute justifying rights or moral concern, arguing that we would then be excluding babies, the senile insane, or the mentally handicapped. It is the capacity to feel suffering that matters. And in addition to sentience, the interest that an entity has in continuing its life, expecting to remain living, is fundamental. By this, what Singer is trying to say is that any being with interests and preferences to achieve a better state in the future deserves equal moral respect. Expanding environmental ethics into these biocentric approaches implies understanding the sustainability of AI in a broader sense than the TBL model, but less systemic than the BE model. It does not follow a TBL model at all because nonhumans do not care about profit as much as we do, so it would not be fair to attach the same moral weight to this dimension as to other dimensions, such as the planet. However, it does maintain concern for the planet insofar as it matters for people and for other individuals. Thus, perhaps the people sphere should be changed to (not only human) individuals and the profit sphere could be omitted, so that we end up with a double bottom line model. Alternatively, one could add different spheres, such as that of animals or plants, to the same level as that occupied by people, thus forming a quadruple bottom line model. It depends on what kind of biocentric ethics is assumed. However, in any case, it would not be entirely a BE model because an ecological or systemic view is not assumed and the planet is not considered as the sphere from which individual beings are able to develop. If this environmental ethics approach is taken as a reference for assessing the sustainability of AI, the questions to be asked are: How are the lives of human and nonhuman living beings harmed by the human use of AI? Which living beings are the most affected? If non-sentient beings, such as plants, without a nervous system, are affected more than anything else, can this be compensated for? AI ethics should begin to contemplate the moral implications of harming non-human beings because they are rarely mentioned (Owe and Baum 2021; Hagendorff 2022; Hagendorff et al. 2022; Singer and Tse 2022).

11  AI, Sustainability, and Environmental Ethics

229

The environmental impacts of AI can perhaps be justified from a consequentialist biocentrism if the individual lives benefited by AI outweigh the lives harmed. For example, if the water consumption of AI to cool the computers that store and process data leaves a thousand individuals without local resources, but AI will serve to better manage water (Zhu et al. 2022) and thereby provide access to clean water for ten thousand individuals, then the outcome can be considered positive. On the other hand, if we assume a biocentrism based on respect for individual rights, and if, for example, the water consumption of AI would violate the basic rights of even a few individuals, then the development of such AI should be rejected on principle. Individualistic metrics have the advantage that trade-offs can be evaluated from a focus on consequentialist or deontological arguments. If every life counts the same morally, then the key issue will be to identify what the welfare needs of those lives are, and from there to count the number of individuals affected or benefited. However, the paradox of biocentrism – both in its extended version that would include plant organisms and in its animalistic version – is that its moral individualism can lead to overlooking concerns about the systemic aspects of nature and thus prove to be an insufficient moral approach to address the problems unleashed by the ecological crisis, which is damaging the lives of so many living beings.

11.4.3 The Ecocentric Concern for the Environmental Costs of AI Finally, there is a last approach within environmental ethics: the ecocentric concern. What ecological science tells us about the complex interactions and interdependencies in nature is one of the bases for questioning the moral individualism of biocentrism. That the human being is at the pinnacle of moral relevance is already rejected even by sentientism, but it is necessary to go further in order to contest that the most important thing is the capacity to suffer. The more radical biocentrism, which would embrace respect for the intrinsic value of plants and other non-sentient organisms, has already taken this step by considering that they too have primitive interests in developing, growing, and reproducing. However, does an ecosystem or an entire biotic community have such interests on self-realization, and can it be considered a moral subject? Biocentrism would anchor here, but ecocentrism and the so-called “deep ecology” movement would make such a moral extension (Naess 1984). One may think that, by defending the interests of every living individual, we are already maintaining a maximum commitment to sustainability, and that a biocentric ethic for AI is as far as we can expand our environmental ethical concern. Why might it be relevant to include the inherent value of ecosystems? Let us imagine the following case. If we discover that in order to feed an AI system we need to cut down forests to get wood to burn, is the choice between destroying forest A, where two thousand animals and five thousand plant organisms live, and destroying forest B, where one thousand animals and twenty thousand plant

230

C. Moyano-Fernández and J. Rueda

organisms live, irrelevant? From a radical biocentric ethic, where each individual interest matters, we would say that it is morally more acceptable to destroy forest A, where there are fewer living individuals. From sentientism, we could ask where there are more species of sentient animals, since this would be an important ethical question, which would lead us to prioritising the protection of forest A. But from a radical biocentrism, we would not discriminate between plants and animals: in theory, every organism counts equally. So let us imagine forest C, in which only 800 animals and 3000 plants live. Any kind of biocentrism would accept sacrificing this forest if the energy it provides feeds an AI, for example, used for conservation biology or for health, which helps to protect a greater number of individual lives. However, if forest C is home to apex predators like wolves, whose trophic role in ecosystems is key because of the top-down relationships they can facilitate, home to ecological engineers like beavers, and to trees whose mycorrhizae are 500 years old, would this be considered ethically irrelevant? If our intuition inclines us to think that perhaps this forest C suddenly becomes more valuable regardless of the benefit to individuals, it may be because we are adopting an ecocentric perspective, where the value of species or whole ecosystems intrinsically matters. From an anthropocentric or biocentric perspective, we might also conclude that this forest C is more valuable than others if we find that it provides more benefits (for aesthetic, health, co-evolutionary or survival reasons) to humans and non-humans individuals in the long run. But if we do not identify instrumental benefits from its conservation and yet consider it worth preserving by arguing that this particular ecosystemic as a whole, with its unique interdependencies, is more important than the sum of its individual components, then there is an ecocentric intuition because we are presupposing an intrinsic value on it. According to Lawrence Johnson (1991), species and ecosystems are holistic entities that have interests and should not therefore be valued only instrumentally. The ecocentrism approach might include James Lovelock’s Gaia hypothesis: the theory that even the planet Earth alters its geo-physiological structure over time in order to ensure the continuation of an equilibrium of evolving organic and inorganic matter (Lovelock 1972). Here, Holmes Rolston III (1988) would point out that being teleologically organised, as an ecosystem, would justify the attribution of systemic and intrinsic value. Ecocentrism can be objected to as perpetuating a logic of human centrality, in that we are the ones who decide that nonhuman nature has values (Bookchin 1987). But the fact that values are anthropogenically assigned does not imply that they must respond to anthropocentric egoism. Value judgements can be directed towards the nonhuman realm through sympathy or respect (Callicot 1989). Making AI more sustainable based on ecocentric ethics would mean adopting the BE model of sustainability, for which the most important sphere is the planet as a whole. If the development of AI puts the biosphere, its ecosystems, and nonhuman species at risk, then no matter how many human needs it satisfies, it cannot be considered sustainable AI. Assessing AI’s environmental impacts does not assume an individualistic ontology, but a relational and systemic one. To understand how ecocentric concern starts from a holistic ontology, let us consider this other example.

11  AI, Sustainability, and Environmental Ethics

231

Let us imagine that we do not want to cut down a whole forest to provide energy to an AI system, but that it is enough to cut down only a part of it. Is it morally indifferent if we cut down trees in areas where there are riparian trees than in other denser areas? From a holistic ontology, we can consider that riparian trees have a systemic effect on river ecosystems that help drainage and contain the river, which can prevent future landslides or floods. Conversely, another area of dense woodland may facilitate the spread of fires, and by taking advantage of logging there, firebreaks could be formed that would prevent systemic risks. Attending to the relationships and interdependencies of biotic communities and ecosystems means taking care of these synergies that are not understood only by the atomised sum of individual units.

11.5 Ethical Values for a Sustainable AI In this last section, we shall introduce some of the key ethical values that are frequently mentioned for the development and use of AI in an environmentally-­ sustainable manner. For the identification of these values, both specific publications on the ethics of sustainable AI and contributions from studies on AI’s environmental impacts have been taken into account. We will briefly develop the following: transparency, efficiency, fairness and equity, recognition, and utility. Greater transparency is a growing claim. Unlike other debates in AI ethics in which transparency is commonly linked to the algorithmic performance of AI models and their lack of explainability (see Rueda et al. 2022), this term serves another function here. Transparency is advocated to be able to estimate AI’s environmental impacts, being advisable to report the training time and computational resources of the AI models (Strubell et al. 2019). In this vein, Patterson et al. (2021) stated: “To make the carbon costs of training transparent, we encourage more researchers to measure energy usage and CO2e – or to get a rough estimate using a tool like ML Emissions Calculator  – and publish the data.” Again, transparency is not only needed for knowing the operational carbon footprint of AI, but also for estimating its embodied carbon footprint. Regarding the latter, Carole-Jean Wu et al. (2022) suggested that AI researchers should publish the hardware used, the number of machines, and the total runtime necessary to produce results. So, in addition to performance metrics, AI developers should be required to track energy use (Coeckelbergh 2021). Efficiency is another commonly emphasised value. Efficiency is a means-ends rationality and encourages doing more with less. A desirable innovation in the interest of efficiency would be to develop hardware that requires less energy. Another way is to reduce the number of floating-point operations (the amount of work performed by a computational process) required to generate a result (Schwartz et al. 2020, p. 60). Energy reduction coupled with increased efficiency is not only ethically desirable from an ecological perspective, but also profitable for companies as they would be saving money (Patterson et  al. 2021). Therefore, efficiency could

232

C. Moyano-Fernández and J. Rueda

bring prudential and moral interests together. Efficiency, however, is not a magic recipe for reducing environmental costs. As an example of the Jevons paradox, since the improvements in the efficiency of AI (and their corresponding reduction of operational power footprint), the overall demand for AI seems to increase with time (Wu et al. 2022).4 Furthermore, it may also be the case that, to combat the climate emergency, we not only need more efficient technologies, but also to rethink our lifestyles and the role of technologies in them (Coeckelbergh 2021). Equity and fairness are also often mentioned to see if the benefits of using AI and its environmental costs are distributed in a just manner. Strubell et al. (2019), for example, argued that academic researchers need equitable access to computational resources. More importantly, lower GDP countries may be disadvantaged in the cost-benefit balance of AI.  As mentioned by Mulligan and Elaluf-Calderwood (2022), countries that are benefitting the least from the use of AI are often those facing the most important burdens in terms of AI’s deleterious environmental impacts. From a climate justice perspective, that those who use AI the least will suffer the most from its negative ecological effects is unfair (Bender et al. 2021). Similarly, this concern may extend to future generations (Coeckelbergh 2021). From an intergenerational justice approach, it is important to consider the environmental costs that future generations will inherit without contributing to the problem themselves. Recognition is another value commonly used in the literature on justice, as a complement to simple distributive justice. Recognition is important for AI environmental ethics because it can highlight those communities or individuals that are being overlooked when analysing the environmental trade-offs of AI. It is a value that is concerned with who the subjects of justice can be and who is being marginalised. While some researchers have extended this recognition to minority or indigenous communities from decolonial epistemologies (Mohamed et al. 2020), others have extended it to future generations (Coeckelbergh 2021), or to nonhuman animals (Owe and Baum 2021; Hagendorff et al. 2022). Currently, there is no literature as yet that discusses the importance of recognising non-animal nature as a subject of justice for an in-depth assessment of the sustainability or environmental ethics of AI.  However, there are authors (Schlosberg 2014; Romero and Dryzek 2021; Kortetmäki et al. 2022) who have long argued for the environmental value of recognising non-animal entities as stakeholders with intrinsic values and (ecocentric) interests that deserve to be “listened” to and represented. In a world with competing priorities, moreover, it is relevant to pay attention to the utility of technological developments. Utility requires maximising the value of the outcomes provided by applying AI. To measure utility, it is convenient to look at opportunity costs, i.e. to consider the value of the alternatives that are foregone. In the case of energy expenditure, this issue is apparent. As some researchers denounce,  Even if total electricity consumption increases, to measure efficiency gains, it is still relevant to assess whether this trend is worthwhile in proportion to the technological improvement in question. For example, an article published in Science on global energy consumption by data centres estimated that, from 2010 to 2018, computer instances had increased by 550% while electricity use had only increased by 6% (Masanet et al. 2020). 4

11  AI, Sustainability, and Environmental Ethics

233

the colossal expenditure of energy by training AI is embarrassing when in high-­ resource countries there are still families that have problems heating their homes (Strubell et al. 2019). The value of the very purpose of some specific AI developments must also be raised. Consider the case of AlphaGo – the powerful AI programme of Google’s Deep Mind that outperforms human professionals in the board game of Go. Strikingly, the best version of AlphaGo required “1,920 CPUs and 280 GPUs to play a single game of Go, with an estimated cost to reproduce this experiment of $35,000,000” (Schwartz et  al. 2020). Its corresponding emissions are alarming: 40 days of research training would have generated around 96 tonnes of CO2, which is equivalent to a carbon footprint of 23 American homes or 1000 h of air travel (Van Wynsberghe 2021). Is the utility of AlphaGo greater than what could be achieved if we allocated those resources to other ends? To paraphrase Aimee van Wynsberghe (2021), in a world with approximately 600 million people without access to modern electricity, training extremely energy-dependent AI models to beat the world champion of Go is a questionable accomplishment that requires further societal discussion.

11.6 Conclusions In this chapter, we have attempted to delve into the relationship of AI with sustainability and environmental ethics. Some researchers and developers claim that machine learning can help achieve sustainable development. Therefore, AI could play an important role for sustainability. In contrast, other scholars highlight the importance of addressing the sustainability of AI, because its development and application have undeniable energy and environmental costs. Hence, analysing AI systems ethically cannot leave aside the search for why and how it should be “greener”. To face the question of how to ethically green AI, we first considered it appropriate to outline three representative sustainability models in discussions about how to develop our human societies and economies without destroying nature. We then mapped the different approaches prevalent in the environmental ethics literature to deal with the question of why AI should be greener or more sustainable. Here, we have identified that the different frameworks of environmental ethics have certain theoretical links with the models of sustainability presented above, indirectly engaging more with one or the other. Finally, we have presented some ethical values that we consider helpful to reasonably guide human criteria on how and why to make AI more sustainable, and, based on them, to assess the balance of AI morally. Last but not least, we think that future research should increasingly consider the sustainability of AI from environmental ethics approaches. To consider whether particular developments of AI are good or fair, adopting largely unattended normative frameworks from environmental ethics can help to move the debate forwards. In particular, environmental ethics can reinforce the commitment to more robust

234

C. Moyano-Fernández and J. Rueda

models of sustainability and discuss the energy and ecological trade-offs of AI in more depth from more-than-human-centred perspectives, which should mean considering narrow human interests and adopting more holistic viewpoints. Acknowledgments  We are grateful for the insightful comments of Jan Deckers on a previous version of this chapter. Cristian Moyano-Fernández thanks the funding of the project “Ética del Rewilding en el Antropoceno: Comprendiendo los Escollos de Regenerar Éticamente lo Salvaje (ERA-CERES)”, with reference PZ618328 / D043600, funded by Fundación BBVA; and the project “La solidaridad en bioética (SOLBIO)” with reference PID2019-105422GB-100, funded by the Spanish Ministry of Science and Innovation. Jon Rueda also thanks the funding of the research project EthAI+3 (Digital Ethics. Moral Enhancement through an Interactive Use of Artificial Intelligence) of the State Research Agency of the Spanish Government (PID2019-104943RB-I00), the project SOCRAI3 (Moral Enhancement and Artificial Intelligence. Ethical aspects of a virtual Socratic assistant) of FEDER Junta de Andalucía (B-HUM-64-UGR20), and an INPhINIT Retaining Fellowship of the La Caixa Foundation (LCF/BQ/DR20/11790005).

References Aruga, K. 2022. Future issues of environmental economics. In Environmental and natural resource economics, 79–92. Cham: Springer. Barrow, M.V. 2009. Nature’s ghosts: Confronting extinction from the age of Jefferson to the age of ecology. Chicago: University of Chicago Press. Bender, E.M., T. Gebru, A. McMillan-Major, and S. Shmitchell. 2021. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, 610–623. ACM. Bentham, J. 1996 [1789]. An introduction to the principles of morals and legislation. New York: Oxford University Press. Bookchin, M. 1987. Social ecology versus deep ecology: A challenge for the ecology movement. In Green perspectives: Newsletter of the green program project, 4–5. Callicot, B. 1989. In defense of the land ethic. Essays in environmental philosophy. State University of New York Press. Coeckelbergh, M. 2021. AI for climate: Freedom, justice, and other ethical and political challenges. AI and Ethics 1 (1): 67–72. Croesser, E. 2020. Ecosocialism and climate justice. An ecological neo-Gramscian analysis. London: Routledge. Dauvergne, P. 2020. Is artificial intelligence greening global supply chains? Exposing the political economy of environmental costs. Review of International Political Economy 29 (3): 696–718. https://doi.org/10.1080/09692290.2020.1814381. de Wildt, T.E., I.R. van de Poel, and E.J.L.  Chappin. 2022. Tracing long-term value change in (energy) technologies: Opportunities of probabilistic topic models using large data sets. Science Technology and Human Values 47 (3): 429–458. https://doi.org/10.1177/01622439211054439. Dhar, P. 2020. The carbon impact of artificial intelligence. Nature Machine Intelligence 2 (8): 423–425. Elkington, J. 1994. Towards the sustainable corporation: Win-win-win business strategies for sustainable development. California Management Review 36: 90–100. European Commission, Directorate-General for Research and Innovation, European Group on Ethics in Science and New Technologies. 2018. Statement on artificial intelligence, robotics and ‘autonomous’ systems: Brussels, 9 March 2018, Publications Office. Available at: https:// data.europa.eu/doi/10.2777/531856. Accessed 23 Feb 2023.

11  AI, Sustainability, and Environmental Ethics

235

Floridi, L., J. Cowls, M. Beltrametti, R. Chatila, P. Chazerand, V. Dignum, et al. 2018. AI4People— An ethical framework for a good AI society: Opportunities, risks, principles, and recommendations. Minds and Machines 28: 689–707. Giddings, B., B.  Hopwood, and G.  O’brien. 2002. Environment, economy and society: Fitting them together into sustainable development. Sustainable Development 10 (4): 187–196. Goodpaster, K.E. 1978. On being morally considerable. The Journal of Philosophy 75: 308–325. Griggs, D., M. Stafford-Smith, O. Gafney, and J. Rocktröm. 2013. Policy: Sustainable development goals for people and planet. Nature 495 (7441): 305–307. Hagendorff, T. 2022. Blind spots in AI ethics. AI and Ethics 2 (4): 851–867. Hagendorff, T., L.N. Bossert, Y.F. Tse, and P. Singer. 2022. Speciesist bias in AI: How AI applications perpetuate discrimination and unfair outcomes against animals. AI and Ethics. https://doi. org/10.1007/s43681-­022-­00199-­9. Heilinger, J.C., H. Kempt, and S. Nagel. 2023. Beware of sustainable AI! Uses and abuses of a worthy goal. AI and Ethics: 1–12. Hiller, A., R.  Ilea, and L.  Kahn. 2014. Consequentialism and environmental ethics. New  York: Routledge. Johnson, L.E. 1991. A morally deep world. An essay on moral significance and environmental ethics. Cambridge: Cambridge University Press. Kortetmäki, T., A. Heikkinen, and A. Jokinen. 2022. Particularizing nonhuman nature in stakeholder theory: The recognition approach. Journal of Business Ethics. https://doi.org/10.1007/ s10551-­022-­05174-­2. Kuhlman, T., and J. Farrington. 2010. What is sustainability? Sustainability 2 (11): 3436–3448. https://doi.org/10.3390/su2113436. Lovelock, J.E. 1972. Gaia as seen through the atmosphere. Atmospheric Environment 6 (8): 579–580. https://doi.org/10.1016/0004-­6981(72)90076-­5. Lucivero, F. 2020. Big data, big waste? A reflection on the environmental sustainability of big data initiatives. Science and Engineering Ethics 26 (2): 1009–1030. Malthus, T. 1798. An essay on the principle of population. London: Pickering & Chatto Publishers. Masanet, E., A. Shehabi, N. Lei, S. Smith, and J. Koomey. 2020. Recalibrating global data center energy-use estimates. Science 367 (6481): 984–986. Meadows, D.H., D.L.  Meadows, J.  Randers, and W.W.  Behrens. 1972. The limits to growth. Washington DC: Potomac Associates, New American Library. Midgley, M. 1996. Duties concerning islands. In Environmental ethics, ed. R.  Elliot, 89–103. Oxford University Press. Mohamed, S., M.T. Png, and W. Isaac. 2020. Decolonial AI: Decolonial theory as sociotechnical foresight in artificial intelligence. Philosophy and Technology 33: 659–684. https://doi. org/10.1007/s13347-­020-­00405-­8. Monasterio Astobiza, A., M. Toboso, M. Aparicio, and D. López. 2021. AI ethics for sustainable development goals. IEEE Technology and Society Magazine 40 (2): 66–71. Mulligan, C., and S. Elaluf-Calderwood. 2022. AI ethics: A framework for measuring embodied carbon in AI systems. AI and Ethics 2: 363–375. https://doi.org/10.1007/s43681-­021-­00071-­2. Naess, A. 1984. A defense of the deep ecology movement. Environmental Ethics 6: 265–270. Nash, R.F. 1989. The rights of nature. A history of environmental ethics. University of Wisconsin Press, Wisconsin. Owe, A., and S.D. Baum. 2021. Moral consideration of nonhumans in the ethics of artificial intelligence. AI and Ethics 1: 517–528. https://doi.org/10.1007/s43681-­021-­00065-­0. Passmore, J. 1974. Man’s responsibility for nature: Ecological problems and Western traditions. London: Gerald Duckworth & Co. Patterson, D, J.  Gonzalez, Q.  Le, C.  Liang, L.M.  Munguia, D.  Rothchild, D.  So, M.  Texier, and J.  Dean. 2021. Carbon emissions and large neural network training. arXiv preprint arXiv:2104.10350. Polimeni, J.M., K. Mayumi, M. Giampietro, and B. Alcott. 2009. The myth of resource efficiency. Routledge, New York: The Jevons Paradox.

236

C. Moyano-Fernández and J. Rueda

Raworth, K. 2017. Doughnut economics: Seven ways to think like a 21st-century economist. New York: Random House Business. Richie, C. 2022. Environmentally sustainable development and use of artificial intelligence in health care. Bioethics 36 (5): 547–555. https://doi.org/10.1111/bioe.13018. Robinson, D.M. 2007. Emerson, Thoreau, fuller, and transcendentalism. American Literary Scholarship: 3–34. Rolston, Holmes, III. 1988. Environmental ethics: Duties to and values in the natural world. Philadelphia: Temple University Press. Romero, J., and J.S. Dryzek. 2021. Grounding ecological democracy: Semiotics and the communicative networks of nature. Environmental Values 30 (2): 407–429. Rueda, J., J. Delgado Rodríguez, I. Parra Jounou, J. Hortal, T. Ausín, and D. Rodríguez-Arias. 2022. “Just” accuracy? Procedural fairness demands explainability in AI-based medical resource allocations. AI & Society. https://doi.org/10.1007/s00146-­022-­01614-­9. Sætra, H.S. 2021. AI in context and the sustainable development goals: Factoring in the unsustainability of the sociotechnical system. Sustainability (Switzerland), 13 (4): 1–19. https://doi. org/10.3390/su13041738. Schlosberg, D. 2014. Ecological justice for the anthropocene. In Political animals and animal politics, ed. M. Wissenburg and D. Schlosberg, 75–89. Palgrave Macmillan. Schwartz, R., J. Dodge, N.A. Smith, and O. Etzioni. 2020. Green AI. Communications of the ACM 63 (12): 54–63. Schweitzer, A. 1923. Civilization and Ethics. London: A&C Black. Semiconductor Review. 2020 How can the Semiconductor Industry Achieve Sustainability? Semiconductor Review, 12. Available at: https://www.semiconductorreview.com/news/how-­ can-­the-­semiconductor-­industry-­achieve-­sustainability%2D%2Dnwid-­135.html. Accessed 22 Feb 2023. Shiva, V., and M. Mies. 2014. Ecofeminism. London: Zed Books. Singer, P. 1975. Animal liberation: A new ethics for our treatment of animals. New  York: New York Review. Singer, P., and Y.F. Tse. 2022. AI ethics: The case for including animals. AI and Ethics. https://doi. org/10.1007/s43681-­022-­00187-­z. Steffen, W., K.  Richardson, J.  Rockström, S.E.  Cornell, I.  Fetzer, E.M.  Bennett, et  al. 2015. Planetary boundaries: Guiding human development on a changing planet. Science 347 (6223): 1259855. https://doi.org/10.1126/science.1259855. Strubell, E, A. Ganesh, and A. McCallum. 2019. Energy and policy considerations for deep learning in NLP. arXiv preprint arXiv:1906.02243. Sustainable Aotearoa New Zealand (SANZ). 2009. Strong sustainability for New Zealand: Principles and scenarios. Nakadize Limited. Van de Poel, I. 2021. Design for value change. Ethics and Information Technology 23 (1): 27–31. Van Wynsberghe, A. 2021. Sustainable AI: AI for sustainability and the sustainability of AI. AI and Ethics 1 (3): 213–218. World Commission on Environment and Development (WCED). 1987. Our common future. New York: Oxford University Press. Wu, C.J., R. Raghavendra, U. Gupta, B. Acun, N. Ardalani, K. Maeng, et al. 2022. Sustainable AI: Environmental implications, challenges and opportunities. Proceedings of Machine Learning and Systems 4: 795–813. Zhu, M., J. Wang, X. Yang, Y. Zhang, L. Zhang, H. Ren, et al. 2022. A review of the application of machine learning in water quality evaluation. Eco-Environment & Health 1 (2): 107–116. https://doi.org/10.1016/j.eehl.2022.06.001.

Chapter 12

The Singularity, Superintelligent Machines, and Mind Uploading: The Technological Future? Antonio Diéguez and Pablo García-Barranquero

Abstract  This chapter discusses the question of whether we will ever have an Artificial General Superintelligence (AGSI) and how it will affect our species if it does so. First, it explores various proposed definitions of AGSI and the potential implications of its emergence, including the possibility of collaboration or conflict with humans, its impact on our daily lives, and its potential for increased creativity and wisdom. The concept of the Singularity, which refers to the hypothetical future emergence of superintelligent machines that will take control of the world, is also introduced and discussed, along with criticisms of this concept. Second, it is considered the possibility of mind uploading (MU) and whether such MU would be a suitable means to achieve (true) immortality in this world—the ultimate goal of the proponents of this approach. It is argued that the technological possibility of achieving something like this is very remote, and that, even if it were ever achieved, serious problems would remain, such as the preservation of personal identity. Third, the chapter concludes arguing that the future we create will depend largely on how well we manage the development of AI. It is essential to develop governance of AI to ensure that critical decisions are not left in the hands of automated decision systems or those who create them. The importance of such governance lies not only in avoiding the dystopian scenarios of a future AGSI but also in ensuring that AI is developed in a way that benefits humanity.

12.1 Introduction Many of the opinions that can be read about the Singularity express strong convictions about the future of our species and how the emergence of superintelligent machines will change it. However, the question is sufficiently complex for us to be A. Diéguez (*) · P. García-Barranquero Department of Philosophy, University of Malaga, Málaga, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 F. Lara, J. Deckers (eds.), Ethics of Artificial Intelligence, The International Library of Ethics, Law and Technology 41, https://doi.org/10.1007/978-3-031-48135-2_12

237

238

A. Diéguez and P. García-Barranquero

cautious about categorical positions. Will we ever have an Artificial General Superintelligence (AGSI), and not merely a particular or narrow one—which we have now—only capable of solving concrete, specific, and well-delimited problems? If we have it, will it collaborate with us or kill our species? If we have the possibility of living with it, will it relegate us to a life of leisure and perhaps of practical uselessness, or, on the contrary, will it enhance our creativity and be a means to enrich our lives and increase our wisdom? In this chapter, we will briefly outline how AGSI has been characterized, we will discuss some proposed definitions, and we will provide a general description of this research field. Supposedly, the emergence of AGSI would lead to the advent of the Singularity, a scientific-technological concept that we will explain in detail (along with its best-known criticisms many experts have offered in recent decades). Finally, we will end with speculative comments on the possibility of mind uploading (MU) and whether such MU would be a suitable means to achieve (true) immortality in this world—the ultimate goal of the proponents of this approach.

12.2 The Advent of the Singularity: Raymond Kurzweil’s Predictions 12.2.1 Is the Singularity Near? Despite undoubted advances in recent years, the current limitations of Artificial Intelligence (AI) are well-known (Larson 2021; Marcus and Davis 2019). Therefore, it is surprising that some opinion makers, including notable scientists, take it for granted that an AGSI will eventually dominate the world and displace humans. For instance, Raymond Kurzweil (2005) has recovered the Singularity idea and dated such an event. Although the Singularity is something so strangely singular that, despite the popularity of the term after appearing in movies such as Ex Machina and having given name to a university, there is no agreement about what it is (Sandberg 2013). Kurzweil uses it to designate the advent at some future time of the first superintelligent system capable of improving itself or capable of manufacturing other systems more intelligent than itself, which in turn will be able to do the same, and so on. All this will produce an exponential growth of the intelligence, and eventually superintelligent machines will take control of the world. The term “Singularity” has a long-established use in physics. In the general theory of relativity, a space-time singularity designates a system or state in an area of space-time in which the known laws of physics do not apply. Examples such as the first moments after the Big Bang or the inside of a black hole illustrate this. There is, however, no direct connection, but rather a mere analogy between this use of the term and the sense in which Kurzweil uses it. Implicit in both cases is the idea of discontinuity and unpredictability. In the first case, events cannot be established according to our physical standards. In the second one, the effects of AGSI would

12  The Singularity, Superintelligent Machines, and Mind Uploading…

239

be incommensurable regarding those produced so far by the deployment of human intelligence. In 1993, in an essay entitled “The Coming Technological Singularity: How to Survive in the Post-Human Era”, the writer and mathematician Vernor Vinge used the term to refer to this hypothetical future explosion of machine intelligence.1 Such an event would have enormous repercussions for humans. Vinge states that it would mean the end of our era. But years earlier, one of the first to note the possibility of an AI explosion, Irving J. Good, had already written: “Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control” (Good 1966, p. 33). Kurzweil is the great promoter of this idea. He is well-known for his bold predictions. He maintains, for example, that in the year 2029, a machine will pass the Turing Test and therefore show intelligence equal to human intelligence. He also estimates that the advent of the Singularity will be around 2045. Kurzweil believes that within 15 years, starting in 2030, intelligent machines will perfect themselves so much that everything will be under their control, including the material and energy resources necessary to maintain their growth, therefore beginning their cosmic expansion. Human civilization will then end, and a post-biological era under the domination of machines will begin. But if it happens so soon, why do we not already see any clear signs of an event of such magnitude? The answer is simple: exponential growth—what is inappreciable at a given moment becomes, in a short time, colossal in its magnitude. Kurzweil’s proposal to avoid our disappearance or annulment at the expense of superintelligent machines is the human-machine integration. Of course, the future will belong to superintelligent machines, but these will have an intelligence whose origin would be human, because it would have arisen from the enhancement of human intelligence itself. Kurzweil describes the process in these words: The Singularity will represent the culmination of the merger of our biological thinking and existence with our technology, resulting in a world that is still human but that transcends our biological roots. There will be no distinction, post-Singularity, between human and machine or between physical and virtual reality. If you wonder what will remain unequivocally human in such a world, it’s simply this quality: ours is the species that inherently seeks to extend its physical and mental reach beyond current limitations (Kurzweil 2005, p. 9).

There is, of course, good reason to wonder what substantial difference there will be for us between a future in which superintelligent machines would have annihilated us and a future in which they would have incorporated our mind, diluting it into a cosmic intelligence. What exactly is human about such a future? David Chalmers, in an extensive and influential paper, warns us: If there is a singularity, it will be one of the most important events in the history of the planet. An intelligence explosion has enormous potential benefits: a cure for all known  Similarly, the term had already been used orally by John von Neumann in 1957 in a conversation with the physicist Stanislaw Ulam. In addition to Vinge and Kurzweil, other authors who have developed the content of the term include David J. Chalmers (2010) and Anders Sandberg (2013), as well as several of those writing in the collective book by Amnon H. Eden et al. (2012). 1

240

A. Diéguez and P. García-Barranquero

diseases, an end to poverty, extraordinary scientific advances, and much more. It also has enormous potential dangers: an end to the human race, an arms race of warring machines, the power to destroy the planet. So if there is even a small chance that there will be a singularity, we would do well to think about what forms it might take and whether there is anything we can do to influence the outcomes in a positive direction (Chalmers 2010, p. 10).

Chalmers argues that the Singularity is certain to happen sooner or later. He defends that once we have AI (and almost everyone assumes that we already do), it is only a matter of time (he suggests the date of 2100) before we have an AI of higher-than-­ human capacity. Once we have higher-than-human AI, it is a matter of time before we have an AGSI, which is the one that will trigger the Singularity. In his extensive paper, he analyzes the problem of whether this phenomenon can be controlled so that it does not harm humans, intentionally or unintentionally. His idea is to insert our moral values into machines as we develop their intelligence and, as an initial precautionary measure, build the first superior-to-human AI systems in virtual worlds, not directly in real machines. However, this would only be a temporary solution because Chalmers believes that an AGSI would eventually know how to escape into the real world.

12.2.2 From Moore’s Law to Law of Accelerating Returns Kurzweil bases his assurance that the Singularity is near and that it will also be an extremely fast process on Moore’s Law, named after Gordon E. Moore, the engineer who formulated it in 1965 and who later co-founded Intel. Then, expanding on it, Kurzweil establishes his Law of Accelerating Returns (LAR). Moore’s law states that the number of transistors that can be placed on a microprocessor (or, in short, the computational power of computers) doubles in periods ranging from 18 months to 2 years, and thus has exponential growth. Although this is its initial formulation, after decades of going unnoticed, it has subsequently been extrapolated—on more than debatable grounds on many occasions—to various technologies, always preaching, of course, the exponential growth of the technology in question. This thesis, however, is far from being able to support Kurzweil’s claims on its own, as evidenced by the fact that Moore has stated that, in his opinion, the Singularity will never occur. Moreover, even if it continues to be fulfilled for years to come, the law expresses a contingent regularity. There is no scientific basis for believing it could express any stronger regularity like those embodied in genuine scientific laws. Moore presents it not as a law but as an aspiration to be maintained in the microelectronics industry over subsequent years. But it is a difficult aspiration. It has become popular among computer scientists to joke that the number of researchers predicting the death of Moore’s law doubles every 2 years. The appellation of law does not fit it rightfully. No physical cause or underlying mechanism compels such regularity to continue to hold, even in a probabilistic way. And obviously, there are physical limits to the postulated growth. For example, in

12  The Singularity, Superintelligent Machines, and Mind Uploading…

241

no finite system can there be exponential growth that does not reach its plateau phase sooner or later, if it does not end up decaying. This is something that Kurzweil (2005) knows, of course, and admits in the specific case of Moore’s law but does not consider it applicable to the LAR. To save the idea that exponential growth will continue beyond the point at which the plateau phase is reached, Kurzweil resorts to the Kuhnian notion of paradigm shifts. He applies it not to scientific change, as did Thomas Kuhn (1970), but to technological change, specifically to changes in the paradigms of computer technologies. When one of these technologies stagnates, a change allows a new technological paradigm to take over and act as a new growth engine. He is confident that this will happen with quantum computing. But even this exponential growth, periodically boosted by technological paradigm shifts, would have to cease at some point. What is important to determine then is whether the stalemate would occur before or after the AI generated up to that point would have a real impact on the lives of human beings, and in particular, whether that AI would be in a position to take control of those lives. It is not clear whether this plateau phase in the logistic curve that stabilizes what has been achieved during the growth phase will occur at a level where superintelligent machines are so sophisticated as to acquire the will to replace humans in the control of our planet. Kurzweil (2005, p. 499), however, takes this for granted. He states with utter conviction but no argument other than his confidence in the LAR. In fact, neither Moore’s law nor the LAR are physical (or chemical or biological) laws, but at most, historical “laws”. This type of “laws” was already convincingly disqualified by Karl Popper (1957). Furthermore, the alleged historical “laws” are only trends maintained as long as they are indeed maintained and nothing more. For Moore’s law, evidence is accumulating that its validity can no longer be sustained. In February 2016, the journal Nature published a paper on the matter that began with the following statement: “Next month, the worldwide semiconductor industry will formally acknowledge what has become increasingly obvious to everyone involved: Moore’s law, the principle that has powered the information-technology revolution since the 1960s, is nearing its end” (Waldrop 2016, p. 145). The paper explained that the main reason for this slowdown was economic, but it was not the only one. The end of Moore’s law does not necessarily mean the end of progress in processing power. On the contrary, new types of computers are on the horizon and they will bring great surprises. However, the next objective of the companies that manufacture them may be more than the quantitative increase in power, but rather diversification and adaptation to the complex needs of users. However, skepticism about the idea of prolonged exponential growth in technological development is not solely related o the belief in the current decline in technological inventiveness. Indeed, it is enough to consider that technologies are very diverse and that while some advance rapidly, others may stagnate or slow down after growing for some time. Moreover, we must also consider the material and especially the cultural constraints this development may encounter.

242

A. Diéguez and P. García-Barranquero

Precisely for these reasons, predictions about the future of technology have often made so many mistakes.2 We do not have the flying cars of the movie Blade Runner, and we have cell phones connected to the Internet instead of cumbersome video phones fixed to the wall, like the one Rick Deckard uses to call Rachael from a bar. But, in the end, these also constitute reasons to reject the technological determinism that tries to convince us that technology has long been out of our hands and that its internal logic leads autonomously to a development whose pace and direction we have no control over.

12.3 The Roadmap to Superintelligent Machines 12.3.1 Concerns and Uncertainties But there is still a fundamental question: Do all these technological advances lead undeniably to creating an AGSI? Even leaving the difficult problem of consciousness aside (Chalmers 1996), or the objections to the possibility of creating an AI in the strict sense raised by the philosophers Hubert Dreyfus (1992) and John Searle (1984), or by the physicist Roger Penrose (1989), the primary problem remains to determine whether an increase in the computing power of machines, however great it may be, is sufficient to generate intelligence in the sense in which the singularists claim. It is one thing if we can increase (even enormously) that computing power and processing speed, and another if that will give rise to something like a mechanical brain with intelligence comparable to or greater than human intelligence. Moreover, it is far from clear that all the problems that an intelligent entity must face are computable and, therefore, that an intelligent machine can solve them by operating with algorithms alone. Instead, good reasons exist to think this is not the case. AI has provided more than remarkable achievements, such as the IBM-developed Watson system, which won the TV quiz show Jeopardy!, the AlphaGo program, which has beaten human Go champions, Google’s powerful search algorithms, or ChatGPT.  It has not succeeded—nor is it expected to do so in the foreseeable future—in creating machines with general and versatile intelligence, sensitive to changes in context, applicable to diverse fields, and capable of setting their own goals. Its results have been limited to particular areas of application. An AI system can be much more intelligent than a human being in performing a specific task but not in anything else. No AI system currently has a general capability applicable to any problem, as with human intelligence. While there are no solid reasons to completely rule out that it will ever be possible to achieve it—although there are substantial reasons to rule out that algorithmic procedures can reach it—there are also no compelling reasons to think that it will become a reality in the foreseeable future.

 For a critique of the scientific-technological basis of the Singularity, see Modis (2006, 2012).

2

12  The Singularity, Superintelligent Machines, and Mind Uploading…

243

Engineers and scientists working on AI tend to take a cautious view. Quite a few, it seems, believe that we will have a machine with human-like intelligence before the end of the century (Bostrom 2014; Müller and Bostrom 2016; Stein-Perlman et al. 2022), but when speak out in academic publications, their statements are often far from the radical views of the singularists. For example, in September 2016, under the auspices of Stanford University, a report prepared by a panel of experts on the past and future of AI was published, which states the following: The frightening, futurist portrayals of Artificial Intelligence that dominate films and novels, and shape the popular imagination, are fictional. In reality, AI is already changing our daily lives, almost entirely in ways that improve human health, safety, and productivity. Unlike in the movies, there is no race of superhuman robots on the horizon or probably even possible (Grosz et al. 2016, p. 6).

Some renowned specialists are highly skeptical of these announcements and promises, and disprove them with the same conviction that the above defend them. Thus, Luc Julia, the creator of Siri, writes on the field of AI: All this has led to a misunderstanding about even the name given to the discipline, which, as we have seen, has nothing to do with intelligence. I argue that there is no such thing as artificial intelligence. If we want to keep the acronym, AI should no longer stand for ‘artificial intelligence’, but for ‘augmented intelligence’ (Julia 2019, pp. 122–123).

Erik Larson, a computer scientist and entrepreneur, makes this clarification: We might also give further voice to a reality that increasing numbers of AI scientists themselves are now recognizing, if reluctantly: that, as with prior periods of great AI excitement, no one has the slightest clue how to build an artificial general intelligence (Larson 2021, p. 275).

However, significant advances towards Artificial General Intelligence (AGI) are announced from time to time. A recent one is Gato, a multimodal system developed by DeepMind, which has been presented as a “precursor of AGI” and as a “generalist agent”. Using the same neural network with the same weights, Gato can perform 604 different tasks, including recognizing images, controlling a robotic arm, playing Atari, or chatting. Gary Marcus (2022) points out that, although Gato can perform many different tasks, when faced with a new task, it would not be able to analyze it logically, to reason about it, and to connect this new task with the others, understanding that there are relevant implications between them despite belonging to very different contexts. However, something like that would be possible if it truly understood what it was doing. According to Marcus and other critics, it cannot be said that Gato has a better understanding of the world than the systems so far in use. For example, in image recognition—a task in which machines already outperform humans—it still recognizes relationships between pixels but has yet to understand what the image means. Nor does it really understand the texts it reads or produces. AlphaZero became the best chess player in the world with 9 h of training, but it does not know what chess is and whether it is appropriate to play it during a fire. It is worth clarifying, on the other hand, that we could ever have AGI without it being superior to human intelligence, at least in many relevant aspects. Indeed, in order to have AGSI one  day, i.e., an AI far superior to human intelligence and

244

A. Diéguez and P. García-Barranquero

creativity in all aspects, not only would it be necessary to have AGI first, but it would also have to be able to create machines more intelligent than itself, leading to exponential growth (Kurzweil 2005). And we cannot take this for granted, at least in the sense that these improvements are unlimited. One thing is that we can use an algorithm to improve the efficiency of the solutions to a problem (genetic algorithms and genetic programing) or that we have algorithms capable of improving their code. It is quite another thing for an algorithm to be able to create machines with human-like intelligence through a process of improvement for the development of specific tasks. As Roman Yampolskiy says: “[a]lthough performance of the software as a result of such optimization may be improved, the overall algorithm is unlikely to be modified to a fundamentally more capable one” (Yampolskiy 2016, p. 84). Despite the more enthusiastic who think otherwise, critics believe that achieving superintelligent machines is not enough to increase the computational capacity of hardware and the amount of data processed. Nor does it appear at the moment that genetic algorithms or other existing AI tools alone can ever produce true AGI (Braga and Logan 2017).

12.3.2 The Future of Superintelligence by Nick Bostrom Whether or not it is worthwhile to take the research leading to AGI beyond a certain point will depend to a large extent on whether we are able to develop the means to avoid the dystopian scenarios that Hans Moravec (1988), Marvin Minsky (1958), Kurzweil (2005) himself and others present to us as inevitable and even—surprisingly—as desirable, at least for them. Because what seems clear is that, if the creation of an autonomous and conscious AGSI were possible and these authors are correct in their predictions, this would not bring anything positive for humans. If we are to believe them, even if such AGSI did not attempt to annihilate us, its growth and deployment, or the inescapable attainment of its objectives—as in the well-­ worn example of the machine dedicated to manufacturing as many office paper clips as possible at all costs—could present a threat to our species (Agar 2010). Perhaps its hostility would not destroy us, but its indifference would. In this case, we should limit the implementation of advances in AI to the development of systems capable of performing specific tasks and forever abandon the project of creating an AGSI. It is not an eccentricity of crackpot professors that a manifesto signed by many great scientists and engineers is already circulating in the networks in favor of stricter control of AI research so that measures can be taken in time against the risks involved in its development. For example: (1) the manufacture of weapons capable of making autonomous decisions; (2) or the severe privacy problems we are facing, among others (Agar 2019; Véliz 2020). This is the first step. As bioethics is a discipline that has reached remarkable development, we are still far from mature techno-ethics.

12  The Singularity, Superintelligent Machines, and Mind Uploading…

245

Kurzweil’s theses have aroused very mixed reactions.3 On the one hand, he has devoted followers who consider him almost a spiritual guru and spread his ideas with a somewhat naive enthusiasm and conviction. A more recent and better-­justified defense of the possibility that human beings may shortly create an AGSI is that made by Nick Bostrom (2014). Bostrom does not call this the “Singularity”, nor does he agree with Kurzweil regarding its characteristics. Rather, Bostrom acknowledges that the difficulties in this project are much more significant than previously thought. His portrait of the situation is less confident than Kurzweil’s. The possibility of superintelligent machines is, he tells us: “quite possibly the most important and most daunting challenge humanity has ever faced, and – whether we succeed or fail – it will probably be the last challenge we face” (Bostrom 2014, p. 7). Precisely because the risks involved are enormous (although he also sees possible benefits), it is a possibility that should concern us. Bostrom criticizes those who, working in this research field, have taken the matter lightly or have been unable or unwilling to see any problem with it: The AI pioneers for the most part did not countenance the possibility that their enterprise might involve risk. They gave no lip service–let alone serious thought–to any safety concern or ethical qualm related to the creation of artificial minds and potential computer overlords: a lacuna that astonishes even against the background of the era’s not-so-impressive standards of critical technology assessment (Bostrom 2014, p. 5).

The effort Bostrom devotes in his book to the intricate question of how we might prevent an AGSI from being adverse to human interests, purposes, and values is particularly noteworthy. The control mechanisms would all be useless unless AGSI had somehow internalized that our human interests should be respected and promoted. Still, Bostrom’s reflections on the possible ways of infusing ethical behavior into it (“coherent extrapolated volition” and “indirect normativity”) are, in practice, a recognition of the futility of the attempt (Agar 2016). The achievement of AGSI could occur in a single decisive step that could not be reversed. In his words: “attempts to build artificial general intelligence might fail pretty much completely until the last missing critical component is put in place, at which point a seed AI might become capable of sustained recursive self-improvement” (Bostrom 2014, p. 29). And Bostrom is not alone in believing in this impossibility of control or confinement of a future AGSI (Alfonseca et al. 2021; Russell 2020). It would certainly be unwise to expect that an AGSI whose situation in the world, goals, “embodiment”, experience, history, (lack thereof) socialization, and emotions (if it has), were so far removed from all the attributions we can make about these traits in the case of our species, will nevertheless end up developing ethical behavior similar to ours. At least Bostrom has the intellectual honesty to admit that creating an AGSI could be an existential risk for humans (even if it could also contribute to diminishing other existential risks) and that we would be practically unarmed in the face of its power (Bostrom 2002). Bostrom is trying to warn us of

 Piero Scaruffi (2013) proposes an interesting and well-documented critical analysis.

3

246

A. Diéguez and P. García-Barranquero

the danger and suggests that it would be better to slow down AI research until we are sure we can solve the control problem. Nevertheless, there remains much in common between Kurzweil’s position and that of Bostrom. Both are convinced that AGSI will arrive in the not-too-distant future, will come suddenly, and will dominate humans if they do not take radical measures. For instance, to merge with it, in the case of Kurzweil, or, in the case of Bostrom, to somehow insert an ethical behavior sensitive to the purposes and values of humans into. The problem is that the current state of AI research does not allow such predictions to be made rigorously. As a result, the scenarios drawn by Kurzweil and Bostrom are based more on their futuristic visions than on real data. There are no irrefutable reasons to think that creating an AGSI is not and will never be possible. Proving that would be as much as proving that AGSI is impossible a priori, and the arguments put forward by critics fail to do so. But it is one thing if it cannot be proved that it is impossible and quite another if it is guaranteed that we will have it. It could be a perfectly possible entity, and yet the difficulties in creating it could be so great that we would never be able to make it effective. As Luciano Floridi (2016) rightly states: “True AI is not logically impossible, but it is utterly implausible. We have no idea how we might begin to engineer it, not least because we have very little understanding of how our own brains and intelligence work”.

12.4 What if We Can Live Forever? Dreams of Digital Immortality 12.4.1 Types of MU: The Analysis of David Chalmers The belief that superintelligent machines await us in the not-too-distant future is usually complemented by another view that has achieved certain popularity, that of the possibility of mind uploading (MU) as a means to achieve (true) immortality (Diéguez 2021; García-Barranquero 2021). Kurzweil is one of the main supporters of this idea. For him, the ultimate solution to the biological problem of the limitation of our life is to transfer our minds into intelligent machines. This MU process needs not be a one-time event. The most detailed analysis of this issue has been done by Chalmers (2010), who distinguishes between destructive, gradual, and non-­ destructive uploading.4 Briefly: 1. Destructive uploading involves the destruction of the original brain at the very moment of uploading. 2. Gradual uploading, which would supposedly be achieved by nanotechnology, would be the replacement one by one of all our neurons by artificial neurons. These, beforehand, would have copied the exact functioning of the neuron they

 But see also Sandberg and Bostrom (2008).

4

12  The Singularity, Superintelligent Machines, and Mind Uploading…

247

are replacing. Or it could also consist of the functional uploading of each neuron into a machine connected to the brain, so that the machine would take over the function of those uploaded neurons and could replace them. In a way, it is a modality of destructive uploading since the substituted neuron is supposed to be destroyed. Still, there would be the possibility (however remote) that the substituted neuron would be preserved and joined to all the gradually replaced neurons, thus finally recomposing the original brain. Even more possible, the brain could be copied neuron by neuron, leaving it intact in the process. 3. Non-destructive uploading would consist of the (gradual or simultaneous) copying of the entire brain structure and the information contained therein without destroying the original brain. However, before we can lend ourselves to uploading, we must have an answer to two crucial questions: (1) will I survive the upload, or, in other words, will my uploaded version be self-aware; and (2) will I be me, will my personal identity have been preserved?5 Chalmers believes that both questions can be answered positively, especially in the case of gradual uploading. Still, it is only a hope he holds based on his functionalist convictions and, by his own admission, his instinctively optimistic view of technology. Functionalism is an approach in the philosophy of mind that opposes the identification made by reductionist materialism between the brain and the mind or, to be more precise, between mental and brain processes. For functionalism, mental processes are functional states of organisms. Therefore, they are not characterized by their material support, but by the function they perform within the cognitive system, which can be realized from different psychophysical states or material substrates. Another way of saying this is that mental states can be multiply realized. They are internal processes that causally mediate between the inputs and outputs of the system. What matters is not the material basis on which they are realized or executed but, the causal relations that some mental states exert on others and that generate behavior. The mind is seen as an informational structure of inputs and outputs, as software, in short, that can execute on different hardware without changing its structure. Accordingly, a machine equipped with a program capable of perfectly simulating the pattern of inputs and outputs that a human brain presents when a particular mental process occurs in it (the memory of a face or the vision of a color, for example) also shows that mental process (Fodor 2000; Putnam 1981). If functionalism is correct, it would make no difference whether our mind is the result of the functioning of a biological brain, made up of nerve cells, or a mechanical brain, made up of silicon chips, or other organic or inorganic materials. Assuming that all the functional states of the former can be realized by the latter, both brains would have the same mental processes. Thus, it is possible, in principle, for intelligent entities to exist without being endowed with a brain constituted by organic matter. These would have a “brain” of any other composition and structure capable of adopting the same functional states.  See in this sense, Bamford and Danaher 2017; García-Barranquero 2021; Pigliucci 2014.

5

248

A. Diéguez and P. García-Barranquero

Indeed, if the type of material support is not essential to have a mind, and if the mind is a set of functional states or patterns of such functional states, then it would be possible, in principle, to make a copy of our mind that would maintain all its functional states in non-organic support, including all our memories, and, if the copy were sufficiently accurate, that mind would be our mind. This is how transhumanists who believe in the possibility of MU as a means to achieve immortality seem to interpret the matter. But things are a bit more complex (Diéguez 2021; García-Barranquero 2021; Hauskeller 2012). Let’s see why.

12.4.2 Will I Still Be Myself in a Virtual World? Problems with Personal Identity Functionalism rejects the identity between types of mental processes and types of brain processes (type-type identity), as we have just said, and in this sense, rejects the reduction of mental processes to physicochemical processes. However, it accepts the identity between a concrete mental process and a concrete functional state in a physical system, be it an organic brain or a machine (token-token identity). To the extent that a mental process is functionally characterized, it will have non-physical properties and, therefore, will not be reduced to physicochemical processes. Still, for any mental process to exist, it requires support capable of presenting the functional state that characterizes that process since that case of mental process consists precisely of a causal structure that the support adopts. This implies that if we have two material supports in two equal functional states, we will have two equal mental states but not a single mental state. For the simple reason that the same mental state cannot be identified with two functional states in different supports. Therein lies the materialist or physicalist commitment to functionalism understood in the strict sense. According to the above, and following functionalism, a machine capable of simulating all the functional states of my brain will have—the same mental processes as me—it will also be able to remember the same things as me or form the same judgments I would form; our mental processes will not be identical, that is, I would not be the machine, nor the machine would be me. My concrete mental process is identified with my concrete functional state, and the machine’s with its own. This means that an exact mechanical copy of my mind would not be myself, and the fact that such a copy may survive my death does not make me immortal, nor does it diminish one iota the fact that the person I am has ceased to exist (at least in this world) at the moment of death. Functionalism, therefore, is far from providing arguments to support the belief in the possibility of uploading the mind into a machine, thus maintaining the subject’s identity in the process. Instead, it provides arguments to be skeptical about it. Andrew Brook and Robert Stainton have illustrated the point with a striking example: Imagine that you’ve gone into the Eternal Life Center to have your body rejuvenated and your mind transferred into the fixed-up body. You climb onto the table, hear some whirring

12  The Singularity, Superintelligent Machines, and Mind Uploading…

249

sounds, and then the lights go out. As you descend the table, an embarrassed attendant explains that there’s been a slight glitch. He tells you, “The way the technology normally works is this: a new body is created, the information from your brain is put into its brain, and your old body is then destroyed. The problem is, though a new body was created and your program was put into it, the power unfortunately went out before the old body (i.e., you!) could be atomized”. Now, they can’t let two of you leave the center. So the attendant makes a simple request: “Please return to the table, so we can destroy the old body” (Brook and Stainton 2000, pp. 131–132).

Brook and Stainton state that “many people would resist this request”. The reason for this rejection, as these authors see it, is that, “having your mind moved to another body may, despite initial appearances, actually be a way of dying, not a way of continuing to live” (Brook and Stainton 2000, p. 132; see also, Agar 2010). This statement puts the finger on the problem. It is not only that a copy of my mind would not be myself, but it is also that, very possibly, my own mind in another body would not be myself. This is at least what one would have to think about if we consider personal identity as something more than the possession of a mind and a mind as something more than a set of information or as a computer program (Häggström 2021; Hauskeller 2012; Schneider 2019). However, this is precisely what Moravec, for example, explicitly denies. For him, the rejection of the idea that a copy of myself (or, instead, of my mind) is myself, as well as the refusal to accept that I do not die as long as a copy of myself (of my mind) remains alive, stem from a common but erroneous view, which he calls the “body-identity position”. According to this view, a person is defined by the material of which the human body is made. Without a human body that maintains continuity, a person ceases to be themselves. Therefore, they would not be themselves in a body other than their own, whether human or mechanical. Against this view, Moravec proposes another one that would save his vision of (true) immortality: the “pattern-identity position”. Similar to what functionalism maintains, for this second approach, the important matter would not be the material we are made of, but “the pattern and the process” that occurs in the body and the brain. “If the process is preserved—he writes—, I am preserved. The rest is mere jelly” (Moravec 1988, p. 117). Kurzweil (2005 p. 371) defends a similar view. For him, we would not cease to be who we are once we have uploaded our minds into a computer. He also believes that the maintenance of the informational pattern matters for the maintenance of identity. Moravec compares this situation to what happens in our bodies over time. In a few years, a human has changed every one of the atoms that constituted their body at a given moment. Nevertheless, that human remains the same person despite having completely replaced the matter that made them up. According to him, this would mean that personal identity resides in the only thing preserved in the process, that is, in the pattern or model structure. Moravec draws from this idea that the mind and the body can be separated. A position that he rightly calls dualistic, since, in effect, it goes a step beyond functionalism to fall squarely into dualism. What can we say about this argument? First, a person remains the same despite the changes that time produces in their body, but it does not follow that: (1) their

250

A. Diéguez and P. García-Barranquero

body is not part of his personal identity; (2) this personal identity can be reduced to a mere pattern or functional structure. The changes that come with age do not make individuals have another body. They have the same body (in the sense that another has not replaced it), even if it is a different body from that of youth (in the sense that it has changed in its appearance, components, and capacities). Precisely if personal identity is maintained through changes in the body is, among other reasons, because these changes are experienced as changes in the body itself, not as a change of body. However, it is necessary to recognize that possessing the same body throughout life does not exclude the possibility of a change of personal identity caused, for example, by severe mental disorders. In other words, the body alone is not enough to guarantee identity. Neither does the mind seem to be enough to guarantee it.6 Unless a radical mind-body dualism is assumed, it is not easy to accept that we would remain the same person, that is, our personal identity would be preserved if our mind were to leave our body or if we were to change it to another body. A human without a biological body would lose its animal condition and could no longer be human (García-Barranquero 2021; Pigliucci 2014). Nor would it be so if the body it adopted were not human. Instead, as Hillary Putnam (1981) points out, it seems naïve to conceive of the mind: As a sort of ghost, capable of inhabiting different bodies (but without change in the way it thinks, feels, remembers, and exhibit personality, judging from the spate of popular books about reincarnation and ‘remembering previous lives’) or even capable of existing without a body (and continuing to think, feel, remember, and exhibit personality) (Putnam 1981, p. 77).

A fundamental problem in fruitfully elucidating this question is that it remains difficult to answer. Not only is it unclear what a future merger with machines would look like, but there is also no agreement on what personal identity consists of and what the criteria for its permanence over time are. To speak of MU means to perform the transfer of the mind from a human brain to a machine. Of course, something like this would imply the exact copy of our brain, of its entire connectome, but also of the specific details of each synapse, as well as the implementation of this copy through the appropriate procedures. This would therefore open the possibility of making more than one copy of our mind, at least in principle. Thus, as Patrick Hopkins (2012 p. 237) says, to admit that MU can work, it would have to be shown that the act of uploading a mind is very different from the act of copying an object, because, if the upload of my mind is like making a copy of an object, there would not exist an identity, in the strict sense, between my mind and its copy. The impossibility of accepting that a copy of me can be myself does not stem from the thesis of body-identity but from the very concept of identity. As Kant pointed out in his reply to Leibniz’s principle of indiscernibles, it is enough that two things are in different places for them not to be considered indiscernible or identical. If it is a question of two things, they are no longer identical. Identity can only be of a thing with itself. A copy can be very similar to the original, but strictly speaking,  Antonio Diéguez (2022) and Susan Schneider (2019) have a more detailed analysis of this issue.

6

12  The Singularity, Superintelligent Machines, and Mind Uploading…

251

it cannot be identical to it, that is to say, it cannot be simultaneously the original and the copy (see, Hauskeller 2012). On the other hand, the “identity-pattern” thesis leads to absurd situations. The hypothetical case elaborated by Brook and Stainton is a good example, but others can be imagined. For illustration, suppose someone makes a copy of their mind to be uploaded to machine A, but then, for security reasons, makes another copy and saves it to machine B. Are the two copies, A and B, the same individual, the same person? It does not seem that this can be so, especially if we consider that they are two different entities, and one can be destroyed without the other being destroyed. But if we accept the identity-pattern thesis, we should consider that they are the same person, at least in the first instants, before their histories begin to differ. On the other hand, my mind is conscious of itself, but nothing guarantees that a copy of my mind is also conscious of itself. Moreover, supposing it were, the consciousness that copy would have of itself would not be the consciousness that I have of myself. If this ceases with death, the fact that a self-conscious copy of my mind remains in operation does not palliate this disappearance. It could be objected that, although the token identity would be lost in this copying process, the type identity would not. According to Mark Walker (2014), this distinction could allow us to differentiate those who defend the psychological theory from those who defend the somatic (physical) theory of identity. Suppose I have Cervantes’ text Rinconete y Cortadillo on my computer, and I print a hard copy. The next day a clumsy movement causes the coffee to spill on it, so I burn it in the fireplace and print a new copy. That new copy is not the same as the previous one, in the sense that they are different copies (tokens), physically independent, of the same text. Instead, the work’s identity (as type) has been maintained in the process. The new copy, in effect, remains the text of Rinconete y Cortadillo. Suppose the psychological theory of the permanence of personal identity was correct. In that case, we could say that a copy of my mind in a machine would not be identical to me in the sense of the identity of exemplar (it would not maintain a material or bodily continuity). However, it would be identical in the sense of the identity of type (a new exemplar of my mind understood as type). The question is whether that type identity retains enough of our personal identity to make MU desirable. This is challenging to accept. That type identity presupposes that the mind is a pattern of information that can be stored or transported in different ways, an “organizational invariant” in Chalmers’ (2010) sense. That is precisely what the question in this whole debate is. Moreover, we must remember that these transhumanist ideas are based on rejecting the psychophysical or mind-brain identity theory. However, although functionalism still enjoys popularity, it has received strong criticisms, and there is a certain return of strong physicalists who defend precisely the mind-brain identity (the type-­ type identity). For physicalisms, the mind is nothing different from the psychophysical processes that produce it; mental states are biological states of the brain. Such a position demolishes in one fell swoop all the claims of the transhumanists to MU. You cannot upload a mind into a computer simply because the mind would be the brain at work, not the software that the brain is running. The very analogy of the mind as software that different hardware can run is misleading to many because

252

A. Diéguez and P. García-Barranquero

it presupposes a computational theory of mind and a rigid view of the brain that not all authors, strong physicalists or not, are willing to accept (Fitch 2014). It is possible, however, to maintain positions in favor of MU, even accepting this pessimistic conclusion on the maintenance of personal identity that we have reached.7 This would be, for example, the position of Derek Parfit (1984), for whom what matters is only survival, regardless of identity. Taking a further philosophical step, one might also think that, even if MU did not preserve personal identity, it would still be desirable for many people if the process generated a personal entity that, at least in some central aspects, maintained some continuity with the original subject, or preserved some important things about it. This would be the case, for example, if it were a copy of that subject capable of performing similar functions to loved ones, or capable of retaining some of its valuable qualities. This is a possibility discussed in detail by Joseph Corabi and Schneider (2014), but it is not very clear, in our opinion, that at the end of the day the MU would for these reasons, gain much real acceptability. Note that the loss of identity is assumed, with which the resulting person would not be the one who had those qualities that are preserved. However, this could only be known with certainty through empirical studies once such a thing was possible. Finally, there will be those who, even accepting that MU implies the loss of personal identity, would be willing to risk it, because they will judge that the identity that should characterize them must be the future personal identity that their existence in the machine can provide. Such a person will believe that their true identity is the one that is yet to come when their existence merges with machine. It will be this new identity that will constitute them as the individual they really want to be, even at the cost of the disappearance of the one they are now, whom they will probably see as an inauthentic entity in the transition towards their true fullness. It is also a possibility that cannot be ruled out, but it is still the closest thing to death, however much some famous transhumanists may defend it.

12.5 Conclusions The future we are going to create will depend, to a large extent, on whether we are capable of managing the development of AI well. We must not leave the most critical decisions in the hands of these AI systems, which are beginning to be known by the more appropriate name of “automated decision systems” (ADS). So, it would be tantamount to leaving them in the hands of those who run the companies dedicated to their creation. Therefore, it is essential to develop the governance of AI. However, as you would expect to infer from this chapter, the importance of such a governance lies not so much in the need to avoid the worst dystopian scenarios of a future AGSI,

 For an empirical analysis on the reasons why people may decide to upload, see Laakasuo et al. 2021. 7

12  The Singularity, Superintelligent Machines, and Mind Uploading…

253

although this issue should be addressed. Instead, it lies in the urgency of avoiding the harmful ways in which it is already being used and the undesirable uses that are realistically foreseeable in the near future. We pay perhaps too much attention to the problems posed by the possibility of superintelligent machines eventually taking over, and tend to lose sight of the essentials. There are, for the moment, much more peremptory risks, such as the growing pressure to cede our privacy and decision-making power, not so much to the machines as to the owners of the machines (Véliz 2020). There is a point on which precisely Bostrom (2014 p. 231) agrees with Marcus and Davis (2019, p. 199), two critics we have already quoted: we need time to think and develop trustworthy AI. Only by strengthening the governance of this technology we can have a chance of preventing the fulfillment of the announced threats. For that very reason, this should be incorporated into the agenda of all political parties. It is time to get serious about regulatory and legislative work. The development of particular or narrow AI, which is the one we know so far, poses political, ethical, social, and anthropological risks. For instance: (1) who controls the information; (2) how its biased decisions affect humans; (3) how it will be used in intelligent weapons; (4) what image of the human being we forge in contrast to it or as an adaptation to it, etc. Nevertheless, this narrow AI raises no obvious existential risks. On the other hand, the creation (if ever feasible) of an AGSI would. Various analysts (Bostrom 2014; Russell 2020; Tegmark 2017) have argued that it would be virtually impossible to control it, and without control, our interests and even our survival as a species would be severely threatened. Other specialists, such as Marcus and Davis (2019) or Larson (2021), believe this is a rather unlikely possibility. We do not know if AGSI is possible, although no one has proved that it is impossible. We do not know if AGSI is feasible. Perhaps it is not, as Floridi (2016) suggests. We do not know if AGSI would acquire will and consciousness. It may not, and, as far as we know, there seems to be little chance that it will. We do not know if AGSI would destroy us. In any case, it is crucial to think about these questions since the danger is significant. Regarding the possibility of MU as a means to achieve (true) immortality, despite the speculative nature that any reflection on the possibility of the maintenance of personal identity after the transfer of the mind into a machine inevitably has, it can be argued that such maintenance is highly problematic when analyzed from the main theoretical perspectives that can be adopted about personal identity. In short, if superintelligent machines come into existence and lead to the Singularity, the hypothetical transfer of our mind to such machines would not represent a viable alternative to achieve fusion with them.

References Agar, N. 2010. Humanity’s end. Why we should reject radical enhancement. Cambridge, MA: The MIT Press.

254

A. Diéguez and P. García-Barranquero

———. 2016. Don’t worry about superintelligence. Journal of Evolution and Technology 26 (1): 73–82. ———. 2019. How to be human in the digital economy. Cambridge, MA: The MIT Press. Alfonseca, M., et  al. 2021. Superintelligence cannot be contained: Lessons from computability theory. Journal of Artificial Intelligence Research 70: 65–67. Bamford, S., and J. Danaher. 2017. Transfer of personality to a synthetic human (‘mind uploading’) and the social construction of identity. Journal of Consciousness Studies 24 (11–12): 6–30. Bostrom, N. 2002. Existential risks: Analyzing human extinction scenarios and related hazards. Journal of Evolution and Technology 9: 1–36. ———. 2014. Superintelligence: Paths, dangers, strategies. Oxford: Oxford University Press. Braga, A., and R.K. Logan. 2017. The emperor of strong AI has no clothes: Limits to artificial intelligence. Information 8: 156. Brook, A., and R.J. Stainton. 2000. Knowledge and mind. Cambridge, MA: The MIT Press. Chalmers, D.J. 1996. The conscious mind. New York: Oxford University Press. ———. 2010. The singularity. A philosophical analysis. Journal of Consciousness Studies 17 (9–10): 7–65. Corabi, J., and S. Schneider. 2014. If you upload, will you survive? In Intelligence unbound. The future of uploaded and machine minds, ed. R. Blackford and D. Broderick, 131–145. Malden, MA: Wiley Blackwell. Diéguez, A. 2021. Cuerpos inadecuados: El desafío transhumanista a la filosofía. Barcelona: Herder Editorial. ———. 2022. El volcado de la mente en la máquina y el problema de la identidad personal. Revista de Filosofía (La Plata) 52 (2): e054. Dreyfus, H. 1992. What computers still can’t do: A critique of artificial reason. Cambridge, MA: The MIT Press. Eden, A.H., et al., eds. 2012. Singularity hypotheses: A scientific and philosophical assessment. Berlin: Springer. Fitch, W.T. 2014. Toward a computational framework for cognitive biology: Unifying approaches from cognitive neuroscience and comparative cognition. Physics of Life Reviews 11 (3): 329–364. Floridi, L. 2016. Should we be afraid of AI? Aeon, 9 May. Available on: https://aeon.co/essays/ true-­ai-­is-­both-­logically-­possible-­and-­utterly-­implausible. Accessed 1 Apr 2023. Fodor, J.A. 2000. The mind Doesn’t work that way: The scope and limits of computational psychology. Cambridge, MA: The MIT press. García-Barranquero, P. 2021. Transhumanist immortality: Understanding the dream as a nightmare. Scientia et Fides 9 (1): 177–196. Good, I.J. 1966. Speculations concerning the first ultra intelligent machine. Advances in Computers 6: 31–88. Grosz, B., et al. 2016. Artificial intelligence and life in 2030. One-hundred-year study on Artificial Intelligence. Stanford University. Avaliable on: https://ai100.stanford.edu/2016-­report. Accessed 1 Apr 2023. Häggström, O. 2021. Aspects of mind-uploading. In Transhumanism: Proper guide to a posthuman condition or a dangerous idea? ed. W. Hoichner and H.-J. Kreowski, 3–20. Cham: Springer. Hauskeller, M. 2012. My brain, my mind, and I: Some philosophical assumptions of mind-­ uploading. International Journal of Machine Consciousness 4 (01): 187–200. Hopkins, P.D. 2012. Why uploading will not work, or, the ghosts haunting transhumanism. International Journal of Machine Consciousness 4 (1): 229–243. Julia, L. 2019. L’intelligence artificielle n’existe pas. Paris: First Editions. Kuhn, T. 1970. The structure of scientific revolutions. Chicago: University of Chicago. Kurzweil, R. 2005. The singularity is near: When humans transcend biology. New York: Viking. Laakasuo, M., et al. 2021. The dark path to eternal life: Machiavellianism predicts approval of mind upload technology. Personality and Individual Differences 177: 110731.

12  The Singularity, Superintelligent Machines, and Mind Uploading…

255

Larson, E. 2021. The myth of artificial intelligence: Why computers can’t think the way we do. Cambridge, MA: The Belknap Press. Marcus, G. 2022. The new science of alt intelligence AI has lost its way. Let’s take a step back, The Road to AI We Can Trust (blog). Available on: https://garymarcus.substack.com/p/the-­new-­ science-­of-­alt-­intelligence?s=r. Accessed 1 Apr 2023. Marcus, G., and E.  Davis. 2019. Rebooting AI: Building artificial intelligence we can trust. New York: Pantheon. Minsky, M. 1958. Society of mind. New York: Simon and Schuster. Modis, T. 2006. The singularity myth. Technological Forecasting and Social Change 73 (2): 104–112. ———. 2012. Why the singularity cannot happen. In Singularity hypotheses: A scientific and philosophical assessment, ed. A.H. Eden, J.H. Moor, J.H. Soraker, and E. Steinhart, 311–340. Berlin: Springer. Moravec, H. 1988. Mind children: The future of robot and human intelligence. Cambridge (MA): Harvard University Press. Müller, V.C., and N. Bostrom. 2016. Future progress in artificial intelligence: A survey of expert opinion. In Fundamental issues of artificial intelligence, ed. V.C.  Müller, 553–571. Berlin: Springer. Parfit, D. 1984. Reasons and persons. Oxford: Oxford University Press. Penrose, R. 1989. The Emperor’s new mind. Oxford: Oxford University Press. Pigliucci, M. 2014. Mind uploading: A philosophical counter-analysis. In Intelligence unbound. The future of uploaded and machine minds, ed. R.  Blackford and D.  Broderick, 119–130. Oxford: Wiley-Blackwell. Popper, K. 1957. The poverty of historicism. London: Routledge. Putnam, H. 1981. Reason, truth and history. Cambridge, MA: Cambridge University Press. Russell, S. 2020. Human compatible: AI and the problem of control. London: Penguin Books. Sandberg, A. 2013. An overview of models of technological singularity. In The transhumanist reader: Classical and contemporary essays on the science, technology, and philosophy of the human future, ed. M. More and N. Vita-More, 376–394. Oxford: Wiley. Sandberg, A., and N. Bostrom. 2008. Whole brain emulation: A roadmap, technical report. Future of Humanity Institute, Oxford University. Avaliable on: www.fhi.ox.ac.uk/reports/2008-­3.pdf. Accessed 1 Apr 2023. Scaruffi, P. 2013. Demystifying machine intelligence. Omniware Publishing. Avaliable on: http:// www.scaruffi.com/singular/download.pdf. Accessed 1 Apr 2023. Schneider, S. 2019. Artificial you: AI and the future of your mind. Princeton: Princeton University Press. Searle, J. 1984. Minds, brains and science. Cambridge, MA: Harvard University Press. Stein-Perlman, Z., et al. 2022. 2022 expert survey on progress in AI. AI Impacts, 3 Aug. Available on: https://aiimpacts.org/2022-­expert-­survey-­on-­progress-­in-­ai/. Accessed 1 Apr 2023. Tegmark, M. 2017. Life 3.0: Being human in the age of artificial intelligence. New York: Alfred A. Knopf. Véliz, C. 2020. Privacy is power: Why and how you should take back control of your data. London: Penguin Books. Vinge, V. 1993. The coming technological singularity; how to survive in the post-human era. Presented at the VISION-21 Symposium, March 30–31. Waldrop, M. 2016. More than Moore. Nature 530 (7589): 144–147. Walker, M. 2014. Personal identity and uploading. In Intelligence unbound. The future of uploaded and machine minds, ed. R. Blackford and D. Broderick, 161–177. Oxford: Wiley Blackwell. Yampolskiy, R.V. 2016. Artificial superintelligence. A futuristic approach. Boca Raton, FL: CRC Press.